A Brief Recap
In our last post, we detailed a case where a tree had been removed without the necessary permit. One of the key issues was the species of the tree, as some species are granted greater protections than others within the municipality. With only a stump remaining, the arborist involved—Charlie Marcus of Legacy Arborist Services—reached out to several experts for input. Andrew was one of them and recalled that a colleague, Sarah, had recently taught her undergraduate genetics class at the University of Tampa about DNA barcoding—a method of species identification that uses a short section of DNA from a specific gene.

Andrew reached out to Dr. Sarah Wilson to see if we could apply the method to the stump—both to assist Charlie and because, honestly, it sounded fascinating. Fortunately, she agreed and had some leftover reagents from teaching DNA barcoding in the Spring. We obtained wood from the stump and prepared it for sequence analysis by a third-party lab. A complete pictorial walkthrough of this process is available in our Part 1 post.
When we last wrote about this effort, we had just packaged three samples that, based on electrophoresis, we were confident contained the plant DNA needed for identification. We sent them out and waited for the results. We didn’t have to wait long—just two days—before the report from Genewiz landed in our inbox!

Following a DNA extraction and PCR, samples were run out on an agarose gel. Ladder (L) was loaded into the first two wells. All three PCRs (1, 2, 4) appeared to have worked and the size (~500 base pairs) was in the expected range for the primers used. The blank (B) reaction showed no result suggesting there was no contamination.
Solving a Mystery with Science (By Dr. Sarah Wilson)

PCR samples (Figure 2) were mailed to Genewiz for sequencing. A method called Sanger sequencing was used so that the order of nucleotides (building blocks of DNA) could be determined for the DNA barcode region. The resulting sequence data (A, T, G, Cs) was uploaded by Genewiz directly into the DNA Subway for further analysis.

First, the sequences were trimmed using the sequence trimmer. Several nucleotides at the beginning and end of the sequence were removed before analyzing further. Trimming the sequence helps improve reliability in the next steps.

The first 110 nucleotides of the trimmed sequence for sample 1 are shown in Figure 5. The DNA sequencing software measures the fluorescence emitted in each of four channels (A, T, C, G) and records these in a graph called an electropherogram. A black peak on the graph indicates a guanine (G) is in that position of the sequence. A red peak on the chart is a thymine (T), green is an A (adenine), and blue is cytosine (C).

A BLAST (basic local alignment search tool) search can quickly identify any close matches to a given sequence by comparing it to a sequence database. The “hits” are sorted so those at the top of the list are most likely matches to the DNA barcode.
Several statistics shown in the BLAST results allow for comparison between the results. The Aln. (alignment) length indicates how many nucleotides align between the DNA barcode and the database entry. The top six hits have either 564, 554, or 544 nucleotides in alignment. These are all good results.
The bit score formula takes into account gaps in the sequence and normalizes to the size of the database; the higher the bit score score, the better the alignment. Bit scores for the top two hits are the highest, suggesting these are likely the best matches for the barcode. But, there are other factors to consider as well, importantly, the number of mismatches.
The Expectation, or E-value is the number of alignments that would be expected to occur by chance in the database. The lower the E-value, the higher the probability that the hit is related to the DNA barcode. All six of the top hits have an E-value of zero, indicating they are all good contenders.
The number of mismatches is probably the most important factor. There are 2 mismatches for the top two hits, and 1 mismatch for the third hit. There are 0 mismatches for the sixth hit with all four of these alignments being for the same genus, Nyssa. With 5 mismatches, the hits for Davidia are unlikely to be the correct genus. The hit with 0 mismatches for Nyssa sylvatica is the most likely genus and species since the entire barcode matches perfectly.

Although the sequences were trimmed using the DNA subway, additional nucleotides should have been removed from both ends. Those extra nucleotides on each end of the sequence altered the ranking of the hits in the BLAST search. The database hit with 0 mismatches, Nyssa sylvatica, is the best choice when looking at the data more closely, even though it was the sixth hit on the list.

The BLAST database includes links to Wikipedia so that you can see images and relevant facts about each organism. All three samples aligned perfectly (0 mismatches) to the database entry for Nyssa sylvatica (common name: Tupelo).
Conclusion
Andrew was pleasantly surprised—once again—to discover that science worked. All his fears about moss DNA contamination or inaccurate identifications were put to rest when he saw the species, which is native to North Florida. Moreover, he remembered that in Charlie’s first email, a Tupelo species had been one of the initial hunches suggested to the group, given the stump’s proximity to a body of water.

As for us, we had so much fun with this project that we’re planning to try more identifications. Andrew has a research block of oaks that were purchased from a native nursery as sand live oak but appear to be more like live oak. We’re also eager to try our hand at identifying decay fungi—or perhaps even conduct a survey of Swedish meatballs at IKEA or tuna salad at Subway!
About this blog
Rooted in Tree Research is a joint effort by Andrew Koeser and Alyssa Vinson. Andrew is a Research and Extension Professor at the University of Florida Gulf Coast Research and Education Center near Tampa, Florida. Alyssa Vinson is the Urban Forestry Extension Specialist for Hillsborough County, Florida.
The mission of this blog is to highlight new, exciting, and overlooked research findings (tagged Tree Research Journal Club) while also examining many arboricultural and horticultural “truths” that have never been empirically studied—until now (tagged Show Us the Data!).
Subscribe!
Want to be notified whenever we add a new post (about once every 1–2 weeks)? Subscribe here.
Want to see more? Visit our archive.