Journal article
Authors list: Kolter, Andreas; Gemeinholzer, Birgit
Publication year: 2021
Pages: 265-298
Journal: Genome
Volume number: 64
Issue number: 3
ISSN: 0831-2796
eISSN: 1480-3321
DOI Link: https://doi.org/10.1139/gen-2019-0198
Publisher: Canadian Science Publishing
Abstract:
The problem of low species-level identification rates in plants by DNA barcoding is exacerbated by the fact that reference databases are far from being comprehensive. We investigate the impact of increased sampling depth on identification success by analyzing the efficacy of established plant barcode marker sequences (rbcL, matK, trnL-trnF, psbA-trnH, ITS). Adding sequences of the same species to the reference database led to an increase in correct species assignment of +10.9% for rbcL and +19.0% for ITS. Simultaneously, erroneous identification dropped from similar to 40% to similar to 12.5%. Despite its evolutionary constraints, ITS showed the highest identification rate and identification gain by increased sampling effort, which makes it a very suitable marker in the planning phase of a barcode study. The limited sequence availability of trnL-trnF is problematic for an otherwise very promising plastid plant barcoding marker. Future developments in machine learning algorithms have the potential to give new impetus to plant barcoding, but are dependent on extensive reference databases. We expect that our results will be incorporated into future plans for the development of DNA barcoding reference databases and will lead to these being developed with greater depth and taxonomic coverage.
Citation Styles
Harvard Citation style: Kolter, A. and Gemeinholzer, B. (2021) Plant DNA barcoding necessitates marker-specific efforts to establish more comprehensive reference databases, Genome, 64(3), pp. 265-298. https://doi.org/10.1139/gen-2019-0198
APA Citation style: Kolter, A., & Gemeinholzer, B. (2021). Plant DNA barcoding necessitates marker-specific efforts to establish more comprehensive reference databases. Genome. 64(3), 265-298. https://doi.org/10.1139/gen-2019-0198
Keywords
DISCRIMINATE; GenBank; IMPROVEMENTS; MATK; NONCODING REGIONS; plant barcoding; PRIMERS; PSEUDOGENES; reference database; reference DNA library; species-level resolution