An approach for estimating haplotype diversity from sequences with unequal lengths

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

  • Ping Fan
  • Fjeldså, Jon
  • Xuan Liu
  • Yafei Dong
  • Yongbin Chang
  • Yanhua Qu
  • Gang Song
  • Fumin Lei

Genetic diversity is an essential component of biodiversity. Developing robust quantification methods is critically important in depicting the genetic diversity underlying the geographical distributions of species, especially for the sequence data with unequal lengths. Traditional calculation of genetic diversity depends on sequences of equal length. However, many homologous sequences downloaded from online repositories vary in length, posing a significant challenge to quantify the genetic diversity, especially haplotype diversity. We developed a new approach independent of sequence length by applying the same parameters used in calculating nucleotide diversity to estimate haplotype diversity. We compared this novel approach with the calculations by the program DNAsp, and we used simulation data from terrestrial vertebrates (birds, mammals and amphibians) and Homo sapiens to validate the method's performance. We further applied this approach to explore the global latitudinal gradients of haplotype diversity in amphibians, mammals and birds, and compared the results by traditional methods. The haplotype diversity calculated by our novel approach is consistent with the results from DNAsp. The simulations showed that our approach is robust and has a good estimating performance for sequence data with unequal lengths. For the datasets of terrestrial vertebrates and H. sapiens, our approach is capable of estimating haplotype diversity with unequal intraspecific sequence lengths. In contrast to patterns based on traditional methods, we observed different latitudinal patterns of haplotype diversity between the northern and southern hemispheres for terrestrial vertebrates, which is consistent with the updated pattern of nucleotide diversity for mammals. The present work contributes to the development of more precise quantification methods, which may be broadly applied to assessing biogeographical patterns of genetic diversity.

OriginalsprogEngelsk
TidsskriftMethods in Ecology and Evolution
Vol/bind12
Udgave nummer9
Sider (fra-til)1658-1667
Antal sider10
ISSN2041-210X
DOI
StatusUdgivet - 2021

Bibliografisk note

Funding Information:
We thank Xiaolu Jiao and Xin Yu for assistance with data analysis, Weiwei Zhai, Liang Ma and Hechuan Yang for discussion and Huijie Qiao for his generous help during our revision. This study was funded by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA19050202 to F.L.), the Second Tibetan Plateau Scientific Expedition and Research (STEP) Program (2019QZKK0304 to F.L. and G.S.), National Science Foundation of China (32070434 & 31572291 to G.S.; 31630069 to F.L.), the National Science and Technology Basic Resources Survey Program of China (2019FY100204 to F.L. and P.F.) and China Scholarship Council, Grant/Award Number: [2017]7011 to P.F.

Publisher Copyright:
© 2021 British Ecological Society

ID: 273697299