The severe acute respiratory syndrome COVID-19 was discovered on December 31, 2019 in China. Subsequently, many COVID-19 cases were reported in many other countries. However, some positive COVID-19 samples had been reported earlier than those officially accepted by health authorities in other countries, such as France and Italy. Thus, it is of great importance to determine the place where SARS-CoV-2 was first transmitted to human. To this end, we analyze genomes of SARS-CoV-2 using k-mer natural vector method and compare the similarities of global SARS-CoV-2 genomes by a new natural metric. Because it is commonly accepted that SARS-CoV-2 is originated from bat coronavirus RaTG13, we only need to determine which SARS-CoV-2 genome sequence has the closest distance to bat coronavirus RaTG13 under our natural metric. From our analysis, SARS-CoV-2 most likely has already existed in other countries such as France, India, Netherland, England and United States before the outbreak at Wuhan, China.
Shaojun PEI
,
Stephen S. -T. YAU
. ANALYSIS OF THE GENOMIC DISTANCE BETWEEN BAT CORONAVIRUS RATG13 AND SARS-COV-2 REVEALS MULTIPLE ORIGINS OF COVID-19[J]. Acta mathematica scientia, Series B, 2021
, 41(3)
: 1017
-1022
.
DOI: 10.1007/s10473-021-0323-x
[1] Guan W, Ni Z, Yu H, et al. Clinical Characteristics of Coronavirus Disease 2019 in China. New England Journal of Medicine, 2020, 382:1708-1720
[2] Zhou P, Yang X L, Wang X G, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature, 2020, 579:270-273
[3] Lam T T Y, Jia N, Zhang Y W. et al. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins. Nature, 2020583:282-285
[4] Munnink B B O, Sikkema R S, Nieuwenhuijse D F, et al. Transmission of SARS-CoV-2 on mink farms between humans and mink and back to humans. Science, 2020, 371(6525):eabe5901
[5] Dong R, Pei S, Yin C, et al. Analysis of the hosts and transmission paths of SARS-CoV-2 in the COVID-19 outbreak. Genes, 2020, 11(6):637
[6] Deslandes A, Berti V, Tandjaoui-Lambotte Y, et al. SARS-CoV-2 was already spreading in France in late December 2019. International Journal of Antimicrobial Agents, 2020, 55:106006
[7] Sridhar V B, Monica E P, Kacie G, et al. Serologic testing of U.S. blood donations to identify SARS-CoV- 2-reactive antibodies:December 2019-January 2020. Clinical Infectious Diseases, 2020, ciaa1785
[8] Carrat F, Figoni J, Henny J, et al. Evidence of early circulation of SARS-CoV-2 in France:findings from the population-based "CONSTANCES" cohort. European Journal of Epidemiology, 2021. https://doi.org/10.1007/s10654-020-00716-2
[9] Yu C, He R L, Yau S S T. Protein sequence comparison based on K-string dictionary. Gene, 2013, 529:250-256
[10] Wen J, Chan R H F, Yau S -C, et al. K-mer natural vector and its application to the phylogenetic analysis of genetic sequences. Gene, 2014, 546:25-34
[11] Deng M, Yu C, Liang Q, et al. A Novel Method of Characterizing Genetic Sequences:Genome Space with Biological Distance and Applications. PLoS ONE, 2011, 6(3):e17293
[12] Sims G E, Jun S R, Wu G A, et al. Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proceedings of the National Academy of Sciences, 2009, 106:2677-2682
[13] Sims G E, Jun S R, Wu G A, et al. Whole-genome phylogeny of mammals:evolutionary information in genic and non-genic regions. Proceedings of the National Academy of Sciences, 2009, 106:17077-17082
[14] Wu F, Zhao S, Yu B, et al. A new coronavirus associated with human respiratory disease in China. Nature, 2020, 579(7798):265-269