A bioinformatic analysis of the spike glycoprotein & evolution of COVID-19

The Severe Acute Respiratory Syndrome 2 (COVID-19/SARS-CoV-2) has become the pandemic of the century due to its drastically infectious nature. SARS & MERS are the most notable of past coronaviruses infecting merely thousands compared to COVID-19’s gigantic magnitude. COVID-19’s global spread has been attributed to its high asymptotic transmission and explosive infectious nature, mainly due to mutational changes in the spike glycoprotein. The purpose of this research is to comprehend & evaluate the divergent evolution of the spike glycoprotein in COVID-19, and other coronaviruses, at the molecular level via bioinformatic analysis. A phylogenetic tree was constructed using spike glycoprotein sequences from viral genomes using MEGA X program. Nucleotide composition analysis and genome organization study were carried out. Dot plot comparisons were performed using EMBOSS Dot Matcher program. Phylogenetic analysis produced four distinct clades for each coronavirus genera with a common ancestral origin sometime in recent history. More importantly, COVID-19 and SARS formed their own subclade suggesting that evolution of sequence has taken place in the spike glycoprotein over the period time. Genome organization and nucleotide composition provided further evidence of mutational changes in the spike glycoprotein. The results from this study demonstrated the divergent evolution of coronaviruses. Mutational changes in the spike glycoprotein have resulted in more virulent forms of COVID-19.

___

1. JHU John Hopkins University of Medicine COVID-19 Dashboard. https:// coronavirus.jhu.edu/map.html. access date 1.7.2021

2. Worldometer COVID-19 Coronavirus Pandemic. https://www. worldometers.info/coronavirus/. access date 1.7.2021

3. Ksiazek T, Erdman D, Goldsmith C, et al. A Novel Coronavirus Associated with Severe Acute Respiratory Syndrome. N Engl J Med. 2003;348:1953- 66.

4. Cascella M, Rajnik M, Cuomo A, Dulebohn SC, Napoli RD. Features, Evaluation and Treatment Coronavirus (COVID-19). StatPearls Publishing. 2021.

5. Anderson RM, Fraser C, Ghani A, Donnelly C, et al. Epidemiology, transmission dynamics and control of SARS: the 2002–2003 epidemic. Philos Trans R Soc Lond B Biol Sci. 2004;359:1091–105.

6. Chowell G, Abdirizak F, Lee S, Jung E, et al.Transmission characteristics of MERS and SARS in the healthcare setting: a comparative study. BMC Med. 2015;13:210.

7. Wit ED, Doremalen NV, Falzarano D, Munster VJ. SARS and MERS: recent insights into emerging coronaviruses. Nat Rev Microbiol. 2016;14: 52334.

8. Chandra A, Chandra S. A comparative Analysis of SARS, MERS and Covid-19. J Contemp Med. 2020;10:464-70.

9. Wrapp D, Wang N, Corbett KS, Goldsmith JA, et al. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science. 2020:367:1260–3.

10. Pal D. Spike protein fusion loop controls SARS-CoV-2 fusogenicity and infectivity. J Struct Biol. 2021;213:107713.

11. Kim D, Lee J.-Y, Yang J-S, Kim JW, Kim VN, Chang H. The architecture of SARS-CoV-2 transcriptome. Cell. 2020;181:914–21.

12. Wu A, Peng Y, Huang B, Ding X, Wang X, et al. Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China. Cell Host Microbe. 2020;27:325-8.

13. Shang J, Wan Y, Luo C, et. al. Cell entry mechanisms of SARS-CoV-2. Proc Natl Acad Sci. 2020;117:11727–34.

14. Lee S, Lee MK, Na H, et al. Comparative analysis of mutational hotspots in the spike protein of SARS-CoV-2 isolates from different geographic origins. Gene Rep. 2021;23:101100.

15. Li F. Structure, Function, and Evolution of Coronavirus Spike Proteins. Annu Rev Virol. 2016;3:237-61.

16. Korber B, Fischer WM, Gnanakaran S, Yoon H, et. al. Tracking changes in SARS-CoV-2 Spike: evidence that D614G increases infectivity of the COVID-19 virus. Cell. 2020;182:812–27.

17. Zhang L, Jackson CB, Mou H, et al. SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity. Nat Commun. 2020;11:6013.

18. Chand GB, Banerjee A, Azad GK. Identification of twenty-five mutations in surface glycoprotein (Spike) of SARS-CoV-2 among Indian isolates and their impact on protein dynamics. Gene Rep. 2020;21:100891.

19. Shah A, Rashid F, Aziz A, Jan A., Suleman M. Genetic characterization of structural and open reading Fram-8 proteins of SARS-CoV-2 isolates from different countries. Gene Rep 2020;21:100886.

20. Fang Li. Structure, Function, and volution of Coronavirus Spike Proteins. Ann Rev Virol 2016; 3:1,237-61.

21. Fehr A.R., Perlman S. Coronaviruses: An Overview of Their Replication and Pathogenesis. In: Maier H., Bickerton E., Britton P. (eds) Coronaviruses. Mol Biol 2015, vol 1282. Humana Press, New York, NY

22. Rice P, Longden I. Emboss: the European Molecular Open Software Suite. Trends Genet 2000;16:276-7.

23. Landes C, Henaut A, Risler J. Dot-Plot comparison by multivariate analysis (DOCMA): A tool for classifying protein sequences. Bioinformatics. 1998;9:191-6.

24. Jones DT, Taylor WR, and Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8: 275-82.

25. Kumar S, Stecher G, Li M, Knyaz C, and Tamura K. MEGA X: Molecular Evolutionary Genetics Analysis across computing platforms. Mol Biol Evol. 2018;35:1547-1549.

26. Díez-Fuertes F, Iglesias-Caballero M, García-Pérez J, et al. A Founder Effect Led Early SARS-CoV-2 Transmission in Spain. J Virol. 2021;95(3): e01583-20.

27. Volz E, Hill V, McCrone JT, Price A, Jorgensen D, et al. Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity. Cell. 2021;184:64-75.e11.

28. Pathan RK, Biswas M, Khandaker MU. Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model. Chaos Solution Fract. 2021;138:110018