SARS CoV-2’nin (COVID-19) biyoinformatik destekli identifikasyonu

Keşfedilen coronavirusların sayısındaki büyük artış ve sekanslanancoronavirus genomları, bu virus ailesi üzerinde genomik ve biyoinformatik analizler yapmak için bize eşi görülmemiş bir fırsat vermiştir. Coronaviruslar, bilinen tüm RNA virusları arasında en büyükgenoma sahip (yaklaşık 30 kb) ailelerden biridir. Virusun çeşitli genleri (ORF1ab, spike, zarf, membran ve nükleokapsid) bulunmaktadır.Ayrıca filogenetik olarak A, B, C ve D alt gruplarından oluşan Betacoronavirus ile birlikte Coronaviridae ailesi dört cinsten oluşmaktadır(Alfacoronavirus, Beta-, Gamma- ve Delta-). Coronavirusların çeşitligen lokusları kullanılarak yapılan moleküler yapısal analizde, 2003SARS coronavirusun atası olan virusla 4x10-4 ile 2x10-2 arasında biroranla her yıl değişim geçirdiğini göstermiştir. Coronaviruslar ayrıca rekombinasyona açık viruslardır. Farklı murine hepatitis viruslar(MHV) arasında, farklı infectious bronchitis viruslar arasında, MHVve bovine coronaviruslar arasında, feline coronavirus (FCoV) tip 1ve canine coronavirus arasında ve insan coronavirusları arasındabu mutasyon tipi bildirilmiştir. Dolayısıyla mutasyonlara ve değişimlere çok açık olan bir organizma olan virusların sürekli değişenetkileşimlerini incelemek ve anlamak için biyoinformatik analizlere ve sonunda çıkan raporlarla virusların tanımlanmalarına ihtiyaçvardır. Bu derlemede, coronavirus biyoinformatiğinin az bilinen amaaraştırmaya muhtaç mevcut ilerlemeleri vurguladık ve gelecektekigelişmeler için potansiyel yolları tartıştık. Çok kullanılan ve mevcutolan teknolojilere genel bir bakış sunulmuş, SARS CoV-2 özelinde biyoinformatiğin viroloji alanına getireceği bazı önemli avantajlar vedezavantajlar özetlenmiştir.

Bioinformatics‐aided identification of SARS CoV-2 (COVID-19)

Current increase in discovery of coronaviruses and their sequenced genomes gave us a unique opportunity to make genomic and bioinformatic analyses of this virus family. Coronaviruses have one of the largest genomes (approx. 30 kb) among known RNA viruses. These viruses have various genes (Open Reading Fream 1ab, spike, envelope, membrane and nucleocapsid). Furthermore Coronaviridae family has 4 genera (Alphacoronavirus, Beta-, Delta- and Gamma-) with Betacoronaviruses consisting of 4 subgroups phylogenetically, namely A, B, C and D. 2003 SARS coronavirus is shown to have 4x10-4 ile 2x10-2 difference with its common ancestor annually in a molecular structural analysis using different gene loci of coronaviruses. Coronaviruses are also known to have genetic recombination. This type of genetic interaction is reported among different murine hepatitis viruses (MHV), among different infectious bronchitis viruses, between MHV and bovine coronaviruses, between feline coronavirus type 1 and canine coronavirus and among human coronaviruses. Thus we need reports on characterization of viruses by analyzing bioinformatically to investigate and to understand the ever changing virus interactions which are shaped by mutations and genetic changes. In this review we explained and discussed the current status and potential future possibilities of lesser known but much needed coronavirus bioinformatics. We provided a general perspective of present and frequently used techniques and summarized the main advantages and disadvantages of the use of bioinformatic analyses of SARS CoV-2 for virology field

___

  • Andrade ACDSP, de Miranda Boratto PV, Rodrigues RAL, Bastos TM, et al., 2019. New isolates of pandoraviruses: contribution to the study of replication cycle steps. J Virol, 93(5), e01942-18.
  • Bankevich A, Nurk S, Antipov D, Gurevich AA, et al., 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput Biol, 19 (5), 455–477.
  • Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi, I, et al., 2012. GenBank. Nucleic acids research, 41(1), 36-42.
  • Blomberg N, Lauer KB, 2020. Connecting data, tools and people across Europe: ELIXIR’s response to the COVID-19 pandemic. Eur J Hum Genet, 28, 719–723
  • Boc A, Diallo AB, Makarenkov V, 2012. T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks. Nucleic acids res, 40(1), 573-579.
  • Brister JR, Ako-Adjei D, Bao Y, Blinkova O, 2015. NCBI viral genomes resource. Nucleic acids res, 43(1), 571-577.
  • Carneiro J, Gomes C, Couto C, Pereira F, 2020. CoV2ID: Detection and Therapeutics Oligo Database for SARS-CoV-2. bioRxiv, In press.
  • Cleemput S, Dumon W, Fonseca V, Abdool Karim W, et al., 2020. Genome Detective Coronavirus Typing Tool for rapid identification and characterization of novel coronavirus genomes. Bioinformatics, 36(11), 3552-3555.
  • Dhama K, Khan S, Tiwari R, Sircar S, et al., 2020. Coronavirus disease 2019 – COVID-19. Clin Microbiol Rev, 33, e00028- 20.
  • Hadfield J, Megill C, Bell SM, Huddleston J, et al., 2018. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics, 34(23), 4121-4123.
  • Hatcher EL, Zhdanov SA, Bao Y, Blinkova O, et al., 2017. Virus Variation Resource–improved response to emergent viral outbreaks. Nucleic acids res, 45(1), 482-490.
  • Hölzer M, Marz, M, 2017. Software dedicated to virus sequence analysis “bioinformatics goes viral”. Adv Virus Res, 99, 233–257.
  • Huang C, Wang Y, Li X, Ren L, et al., 2020. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet, 395, 497–506.
  • Huelsenbeck JP, Ronquist F, 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics, 17(8), 754- 755.
  • Hulo C, de Castro E, Masson P, Bougueleret L, et al., 2011. ViralZone: a knowledge resource to understand virus diversity. Nucleic Acids Res. 39, 576–582.
  • Ibrahim B, McMahon DP, Hufsky F, Beer M, et al., 2018. A new era of virus bioinformatics. Virus Res, 251, 86-90.
  • Jerome H, Vattipally SB, Thomson EC, 2015. Can we identify potential viral zoonoses before they cross the species barrier? Microbiology Today, 42, 150-153.
  • Karsch-Mizrachi I, Takagi T, Cochrane G, 2018. The international nucleotide sequence database collaboration, Nucleic Acids Res, 46(1), 48–51.
  • Kearse M, Moir R, Wilson A, Stones-Havas S, et al., 2012. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics, 28(12), 1647-1649.
  • Kirk MD, Pires SM, Black RE, Caipo M, et al., 2015. World Health Organization estimates of the global and regional disease burden of 22 foodborne bacterial, protozoal, and viral diseases, 2010: a data synthesis. Plos Med, 12, e1001921.
  • Knight-Jones T, Rushton J, 2013. The economic impacts of foot and mouth disease – what are they, how big are they and where do they occur? Prev Vet Med, 112(3–4), 161– 173.
  • Kozlov AM, Darriba D, Flouri T, Morel B, et al., 2019. RAxMLNG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics, 35(21), 4453-4455.
  • Kulikova T, Aldebert P, Althorpe N, Baker W, et al., 2004. The EMBL nucleotide sequence database. Nucleic acids res, 32(1), 27-30.
  • Kumar S, Stecher G, Li M, Knyaz C, et al., 2018. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol, 35(6), 1547-1549.
  • Lefkowitz EJ, Dempsey DM, Hendrickson RC, Orton RJ, et al., 2018. Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV). Nucleic acids res, 46(1), 708-717.
  • Lord E, Leclercq M, Boc A, Diallo AB, et al., 2012. Armadillo 1.1: an original workflow platform for designing and conducting phylogenetic analysis and simulations. Plos one, 7(1), e29903.
  • Lu R, Zhao X, Li J, Niu P, et al., 2020. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet, 395, 565–574.
  • Marz M, Beerenwinkel N, Drosten C, Fricke M, et al., 2014. Challenges in RNA virus bioinformatics. Bioinformatics, 30(13), 1793-1799.
  • Ogasawara O, Kodama Y, Mashima J, Kosuge T, et al., 2020. DDBJ Database updates and computational infrastructure enhancement. Nucleic acids res, 48(1), 45-50.
  • Oguzoglu TC, Timurkan MO, Muz D, Kudu A, et al., 2010. First molecular characterization of feline immunodeficiency virus in Turkey. Arch Virol, 155(11), 1877-1881.
  • Oliveros JC, Franch M, Tabas-Madrid D, San-León D, et al., 2016. Breaking-Cas - interactive design of guide RNAs for CRISPR - Cas experiments for ENSEMBL genomes. Nucleic acids res, 44(1), 267-271.
  • Ostaszewski M, Mazein A, Gillespie ME., Kuperstein I, et al., 2020. COVID-19 Disease Map, building a computational repository of SARS-CoV-2 virus-host interaction mechanisms. Sci data, 7(1), 1-4.
  • Peng Y., Leung, H.C.M., Yiu, S., Chin, F.Y.L., 2012. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28 (11), 1420–1428.
  • Pickett BE, Sadat EL, Zhang Y, Noronha JM, et al., 2012. ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic acids res, 40(1), 593-598.
  • Rian K, Esteban-Medina M, Hidalgo MR, Cubuk C, et al., 2020. Mechanistic modeling of the SARS-CoV-2 disease map. BioRxiv, In press. Schliep KP, 2011. phangorn: phylogenetic analysis in R. Bioinformatics, 27(4), 592-593.
  • Shu Y, McCauley J, 2017. GISAID: Global initiative on sharing all influenza data–from vision to reality. Eurosurveillance, 22(13), 30494.
  • Simpson JT, Wong K, Jackman SD, Schein JE, et al., 2009. ABySS: a parallel assembler for short read sequence data. Genome Res, 19, 1117–1123.
  • Suttle CA, 2005. Viruses in the sea. Nature, 437, 356–361.
  • Timurkan MO, Alcigir ME, 2017. Phylogenetic analysis of a partial L1 gene from bovine papillomavirus type 1 isolated from naturally occurring papilloma cases in the northwestern region of Turkey. Onderstepoort J Vet Res, 84(1), 1-6.
  • Timurkan MO, Aydin H, Alkan F, 2018. Detection and molecular characterization of canine adenovirus type 2 (CAV2) in dogs with respiratory tract symptoms in shelters in Turkey. Vet arhiv, 88(4), 467-479.
  • Timurkan MO, Aydin H, Sait A, 2019. Identification and molecular characterisation of bovine parainfluenza virus-3 and bovine respiratory syncytial virus-first report from Turkey. J Vet Res, 63(2), 167-173.
  • Timurkan MÖ, Aydın H, 2019. Increased genetic diversity of BVDV strains circulating in Eastern Anatolia, Turkey: first detection of BVDV-3 in Turkey. Trop Anim Health Prod, 51(7), 1953-1961.
  • Weiss RA, McMichael AJ, 2004. Social and environmental risk factors in the emergence of infectious diseases. Nat Med, 10(12), 70-76.
  • Woo PC, Huang Y, Lau SK, Yuen KY, 2010. Coronavirus genomics and bioinformatics analysis. Viruses, 2(8), 1804- 1820.
  • World Health Organization (WHO), 2020. Novel Coronavirus (COVID-19) Situational Reports. https://www.who.int/ emergencies/diseases/novel-coronavirus-2019/situation-reports/. Accessed: 10.08.2020.
  • Yang X, Charlebois P, Gnerre S, Coole MG, et al., 2012. De novo assembly of highly diverse viral populations. BMC Genomics, 13, 475. Zerbino DR, Birney E, 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res, 18, 821–829.
  • Zhou P, Yang XL, Wang XG, Hu B, et al., 2020. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature, 579, 270–273.