Biology:Circular consensus sequencing

From HandWiki
Short description: DNA sequencing method

Circular consensus sequencing (CCS) is a DNA sequencing method that is used in conjunction with single-molecule real-time sequencing to yield highly accurate long-read sequencing datasets with read lengths averaging 15–25 kb with median accuracy greater than 99.9%.[1][2] These long reads, which are created via the formation of consensus sequencing obtained from multiple passes on a single DNA molecule, can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes.[3]

CCS allows resolution of large or complex genomes – such as the California Redwood genome, nine times the size of the human genome - of any species, including variant detection single nucleotide variants (SNVs) to structural variants, with high precision.[4][5] CCS also enables separation of the different copies of each chromosome (e.g., maternal and paternal for diploid), known as haplotypes. CCS reads offer the benefits of high accuracy equivalent to short-read sequencing data, but with the length necessary for complex genome assemblies and phasing of variants across the genome.[6][7]

Technology

Revio SMRT cell.

In this method, circularized fragments of DNA in solution float across the surface of a nanofluidic chip called a SMRT (Single Molecule, Real-Time) Cell. The surface of the chip is covered with millions of wells called zero-mode waveguides (ZMWs), each a few nanometers wide.[8] To prepare a sample for CCS/HiFi sequencing, primers and DNA polymerase are added to SMRTbell libraries. The circularized DNA becomes trapped in the ZMW, nucleotides are added, and the DNA polymerase enzyme begins to copy the molecule base by base. As this happens, a tiny amount of light is released and read by a detector, which helps the sequencer’s computer determine the order of bases present in the sample. The circularized DNA is sequenced in repeated passes to ensure accuracy – thus the name “circular” consensus sequencing – then  the primers and adapters are removed using bioinformatics to deliver a highly accurate consensus DNA read.[9]

In CCS, the genomic DNA is prepared without amplification such that individual base modifications such as methylation can be detected during sequencing. This allows for the capture of both sequence and valuable methylation information in a single experiment.[10]

History

This sequencing method was first described by Travers, K.J., et al. in Nucleic Acids Research in 2010.[3] It was later commercialized by Pacific Biosciences in 2018 and made available on Sequel II and Revio long-read sequencing instruments.[11][12]

CCS technology has subsequently been used to power numerous studies in several fields, including: Human, telomere-to-telomere, whole genome assembly and pangenome research,[13][14][15] pediatric rare disease genomic analysis,[16][17] understanding DNA methylation in a rare disease cohorts,[18] assembly of whole genomes of non-human vertebrates,[19] assembly of whole genomics of other agriculturally significant species,[20] analysis of cancer genomes[21][22] and Metagenomics and microbial research, among others.[23][24]

Recognizing the importance of this technology in future genomic exploration and discovery, the editors of Nature Methods named long-read sequencing technology its method of the year for 2022.[25]

Applications

Human and conservation biology

CCS can be useful to researchers seeking to perform de novo sequencing assembly or studying haplotyped phased sequences from each chromosomal copy, regardless of how many chromosomes are present in the species.Many biodiversity-oriented consortia have leveraged such technology to complete their conservation biology studies including African Biogenome Project, California Conservation Genomics Project, Darwin Tree of Life, Desert Agriculture Initiative, Earth Biogenome Project , Global Ant Genomics Alliance, Human Pangenome ,Telomere-to-Telomere Consortium ,The 10,000 Fish Genomes Project and Vertebrate Genomes Project.[26][27][28]

Human health

Circular consensus sequencing is helping researchers identify and characterize rare or structural variants with high confidence to better identify the underlying genomics of a given phenotype, with numerous applications to human health including rare disease research, microbiology and infectious disease, cancer research, and other genetic disease research areas.[29][30]

Rare diseases

Although they occur with low frequency in the human population, rare diseases as a collective are common and most have a genetic cause, presenting unique diagnostic challenges. An estimated 50–80% of structural variants are tandem repeats.[31]

Because CCS provides a comprehensive view of variation in the human genome, producing complete, accurate, and phased assemblies for variant calling, identification of repeat expansions and medically relevant interruption sequences, it is enabling the identification of causative pathogenic variants and helping researchers discover novel disease-associated genes.[32]

Microbiology and infectious diseases

Circular consensus sequencing can rapidly identify emerging pathogens and/or detection of changing pathogen genomics as part of regional or global surveillance operations.Where other molecular technologies for public health surveillance may require re-validation or the development of new panels, the unbiased nature of circular consensus sequencing delivers comprehensive genetic information to further characterize global outbreaks, pandemics, and epidemics.[12]

Cancer research

Comprehensive resolution of structural variants enables researchers to better study and detect somatic variants driving cancer. Because of their size (>50 bp), structural variants and tandem repeats account for much genomic variation between individuals.[33]

Long-read RNA sequencing can be useful in cancer research to uncover sources of alternative splicing and fusion events which power cancer growth.[34][35][36][37] CCS also provides an advantage over other sequencing technologies as it can provide phasing information of expressed mutations.[38]

References

  1. Mastrorosa, Francesco Kumara; Miller, Danny E.; Eichler, Evan E. (2023-06-14). "Applications of long-read sequencing to Mendelian genetics". Genome Medicine 15 (1): 42. doi:10.1186/s13073-023-01194-3. ISSN 1756-994X. PMID 37316925. 
  2. Wenger, Aaron M.; Peluso, Paul; Rowell, William J.; Chang, Pi-Chuan; Hall, Richard J.; Concepcion, Gregory T.; Ebler, Jana; Fungtammasan, Arkarachai et al. (2019-10-12). "Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome" (in en). Nature Biotechnology 37 (10): 1155–1162. doi:10.1038/s41587-019-0217-9. ISSN 1546-1696. PMID 31406327. 
  3. 3.0 3.1 Travers, K. J.; Chin, C.-S.; Rank, D. R.; Eid, J. S.; Turner, S. W. (2010-08-01). "A flexible and efficient template format for circular consensus sequencing and SNP detection" (in en). Nucleic Acids Research 38 (15): e159. doi:10.1093/nar/gkq543. ISSN 0305-1048. PMID 20571086. 
  4. Sharma, Priyanka; Masouleh, Ardashir Kharabian; Topp, Bruce; Furtado, Agnelo; Henry, Robert J. (February 2022). "De novo chromosome level assembly of a plant genome from long read sequence data". The Plant Journal 109 (3): 727–736. doi:10.1111/tpj.15583. ISSN 0960-7412. PMID 34784084. 
  5. Cheng, Haoyu; Concepcion, Gregory T; Feng, Xiaowen; Zhang, Haowen; Li, Heng (2021-02-01). "Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm". Nature Methods 18 (2): 170–175. doi:10.1038/s41592-020-01056-5. ISSN 1548-7091. PMID 33526886. 
  6. Cheng, Haoyu; Concepcion, Gregory T.; Feng, Xiaowen; Zhang, Haowen; Li, Heng (2021-02-01). "Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm" (in en). Nature Methods 18 (2): 170–175. doi:10.1038/s41592-020-01056-5. ISSN 1548-7105. PMID 33526886. 
  7. Nurk, Sergey; Walenz, Brian P.; Rhie, Arang; Vollger, Mitchell R.; Logsdon, Glennis A.; Grothe, Robert; Miga, Karen H.; Eichler, Evan E. et al. (2020-09-01). "HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads" (in en). Genome Research 30 (9): 1291–1305. doi:10.1101/gr.263566.120. ISSN 1088-9051. PMID 32801147. PMC 7545148. https://genome.cshlp.org/content/30/9/1291. 
  8. Eid, John; Fehr, Adrian; Gray, Jeremy; Luong, Khai; Lyle, John; Otto, Geoff; Peluso, Paul; Rank, David et al. (2009-01-02). "Real-Time DNA Sequencing from Single Polymerase Molecules" (in en). Science 323 (5910): 133–138. doi:10.1126/science.1162986. ISSN 0036-8075. PMID 19023044. Bibcode2009Sci...323..133E. https://www.science.org/doi/10.1126/science.1162986. 
  9. Travers, K. J.; Chin, C.-S.; Rank, D. R.; Eid, J. S.; Turner, S. W. (2010-06-22). "A flexible and efficient template format for circular consensus sequencing and SNP detection" (in en). Nucleic Acids Research 38 (15): e159. doi:10.1093/nar/gkq543. ISSN 0305-1048. PMID 20571086. PMC 2926623. https://doi.org/10.1093/nar/gkq543. 
  10. Flusberg, Benjamin A.; Webster, Dale R.; Lee, Jessica H.; Travers, Kevin J.; Olivares, Eric C.; Clark, Tyson A.; Korlach, Jonas; Turner, Stephen W. (2010-05-09). "Direct detection of DNA methylation during single-molecule, real-time sequencing" (in en). Nature Methods 7 (6): 461–465. doi:10.1038/nmeth.1459. ISSN 1548-7105. PMID 20453866. 
  11. Wenger, Aaron M.; Peluso, Paul; Rowell, William J.; Chang, Pi-Chuan; Hall, Richard J.; Concepcion, Gregory T.; Ebler, Jana; Fungtammasan, Arkarachai et al. (2019-08-12). "Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome" (in en). Nature Biotechnology 37 (10): 1155–1162. doi:10.1038/s41587-019-0217-9. ISSN 1546-1696. PMID 31406327. 
  12. 12.0 12.1 Oehler, Josephine B.; Wright, Helen; Stark, Zornitza; Mallett, Andrew J.; Schmitz, Ulf (2023-08-08). "The application of long-read sequencing in clinical settings". Human Genomics 17 (1): 73. doi:10.1186/s40246-023-00522-3. ISSN 1479-7364. PMID 37553611. 
  13. Jarvis, Erich D.; Formenti, Giulio; Rhie, Arang; Guarracino, Andrea; Yang, Chentao; Wood, Jonathan; Tracey, Alan; Thibaud-Nissen, Francoise et al. (2022-11-22). "Semi-automated assembly of high-quality diploid human reference genomes" (in en). Nature 611 (7936): 519–531. doi:10.1038/s41586-022-05325-5. ISSN 1476-4687. PMID 36261518. Bibcode2022Natur.611..519J. 
  14. Nurk, Sergey; Koren, Sergey; Rhie, Arang; Rautiainen, Mikko; Bzikadze, Andrey V.; Mikheenko, Alla; Vollger, Mitchell R.; Altemose, Nicolas et al. (2022-03-31). "The complete sequence of a human genome" (in en). Science 376 (6588): 44–53. doi:10.1126/science.abj6987. ISSN 0036-8075. PMID 35357919. Bibcode2022Sci...376...44N. 
  15. Gao, Yang; Yang, Xiaofei; Chen, Hao; Tan, Xinjiang; Yang, Zhaoqing; Deng, Lian; Wang, Baonan; Kong, Shuang et al. (2023-06-14). "A pangenome reference of 36 Chinese populations" (in en). Nature 619 (7968): 112–121. doi:10.1038/s41586-023-06173-7. ISSN 1476-4687. PMID 37316654. Bibcode2023Natur.619..112G. 
  16. Cohen, Ana S.A.; Farrow, Emily G.; Abdelmoity, Ahmed T.; Alaimo, Joseph T.; Amudhavalli, Shivarajan M.; Anderson, John T.; Bansal, Lalit; Bartik, Lauren et al. (June 2022). "Genomic answers for children: Dynamic analyses of >1000 pediatric rare disease genomes" (in en). Genetics in Medicine 24 (6): 1336–1348. doi:10.1016/j.gim.2022.02.007. PMID 35305867. https://linkinghub.elsevier.com/retrieve/pii/S1098360022006530. 
  17. Sanford Kobayashi, Erica; Batalov, Serge; Wenger, Aaron M.; Lambert, Christine; Dhillon, Harsharan; Hall, Richard J.; Baybayan, Primo; Ding, Yan et al. (2022-10-09). "Approaches to long-read sequencing in a clinical setting to improve diagnostic rate" (in en). Scientific Reports 12 (1): 16945. doi:10.1038/s41598-022-20113-x. ISSN 2045-2322. PMID 36210382. Bibcode2022NatSR..1216945S. 
  18. Cheung, Warren A.; Johnson, Adam F.; Rowell, William J.; Farrow, Emily; Hall, Richard; Cohen, Ana S. A.; Means, John C.; Zion, Tricia N. et al. (2023-05-29). "Direct haplotype-resolved 5-base HiFi sequencing for genome-wide profiling of hypermethylation outliers in a rare disease cohort" (in en). Nature Communications 14 (1): 3090. doi:10.1038/s41467-023-38782-1. ISSN 2041-1723. PMID 37248219. Bibcode2023NatCo..14.3090C. 
  19. Rhie, Arang; McCarthy, Shane A.; Fedrigo, Olivier; Damas, Joana; Formenti, Giulio; Koren, Sergey; Uliano-Silva, Marcela; Chow, William et al. (2021-04-29). "Towards complete and error-free genome assemblies of all vertebrate species" (in en). Nature 592 (7856): 737–746. doi:10.1038/s41586-021-03451-0. ISSN 0028-0836. PMID 33911273. Bibcode2021Natur.592..737R. 
  20. Chen, Jian; Wang, Zijian; Tan, Kaiwen; Huang, Wei; Shi, Junpeng; Li, Tong; Hu, Jiang; Wang, Kai et al. (July 2023). "A complete telomere-to-telomere assembly of the maize genome" (in en). Nature Genetics 55 (7): 1221–1231. doi:10.1038/s41588-023-01419-6. ISSN 1546-1718. PMID 37322109. 
  21. Veiga, Diogo F. T.; Nesta, Alex; Zhao, Yuqi; Mays, Anne Deslattes; Huynh, Richie; Rossi, Robert; Wu, Te-Chia; Palucka, Karolina et al. (2022-01-21). "A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer" (in en). Science Advances 8 (3): eabg6711. doi:10.1126/sciadv.abg6711. ISSN 2375-2548. PMID 35044822. Bibcode2022SciA....8.6711V. 
  22. Choy, L Y Lois; Peng, Wenlei; Jiang, Peiyong; Cheng, Suk Hang; Yu, Stephanie C Y (19 May 2022). "Single-Molecule Sequencing Enables Long Cell-Free DNA Detection and Direct Methylation Analysis for Cancer Patients". Clinical Chemistry 68 (9): 1151–1163. doi:10.1093/clinchem/hvac086. PMID 35587130. https://academic.oup.com/clinchem/article/68/9/1151/6588669. 
  23. Reiter, Taylor E.; Brown, C. Titus (2022-03-22). "MAGs achieve lineage resolution" (in en). Nature Microbiology 7 (2): 193–194. doi:10.1038/s41564-021-01027-2. ISSN 2058-5276. PMID 34980920. https://www.nature.com/articles/s41564-021-01027-2. 
  24. Oyewole, Oluwaseun Rume-Abiola; Latzin, Philipp; Brugger, Silvio D.; Hilty, Markus (2022-09-22). "Strain-level resolution and pneumococcal carriage dynamics by single-molecule real-time (SMRT) sequencing of the plyNCR marker: a longitudinal study in Swiss infants". Microbiome 10 (1): 152. doi:10.1186/s40168-022-01344-6. ISSN 2049-2618. PMID 36138483. 
  25. Marx, Vivien (2023-01-12). "Method of the year: long-read sequencing" (in en). Nature Methods 20 (1): 6–11. doi:10.1038/s41592-022-01730-w. ISSN 1548-7105. PMID 36635542. https://www.nature.com/articles/s41592-022-01730-w. 
  26. Nurk, Sergey; Koren, Sergey; Rhie, Arang; Rautiainen, Mikko; Bzikadze, Andrey V.; Mikheenko, Alla; Vollger, Mitchell R.; Altemose, Nicolas et al. (2022-03-31). "The complete sequence of a human genome" (in en). Science 376 (6588): 44–53. doi:10.1126/science.abj6987. ISSN 0036-8075. PMID 35357919. Bibcode2022Sci...376...44N. 
  27. Aganezov, Sergey; Yan, Stephanie M.; Soto, Daniela C.; Kirsche, Melanie; Zarate, Samantha; Avdeyev, Pavel; Taylor, Dylan J.; Shafin, Kishwar et al. (2022-04-01). "A complete reference genome improves analysis of human genetic variation" (in en). Science 376 (6588): eabl3533. doi:10.1126/science.abl3533. ISSN 0036-8075. PMID 35357935. 
  28. Vollger, Mitchell R.; Guitart, Xavi; Dishuck, Philip C.; Mercuri, Ludovica; Harvey, William T.; Gershman, Ariel; Diekhans, Mark; Sulovari, Arvis et al. (2022-04-01). "Segmental duplications and their variation in a complete human genome" (in en). Science 376 (6588): eabj6965. doi:10.1126/science.abj6965. ISSN 0036-8075. PMID 35357917. 
  29. Wenger, Aaron M.; Peluso, Paul; Rowell, William J.; Chang, Pi-Chuan; Hall, Richard J.; Concepcion, Gregory T.; Ebler, Jana; Fungtammasan, Arkarachai et al. (2019-08-12). "Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome" (in en). Nature Biotechnology 37 (10): 1155–1162. doi:10.1038/s41587-019-0217-9. ISSN 1546-1696. PMID 31406327. 
  30. Salk, Jesse J.; Schmitt, Michael W.; Loeb, Lawrence A. (2018-03-26). "Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations" (in en). Nature Reviews. Genetics 19 (5): 269–285. doi:10.1038/nrg.2017.117. PMID 29576615. 
  31. English, Adam C.; Menon, Vipin K.; Gibbs, Richard A.; Metcalf, Ginger A.; Sedlazeck, Fritz J. (2022-12-27). "Truvari: refined structural variant comparison preserves allelic diversity". Genome Biology 23 (1): 271. doi:10.1186/s13059-022-02840-6. ISSN 1474-760X. PMID 36575487. 
  32. "Customer Success Story: Experts at Children's Mercy Kansas City Turn to Long-Read Whole Genome Sequencing to Find Answers for Rare Diseases" (in en-US). https://www.pacb.com/learn/case-studies/customer-success-story-childrens-mercy-kansas-city/. 
  33. Ebert, Peter; Audano, Peter A.; Zhu, Qihui; Rodriguez-Martin, Bernardo; Porubsky, David; Bonder, Marc Jan; Sulovari, Arvis; Ebler, Jana et al. (2021-04-02). "Haplotype-resolved diverse human genomes and integrated analysis of structural variation". Science 372 (6537): eabf7117. doi:10.1126/science.abf7117. ISSN 1095-9203. PMID 33632895. 
  34. Veiga, Diogo F. T.; Nesta, Alex; Zhao, Yuqi; Deslattes Mays, Anne; Huynh, Richie; Rossi, Robert; Wu, Te-Chia; Palucka, Karolina et al. (2022-01-21). "A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer". Science Advances 8 (3): eabg6711. doi:10.1126/sciadv.abg6711. ISSN 2375-2548. PMID 35044822. Bibcode2022SciA....8.6711V. 
  35. Mikheenko, Alla; Prjibelski, Andrey D.; Joglekar, Anoushka; Tilgner, Hagen U. (2022-04-01). "Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns" (in en). Genome Research 32 (4): 726–737. doi:10.1101/gr.276405.121. PMID 35301264. 
  36. Nattestad, Maria; Goodwin, Sara; Ng, Karen; Baslan, Timour; Sedlazeck, Fritz J.; Rescheneder, Philipp; Garvin, Tyler; Fang, Han et al. (2018-08-28). "Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line". Genome Research 28 (8): 1126–1135. doi:10.1101/gr.231100.117. ISSN 1549-5469. PMID 29954844. 
  37. Miller, Anthony R.; Wijeratne, Saranga; McGrath, Sean D.; Schieffer, Kathleen M.; Miller, Katherine E.; Lee, Kristy; Mathew, Mariam; LaHaye, Stephanie et al. (December 2022). "Pacific Biosciences Fusion and Long Isoform Pipeline for Cancer Transcriptome–Based Resolution of Isoform Complexity" (in en). The Journal of Molecular Diagnostics 24 (12): 1292–1306. doi:10.1016/j.jmoldx.2022.09.003. PMID 36191838. https://linkinghub.elsevier.com/retrieve/pii/S1525157822002653. 
  38. Olson, Nathan D.; Wagner, Justin; McDaniel, Jennifer; Stephens, Sarah H.; Westreich, Samuel T.; Prasanna, Anish G.; Johanson, Elaine; Boja, Emily et al. (2022-05-11). "PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions". Cell Genomics 2 (5): 100129. doi:10.1016/j.xgen.2022.100129. ISSN 2666-979X. PMID 35720974.