What to know
UCSC increased the understanding of SARS-CoV-2 by developing a genomic data platform for both scientists and public health officials. The platform simplified interactions with genomic data and enabled users to rapidly cross-reference and identify genomic variation data sets in a unified setting. The platform provided downloads of SARS-CoV-2 evolutionary maps in both scientific diagrams and in plain language to facilitate widespread use. All platform efforts ensured that users have access to complete, accurate, and understandable genome sequence data for all SARS-CoV-2 variants.
New and improved sequencing software tools
This project:
- Expanded and improved data aggregation, display, and visualization within the UCSC SARS-CoV-2 Genome Browser through development of the following tools:
- Cluster-Tracker rapidly identified strains that have recently been introduced into, and transmitted within, a region. The tool also identifies the likely geographic origin of the strain. In addition, it also provides documentation to allow other jurisdictions to build their own versions of the tool.1
- MatUtils toolsuite enabled rapid queries, interpretation, and manipulation of mutation-annotated phylogenetic trees.2
- Big Tree Explorer allowed effective analysis of global SARS-CoV-2 and other pathogen phylogenies.
- Cluster-Tracker rapidly identified strains that have recently been introduced into, and transmitted within, a region. The tool also identifies the likely geographic origin of the strain. In addition, it also provides documentation to allow other jurisdictions to build their own versions of the tool.1
- Developed a database of SARS-CoV-2 phylogenic trees updated daily to provide a comprehensive view of the virus' evolutionary history using public data.2
- Developed matOptimize, a method enabling the online study of phylogenetics, which is the study of evolutionary relationships among living matter, of SARS-CoV-2. The method allows for significantly increased workloads (e.g., extremely large data sets daily) to refine SARS-CoV-2 phylogenetic trees, using and maintaining the same libraries as UShER, a program using algorithms to infer what mutations might occur. matOptimize can also be installed with workflows available on a free, and open-source platform (e.g., Conda, Dockstore).3
- Developed ShUShER, a private, client-side port of UShER, a program using algorithms to infer what mutations might occur, for phylogenetic placement of private genome sequence data for analysis behind a firewall. Codebase and documentation are available.4
New SARS-CoV-2 dashboards and websites
RIVET is a platform for exploring putatively recombinant SARS-CoV-2 lineages. The platform uses the RIPPLES algorithm, run daily, to identify potential lineages and provides exhaustive quality control to support their exploration.
- Identifying SARS-CoV-2 regional introductions and transmission clusters in real time, Virus Evolution, 2022.
- A Daily-Updated Database and Tools for Comprehensive SARS-CoV-2 Mutation-Annotated Trees, Molecular Biology and Evolution, 2021.
- matOptimize: A Parallel Tree Optimization Method Enables Online Phylogenetics for SARS-CoV-2. Bioinformatics v38, 2022.
- ShUShER: A Private Browser-based Placement of Sensitive Genome Samples on Phylogenic Trees. The Journal of Open Source Software, 2021