UTA – Universal Transcript Archive

UTA is a data archive and set of tools that aims to improve the precision of sequence-based descriptions of genomic variation. It will eventually contain transcripts and alignments to multiple reference genomes using multiple algorithms, as obtained from from NCBI, Ensembl, LRG, and UCSC.

Contents

Overview

The UTA stores transcripts aligned to genome reference assemblies using multiple methods in order to improve the precision and accuracy by which the scientific and clinical communities describe variants.

It facilities the following:

  • enabling an interpretation of variants reported in literature against obsolete transcript records
  • identifying regions where transcript and reference genome sequence assemblies disagree
  • characterizing transcripts of the same gene across transcript sources
  • projecting (“lifting over”) variants from one transcript to another
  • identifying transcripts and genomic regions with ambiguous alignments that may affect clinical interpretation
  • querying for multiple transcript sources through a single interface

Examples

Getting Started

Installation

Database

Schema

Databse Loading

Modules

uta.db.transcriptdb

Indices and tables