November 5, 2014

Create a REFSEQ transcript database

Transcript databases (Txdb) in bioconductor are very good annotation packages. These packages help the researchers annotate the genomic regions of interest to multiple genic elements such as exons, introns, UTRs, CDS, genes etc.,. For the human genome bioconductor offers Txdb files only for the UCSC knowngenes. Here, I share the code needed for generating human Txdb using bioconductor package "GenomicFeatures"

The following code/function could be used for generating any Txdb of choice for any organism of interest. This is a very simple function. However, due to the naming of the function and the default parameters hide the full potential of this function in utilizing it for creating a variety of databases. In other words, this function could be used to generate a Txdb from every table existing at UCSC.