Home      FAQ      Help   
Normalize RNA-seq data in R
Qlucore Omics Explorer
Date Created
2011-09-19 16:52:31
Date Updated
2017-09-11 12:50:39
How can I normalize RNA-seq data in R to fit Qlucore Omics Explorer?
With Qlucore Omics Explorer you can import aligned BAM files directly. During the import you choose from several normalization options such as TMM and TPM. With the NGS module you can also analyze your data with the Genome browser and filters applied to variants and read coverage.

If you prefer a R-based workflow the white paper "analyzing RNA-seq data with Qlucore Omics Explorer,Qlucore White Paper Analyzing RNA-seq data C.pdf, provides a background and description to the required steps.

We recommend that the observed gene counts are pre-processed according to the following steps before they are imported into Qlucore Omics Explorer:

Step 1: Normalize the counts for each sample by dividing with a correction factor based on the sequencing depth of the sample. The most straight-forward estimate of sequencing depth is obtained by the total number of mapped reads. However, this measure can have serious drawbacks if the pool of expressed RNA differs between two samples and an additional correction factor, called “TMM” (trimmed mean of M values) can be estimated as described by Robinson and Oshlack (2010).

Step 2: Normalize the observed count for each gene by dividing with the length of the gene (the number of nucleotides).

To make this easy you can download the R script, RNAseqNormalization.R, to be used in R. By using the function Normalize.RNAseqCounts the two steps above will be handled.

The script assumes that the counts are stored in a p x N-matrix and that the length of the respective genes are stored in the vector "TranscriptLengths". If this vector is absent all genes will be assumed to have equal length.

In the Bioconductor-package "GenomicFeatures" more information about reference genomes and gene annotations can be found and in the package "Rsamtools" are tools required to read BAM(SAM)-files included. The "Rsamtools" also includes functionality to retrieve the number of counts per gene. The MapQuantifyRNAseqCounts.pdf document will guide you in the generation of gene length from a reference genome.

To convert data in R to .gedata see Convert from R to Qlucore data file format

Note that this script is developed as a support to users of Qlucore Omics Explorer. It is not part of the Qlucore Omics Explorer software and is not tested in the same way. The responsibility for data correctness is with the user.
Related articles
Was this information helpful?
Back to Search Results