|Normalize RNA-seq data in R|
|How can I normalize RNA-seq data in R to fit Qlucore Omics Explorer?|
|With Qlucore Omics Explorer you can import aligned BAM files directly. During the import you choose from several normalization options such as TMM and TPM. With the NGS module you can also analyze your data with the Genome browser and filters applied to variants and read coverage. |
If you prefer a R-based workflow the white paper "analyzing RNA-seq data with Qlucore Omics Explorer,Qlucore White Paper Analyzing RNA-seq data C.pdf, provides a background and description to the required steps.
We recommend that the observed gene counts are pre-processed according to the following steps before they are imported into Qlucore Omics Explorer:
Step 1: Normalize the counts for each sample by dividing with a correction factor based on the sequencing depth of the sample. The most straight-forward estimate of sequencing depth is obtained by the total number of mapped reads. However, this measure can have serious drawbacks if the pool of expressed RNA differs between two samples and an additional correction factor, called “TMM” (trimmed mean of M values) can be estimated as described by Robinson and Oshlack (2010).
Step 2: Normalize the observed count for each gene by dividing with the length of the gene (the number of nucleotides).
To make this easy you can download the R script, RNAseqNormalization.R, to be used in R. By using the function Normalize.RNAseqCounts the two steps above will be handled.
The script assumes that the counts are stored in a p x N-matrix and that the length of the respective genes are stored in the vector "TranscriptLengths". If this vector is absent all genes will be assumed to have equal length.
In the Bioconductor-package "GenomicFeatures" more information about reference genomes and gene annotations can be found and in the package "Rsamtools" are tools required to read BAM(SAM)-files included. The "Rsamtools" also includes functionality to retrieve the number of counts per gene. The MapQuantifyRNAseqCounts.pdf document will guide you in the generation of gene length from a reference genome.
To convert data in R to .gedata see Convert from R to Qlucore data file format
Note that this script is developed as a support to users of Qlucore Omics Explorer. It is not part of the Qlucore Omics Explorer software and is not tested in the same way. The responsibility for data correctness is with the user.
- Analyzing Flow Cytometry data
- Anova in Qlucore
- Computation of R/R2 statistics
- Convert from R to Qlucore data file format
- Find favorite genes, proteins or variables using the search function
- How to import 10X single cell data
- How to import data (Affymetrix, Illumina, 10x, Agilent, Wizard, tab separated, csv, txt, RNA-seq, bam)
- Loading and creating annotations
- Low expression levels RNA-seq data filtering
- Multi group comparison and ANOVA
- Open interface to R (API)
- RNAseq and array technologies
- Statistical tests for the Extended option using the Open R API: Welch, Wilcoxon, Limma, Mann-Whitney
- Using Limma and Extended Statistics
- What data is used in the Two group comparison (t-test)
- Working with RNA-seq data (bam)
|Back to Search Results|