|One way to handle normalization using housekeeping genes is to compute a 'normalization factor' for each sample using the housekeeping genes, and then apply this factor also to the remaining genes (note that this assumes that all genes are affected in the same way by any external factors).|
1. Select a reference sample. This could be one of the actual samples, or a sample constructed as the average over all samples.
2. Use the housekeeping genes to compute a normalization factor for each of the samples relative to the reference sample. The factor could e.g. be the average over the housekeeping genes for the sample minus the average over the housekeeping genes for the reference sample.
3. Normalize all of the genes in each of the samples by e.g. subtracting the normalization factor computed in step 2.
The scale of the data determines whether it is preferable to subtract or divide by the normalization factor in step 3 (and also whether the factor in step 2 should be computed using subtraction or division). For data on an 'exponential' scale (e.g. counts, or other data with a skewed distribution and only positive values etc) divison is the proper choice, while for 'normally' distributed data (symmetrically distributed, with both positive and negative values) subtraction is a better approach.