This lesson is still being designed and assembled (Pre-Alpha version)

RNA-seq: Glossary

Key Points

Workshop Overview
  • RNA-seq is a commonly used technology for profiling the transcriptome.

  • There are a number of different applications for RNA-seq data - we’ll be looking at detecetign differentially expressed genes using data from an experiment involving RNA-seq data from yeast.

Quality control of the sequencing data.
  • Quality assessmet is a key initial step in the analysis of RNA-seq data

  • The FastQC and MultiQC applications are useful tools for quality assessent of RNA-seq data.

Trimming and Filtering reads
  • Adapter removal and trimming (optional) are important steps in processign RNA-seq data.

Map and count reads
  • Alignment and feature counting are used to generate read counts for each genomic feature (e.g., genes) of interest, per sample.

  • The count data can then be used for stataistical analysis (e.g., to identify differentially expressed genes).

Differential Expression
  • Statistical analysis is required to identify genes exhibiting altered expression between experimental conditions.

  • The limma processing pipeline is a fairly standard (and robust) way to do this.

  • DESeq2 and edgeR offer alternative methods for identifying differetially expressed genes.

Overrepresentation analysis (Gene Ontology)
  • Coordinated changes in groups of functionally related genes can tell us about the underlying biological mechanisms that changing between exprimental conditions.

  • The characteristics of RNA-seq experiments mean that gene-length correction is required, to avoid standard approaches to over-representation analysis givign erroneous results.

Glossary

FIXME