Read below for more information on annotation and enrichment analysis:

Annotation of genes is a crucial part of any gene expression analysis. The raw output of an expression analysis is usually not easy to interpret because it does not include gene names. By performing annotation, the expression measurements are matched with gene names in databases. This couples the results with existing biological knowledge to aid in the interpretation.

If the experiment is based on RNA-seq, micro-array or any other method that includes a nearly exhaustive number of genes or proteins, the dataset become very large. So large, that even after annotation, interpretation can be quite overwhelming. Enrichment analysis is a great tool to help in this process (Subramanian et al., 2005).

Expression analysis often includes a test for differential to investigate which genes are expressed at different levels in cases and controls. With a long list of differentially expressed genes, Gene Set Enrichment Analysis (GSEA) (Mootha et al., 2003) can help to identify what these genes have in common. GSEA can be used to test if a list of genes is enriched for genes encoding proteins in a certain pathway, genes from a certain chromosomal region or genes that encode proteins which fall in the same category e.g. membrane transport. The test part is performed by testing if there are more genes from a certain pathway or category than would be expected by chance. The assumptions of the test require the number of genes to be very large, so enrichment analysis is only suitable for experiments using methods like RNA-seq or microarray as mentioned above. Annotation, however, is useful for any number of genes. (Reimand et al., 2019)




