Basic Workflow

This is a minimal example. For details, please see following articles on the left of website.

Basic enrichment analysis involves performing a batch “dictionary lookup” for a set of genes to determine their associations. The most commonly used method is over-representation analysis (ORA). If you also have information on weights, then Gene Set Enrichment Analysis (GSEA) is another classic choice. After getting enriched result, a re-enrich analysis may help you to get more insightful result.

ORA

library(dplyr)
library(tibble)
1library(org.Hs.eg.db)
library(gt)
library(EnrichGT)
library(readr)

DEGexample <- read_csv("~/Documents/4Fun/EGTFun/DEG.csv")
DEGexample2 <- DEGexample |> dplyr::filter(pvalue<0.05)
DEGexample_UpReg <- DEGexample |> dplyr::filter(pvalue<0.05,log2FoldChange>0.7)
DEGs <- DEGexample_UpReg$...1
# The first example: 
ora_result <- egt_enrichment_analysis(
2                genes = DEGs,
3                database = database_Reactome(OrgDB = org.Hs.eg.db)
                )
# The second example: 
another_example <- egt_enrichment_analysis(
4                genes = genes_with_weights(DEGexample2$...1,DEGexample2$log2FoldChange),
5                database = database_GO_BP(org.Hs.eg.db))
# Ploting
6egt_plot_results(ora_result,showIDs = T,ntop = 20)
1
EnrichGT use AnnotationDbi for fetching most of databases and gene annotations. If throwing an error, please re-check this step;
2
ORA just need 2 input. The first is a character vector containg gene symbols like c("TP53","PLP1","FABP1","VCAM1");
3
The second is your favourite database. EnrichGT supports many. See Database usage;
4
Or you want input genes with direction. See enrichment details;
5
Enriching using Gene Ontology BP;
6
Showing enrichment result as figure. See more in visualization help.

Re-enrichment

This support the ORA output from both EnrichGT and clusterProfiler. So you can use your favourite tool to achieve this.

1re_enrichment_results <- egt_recluster_analysis(ora_result)
2re_enrichment_results
3egt_plot_results(re_enrichment_results,ntop = 3)
4egt_plot_umap(re_enrichment_results)
5re_enrichment_results |> egt_infer_act(DB = "collectri", species = "human")
1
Doing re-enrichment analysis (See Re-enrichment usage);
2
Showing GT HTML report, see more in visualization help;
3
Viewing re-enrichment result as dot plot, see more in visualization help;
4
Viewing re-enrichment result as UMAP plot, see more in visualization help;
5
Infering TF/pathway activity, experimential, might not be correct, see more in re-enrichment help.

LLM-based annotations

See more in large language models integration

library(ellmer)
dsAPI <- "sk-**********" 
1chat <- chat_deepseek(api_key = dsAPI, model = "deepseek-chat", system_prompt = "")
re_enrichment_results <- egt_llm_summary(re_enrichment_results, chat)
2egt_summary(re_enrichment_results, 1)
3summary_with_knowledge <- egt_llm_summary(re_enrich, chat, background_knowledges = paste(readLines("paperReference.txt"), collapse = " "))
4LLMcompare <- egt_llm_multi_summary(re_enrich, chat_list = list(`deepseek-v3.2-exp` = deepseekAI, `qwen3-max` = QwenAI), background_knowledges = paste(readLines("paperReference.txt"), collapse = " "))
1
In this step we create a DeepSeek interface. We use ellmer for LLM support. For details please refer to ellmer tidyverse website. It provides a uniform interface for most of LLMs in R.
2
Display annotation by LLM of cluster 1.
3
You can use background_knowledges to add background knowledge for LLMs to refer (e.g, papers, your opinions …).
4
EnrichGT enable multi-LLM annotation results comparasion.

GSEA

If you have a set of genes with known weight like log2FC, or the loading from PCA or NMF

GSEAexample <- egt_gsea_analysis(
            genes = 
1              genes_with_weights(DEGexample2$...1,DEGexample2$log2FoldChange),
            database = 
2              database_from_gmt("gmt_file.gmt")
            )
3egt_plot_results(GSEAexample)
egt_plot_gsea(GSEAexample$Description[1],
              genes = genes_with_weights(genes = DEGexample2$...1, 
                                              weights = DEGexample2$log2FoldChange),
4              database = database_GO_BP(org.Hs.eg.db))
5egt_plot_gsea(GSEAexample$Description[1:10,],
              genes = genes_with_weights(genes = DEGexample2$...1, 
                                              weights = DEGexample2$log2FoldChange),
6              database = database_GO_BP(org.Hs.eg.db))
7re_enrichment_results <- egt_recluster_analysis(GSEAexample)
1
GSEA is also supported, see enrichment details. genes_with_weights() is used to generate weighted genes;
2
Additional pathway information like GMT file or table is also supported. See Database usage.
3
Showing basic bar plot for results, see more in visualization help;
4
Single ranking plot, see more in visualization help;
5
Please subset it to avoid too many results and the waste of time;
6
Showing table-like result in figure, see more in visualization help.
7
GSEA also supported in re-cluster analysis.

Fusing results

This function support the ORA output from both EnrichGT and clusterProfiler. So you can use your favourite tool to achieve this.

# Fusing results: 
1ora_result_A <- egt_enrichment_analysis(genes = DEGs, database = database_Reactome(OrgDB = org.Hs.eg.db))
2ora_result_B <- egt_enrichment_analysis(genes = DEGs, database = database_kegg(kegg_organism = "hsa",OrgDB = org.Hs.eg.db))
3fused_result <- egt_recluster_analysis(list(ora_result_A,ora_result_B))
1
ORA result 1 with Reactome;
2
ORA result 2 with KEGG;
3
Fuse together.
Important

Only same source of enrichment results can be merge.

For example, the result of

egt_enrichment_analysis(genes = DEGs, database = database_Reactome(OrgDB = org.Hs.eg.db))

and

egt_enrichment_analysis(genes = DEGs, database = database_kegg(OrgDB = org.Hs.eg.db))

can be merged,

but the result of

egt_enrichment_analysis(genes = DEGs, database = database_Reactome(OrgDB = org.Hs.eg.db))

and

egt_enrichment_analysis(genes = genes_with_weights(DEGexample2$...1,DEGexample2$log2FoldChange), database = database_kegg(OrgDB = org.Hs.eg.db))

CAN’T be merged.

If you have ORA results that want to compare, please use egt_compare_groups().

Gene Annotation Converter

1IDs_of_genes <- convert_annotations_genes(DEGexample$...1[1:10], from_what="SYMBOL", to_what=c("ENTREZID","ENSEMBL","GENENAME"), OrgDB=org.Hs.eg.db)
1
Just told it from what and to what :)
Back to top