Basic Workflow
This is a minimal example. For details, please see following articles on the left of website.
Basic enrichment analysis involves performing a batch “dictionary lookup” for a set of genes to determine their associations. The most commonly used method is over-representation analysis (ORA). If you also have information on weights, then Gene Set Enrichment Analysis (GSEA) is another classic choice. After getting enriched result, a re-enrich analysis may help you to get more insightful result.
ORA
library(dplyr)
library(tibble)
1library(org.Hs.eg.db)
library(gt)
library(EnrichGT)
library(readr)
DEGexample <- read_csv("~/Documents/4Fun/EGTFun/DEG.csv")
DEGexample2 <- DEGexample |> dplyr::filter(pvalue<0.05)
DEGexample_UpReg <- DEGexample |> dplyr::filter(pvalue<0.05,log2FoldChange>0.7)
DEGs <- DEGexample_UpReg$...1
# The first example:
ora_result <- egt_enrichment_analysis(
2 genes = DEGs,
3 database = database_Reactome(OrgDB = org.Hs.eg.db)
)
# The second example:
another_example <- egt_enrichment_analysis(
4 genes = genes_with_weights(DEGexample2$...1,DEGexample2$log2FoldChange),
5 database = database_GO_BP(org.Hs.eg.db))
# Ploting
6egt_plot_results(ora_result,showIDs = T,ntop = 20)- 1
-
EnrichGT use
AnnotationDbifor fetching most of databases and gene annotations. If throwing an error, please re-check this step; - 2
-
ORA just need 2 input. The first is a character vector containg gene symbols like
c("TP53","PLP1","FABP1","VCAM1"); - 3
- The second is your favourite database. EnrichGT supports many. See Database usage;
- 4
- Or you want input genes with direction. See enrichment details;
- 5
- Enriching using Gene Ontology BP;
- 6
- Showing enrichment result as figure. See more in visualization help.
Re-enrichment
This support the ORA output from both EnrichGT and clusterProfiler. So you can use your favourite tool to achieve this.
1re_enrichment_results <- egt_recluster_analysis(ora_result)
2re_enrichment_results
3egt_plot_results(re_enrichment_results,ntop = 3)
4egt_plot_umap(re_enrichment_results)
5re_enrichment_results |> egt_infer_act(DB = "collectri", species = "human")- 1
- Doing re-enrichment analysis (See Re-enrichment usage);
- 2
- Showing GT HTML report, see more in visualization help;
- 3
- Viewing re-enrichment result as dot plot, see more in visualization help;
- 4
- Viewing re-enrichment result as UMAP plot, see more in visualization help;
- 5
- Infering TF/pathway activity, experimential, might not be correct, see more in re-enrichment help.
LLM-based annotations
See more in large language models integration
library(ellmer)
dsAPI <- "sk-**********"
1chat <- chat_deepseek(api_key = dsAPI, model = "deepseek-chat", system_prompt = "")
re_enrichment_results <- egt_llm_summary(re_enrichment_results, chat)
2egt_summary(re_enrichment_results, 1)
3summary_with_knowledge <- egt_llm_summary(re_enrich, chat, background_knowledges = paste(readLines("paperReference.txt"), collapse = " "))
4LLMcompare <- egt_llm_multi_summary(re_enrich, chat_list = list(`deepseek-v3.2-exp` = deepseekAI, `qwen3-max` = QwenAI), background_knowledges = paste(readLines("paperReference.txt"), collapse = " "))- 1
-
In this step we create a DeepSeek interface. We use
ellmerfor LLM support. For details please refer to ellmer tidyverse website. It provides a uniform interface for most of LLMs in R. - 2
-
Display annotation by LLM of
cluster 1.
- 3
-
You can use
background_knowledgesto add background knowledge for LLMs to refer (e.g, papers, your opinions …). - 4
- EnrichGT enable multi-LLM annotation results comparasion.
GSEA
If you have a set of genes with known weight like log2FC, or the loading from PCA or NMF…
GSEAexample <- egt_gsea_analysis(
genes =
1 genes_with_weights(DEGexample2$...1,DEGexample2$log2FoldChange),
database =
2 database_from_gmt("gmt_file.gmt")
)
3egt_plot_results(GSEAexample)
egt_plot_gsea(GSEAexample$Description[1],
genes = genes_with_weights(genes = DEGexample2$...1,
weights = DEGexample2$log2FoldChange),
4 database = database_GO_BP(org.Hs.eg.db))
5egt_plot_gsea(GSEAexample$Description[1:10,],
genes = genes_with_weights(genes = DEGexample2$...1,
weights = DEGexample2$log2FoldChange),
6 database = database_GO_BP(org.Hs.eg.db))
7re_enrichment_results <- egt_recluster_analysis(GSEAexample)- 1
-
GSEA is also supported, see enrichment details.
genes_with_weights()is used to generate weighted genes; - 2
- Additional pathway information like GMT file or table is also supported. See Database usage.
- 3
- Showing basic bar plot for results, see more in visualization help;
- 4
- Single ranking plot, see more in visualization help;
- 5
- Please subset it to avoid too many results and the waste of time;
- 6
- Showing table-like result in figure, see more in visualization help.
- 7
- GSEA also supported in re-cluster analysis.
Fusing results
This function support the ORA output from both EnrichGT and clusterProfiler. So you can use your favourite tool to achieve this.
# Fusing results:
1ora_result_A <- egt_enrichment_analysis(genes = DEGs, database = database_Reactome(OrgDB = org.Hs.eg.db))
2ora_result_B <- egt_enrichment_analysis(genes = DEGs, database = database_kegg(kegg_organism = "hsa",OrgDB = org.Hs.eg.db))
3fused_result <- egt_recluster_analysis(list(ora_result_A,ora_result_B))- 1
- ORA result 1 with Reactome;
- 2
- ORA result 2 with KEGG;
- 3
- Fuse together.
Only same source of enrichment results can be merge.
For example, the result of
egt_enrichment_analysis(genes = DEGs, database = database_Reactome(OrgDB = org.Hs.eg.db))and
egt_enrichment_analysis(genes = DEGs, database = database_kegg(OrgDB = org.Hs.eg.db))can be merged,
but the result of
egt_enrichment_analysis(genes = DEGs, database = database_Reactome(OrgDB = org.Hs.eg.db))and
egt_enrichment_analysis(genes = genes_with_weights(DEGexample2$...1,DEGexample2$log2FoldChange), database = database_kegg(OrgDB = org.Hs.eg.db))CAN’T be merged.
If you have ORA results that want to compare, please use egt_compare_groups().
Gene Annotation Converter
1IDs_of_genes <- convert_annotations_genes(DEGexample$...1[1:10], from_what="SYMBOL", to_what=c("ENTREZID","ENSEMBL","GENENAME"), OrgDB=org.Hs.eg.db)- 1
- Just told it from what and to what :)