MicrobiomeStatPlot | 热图结合柱状图展示组间差异Heatmap bar plot tutorial

学术 2024-11-07 07:02 广东

简介

在微生物组分析过程中，常常有多组之间的比较，用常用的柱状图、箱线图等不能同时展现多个物种多组的差异，可以用热图展示微生物物种的组间差异显著性，并用不同的颜色确定物种是显著上升还是下降。同时为了配合显著性差异热图，可以用柱状图展示微生物物种在每个分组中的流行率和相对丰度。

标签：#微生物组数据分析 #MicrobiomeStatPlot #热图结合柱状图展示组间差异 #R语言可视化 #Heatmap bar plot

作者：First draft(初稿)：Defeng Bai(白德凤)；Proofreading(校对)：Ma Chuang(马闯) and Jiani Xun(荀佳妮)；Text tutorial(文字教程)：Defeng Bai(白德凤)

源代码及测试数据链接：

https://github.com/YongxinLiu/MicrobiomeStatPlot/项目中目录 3.Visualization_and_interpretation/HeatmapBarPlot

或公众号后台回复“MicrobiomeStatPlot”领取

热图结合柱状图展示组间差异分析案例

这是Damian R. Plichta团队2023年发表于Nature Microbiology上的一篇论文，论文题目为：Centenarians have a diverse gut virome with the potential to modulate metabolism and promote healthy lifespan. https://doi.org/10.1038/s41564-023-01370-6

图 3a | 百岁老人 (CE)、老年人和年轻参与者之间按细菌宿主分组的 vOTU 种群相对病毒丰度的变化（在物种水平上）。

热图显示差异丰富的 vOTU 种群（Wilcoxon 秩和检验，双侧，*Padj < 0.05 FDR 校正）。只有在 CE vs 老年人或 CE vs 年轻人群中发现富集或耗尽的 vOTU 种群，例如与 C. scindens 或 F. prausnitzii 相关的种群，才会被纳入最终的热图。色标表示 MaAslin2 中一般线性模型的 beta 系数，表示富集程度（红色）或耗尽程度（蓝色）。vOTU 按其流行程度（水平条形面板）排序，其中与 C. scindens 或 A. muciniphila 相关的病毒在 >40% 的 CE 样本中检测到，在 <10% 的年轻成人样本中检测到。

结果

使用病毒宿主标签，我们在物种水平上鉴定出与 Alistipes、Parabacteroides、Clostridium、Eggerthella、Ruminococcus 和 Akkermansia 相关的多发性（WRST，双侧，Padj < 0.05）病毒群（图 3a）。相反，我们发现百岁老人微生物群中拟杆菌和粪杆菌病毒显著减少（WRST，双侧，Padj < 0.05）（图 3a、c），这与这些先前描述的共生细菌的相对减少相吻合。

R语言实战

源代码及测试数据链接：

https://github.com/YongxinLiu/MicrobiomeStatPlot/

或公众号后台回复“MicrobiomeStatPlot”领取

软件包安装

# 基于CRAN安装R包，检测没有则安装 Installing R packages based on CRAN and installing them if they are not detectedp_list = c("ggplot2", "patchwork", "dplyr", "reshape2", "ggprism", "plyr",           "magrittr","ggfun","cowplot" )for(p in p_list){if (!requireNamespace(p)){install.packages(p)}    library(p, character.only = TRUE, quietly = TRUE, warn.conflicts = FALSE)}
# 加载R包 Loading R packagessuppressWarnings(suppressMessages(library(ggplot2)))suppressWarnings(suppressMessages(library(patchwork)))suppressWarnings(suppressMessages(library(dplyr)))suppressWarnings(suppressMessages(library(reshape2)))suppressWarnings(suppressMessages(library(ggprism)))suppressWarnings(suppressMessages(library(plyr)))suppressWarnings(suppressMessages(library(magrittr)))suppressWarnings(suppressMessages(library(ggfun)))suppressWarnings(suppressMessages(library(cowplot)))

计算物种流行率

A组、B组和C组流行率计算

# 计算流行率# Load datadata_fungi <- read.table("data/species_count_data.txt", sep = "\t", header = TRUE, check.names = FALSE)design_A <- read.table("data/design_A.txt", sep = "\t", header = TRUE, row.names = 1)design_B <- read.table("data/design_B.txt", sep = "\t", header = TRUE, row.names = 1)design_C <- read.table("data/design_C.txt", sep = "\t", header = TRUE, row.names = 1)difference_species <- read.table("data/Difference_species.txt", sep = "\t", header = TRUE, row.names = 1)
# Preprocess datarownames(data_fungi) <- data_fungi$Speciesdata_fungi <- data_fungi[, -1] %>% apply(2, function(x) x / sum(x))data_fungi02 <- as.data.frame(t(data_fungi))
# Function to process each design groupprocess_design <- function(design, data_fungi02, all_counts_value) {  data_fungi_group <- data_fungi02[rownames(data_fungi02) %in% rownames(design), ] %>% t() %>% as.data.frame()    zero_counts <- rowSums(data_fungi_group == 0)  data_species2 <- data_fungi_group %>%    mutate(zero_counts = zero_counts,           sample_percent = round(1 - zero_counts / all_counts_value, 6))
  data_species3 <- data_species2[rownames(data_species2) %in% rownames(difference_species), ]  return(data_species3)}
# Process each design group and write to CSVdata_species3_C <- process_design(design_A, data_fungi02, 30)write.csv(data_species3_C, "data/data_species3_prevalence_A.csv")
data_species3_E <- process_design(design_B, data_fungi02, 35)write.csv(data_species3_E, "data/data_species3_prevalence_B.csv")
data_species3_Y <- process_design(design_C, data_fungi02, 50)write.csv(data_species3_Y, "data/data_species3_prevalence_C.csv")

热图+柱状图

# 载入数据# Load datadata <- read.table("data/Difference_species2.txt",header = TRUE,row.names = 1,sep = "\t")data[which(data$FDR<0.05),'sig'] <- '*'# 排序# Set orderdata$Species2 = factor(data$Species, levels = unique(data$Species))data = data %>%  mutate(Species2 = ordered(Species2,                         levels=c("Species21",                                  "Species20","Species19",                                  "Species18","Species17",                                  "Species16","Species15","Species14",                                  "Species13",                                  "Species12","Species11","Species10",                                  "Species09","Species08","Species07",                                   "Species06","Species05","Species04",                                  "Species03","Species02","Species01"                                  )))# 绘图# Plotp1 <- ggplot(data, aes(Group,Species2)) +    geom_tile(aes(fill=B_coef), color="white") +      geom_text(aes(label=sig), color="black", size=6,vjust = 0.8) +   scale_fill_gradient2(low='#0085c1', high='red',mid = 'white', limit=c(-1.3,1.3),name=paste0("B-coef.")) +  labs(x=NULL,y=NULL) +    theme_classic()+  theme(axis.text.x = element_text(size=10,angle = 45,hjust = 1,color = "black"),                    axis.text.y = element_text(size=10,color = "black"),         axis.line.x = element_blank(),        axis.line.y = element_blank(),        axis.ticks.length = unit(2.0, "mm"),        panel.background=element_blank(),        legend.position = "left")#p1#ggsave(paste("results/age_fheatmap01",".pdf", sep=""), p1, width=89 * 1.5, height=180 * 1.5, unit='mm')# Bar plot prevalence# 柱状图展示每个物种的流行率imp_species <- read.table("data/Prevalence2.txt", header = TRUE, sep = "\t")imp_species <- imp_species %>%  mutate(Species2 = ordered(Species,                            levels = c("Species21", "Species20", "Species19", "Species18", "Species17",                                       "Species16", "Species15", "Species14", "Species13",                                       "Species12", "Species11", "Species10", "Species09",                                       "Species08", "Species07", "Species06", "Species05",                                       "Species04", "Species03", "Species02", "Species01")))# 获取所有唯一的组名groups <- unique(imp_species$Group)# 为每个组指定不同的颜色colors <- c("A" = "#ffad00", "B" = "#5196d5", "C" = "#00fc8d")# 使用 lapply 绘制每个组的图plots <- lapply(seq_along(groups), function(i) {  g <- groups[i]  p <- ggplot(subset(imp_species, Group == g), aes(x = Species2, y = Prevalence)) +    geom_bar(stat = "identity", fill = colors[g]) +    coord_flip() +    theme_classic() +    theme(text = element_text(family = 'sans', size = 14),          panel.grid.major = element_blank(),          panel.grid.minor = element_blank(),          legend.position = "none",          axis.line.x = element_blank(),          axis.line.y = element_blank(),          axis.title.y = element_blank(),          axis.text.x = element_text(size = 10, angle = 45, hjust = 1, color = "black"),          axis.text.y = element_blank()          ) +    geom_hline(yintercept = c(0.25, 0.5, 0.75, 1), colour = 'black', lwd = 0.36, linetype = "dashed") +    #scale_y_continuous(expand = expansion(0, 0), limits = c(0, 1), breaks = c(0, 0.25, 0.5, 0.75, 1)) +    #labs(title = paste("Prevalence for", g))  facet_set(label=paste(g))    if (i != 1) {    p <- p + theme(axis.text.y = element_blank(),                   axis.ticks.y = element_blank())  }    return(p)})# 使用 plot_grid 将三个图放在一起，并确保宽度一致final_plot <- plot_grid(plotlist = plots, ncol = 3, rel_widths = c(1, 1, 1))# 显示最终图形#print(final_plot)library(cowplot)p2 <- ggdraw() +  draw_plot(p1, 0, 0.01, .35, 0.945)+  draw_plot(final_plot, 0.350, -0.01, .58, 1.002)#p2pdf("results/age_fungi_heatmap_ra_bar1.pdf", height = 7.2, width = 8)p2dev.off()#> png #>   2# Bar plot relative abundance# 柱状图展示每个物种的相对丰度# 如果要按照相对丰度降序排列，请自行调整imp_species <- read.table("data/Prevalence2.txt", header = TRUE, sep = "\t")imp_species <- imp_species %>%  mutate(Species2 = ordered(Species,                            levels = c("Species21", "Species20", "Species19", "Species18", "Species17",                                       "Species16", "Species15", "Species14", "Species13",                                       "Species12", "Species11", "Species10", "Species09",                                       "Species08", "Species07", "Species06", "Species05",                                       "Species04", "Species03", "Species02", "Species01")))# 获取所有唯一的组名groups <- unique(imp_species$Group)# 为每个组指定不同的颜色colors <- c("A" = "#ffad00", "B" = "#5196d5", "C" = "#00fc8d")# 使用 lapply 绘制每个组的图plots <- lapply(seq_along(groups), function(i) {  g <- groups[i]  p <- ggplot(subset(imp_species, Group == g), aes(x = Species2, y = RA)) +    geom_bar(stat = "identity", fill = colors[g]) +    coord_flip() +    theme_classic() +    theme(text = element_text(family = 'sans', size = 14),          panel.grid.major = element_blank(),          panel.grid.minor = element_blank(),          legend.position = "none",          axis.line.x = element_blank(),          axis.line.y = element_blank(),          axis.title.y = element_blank(),          axis.text.x = element_text(size = 10, angle = 45, hjust = 1, color = "black"),          axis.text.y = element_blank()          ) +    geom_hline(yintercept = c(0.05, 0.1, 0.15, 0.20), colour = 'black', lwd = 0.36, linetype = "dashed") +    #scale_y_continuous(expand = expansion(0, 0), limits = c(0, 1), breaks = c(0, 0.25, 0.5, 0.75, 1)) +    #labs(title = paste("Prevalence for", g))  facet_set(label=paste(g))    if (i != 1) {    p <- p + theme(axis.text.y = element_blank(),                   axis.ticks.y = element_blank())  }    return(p)})# 使用 plot_grid 将三个图放在一起，并确保宽度一致final_plot2 <- plot_grid(plotlist = plots, ncol = 3, rel_widths = c(1, 1, 1))# 显示最终图形#print(final_plot)library(cowplot)p3 <- ggdraw() +  draw_plot(p1, 0, 0.01, .35, 0.945)+  draw_plot(final_plot2, 0.350, -0.01, .58, 1.002)#p2pdf("results/age_fungi_heatmap_ra_bar2.pdf", height = 7.2, width = 8)p3dev.off()#> png #>   2

使用此脚本，请引用下文：

Yong-Xin Liu, Lei Chen, Tengfei Ma, Xiaofang Li, Maosheng Zheng, Xin Zhou, Liang Chen, Xubo Qian, Jiao Xi, Hongye Lu, Huiluo Cao, Xiaoya Ma, Bian Bian, Pengfan Zhang, Jiqiu Wu, Ren-You Gan, Baolei Jia, Linyang Sun, Zhicheng Ju, Yunyun Gao, Tao Wen, Tong Chen. 2023. EasyAmplicon: An easy-to-use, open-source, reproducible, and community-based pipeline for amplicon data analysis in microbiome research. iMeta 2: e83. https://doi.org/10.1002/imt2.83

宏基因组推荐

本公众号现全面开放投稿，希望文章作者讲出自己的科研故事，分享论文的精华与亮点。投稿请联系小编（微信号：yongxinliu 或 meta-genomics）

10000+：菌群分析宝宝与猫狗梅毒狂想曲提DNA发Nature

系列教程：微生物组入门 Biostar 微生物组宏基因组

专业技能：学术图表高分文章生信宝典不可或缺的人

一文读懂：宏基因组寄生虫益处进化树必备技能：提问搜索 Endnote

扩增子分析：图表解读分析流程统计绘图

16S功能预测 PICRUSt FAPROTAX Bugbase Tax4Fun

生物科普: 肠道细菌人体上的生命生命大跃进细胞暗战人体奥秘

写在后面

为鼓励读者交流快速解决科研困难，我们建立了“宏基因组”讨论群，己有国内外6000+ 科研人员加入。请添加主编微信meta-genomics带你入群，务必备注“姓名-单位-研究方向-职称/年级”。高级职称请注明身份，另有海内外微生物PI群供大佬合作交流。技术问题寻求帮助，首先阅读《如何优雅的提问》学习解决问题思路，仍未解决群内讨论，问题不私聊，帮助同行。

点击阅读原文

http://mp.weixin.qq.com/s?__biz=MzUzMjA4Njc1MA==&mid=2247513363&idx=2&sn=5793b3b569960fcd59176046634fcfb9

宏基因组

宏基因组/微生物组是当今世界科研最热门的研究领域之一，为加强本领域的技术交流与传播，推动中国微生物组计划发展，中科院青年科研人员创立“宏基因组”公众号，目标为打造本领域纯干货技术及思想交流平台。

iMeta2024 浙江编委会暨长短读长测序技术 SEQ ALL 沙龙(2025 .11.19浙大)

iMeta实时CiteScore突破20 (2024/11/5)

iMeta | 浙大儿院倪艳/傅君芬组-MetOrigin 2.0微生物代谢物的发现和溯源

南加大 David Hutchins 实验室招收博士生及博士后研究员

MPB：上海交大肖湘等-海洋微生物的厌氧高压培养实验

MicrobiomeStatPlot | 孟德尔随机化分析教程Mendelian Randomization Analysis

iMetaOmics | 重庆大学王贵学组-肠道微生物细胞外囊泡在神经退行性疾病中的新作用及其治疗策略

视频回放 | 陈连民-我国人群特征性肠道微生物遗传图谱构建及其心血管代谢调控功能解析

iMetaOmics | 刘永鑫/陈同-用于食物微生物组成和时间序列研究的微生物组数据库FoodMicroDB

Cell | 病毒学未来50年面临的机遇和挑战

iMeta | 最全可视化韦恩图和集合图工具EVenn使用手册

iMeta | Sangerbox 2: 临床生信分析平台

iMeta 讲坛14 | 陈连民-我国人群特征性肠道微生物遗传图谱构建及其心血管代谢调控功能解析

iMeta2024 北京编委会暨长短读长测序技术沙龙(11/15北林）

Horticulture Research | 安徽农业大学植物病原细菌学团队在猕猴桃微生物组与溃疡病抗性方面取得新进展

Nature Reviews Microbiology：在单细胞水平研究微生物组功能的下一代生理学方法

Nature Biotechnology：反向基因组学对未培养细菌的定向分离与培养

npj biofilms microbiomes | 南京林业大学樊奔组发现细菌磷素平衡新型调控分子

MPB | 林科院袁志林组-枫香-真菌互作培养体系构建

MicrobiomeStatPlot | 蒙特尔检验相关性热图教程Mantel test correlation heatmap

iMeta | 被引超5200次，发文225篇，平均引用23.19，百引耗时7天(2024/11/10)

Nat Rev Microbiol | 深大周之超等综述微生物硫代谢的多样性及生态

MicrobiomeStatPlot | 棒棒糖图教程Lollipop Chart tutorial

MPB | 北林袁峥嵘组-16S扩增子分析中常用软件及数据库应用现状

视频回放 | 陈汉清-消化系统疾病危险因素及干预策略

喜报 | iMeta被SCIE、ESI等数据库正式收录!(2024/11/8)

MicrobiomeStatPlot | 线图教程Line plot tutorial

MPB | 中科院城环所杨军组-基于DNA宏条形码的水体微型真核生物群落测序建库方法

Nature综述 | 食源性细菌病原体:基于基因组的方法应对复杂和不断变化的世界中持续出现和新出现的威胁

喜讯 | 高起点新刊Engineering Microbiology 被 PubMed Central (PMC) 收录！！

MPB | 李文均团队-盐湖微生物的分离培养及保藏方法

iMeta | 中国海洋大学张伟鹏组-海洋生物被膜抗菌肽

MicrobiomeStatPlot | 热图结合柱状图展示组间差异Heatmap bar plot tutorial

iMeta 2024湖北区编委会暨2024 SEQ ALL系列线下沙龙首站顺利召开

ISME | 中科院版纳园庞志强/徐鹏提出植物根系分泌物-功能微生物组互作研究的新模型

MicrobiomeStatPlot | 分层关联分析Halla analysis

Microbiome|韦中/吴凤芝团队揭示挥发物介导的种间植物相互作用通过诱导根分泌物的变化促进有益细菌的根定植

MicrobiomeStatPlot | 广义估计方程分析Generalizedestimatingequation

iMeta 讲坛13 | 陈汉清-消化系统疾病危险因素及干预策略

ISME：利用基因组信息预测细菌的环境偏好

iMeta最终2023影响因子23.8 (附中国期刊最新影响因子)

MicrobiomeStatPlot | 频率分布直方图教程Frequency distribution histogram

MPB | 中农戴兆来组-猪肠道微生物样品的采集与核酸提取

【2025肠道大会】征稿正式启动！

iMeta成都-四川分舵线下编委会(11.4下午成都大学)

MPB：中农戴兆来组-猪肠道微生物的体外培养与功能研究

MicrobiomeStatPlot | 森林图教程Forest plot tutorial

视频回放 | 陈程杰-“接地气的”生物软件开发与社区维护-“用户视角”

MPB：陈同等-ImageGP在微生物组可视化中的应用

分类

时事

民生

政务

教育

文化

科技

财富

体娱

健康

情感

旅行

百科

职场

楼市

企业

乐活

学术

汽车

时尚

创业

美食

幽默

美体

文摘

原创标签

时事社会财经军事教育体育科技汽车科学房产搞笑综艺明星音乐动漫游戏时尚健康旅游美食生活摄影宠物职场育儿情感小说曲艺文化历史三农文学娱乐电影视频图片新闻宗教电视剧纪录片广告创意壁纸头像心灵鸡汤星座命理教育培训艺术文化金融财经健康医疗美妆时尚餐饮美食母婴育儿社会新闻工业农业时事政治星座占卜幽默笑话独立短篇连载作品文化历史科技互联网

发布位置

广东北京山东江苏河南浙江山西福建河北上海四川陕西湖南安徽湖北内蒙古江西云南广西甘肃辽宁黑龙江贵州新疆重庆吉林天津海南青海宁夏西藏香港澳门台湾美国加拿大澳大利亚日本新加坡英国西班牙新西兰韩国泰国法国德国意大利缅甸菲律宾马来西亚越南荷兰柬埔寨俄罗斯巴西智利卢森堡芬兰瑞典比利时瑞士土耳其斐济挪威朝鲜尼日利亚阿根廷匈牙利爱尔兰印度老挝葡萄牙乌克兰印度尼西亚哈萨克斯坦塔吉克斯坦希腊南非蒙古奥地利肯尼亚加纳丹麦津巴布韦埃及坦桑尼亚捷克阿联酋安哥拉

MicrobiomeStatPlot | 热图结合柱状图展示组间差异Heatmap bar plot tutorial

一站式论文提升服务，助您顺利发高分论文！

猜你喜欢

写在后面