MicrobiomeStatPlot | 边绑定图教程Edge Bundling Plot

学术   2024-10-18 07:01   中国香港  

边绑定图简介

什么是Edge Bundling图?

参考:

https://mp.weixin.qq.com/s/YYvb25LRzpZfB5pTIBSygw

Edge Bundling图是一种数据可视化方式,用于展示不同节点之间的联系。与普通的network的差别在于,它使用曲线来展示节点间的连接,而非直线,并会把相同趋势的曲线捆绑再一起,就像整理数据线的“环”。因此在关联较多的情况下,这一类可视化的方式可能有利于展现趋势的变化,而不会显得复杂而混乱。

Edge Bundling图除了环形,还有多种形式。在微生物组领域因为存在物种分类的信息,所以可能环形的排布方式能够呈现出更加直观的规律。

标签:#微生物组数据分析  #MicrobiomeStatPlot  #边绑定图  #R语言可视化 #Edge bundling plot

作者:First draft(初稿):Defeng Bai(白德凤);Proofreading(校对):Ma Chuang(马闯) and Jiani Xun(荀佳妮);Text tutorial(文字教程):Defeng Bai(白德凤)

源代码及测试数据链接:

https://github.com/YongxinLiu/MicrobiomeStatPlot/项目中目录 3.Visualization_and_interpretation/EdgeBundlingPlot

或公众号后台回复“MicrobiomeStatPlot”领取


边绑定图应用案例

下图是来自于中山大学肿瘤防治中心Liu Na团队2022年发表于Jama Oncology(Qiao et al., 2022)上的一篇论文附件中的图。论文题目为:Association of Intratumoral Microbiota With Prognosis in Patients With Nasopharyngeal Carcinoma From 2 Hospitals in China.https://doi.org/10.1001/jamaoncol.2022.2810

图 17 |  基于微生物组和差异免疫表达基因的Spearman相关系数的网络分析。

绿色圆点代表微生物,六边形代表目标基因(调整后的Benjamini-Hochberg P < .05),六边形内的黑色三角形代表每个基因所涉及的免疫功能。边的颜色表示微生物节点和基因节点的Spearman相关性。BCR表示B细胞受体;TCR表示T细胞受体。

结果

进一步注释了差异免疫基因集,并分析了它们与鼻咽癌肿瘤内细菌的相关性。结果显示,大多数与免疫相关的基因(如CXCL13)的表达水平与肿瘤内细菌(如牙龈卟啉单胞菌)的丰度呈负相关(见补充材料1中的图eFigure 17)。

边绑定图R语言实战

源代码及测试数据链接:

https://github.com/YongxinLiu/MicrobiomeStatPlot/

或公众号后台回复“MicrobiomeStatPlot”领取

软件包安装

# 基于CRAN安装R包,检测没有则安装p_list = c("psych", "magrittr", "reshape2", "dplyr", "readxl", "igraph", "tidyverse", "RColorBrewer", "ggraph")for(p in p_list){if (!requireNamespace(p)){install.packages(p)}    library(p, character.only = TRUE, quietly = TRUE, warn.conflicts = FALSE)}
# 加载R包 Load the packagesuppressWarnings(suppressMessages(library(psych)))suppressWarnings(suppressMessages(library(magrittr)))suppressWarnings(suppressMessages(library(reshape2)))suppressWarnings(suppressMessages(library(dplyr)))suppressWarnings(suppressMessages(library(readxl)))suppressWarnings(suppressMessages(library(igraph)))suppressWarnings(suppressMessages(library(tidyverse)))suppressWarnings(suppressMessages(library(RColorBrewer)))suppressWarnings(suppressMessages(library(ggraph)))

实战

这里实现两种类型对象数据进行Spearman相关性检验,并通过筛选符合条件的r和p值数据,利用Edge Bundling图展示正负相关性,通过连线的粗细展示相关性的强弱。 

#1.加载数据#1.Load datadata5 <- read.csv("data/data5_used.csv", row.names = 1, header = TRUE, check.names = FALSE)#2.计算Spearman相关系数并调整p值#2. Calculate the Spearman correlation coefficient and adjust the p-valueCor_selected <- corr.test(data5, method="spearman", adjust="BH")Cor <- as.data.frame(Cor_selected$r)# 准备相关性数据# Prepare correlation datar.cor <- data.frame(Cor_selected$r)[1:36, 37:78]#3.创建数据框并设置对角块为零#3. Create a data frame and set the diagonal blocks to zerodata <- Cor %>% as.data.frame()data[1:36, 1:36] <- 0data[37:78, 37:78] <- 0# 添加id列用于melt# Add id column for meltdata$id <- colnames(data)#4.准备用于绘图的相关性Connect数据#4. Prepare the correlation data for plotting# 定义相关性计算函数,这里只计算组件对象的相关性,不计算组内对象的相关性# Define the correlation calculation function. Here, only the correlation of component objects is calculated, not the correlation of objects within the group.calculate_correlations <- function(data, species_index, start_col, end_col) {  rows <- end_col - start_col + 1  Correlations <- data.frame(    variable = character(length = rows),    correlation = numeric(length = rows),    p_adj = numeric(length = rows),    stringsAsFactors = FALSE  )  for (i in 1:rows) {    temp1 <- colnames(data5)[i + start_col - 1]    temp2 <- corr.test(data5[, species_index], data5[, i + start_col - 1], method="spearman", adjust="BH")    temp3 <- temp2$p.adj    Correlations[i, 1] <- temp1    Correlations[i, 2] <- temp2$r    Correlations[i, 3] <- temp3  }  Correlations$species <- colnames(data5)[species_index]  return(Correlations)}# 计算相关性# Calculate correlationstart_col <- 37end_col <- ncol(data5)species_count <- 36all_correlations <- list()for (species_index in 1:species_count) {  correlations <- calculate_correlations(data5, species_index, start_col, end_col)  all_correlations[[species_index]] <- correlations}# 合并所有相关性数据# Merge all related datafinal_correlations <- do.call(rbind, all_correlations)#筛选p_adj <= 0.05和r的绝对值>=0.4的数据#Filter data with p_adj <= 0.05 and absolute value of r >= 0.4final_correlations <- final_correlations[abs(final_correlations$correlation) >= 0.4 ,]final_correlations <- final_correlations[final_correlations$p_adj <= 0.05 ,]#data2 <- read.csv("species_gene_correlation_36S_42genes_selected.csv", row.names = 1, header = TRUE, check.names = FALSE)data2 <- data.frame(from = final_correlations$species, to = final_correlations$variable, r = final_correlations$correlation, p_adj = final_correlations$p_adj)cor.data2 <- data2 %>%  data.frame() %>%  mutate(    linecolor = ifelse(r > 0, "positive", "negative"),    linesize = abs(r)  )colnames(cor.data2) <- c("to", "from", "r", "FDR", "linecolor", "linesize")connect_n <- cor.data2[, c(2, 1, 3, 5, 6)]#5. 移除id列并创建分层边数据#5. Remove the id column and create hierarchical edge datadata <- data %>% select(-id)edge <- data.frame(from = 'Origin', to = colnames(data))vertices <- data.frame(name = c('Origin', as.character(edge$to)))colnames(vertices) <- "name"# 对顶点进行分组# Group the verticesvertices$group <- "group"vertices$group[1:37] <- "Species"vertices$group[38:79] <- "Gene"#6.计算角度并调整标签位置#6. Calculate the angle and adjust the label positionall_leaves <- which(is.na(match(vertices$name, edge$from)))nleaves <- length(all_leaves)vertices$id[all_leaves] <- seq(1, nleaves)vertices$angle <- 90 - 360 * vertices$id / nleavesvertices$hjust <- ifelse(vertices$angle < -90, 1, 0)vertices$angle <- ifelse(vertices$angle < -90, vertices$angle + 180, vertices$angle)#7.将连接与顶点匹配#7. Match connections to verticesfrom <- match(connect_n$from, vertices$name)to <- match(connect_n$to, vertices$name)connect1 <- connect_n[connect_n$r >= 0, ]from1 <- match(connect1$from, vertices$name)to1 <- match(connect1$to, vertices$name)connect2 <- connect_n[connect_n$r < 0, ]from2 <- match(connect2$from, vertices$name)to2 <- match(connect2$to, vertices$name)#8.创建图对象并生成图#8. Create a graph object and generate a graphmygraph <- graph_from_data_frame(edge, vertices = vertices)# 生成图# Generate graphpdf("results/EdgeBundingPlot01.pdf",family = "serif",width = 6,height = 7)ggraph(mygraph, layout = 'dendrogram', circular = TRUE) +  geom_conn_bundle(data = get_con(from = from, to = to),                    aes(edge_width = rep(connect_n$linesize, 3), edge_colour = rep(connect_n$linecolor, 3), edge_alpha = 0.8),                   tension = 0.8) +  geom_node_point(aes(filter = leaf, x = 1.05 * x, y = 1.05 * y, size = 5, color = group, alpha = 0.2)) +  geom_node_text(aes(filter = leaf, x = 1.1 * x, y = 1.1 * y, label = name, angle = angle, hjust = hjust, color = group), size = 2.5) +  scale_edge_color_manual(values = c("#BA55D3", "#CCEEFF")) +  scale_edge_width_continuous(range = c(0.2,1.6)) +  scale_size_continuous(range = c(0.1, 1.5)) +  scale_color_manual(values = rep(c('#6A5ACD', '#189078', '#54278f'), 30)) +  theme_void() +  theme(plot.margin = unit(c(0, 0, 0, 0), 'cm')) +  coord_fixed() +  expand_limits(x = c(-1.2, 1.2), y = c(-1.2, 1.2))dev.off()#> png #>   2

使用此脚本,请引用下文:

Yong-Xin Liu, Lei Chen, Tengfei Ma, Xiaofang Li, Maosheng Zheng, Xin Zhou, Liang Chen, Xubo Qian, Jiao Xi, Hongye Lu, Huiluo Cao, Xiaoya Ma, Bian Bian, Pengfan Zhang, Jiqiu Wu, Ren-You Gan, Baolei Jia, Linyang Sun, Zhicheng Ju, Yunyun Gao, Tao Wen, Tong Chen. 2023. EasyAmplicon: An easy-to-use, open-source, reproducible, and community-based pipeline for amplicon data analysis in microbiome research. iMeta 2: e83. https://doi.org/10.1002/imt2.83

Copyright 2016-2024 Defeng Bai baidefeng@caas.cn, Chuang Ma 22720765@stu.ahau.edu.cn, Jiani Xun 15231572937@163.com, Yong-Xin Liu liuyongxin@caas.cn

宏基因组推荐
本公众号现全面开放投稿,希望文章作者讲出自己的科研故事,分享论文的精华与亮点。投稿请联系小编(微信号:yongxinliu 或 meta-genomics)

猜你喜欢

iMeta高引文章 fastp 复杂热图 ggtree 绘图imageGP 网络iNAP
iMeta网页工具 代谢组MetOrigin 美吉云乳酸化预测DeepKla
iMeta综述 肠菌菌群 植物菌群 口腔菌群 蛋白质结构预测

10000+:菌群分析 宝宝与猫狗 梅毒狂想曲 提DNA发Nature

系列教程:微生物组入门 Biostar 微生物组  宏基因组

专业技能:学术图表 高分文章 生信宝典 不可或缺的人

一文读懂:宏基因组 寄生虫益处 进化树 必备技能:提问 搜索  Endnote

扩增子分析:图表解读 分析流程 统计绘图

16S功能预测   PICRUSt  FAPROTAX  Bugbase Tax4Fun

生物科普:  肠道细菌 人体上的生命 生命大跃进  细胞暗战 人体奥秘  

写在后面

为鼓励读者交流快速解决科研困难,我们建立了“宏基因组”讨论群,己有国内外6000+ 科研人员加入。请添加主编微信meta-genomics带你入群,务必备注“姓名-单位-研究方向-职称/年级”。高级职称请注明身份,另有海内外微生物PI群供大佬合作交流。技术问题寻求帮助,首先阅读《如何优雅的提问》学习解决问题思路,仍未解决群内讨论,问题不私聊,帮助同行。

点击阅读原文


宏基因组
宏基因组/微生物组是当今世界科研最热门的研究领域之一,为加强本领域的技术交流与传播,推动中国微生物组计划发展,中科院青年科研人员创立“宏基因组”公众号,目标为打造本领域纯干货技术及思想交流平台。
 最新文章