散点图简介
散点图(Scatter Plot) 是一种统计图形,用于展示两个变量之间的关系。它通过二维坐标系中的点来表示观察数据,每个点的位置由两个变量的取值决定,横坐标表示自变量(X 轴),纵坐标表示因变量(Y 轴)。散点图的应用:相关性分析,判断两个变量之间是否存在某种关联,以及这种关联是正相关、负相关还是无关。模型拟合,通过观察散点图,可以选择合适的回归模型(如线性回归、非线性回归等)来拟合数据。异常检测,可以通过观察散点图中的异常点来检测潜在的异常数据或错误观测值。
标签:#微生物组数据分析 #MicrobiomeStatPlot #散点图 #R语言可视化 #Scatter plot
作者:First draft(初稿):Defeng Bai(白德凤);Proofreading(校对):Ma Chuang(马闯) and Jiani Xun(荀佳妮);Text tutorial(文字教程):Defeng Bai(白德凤)
源代码及测试数据链接:
https://github.com/YongxinLiu/MicrobiomeStatPlot/项目中目录 3.Visualization_and_interpretation/ScatterPlot
或公众号后台回复“MicrobiomeStatPlot”领取
这是Patrick Lehodey在2020年发表于Biogeosciences上的文章,第一作者为Audrey Delpech,题目为:Influence of oceanic conditions in the energy transfer efficiency estimation of a micronekton model. https://bg.copernicus.org/articles/17/833/2020/
图 5 | 显示了在平面 (a) (𝒯,𝒱)和(b)(𝒯,𝒮)上基于核密度估计的散点图及边缘分布,分别对应表3中实验3a、3b、3c 和 3d 的配置。
绿色圆点代表微生物,六边形代表目标基因(调整后的Benjamini-Hochberg P < .05),六边形内的黑色三角形代表每个基因所涉及的免疫功能。边的颜色表示微生物节点和基因节点的Spearman相关性。BCR表示B细胞受体;TCR表示T细胞受体。
结果
图 5 显示,每种配置的次要变量分布都非常接近,足以进行实验比较,避免了任何交叉相关的风险。
这是Masahira Hattori 在2022年发表于NC上的文章,第一作者为Suguru Nishijima,题目为:Extensive gut virome variation and its associations with host and environmental factors in a population-level cohort https://doi.org/10.1038/s41467-022-32832-W
图 2 | 从 4198 个人类肠道元基因组中重建的噬菌体基因组概览
从 4198 个全元基因组数据集中重建的噬菌体基因组(n = 4709)的基因组大小和 GC 含量。顶部和右侧的柱状图分别描述了基因组大小和 GC 含量的分布情况。
结果
对序列相似度大于 95% 的 4709条噬菌体序列进行聚类,生成了1347个病毒操作分类单元(vOTUs)(相当于物种水平)(图 1a 和补充图 2a,补充数据 4)。
这是Damian R. plichta 在2023年发表于Nature Microbiology上的文章,第一作者为Joachim Johansen ,题目为:Centenarians have a diverse gut virome with the potential to modulate metabolism and promote healthy lifespan https://www.nature.com/articles/s41467-024-45793-z
图 3 | 噬菌体特征与独特的百岁老人细菌群落相关群落相关。
一些富集的温和 vOTUs 的 PtW 丰度(log10 标度)与 MSP 宿主丰度呈相关趋势,如 E. boltea vOTUs 丰度的 Spearman 等级相关性为 0.51(P = 4.82 × 10-66)、C. scindens(P = 4.49 × 10-08)和 P. distasonis(P = 6.00 × 10-11)。
结果
推测这一趋势可能反映了病毒细菌宿主的丰度。因此,我们将预测的温带病毒的病毒特征与其预测的宿主相关联。预测的温带病毒的病毒特征与其预测的宿主相关联。发现C.scindens等过量细菌的特征与相关病毒有显著相关性(Cor = 0.30, P = 4.49 × 10-08)。在 Akkermansia muciniphila中也发现了类似的趋势(Cor = 0.25、 P=6.23×10-11)、Enterocloster bolteae(Cor=0.51,P=4.82×10-66)和 Parabacteroides distasonis(Cor = 0.35,P = 6.00 × 10-11)(图 3d)。
源代码及测试数据链接:
https://github.com/YongxinLiu/MicrobiomeStatPlot/
或公众号后台回复“MicrobiomeStatPlot”领取
软件包安装
# 基于CRAN安装R包,检测没有则安装
p_list = c("ggplot2", "ggpubr", "ggpmisc", "doBy", "FactoMineR", "factoextra",
"tidyverse", "ggExtra", "vegan", "cowplot", "MASS", "scales","showtext","grid")
for(p in p_list){if (!requireNamespace(p)){install.packages(p)}
library(p, character.only = TRUE, quietly = TRUE, warn.conflicts = FALSE)}
# 加载R包 Load the package
suppressWarnings(suppressMessages(library(ggplot2)))
suppressWarnings(suppressMessages(library(ggpubr)))
suppressWarnings(suppressMessages(library(ggpmisc)))
suppressWarnings(suppressMessages(library(doBy)))
suppressWarnings(suppressMessages(library(FactoMineR)))
suppressWarnings(suppressMessages(library(factoextra)))
suppressWarnings(suppressMessages(library(tidyverse)))
suppressWarnings(suppressMessages(library(ggExtra)))
suppressWarnings(suppressMessages(library(vegan)))
suppressWarnings(suppressMessages(library(cowplot)))
suppressWarnings(suppressMessages(library(MASS)))
suppressWarnings(suppressMessages(library(scales)))
suppressWarnings(suppressMessages(library(showtext)))
suppressWarnings(suppressMessages(library(grid)))
实战1
散点图加拟合线
# 读取数据
# Load data
Scatterplot <- read.table("data/figure1e.txt", header = TRUE, sep = "\t", comment.char = "")
colnames(Scatterplot) <- c("PROGENY", "KEGG")
# 散点图
# Plot
p1 <- ggplot(Scatterplot, aes(x = PROGENY, y = KEGG)) +
geom_point(size = 3.5, color = "#1F78B4") +
geom_smooth(method = "lm", color = "#E31A1C", size = 1.3, linetype = "dashed") +
stat_cor(method = "pearson",
label.sep = '\n',
p.accuracy = 0.001,
r.digits = 3,
label.x = 9.2,
size = 5, color = "darkred") +
scale_x_continuous(limits = c(8.9, 10.6), expand = c(0, 0)) +
scale_y_continuous(limits = c(6.2, 9.8), expand = c(0, 0)) +
labs(x = "MAPK-PROGENy GES", y = "IL-17 GES") +
ggtitle("Scatter Plot of PROGENy vs IL-17 GES") +
theme_minimal(base_size = 16) +
theme(
axis.text = element_text(color = "black", size = 13),
axis.title = element_text(size = 15, face = "bold"),
plot.title = element_text(size = 17, face = "bold", hjust = 0.5),
panel.border = element_rect(color = "gray", fill = NA, size = 0.8)
)
# 将图形保存为PDF文件
# Save plot
ggsave("results/scatterplot01.pdf", plot = p1, width = 6, height = 4.5)
实战2
散点图加直方图
参考:https://mp.weixin.qq.com/s/3wBXO3gHWj2TpgHqJdBm6w https://mp.weixin.qq.com/s/k6XH4KQaJ6BSDwNO_xUx4w
# 创建示例数据
# Create sample data
covariance <- matrix(c(1, -0.2, -0.2, 1), nrow = 2, byrow = TRUE)
data <- mvrnorm(n = 1000, mu = c(0.5, 3), covariance)
data <- as.data.frame(data)
colnames(data) <- c("Genome size", "GC content")
# 添加分组信息
# Add group info
data$group <- "Host unknown"
data$group[order(data$`Genome size`)[c(sample(10:30, 10), sample(500:1000, 300))]] <- "Bacteroidetes"
data$group[order(data$`Genome size`)[c(sample(30:60, 20), sample(200:800, 300))]] <- "Firmicutes"
data$group[order(data$`Genome size`)[c(sample(40:80, 20), sample(100:200, 20))]] <- "Actinobacteria"
data$group[order(data$`Genome size`)[sample(1:1000, 10)]] <- "Proteobacteria"
# 设置颜色
# Set color
colors <- c("Bacteroidetes" = "#EB746A",
"Firmicutes" = "#7AA82C",
"Actinobacteria" = "#1EB5B8",
"Proteobacteria" = "#A07DB7",
"Host unknown" = "#AAAAAA")
# 散点图
# Plot
p21 <- ggplot(data) +
geom_point(aes(`Genome size`, `GC content`, color = group), alpha = 0.8, size = 3.5) +
scale_color_manual(values = colors) +
theme_minimal() +
theme(
legend.position = "top",
legend.title = element_blank(),
axis.text = element_text(size = 12),
axis.title = element_text(size = 14, face = "bold"),
axis.line = element_line(size = 0.6),
panel.grid = element_line(color = "grey90")
) +
labs(x = "Genome Size", y = "GC Content")
# 上方直方图
# Top bar plot
p22 <- ggplot(data) +
geom_histogram(aes(`Genome size`), binwidth = 0.2, fill = "#EB746A", color = "white", size = 0.8) +
theme_minimal() +
theme(
axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line = element_line(size = 0.6), # 添加刻度线
panel.grid = element_blank()
) +
ylab("# of vOTUs")
# 右侧直方图
# Right bar plot
p23 <- ggplot(data) +
geom_histogram(aes(`GC content`), binwidth = 0.2, fill = "#A07DB7", color = "white", size = 0.8) +
theme_minimal() +
theme(
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.line = element_line(size = 0.6), # 添加刻度线
panel.grid = element_blank()
) +
coord_flip() +
ylab("# of vOTUs")
# 拼图
# Patchwork
p24 <- ggdraw() +
draw_plot(p21, 0.05, 0, .75, .8) +
draw_plot(p22, 0.05, 0.8, .75, .2) +
draw_plot(p23, 0.8, 0, .2, .8)
# 保存优化后的图像为PDF
# Save plot
ggsave("results/scatter_histogram_high.pdf", plot = p24, height = 6, width = 8)
实战3
散点图加密度图
# 数据为“2.Scatter plot plus histogram 散点图加直方图”使用的数据
# Data inherted from "2.Scatter plot plus histogram 散点图加直方图"
# 绘制密度图
# Plot density plot
p31 <- ggplot(data, aes(x = `Genome size`, color = group, fill=group)) +
geom_density(alpha = 0.4, size = 1) +
scale_y_continuous(expand = c(0,0))+
scale_color_manual(values = c("#f28c88","#ffbe5d","#9db1c1","#d6d6d6","#77bda1"))+
scale_fill_manual(values =alpha(c("#f28c88","#ffbe5d","#9db1c1","#d6d6d6","#77bda1"),0.6) )+
theme_classic()+
theme(panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.ticks = element_line(size=0.6,colour = "black"),
axis.text.y = element_text(size=10,colour = "black"),
axis.title.y = element_text(size=16),
axis.title.x = element_blank(),
axis.ticks.x = element_blank(),
axis.text.x = element_blank(),
legend.position = "none"
)
#p31
p32 <- ggplot(data, aes(x = `GC content`, color = group, fill=group)) +
geom_density(alpha = 0.4, size = 1) +
scale_y_continuous(expand = c(0,0))+
scale_color_manual(values = c("#f28c88","#ffbe5d","#9db1c1","#d6d6d6","#77bda1"))+
scale_fill_manual(values =alpha(c("#f28c88","#ffbe5d","#9db1c1","#d6d6d6","#77bda1"),0.6) )+
theme_classic()+
theme(panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.ticks = element_line(size=0.6,colour = "black"),
axis.text.x = element_text(size = 12,colour = "black"),
axis.title.y = element_blank(),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
legend.position = "none"
)+coord_flip()
#p32
# 拼图
# Patchwork
p33 <- ggdraw() +
draw_plot_label("a", size = 20)+
draw_plot(p21, 0.05, 0, .75, .8)+
draw_plot(p31, 0.05, 0.8, .75, .2)+
draw_plot(p32, 0.8, 0, .2, .8)
ggsave("results/plot03.pdf", height = 5, width = 8, plot = p33)
#p33
实战4
散点图加箱线图
# 载入数据
# Load data
distance_mat=read.table(paste0("data/OTUID_beta_diversity_yjb_all.txt"), header=T, row.names=1, sep="\t", comment.char="")
distance_mat[1:3, 1:3]
metadata=read.table("data/metadata_double_single3.txt", header=T, row.names=1, sep="\t", comment.char="", stringsAsFactors=F)
df <- metadata
df <- df[rownames(df)%in%rownames(distance_mat), ]
# 进行PCoA分析(PCoA analysis)
# pcoa_result <- cmdscale(distance_mat, eig = TRUE, k = 2)
# pcoa_result2 <- pcoa_result[["points"]]
# bray_dis <- vegdist(varespec, method = 'bray')#基于Bray-curtis 距离测算(Based on Bray-Curtis distance measurement)
nmds_dis <- metaMDS(distance_mat, k = 2)#NMDS排序计算,一般定义 2 个维度(NMDS sorting calculation generally defines 2 dimensions)
#> Run 0 stress 0.2595632
#> Run 1 stress 0.2705372
#> Run 2 stress 0.2719117
#> Run 3 stress 0.2767482
#> Run 4 stress 0.2696427
#> Run 5 stress 0.270412
#> Run 6 stress 0.2639526
#> Run 7 stress 0.276302
#> Run 8 stress 0.2705819
#> Run 9 stress 0.2721413
#> Run 10 stress 0.2796006
#> Run 11 stress 0.2667527
#> Run 12 stress 0.2614544
#> Run 13 stress 0.2842278
#> Run 14 stress 0.2627725
#> Run 15 stress 0.2724872
#> Run 16 stress 0.2630389
#> Run 17 stress 0.2624092
#> Run 18 stress 0.2691989
#> Run 19 stress 0.2830036
#> Run 20 stress 0.2712429
#> *** Best solution was not repeated -- monoMDS stopping criteria:
#> 7: no. of iterations >= maxit
#> 12: stress ratio > sratmax
#> 1: scale factor of the gradient < sfgrmin
nmds_dis_site <- data.frame(nmds_dis$points)#计算样点得分(Calculate sample point scores)
df$nmds1 <- nmds_dis_site[, 1]
df$nmds2 <- nmds_dis_site[, 2]
# 定义颜色
# Define color
colors <- c("Patients" = "#FF6F61", "Healthy" = "#6B5B95")
# 绘制PCoA散点图
# Draw PCoA scatter plot
p41 <- ggplot(df, aes(x = nmds1, y = nmds2, color = Group)) +
geom_point(alpha = 0.7, size = 3) +
stat_ellipse(level = 0.68) +
scale_color_manual(values = colors) +
theme_bw(base_size = 15) +
labs(x = paste0("NMDS1"),
y = paste0("NMDS2")
) +
theme(
legend.position = "bottom",
legend.title = element_text(size = 14, face = "bold"),
legend.text = element_text(size = 12),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text = element_text(size = 12),
axis.title = element_text(size = 14, face = "bold"),
plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
plot.margin = unit(c(1, 1, 1, 1), "cm")
)
# 添加边缘箱线图
# Add marginal box plots
p42 <- ggMarginal(p41, type = "boxplot", margins = "both", groupColour = TRUE, groupFill = TRUE)
# 显示图(Show plot)
ggsave("results/NMDS_plot01.pdf", plot = p42, width = 6, height = 6)
实战5
分组散点图添加误差棒和平均值
参考:https://mp.weixin.qq.com/s/e6CVI23i-qgWANTaDXAJdw
# 载入数据
# Load data
test_design <- read.csv("data/test_design.csv",row.names = 1)
df2 <- test_design[,c(2,5)]
# 统计平均值
# Calculate the average value
mean_df <- aggregate(df2[,2],by=list(df2[,1]),FUN=mean)
rownames(mean_df) <- mean_df[,1]
# 统计标准差
# Standard deviation
sd_df <- aggregate(df2[,2],by=list(df2[,1]),FUN=sd)
rownames(sd_df) <- sd_df[,1]
sd_df <- sd_df[,-1]
# 计算标准误
# Calculate the standard error
sd_df<-sd_df/2
se<-as.data.frame(sd_df)
# 合并数据
# Merge data
df1<-cbind(mean_df,se)
colnames(df1)<-c("group","mean","se")
colnames(df2)<-c("group","Gene_Abundance")
# 绘图
# Plot
p51 <- ggplot()+stat_summary(fun = "mean",geom = "crossbar", mapping = aes(x=df1$group,y=df1$mean),
width=0.65,size=1.0, color = "gray2")+
geom_jitter(data=df2,mapping=aes(x=group,y=Gene_Abundance,fill = group,colour = group,shape = group),
size = 3.5, height = 0.01,width = 0.1)+
scale_color_manual(values = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+
geom_errorbar(data=df1,mapping=aes(x = group,ymin = mean-se, ymax = mean+se), width = 0.2,
color = "gray2",size=1)+
labs(y="Abundance", x="")+
theme_bw(base_line_size = 1.05,base_rect_size =1.05)+
theme(panel.grid.major=element_blank(),panel.grid.minor=element_blank())+
theme(axis.text=element_text(colour='black',size=12))
# 改变样式
# Change the style
p52 <- ggplot()+stat_summary(fun = "mean",geom = "crossbar", mapping = aes(x=df1$group,y=df1$mean),
width=0.2,size=0.5, height =0.01 ,color = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+
geom_jitter(data=df2,mapping=aes(x=group,y=Gene_Abundance,fill = group,colour = group),
size = 3.5,height = 0.01,width = 0.1, shape = 20)+
scale_color_manual(values = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+
geom_errorbar(data=df1,mapping=aes(x = group,ymin = mean-se, ymax = mean+se), width = 0.2,
color = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"),size=1)+
labs(y="Abundance", x="")+
theme_classic()+
theme(panel.grid.major=element_blank(),panel.grid.minor=element_blank())+
theme(axis.text=element_text(colour='black',size=12))
# 改变样式
# Change the style
p53 <- ggplot(df2, aes(group, Gene_Abundance))+
stat_summary(fun.data="mean_sdl", fun.args = list(mult=1), geom="pointrange", color = "pink",size = 1.2)+
stat_summary(fun="mean", fun.args = list(mult=1), geom="point", color = "white",size = 4)+
geom_jitter(data=df2,mapping=aes(x=group,y=Gene_Abundance,fill = group,colour = group),
size = 3.5,height = 0.01,width = 0.1, shape = 20)+
scale_color_manual(values = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+
labs(y="Abundance", x="")+
theme_classic()
# 添加显著性标记
# Add significance
p54 <- ggplot()+stat_summary(fun = "mean",geom = "crossbar", mapping = aes(x=df1$group,y=df1$mean),
width=0.2,size=0.5, height =0.01 ,color = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+
geom_jitter(data=df2,mapping=aes(x=group,y=Gene_Abundance,fill = group,colour = group),
size = 3.5,height = 0.01,width = 0.1)+
scale_color_manual(values = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"))+
geom_errorbar(data=df1,mapping=aes(x = group,ymin = mean-se, ymax = mean+se), width = 0.2,
color = c("#EB746A","#7AA82C","#1EB5B8","#A07DB7"),size=1)+
labs(y="Abundance", x="")+
theme_classic()+
scale_y_continuous(limits = c(0, 50))+
geom_signif(data = df2, mapping=aes(x=group,y=Gene_Abundance),
comparisons = list(c("AA", "BB"),
c("AA", "CC"),
c("AA", "DD")
),
map_signif_level=F,
tip_length=c(0,0,0,0,0,0,0,0,0,0,0,0),
y_position = c(45,40,35),
size=1,
textsize = 4,
test = "wilcox.test"
)
ggsave("results/scatter_plot_error_bar.pdf", p54, width = 6, height = 6)
实战6
质心+误差棒1
例:Wang, T., et al. (2024). “Divergent age-associated and metabolism-associated gut microbiome signatures modulate cardiovascular disease risk.” Nat Med 30(6): 1722-1731.https://doi.org/10.1038/s41591-024-03038-y
参考:https://mp.weixin.qq.com/s/h2EwRvvM7BXVlwpzC7_gNQ
# 读取数据
# Load data
mydata <- read.csv("data/pca_data.csv", header = TRUE)
data1 <- mydata[, 1:10]
data <- scale(data1)
# 进行PCA分析
# Perform PCA analysis
res.pca <- PCA(data, graph = FALSE)
# 提取样本PC1、PC2坐标和分组信息
# Extract sample PC1, PC2 coordinates and grouping information
result <- data.frame(res.pca$ind$coord)
pc1 <- result[, 1]
pc2 <- result[, 2]
group <- mydata[, 11]
# 合并数据用于绘图
# Combining data for plotting
plotdata <- data.frame(pc1 = pc1, pc2 = pc2, group = group)
# 计算均值、标准差、标准误差和95%置信区间
# Calculation of mean, standard deviation, standard error and 95% confidence interval
se <- function(x) sd(x) / sqrt(length(x))
conf_95 <- function(x) t.test(x, conf.level = 0.95)$conf.int[2]
sumdata <- summaryBy(pc1 + pc2 ~ group, data = plotdata, FUN = c(mean, sd, se, conf_95))
# 绘图
# Plot
p61 <- ggplot(sumdata, aes(x = pc1.mean, y = pc2.mean, color = group)) +
geom_point(size = 3) +
scale_color_manual(values = c("#1b9e77", "#d95f02", "#7570b3", "#e7298a", "#66a61e", "#e6ab02")) +
#theme_minimal() +
theme_bw()+
#theme_classic()+
labs(x = 'PC1', y = 'PC2') +
theme(
text = element_text(family = "Arial", size = 16),
axis.title = element_text(face = "bold"),
axis.ticks.length = unit(0.2, "cm"),
axis.text = element_text(color = "black"),
legend.position = "right",
legend.title = element_blank()
)
# 添加误差棒
# Add error bars
p62 <- p61 +
geom_errorbarh(aes(xmin = pc1.mean - pc1.sd, xmax = pc1.mean + pc1.sd), height = 0.1, size = 0.8) +
geom_errorbar(aes(ymin = pc2.mean - pc2.sd, ymax = pc2.mean + pc2.sd), width = 0.1, size = 0.8)
# 添加误差棒
# Adding error bars
p63 <- p61 +
geom_errorbarh(aes(xmin = pc1.mean - pc1.se, xmax = pc1.mean + pc1.se), height = 0.1, size = 0.8) +
geom_errorbar(aes(ymin = pc2.mean - pc2.se, ymax = pc2.mean + pc2.se), width = 0.1, size = 0.8)
# 添加误差棒
# Add error bars
showtext_auto()
p64 <- p61 +
geom_errorbarh(aes(xmin = pc1.mean - pc1.conf_95, xmax = pc1.mean + pc1.conf_95), height = 0.1, size = 0.8) +
geom_errorbar(aes(ymin = pc2.mean - pc2.conf_95, ymax = pc2.mean + pc2.conf_95), width = 0.1, size = 0.8) +
theme(text = element_text(family = "Arial", size = 16))
# 保存图像为PDF文件
# Save as PDF
ggsave("results/Centroid_Error_Bar.pdf", plot = p64, width = 7, height = 4)
实战7
质心+误差棒2
参考:https://mp.weixin.qq.com/s/7SHgJXhctRCi4QnHyaAZug
例:Pärn, J., Verhoeven, J.T.A., Butterbach-Bahl, K. et al. Nitrogen-rich organic soils under warm well-drained conditions are global nitrous oxide emission hotspots. Nat Commun 9, 1135 (2018).https://doi.org/10.1038/s41467-018-03540-1
# 加载数据
# Load data
data1 <- read.csv("data/test_data2.csv",header = T)
# 重新计算质心数据
# Recalculate the centre of mass data
X <- aggregate(X ~ group, data1, function(x) c(mean = mean(x), sd = sd(x)))
Y <- aggregate(Y ~ group, data1, function(x) c(mean = mean(x), sd = sd(x)))
data2 <- data.frame(
group=X$group,
X=X$X[,1],
X_error=X$X[,2],
Y=Y$Y[,1],
Y_error=Y$Y[,2]
)
# 绘图
# Plot
p71 <- ggplot(data2, aes(X, Y)) +
geom_errorbar(aes(xmin = X - X_error, xmax = X + X_error, color = group),
linewidth = 0.6, width = 0.2, alpha = 0.7) +
geom_errorbar(aes(ymin = Y - Y_error, ymax = Y + Y_error, color = group),
linewidth = 0.6, width = 0.2, alpha = 0.7) +
geom_point(aes(color = group), size = 3, alpha = 0.8) +
scale_color_manual(values = c("#D55E00", "#0072B2", "#F0E442", "#009E73",
"#CC79A7", "#56B4E9", "#E69F00", "#999999",
"#1A85FF", "#FF6666", "#66CC99", "#006400",
"#8B0000", "#DAA520", "#556B2F", "#FF4500",
"#4682B4", "#9400D3")) +
geom_smooth(method = "lm", color = "darkblue", fill = "#ADD8E6", se = TRUE,
formula = y ~ x, linetype = "dashed", alpha = 0.3) +
stat_poly_eq(formula = y ~ x,
aes(label = paste(after_stat(eq.label), after_stat(adj.rr.label), sep = "~~~")),
parse = TRUE, size = 4, label.x.npc = "right", label.y.npc = "top") +
labs(x = " category_A", y = " category_B", color = "group") +
#theme_minimal(base_size = 15) +
theme_bw()+
theme(
legend.position = "right",
legend.title = element_text(size = 12, face = "bold"),
legend.text = element_text(size = 10),
axis.title = element_text(size = 16),
axis.text = element_text(size = 14),
panel.grid.major = element_line(size = 0.5, linetype = "dotted", color = "gray80"),
panel.grid.minor = element_blank()
)
# 保存PDF
# Save as PDF
ggsave("results/Centroid_Error_Bar02.pdf", plot = p71, width = 8, height = 6, dpi = 300)
使用此脚本,请引用下文:
Yong-Xin Liu, Lei Chen, Tengfei Ma, Xiaofang Li, Maosheng Zheng, Xin Zhou, Liang Chen, Xubo Qian, Jiao Xi, Hongye Lu, Huiluo Cao, Xiaoya Ma, Bian Bian, Pengfan Zhang, Jiqiu Wu, Ren-You Gan, Baolei Jia, Linyang Sun, Zhicheng Ju, Yunyun Gao, Tao Wen, Tong Chen. 2023. EasyAmplicon: An easy-to-use, open-source, reproducible, and community-based pipeline for amplicon data analysis in microbiome research. iMeta 2: e83. https://doi.org/10.1002/imt2.83
Copyright 2016-2024 Defeng Bai baidefeng@caas.cn, Chuang Ma 22720765@stu.ahau.edu.cn, Jiani Xun 15231572937@163.com, Yong-Xin Liu liuyongxin@caas.cn
猜你喜欢
iMeta高引文章 fastp 复杂热图 ggtree 绘图imageGP 网络iNAP
iMeta网页工具 代谢组MetOrigin 美吉云乳酸化预测DeepKla
iMeta综述 肠菌菌群 植物菌群 口腔菌群 蛋白质结构预测
10000+:菌群分析 宝宝与猫狗 梅毒狂想曲 提DNA发Nature
一文读懂:宏基因组 寄生虫益处 进化树 必备技能:提问 搜索 Endnote
16S功能预测 PICRUSt FAPROTAX Bugbase Tax4Fun
生物科普: 肠道细菌 人体上的生命 生命大跃进 细胞暗战 人体奥秘
写在后面
为鼓励读者交流快速解决科研困难,我们建立了“宏基因组”讨论群,己有国内外6000+ 科研人员加入。请添加主编微信meta-genomics带你入群,务必备注“姓名-单位-研究方向-职称/年级”。高级职称请注明身份,另有海内外微生物PI群供大佬合作交流。技术问题寻求帮助,首先阅读《如何优雅的提问》学习解决问题思路,仍未解决群内讨论,问题不私聊,帮助同行。
点击阅读原文