漂亮的单细胞多组火山图

科技   2024-11-07 12:10   广东  
 今天是生信星球陪你的第1018天

   
公众号里的文章大多数需要编程基础,如果因为代码看不懂,而跟不上正文的节奏,可以来找我学习,相当于给自己一个新手保护期。我的课程都是循环开课,点进去咨询微信↓
生信分析直播课程(每月初开一期,春节休一个月)
生信新手保护学习小组(每月两期)
单细胞陪伴学习小组(每月两期)

最近发现一篇超级棒且适合做教材的单细胞文章,16分呢。

《Single-cell transcriptome analysis reveals the association between histone lactylation and cisplatin resistance in bladder cancer 》

其中的fig1B很好的展示了每种细胞类型的marker基因,把我迷住了,把它复现出来,可以用在每个单细胞数据呢!拿去用。

输入数据是已经做完细胞类型注释的seurat对象,除此之外啥也不要啦。纯代码的。

rm(list = ls())
library(Seurat)
library(dplyr)
library(patchwork)
library(ggplot2)
load( "sce.Rdata")
scRNA = sce
scRNA@meta.data$celltype = Idents(scRNA)
ctys = levels(scRNA)
ctys
## [1] "naive B"     "CD8 T"       "Naive CD4 T" "plasma B"    "CD14+ Mono" 
## [6] "endothelial" "Fibroblasts" "NK" "DC"
scRNA.markers <- FindAllMarkers(scRNA, min.pct = 0.25, 
logfc.threshold = 0.25)
head(scRNA.markers)
##                  p_val avg_log2FC pct.1 pct.2     p_val_adj cluster     gene
## CD79A 0.000000e+00 5.356611 0.915 0.060 0.000000e+00 naive B CD79A
## BANK1 0.000000e+00 6.778892 0.853 0.026 0.000000e+00 naive B BANK1
## MS4A1 0.000000e+00 5.677155 0.848 0.050 0.000000e+00 naive B MS4A1
## HLA-DRA 1.482079e-306 2.775984 0.994 0.336 3.060197e-302 naive B HLA-DRA
## HLA-DQB1 7.341732e-295 2.622558 0.926 0.207 1.515921e-290 naive B HLA-DQB1
## HLA-DQA1 5.404715e-292 2.727477 0.865 0.138 1.115966e-287 naive B HLA-DQA1
colnames(scRNA.markers)[6] = "celltype"
k = scRNA.markers$p_val_adj<0.05;table(k)
## k
## FALSE TRUE
## 3960 8946
scRNA.markers = scRNA.markers[k,]

#上下调
scRNA.markers$label <- ifelse(scRNA.markers$avg_log2FC<0,"sigDown","sigUp")
topgene <- scRNA.markers %>%
group_by(celltype) %>%
top_n(n = 10, wt = avg_log2FC) %>%
bind_rows(group_by(scRNA.markers, celltype) %>%
top_n(n = 10, wt = -avg_log2FC))
head(topgene)
## # A tibble: 6 × 8
## # Groups: celltype [1]
## p_val avg_log2FC pct.1 pct.2 p_val_adj celltype gene label
## <dbl> <dbl> <dbl> <dbl> <dbl> <fct> <chr> <chr>
## 1 0 6.78 0.853 0.026 0 naive B BANK1 sigUp
## 2 3.13e-250 6.98 0.498 0.011 6.47e-246 naive B LINC00926 sigUp
## 3 2.59e-207 6.02 0.423 0.01 5.34e-203 naive B CD24 sigUp
## 4 1.42e-204 6.90 0.411 0.008 2.92e-200 naive B LINC02397 sigUp
## 5 1.15e-196 7.13 0.403 0.01 2.38e-192 naive B IGHD sigUp
## 6 5.77e-166 7.47 0.335 0.006 1.19e-161 naive B PAX5 sigUp
#根据log2FC范围确定背景柱长度:
dfbar = scRNA.markers %>%
group_by(celltype) %>%
summarise(low = round(min(avg_log2FC)-0.5),
up = round(max(avg_log2FC)+0.5))

#绘制背景柱和散点图:
p1 <- ggplot()+
geom_col(aes(x = celltype ,y = low),dfbar,
fill = "#dcdcdc",alpha = 0.6)+
geom_col(aes(x = celltype ,y = up),dfbar,
fill = "#dcdcdc",alpha = 0.6)+
geom_jitter(aes(x = celltype, y = avg_log2FC, color = label),scRNA.markers,
width =0.4,size = 1)+
scale_color_manual(values = c("#0077c0","#c72d2e"))+
theme_classic()
p1
#X轴的色块标签:
library(RColorBrewer)
mycol <- colorRampPalette(rev(brewer.pal(n = 7, name ="Set1")))(length(ctys))
p2 <- p1 +
geom_tile(aes(x = ctys,y = 0),
height = 0.5,fill = mycol, show.legend = F)+
geom_text(aes(x= ctys, y = 0, label = ctys),
size = 3,fontface = "bold")
p2
library(ggrepel)
#给每种细胞类型的top基因加上标签,调整细节:
p3 <- p2 +
geom_text_repel(aes(x = celltype,y = avg_log2FC,label = gene),
topgene,size = 3 )+
labs(x = "CellType",y = "Average log2FoldChange",
title = "Differential expression genes")+
theme(
plot.title = element_text(size = 14,color = "black",face = "bold"),
axis.title = element_text(size = 12,color = "black",face = "bold"),
axis.line.y = element_line(color = "black",linewidth = 0.8),
axis.line.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
panel.grid = element_blank(),
legend.position = c(0.98,0.96),
legend.background = element_blank(),
legend.title = element_blank(),
legend.direction = "vertical",
legend.justification = c(1,0),
legend.text = element_text(size = 12)
)+
guides(color = guide_legend(override.aes = list(size = 4)))
p3
和原图一般无二啦。

生信星球
一个零基础学生信的平台-- 原创结构化图文/教程,精选阶段性资料,带你少走弯路早入门,收获成就感,早成生信小能手~
 最新文章