空转|CARD-结合scRNA解决空间转录组spot注释,还能增强空间精度?!

学术   其他   2023-08-16 09:00   北京  

前面已经介绍过了Seurat 空转 | 结合scRNA完成空转spot注释(Seurat Mapping) &  彩蛋(封面的空转主图代码)和 SPOTlight空转 | 我,SPOTlight,用解卷积,解决空间转录组spot注释! 联合单细胞进行空间转录组spot注释的方法,本文介绍下20202年发表于NBT的文献Spatially informed cell type deconvolution for spatial transcriptomicsCARD方法。

之所以在Seurat 和 SPOTlight外还要再介绍CARD,是因为在算法层面的区别外,CARD还额外以下几种使用场景:

(1)Refined spatial map(提升空间图谱精度

(2)reference-free version: CARDfree (基于基因集合进行空间spot注释

(3)single cell resolution mapping (推断并绘制单细胞水平的空间spot注释

一 载入R包,数据 


1.1,R包安装

安装CARD包,并加载

### 安装CARD包devtools::install_github('YingMa0107/CARD')###加载CARD包library(CARD)library(Seurat)library(patchwork)library(tidyverse)

Error1 : 安装包时候遇到如上报错,根据ERROR的提示安装缺少的包, 然后重新安装CARD即可 ;

后续分析时候还会遇到R包的问题,遇到再说。

1.2,CARD输入数据

(1)根据CARD的官网说明,CARD 需要空间转录组 和 单细胞转录组数据:

单细胞转录组数据(原始counts表达矩阵+每个细胞的注释矩阵),

空间转录组数据(原始counts表达矩阵+每个spot的空间位置矩阵)。

(2)参照示例数据 获取 CARD的输入数据

读取前面SPOTlight的Brain_ST_scRNA.sBio.Rdata数据 (单细胞转录组数据已注释),并整理成示例数据的样子

load("Brain_ST_scRNA.sBio.Rdata")
###空转的counts表达矩阵spatial_count <- Brain_ST@assays$Spatial@countsspatial_count[1:4,1:4]#4 x 4 sparse Matrix of class "dgCMatrix"# AAACACCAATAACTGC-1 AAACAGCTTTCAGAAG-1 AAACAGGGTCTATATT-1 AAACCGGGTAGGTACC-1#MIR1302-2HG . . . .#FAM138A . . . .#OR4F5 . . . .#AL627309.1 . . . .###空转的空间位置矩阵spatial_loca <- Brain_ST@images$slice1@coordinatesspatial_location <- spatial_loca[,2:3]#名字必须是x y ,不然后面CARD_deconvolution会报错colnames(spatial_location) <- c("x","y")spatial_location[1:4,]# x y#AAACACCAATAACTGC-1 59 19#AAACAGCTTTCAGAAG-1 43 9#AAACAGGGTCTATATT-1 47 13#AAACCGGGTAGGTACC-1 42 28
###单细胞转录组的counts表达矩阵sc_count <- Brain_scRNA@assays$RNA@countssc_count[1:4,1:4]#4 x 4 sparse Matrix of class "dgCMatrix"# AAACACCAATAACTGC-1 AAACAGCTTTCAGAAG-1 AAACAGGGTCTATATT-1 AAACCGGGTAGGTACC-1#MIR1302-2HG . . . .#FAM138A . . . .#OR4F5 . . . .#AL627309.1 . . . .
###单细胞转录组的细胞注释矩阵sc_meta <- Brain_scRNA@meta.data %>% rownames_to_column("cellID") %>% dplyr::select(cellID,orig.ident,celltype) %>% mutate(CB = cellID) %>% column_to_rownames("CB")head(sc_meta)# cellID orig.ident celltype#AAACACCAATAACTGC-1 AAACACCAATAACTGC-1 scRNA Tumor#AAACAGCTTTCAGAAG-1 AAACAGCTTTCAGAAG-1 scRNA Tumor#AAACAGGGTCTATATT-1 AAACAGGGTCTATATT-1 scRNA Tumor#AAACCGGGTAGGTACC-1 AAACCGGGTAGGTACC-1 scRNA Fibroblast#AAACCGTTCGTCCAGG-1 AAACCGTTCGTCCAGG-1 scRNA CD8#AAACCTCATGAAGTTG-1 AAACCTCATGAAGTTG-1 scRNA Fibroblast

注意:获取空间位置矩阵Brain_ST@images$slice1@coordinates 时候注意具体的images名字 ,再就是将位置的列名字改为x 和 y

二 CARD 解卷积 


2.1,构建CARD对象

上述的数据准备好之后,就可以构建CARD object 了 。

CARD_obj = createCARDObject(   sc_count = sc_count,   sc_meta = sc_meta,   spatial_count = spatial_count,   spatial_location = spatial_location,   ct.varname = "celltype",   ct.select = unique(sc_meta$celltype), #细胞类型列名  sample.varname = "orig.ident",   minCountGene = 100,   minCountSpot = 5)  ## QC on scRNASeq dataset! ...## QC on spatially-resolved dataset! ..

空间数据存储在CARD_obj@spatial_countMat和CARD_obj@spatial_location中,scRNA-seq数据以singlecellexexperiment格式存储在CARD_obj@sc_eset中。

其中ct.varname 为 meta.data中指定细胞类型注释信息的列名。

2.2,CARD 解卷积

使用CARD_deconvolution函数解卷积

CARD_obj = CARD_deconvolution(CARD_object = CARD_obj)## create reference matrix from scRNASeq...

Error2:初次运行时候,大概率会遇到缺少MuSiC的报错,使用devtools::install_github('xuranw/MuSiC') 进行安装。

Error3:大概率还会缺少TOAST 包,使用BiocManager::install("TOAST")安装

#Error in library(MuSiC) : 不存在叫‘MuSiC’这个名字的程辑包#> devtools::install_github('xuranw/MuSiC')#或者下载到本地,安装#remotes::install_local("E:/bioinformation/sc_ST/MuSiC-master.zip",upgrade = F,dependencies = T)#ERROR: dependency 'TOAST' is not available for package 'MuSiC'#BiocManager::install("TOAST")

安装好之后,再重新运行就没有问题了。

CARD_obj = CARD_deconvolution(CARD_object = CARD_obj)## Select Informative Genes! ...## Deconvolution Starts! ...## Deconvolution Finish! ...print(CARD_obj@Proportion_CARD[1:6,])

以上就得到了每个Spot的细胞类型比例结果。

三 CARD -spot 可视化 


3.1,spot比例可视化


使用CARD.visualize.pie 函数绘制spot的细胞类型分布饼图


colors = c("#FFD92F","#4DAF4A","#FCCDE5","#D9D9D9","#377EB8","#7FC97F","#BEAED4",           "#FDC086","#FFFF99","#386CB0","#F0027F","#BF5B17","#666666","#1B9E77","#D95F02",           "#7570B3","#E7298A","#66A61E","#E6AB02","#A6761D")p1 <- CARD.visualize.pie(proportion = CARD_obj@Proportion_CARD,                         spatial_location = CARD_obj@spatial_location,                          colors = colors)p1

3.2,展示感兴趣的细胞类型

重点关注各细胞类型在空间中的分布,这也是空转特色

## select the cell type that we are interestedct.visualize = c("CD4","CD8","NK")## visualize the spatial distribution of the cell type proportionp2 <- CARD.visualize.prop(  proportion = CARD_obj@Proportion_CARD,          spatial_location = CARD_obj@spatial_location,   ct.visualize = ct.visualize,                 ### selected cell types to visualize  colors = c("lightblue","lightyellow","red"), ### if not provide, we will use the default colors  NumCols = 4)                                 ### number of columns in the figure panelprint(p2)

3.3,可视化两种细胞类型

可以同时展示2种细胞类型,更清晰的对比空间位置。

## visualize the spatial distribution of two cell types on the same plotp3 = CARD.visualize.prop.2CT(  proportion = CARD_obj@Proportion_CARD,                             ### Cell type proportion estimated by CARD  spatial_location = CARD_obj@spatial_location,                  ### two cell types you want to visualize  ct2.visualize = c("CD4","CD8"),  colors = list(c("lightblue","lightyellow","red"),c("lightblue","lightyellow","black")))       ### two color scales                             p3

注:三种细胞类型会报错 。

3.4,细胞类型比例相关图

p4 <- CARD.visualize.Cor(CARD_obj@Proportion_CARD,colors = NULL) # if not provide, we will use the default colorsp4

四 提升空间图谱精度 


开头提到过CARD注释spot外,还提供了CARD.imputation 函数提升空间图谱精度。其分辨率高于原始测量到的分辨率,但是是否使用该推断的结果,视情况而定。

4.1,imputation 函数推断精度

#1. Imputation on the newly grided spatial locations #CARD_obj = CARD.imputation(CARD_obj,NumGrids = 2000,ineibor = 10,exclude = NULL)
## Visualize the newly grided spatial locations to see if the shape is correctly detected. If not, the user can provide the row names of the excluded spatial location data into the CARD.imputation functionlocation_imputation = cbind.data.frame(x=as.numeric(sapply(strsplit(rownames(CARD_obj@refined_prop),split="x"),"[",1)), y=as.numeric(sapply(strsplit(rownames(CARD_obj@refined_prop),split="x"),"[",2)))rownames(location_imputation) = rownames(CARD_obj@refined_prop)

这里的NumGrids 自定义,值越大推断的精度越高

提升前后的空间位置网格轮廓

p5 <- ggplot(location_imputation,              aes(x = x, y = y)) + geom_point(shape=22,color = "#7dc7f5")+  theme(plot.margin = margin(0.1, 0.1, 0.1, 0.1, "cm"),        legend.position="bottom",        panel.background = element_blank(),        plot.background = element_blank(),        panel.border = element_rect(colour = "grey89", fill=NA, size=0.5))
p50 <- ggplot(CARD_obj@spatial_location, aes(x = x, y = y)) + geom_point(shape=22,color = "#7dc7f5")+ theme(plot.margin = margin(0.1, 0.1, 0.1, 0.1, "cm"), legend.position="bottom", panel.background = element_blank(), plot.background = element_blank(), panel.border = element_rect(colour = "grey89", fill=NA, size=0.5))p50 + p5

右侧的为推断后的结果,可以看出精度有一些提升,NumGrids 设置更大值的话,精度提升会更明显。

4.2,分辨率增强-细胞类型比例可视化

p6 <- CARD.visualize.prop(  proportion = CARD_obj@refined_prop,                           spatial_location = location_imputation,              ct.visualize = ct.visualize,                      colors = c("lightblue","lightyellow","red"),      NumCols = 4)                                  p6

4.3,分辨率增强-marker gene 可视化

#增强后空间位置 p7 <- CARD.visualize.gene(spatial_expression = CARD_obj@refined_expression,  spatial_location = location_imputation,  gene.visualize = c("CD3E","CD248","CD14","C1QC"),  colors = NULL,  NumCols = 6)  # 原始空间位置 p77 <- CARD.visualize.gene(  spatial_expression = CARD_obj@spatial_countMat,  spatial_location = CARD_obj@spatial_location,  gene.visualize = c("CD3E","CD248","CD14","C1QC"),  colors = NULL,  NumCols = 6)p7 / p77

本次内容就先介绍到这,下期将介绍CARD的另外2个大杀器:(1)不使用单细胞数据 ,依托基因集合进行spot注释 和 (2)绘制单细胞精度的空间spot注释。

参考资料:Example Analysis (yingma0107.github.io)

◆ ◆ ◆  ◆ 

精心整理(含图PLUS版)|R语言生信分析,可视化(R统计,ggplot2绘图,生信图形可视化汇总)

RNAseq纯生信挖掘思路分享?不,主要是送你代码!(建议收藏)

生信补给站
生信,R语言, Python,数据处理、统计检验、模型构建、数据可视化,我输出您输入!
 最新文章