关于基因棒棒图有很多包可以实现,搜集了一下,有10多个:
1、trackViewer(r包):https://jianhong.github.io/trackViewer/articles/lollipopPlot.html
2、maftools(r包):https://bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/maftools.html
3、GenVisR(r包):https://github.com/griffithlab/GenVisR
4、proteinpaint(网页):https://docs.gdc.cancer.gov/Data_Portal/Users_Guide/proteinpaint_lollipop/
5、G3viz(r包):https://g3viz.github.io/g3viz/,https://github.com/g3viz/g3lollipop.js
6、lolliprot(r包):https://github.com/vaporised/lolliprot
7、lollipops:https://github.com/pbnjay/lollipops, https://joiningdata.com/lollipops/index.html
8、muts-needle-plot:https://github.com/bbglab/muts-needle-plot
9、Mutplot(网页):https://github.com/VivianBailey/Mutplot
10、ggplot2(r包):https://stackoverflow.com/questions/77473777/adding-branches-to-ggplot-mutation-lollipop-plot
今天来学习第一个:trackViewer(r包)
官方教程:https://jianhong.github.io/trackViewer/articles/lollipopPlot.html
数据可以是 甲基化或突变位点与注释数据。
先来一个简单的棒棒图:
rm(list=ls())
library(trackViewer)
SNP <- c(10, 12, 1400, 1402)
sample.gr <- GRanges("chr1", IRanges(SNP, width=1, names=paste0("snp", SNP)))
features <- GRanges("chr1", IRanges(c(1, 501, 1001),
width=c(120, 400, 405),
names=paste0("block", 1:3)))
sample.gr
features
lolliplot(sample.gr, features)
这个图展示了样本中1号染色体上的4个SNP位点:
改变颜色
不同的block颜色不一样
## More SNPs
SNP <- c(10, 100, 105, 108, 400, 410, 420, 600, 700, 805, 840, 1400, 1402)
sample.gr <- GRanges("chr1", IRanges(SNP, width=1, names=paste0("snp", SNP)))
features <- GRanges("chr1", IRanges(c(1, 501, 1001),
width=c(120, 400, 405),
names=paste0("block", 1:3)))
# 不同的block颜色不一样
features$fill <- c("#FF8833", "#51C6E6", "#DFA32D")
lolliplot(sample.gr, features)
改变棒棒的颜色:
sample.gr$color <- sample.int(6, length(SNP), replace=TRUE)
sample.gr$border <- sample(c("gray80", "gray30"), length(SNP), replace=TRUE)
sample.gr$alpha <- sample(100:255, length(SNP), replace = TRUE)/255
sample.gr
# GRanges object with 13 ranges and 3 metadata columns:
# seqnames ranges strand | color border alpha
# <Rle> <IRanges> <Rle> | <integer> <character> <numeric>
# snp10 chr1 10 * | 4 gray30 0.494118
# snp100 chr1 100 * | 3 gray80 0.741176
# snp105 chr1 105 * | 1 gray80 0.470588
# snp108 chr1 108 * | 2 gray30 0.835294
# snp400 chr1 400 * | 5 gray30 0.984314
# ... ... ... ... . ... ... ...
# snp700 chr1 700 * | 3 gray30 0.811765
# snp805 chr1 805 * | 2 gray80 0.450980
# snp840 chr1 840 * | 5 gray30 0.847059
# snp1400 chr1 1400 * | 3 gray30 0.843137
# snp1402 chr1 1402 * | 4 gray30 0.921569
# -------
# seqinfo: 1 sequence from an unspecified genome; no seqlengths
lolliplot(sample.gr, features)
圈圈中添加标签
可以通过设置以node.label.
开头的元数据来逐个控制节点标签。对于每个节点标签,node.label.gp
必须是一个列表,以控制节点标签的样式。也可以简单地使用node.label.col
、node.label.cex
、node.label.fontsize
、node.label.fontfamily
、node.label.fontface
、node.label.font
来分配节点标签属性。
sample.gr$node.label <- as.character(seq_along(sample.gr))
sample.gr$node.label.col <-
ifelse(sample.gr$alpha>0.5 | sample.gr$color==1, "white", "black")
sample.gr$node.label.cex <- sample.int(3, length(sample.gr), replace = TRUE)/2
# [1] 1.0 1.0 1.0 1.5 0.5 1.0 1.0 1.5 1.0 0.5 1.0 0.5 1.5
# 绘图
lolliplot(sample.gr, features)
改变图中block方框的高度
features$height <- c(0.02, 0.05, 0.08)
lolliplot(sample.gr, features)
同时绘制两个转录本的SNP情况
features.mul <- rep(features, 2)
features.mul$height[4:6] <- list(unit(1/8, "inches"), unit(0.5, "lines"), unit(.2, "char"))
features.mul$fill <- c("#FF8833", "#F9712A", "#DFA32D", "#51C6E6", "#009DDA", "#4B9CDF")
end(features.mul)[5] <- end(features.mul[5])+50
features.mul$featureLayerID <- paste("tx", rep(1:2, each=length(features)), sep="_")
# names(features.mul) <- paste(features.mul$featureLayerID, rep(1:length(features), 2), sep="_")
## One name per transcript
names(features.mul) <- features.mul$featureLayerID
#
# GRanges object with 6 ranges and 3 metadata columns:
# seqnames ranges strand | fill height featureLayerID
# <Rle> <IRanges> <Rle> | <character> <list> <character>
# tx_1 chr1 1-120 * | #FF8833 0.02 tx_1
# tx_1 chr1 501-900 * | #F9712A 0.05 tx_1
# tx_1 chr1 1001-1405 * | #DFA32D 0.08 tx_1
# tx_2 chr1 1-120 * | #51C6E6 0.125inches tx_2
# tx_2 chr1 501-950 * | #009DDA 0.5lines tx_2
# tx_2 chr1 1001-1405 * | #4B9CDF 0.2char tx_2
# -------
# seqinfo: 1 sequence from an unspecified genome; no seqlengths
# 绘图
lolliplot(sample.gr, features.mul)
将同一个位点的多个SNP圈圈叠在一起
# Note: the score value is an integer less than 10
sample.gr$score <- sample.int(5, length(sample.gr), replace = TRUE)
# Try a score value greater than 10
sample.gr$score <- sample.int(15, length(sample.gr), replace=TRUE)
sample.gr$node.label <- as.character(sample.gr$score)
sample.gr
#Try a float numeric score
sample.gr$score <- runif(length(sample.gr))*10
sample.gr$node.label <- as.character(round(sample.gr$score, digits = 1))
pdf(file = "test1.pdf",width = 8,height = 4)
lolliplot(sample.gr, features, lollipop_style_switch_limit=15, yaxis=FALSE)
dev.off()
还有更多细节,去官网看看吧:https://jianhong.github.io/trackViewer/articles/lollipopPlot.html
学徒作业
下期,我们来绘制一下这个文献中的这个棒棒图:
图片来源文献:《TP53 Germline Variations Influence the Predisposition and Prognosis of B-Cell Acute Lymphoblastic Leukemia in Children》
友情宣传: