结论:当然不是无脑合并,需要综合考量各种因素。可以去除TCR/BCR相关基因之后进行合并分析。
问题来源
最近遇到一个有意思的问题:朋友自己的单细胞数据加测了TCR/BCR,想和公共数据集的单细胞数据合并分析,但是,公共数据集的单细胞数据没有加测TCR/BCR,这样还可以使用harmony合并分析吗?
我的第一反应是不太好,但是朋友希望能合并。
我的方案
我本来的解决方案:在朋友的原始数据中,只用scRNAseq文库(去掉BCR/TCR文库),重新走一步cellranger 流程。
但是无意间看到一篇文章,感觉这么做也很好啊,还方便快捷
别看人家分低,做的东西还是挺有意思的,通讯是个瑞典人。
通讯作者:Inga-Lill Mårtensson,瑞典哥德堡大学萨尔格伦斯卡医学院医学研究所,风湿病与炎症研究部门,哥德堡,瑞典。电话:+46(0)703640068;电子邮件:(邮箱地址)。
再看下通讯作者过往发的文章,应该是认真做科研的
文献解决方案
使用正则表达式分别去除BCR/TCR基因
BCR-genes were removed from the count data using regular expression commands with the following patterns: ‘IG[HKL]V’, ‘IG[KL]J’, ‘IG[KL]C’, ‘IGH[ADEGM]’.
TCR-genes were removed from the count data using a regular expression command with the following pattern:’^TR[ABDG][VJC]’. A cluster containing cells positive for CD3E, CD19 and CD14 was discarded.
文献主要结论
当然了,文献里面要解决的根本不是我最开始提出的问题。文章主要说了BCR/TCR会对聚类结果有影响
the genes that encode B-cell antigen receptors interfere with the process of unsupervised clustering, as well as the down-stream analyses of these cells.
the genes that encode T-cell receptors interfere with the unsupervised clustering of these cells.
This interference is likely due to the high frequencies of B- and T-cell receptor genes among the genes that account for most of the variance in each of the PCs used for the clustering.
The effects of the B-cell and T-cell receptor genes are abrogated upon their exclusion before clustering is undertaken.
可喜的是,人家提供了代码
Code availability
All custom codes used for data processing and computational analyses are available under project BCR_TCR_Interference at https://github.com/MartenssonLab/.
欢迎批评指正
——生信小博士
看完记得顺手点个“在看”哦!