写在前面的话
整理一些目前在单细胞转录组和空间转录组上常用的数据分析工具,以10x Genomics的技术为例!!
单细胞转录组
上游分析
Cellranger (linux)
https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/installation
下游分析
综合分析
Seurat (R) https://satijalab.org/seurat/
引:Hao Y, Hao S, Andersen-Nissen E. et al.Integrated analysis of multimodal single-cell data. Cell. 2021 Jun 24;184(13):3573-3587.e29.
Scater (R) https://github.com/jimhester/scater
Scanpy (Python) https://scanpy.readthedocs.io/en/stable/
引:Wolf, F., Angerer, P. & Theis, F. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol 19, 15 (2018).
排除游离RNA
SoupX (R) https://github.com/constantAmateur/SoupX
引:Young MD, Behjati S. SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data. Gigascience. 2020 Dec 26;9(12):giaa151.
双细胞过滤
DoubletFinder (R) https://github.com/chris-mcginnis-ucsf/DoubletFinder
引:McGinnis CS, Murrow LM, Gartner ZJ. DoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors. Cell Syst. 2019 Apr 24;8(4):329-337.e4.
scDblFinder (R) https://github.com/plger/scDblFinder
引:Germain PL, Lun A, Garcia Meixide C et al. Doublet identification in single-cell sequencing data using scDblFinder. F1000Research 2022, R10:979.
Scrublet (R) https://github.com/swolock/scrublet
引:Wolock SL, Lopez R, Klein AM. Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. Cell Systems. 2019 Apr;8(4):281-291.e9.
表达量填充
MAGIC (R、Python) https://github.com/KrishnaswamyLab/MAGIC
引:van Dijk D, Sharma R, Nainys J. et al. Recovering Gene Interactions from Single-Cell Data Using Data Diffusion. Cell. 2018 Jul 26;174(3):716-729.e27.
sclmpute (R) https://github.com/Vivianstats/scImpute
引:Li, W.V., Li, J.J. An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nat Commun 9, 997 (2018).
批次矫正
Harmony (R、Python) https://github.com/immunogenomics/harmony
引:Korsunsky, I., Millard, N., Fan, J. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 16, 1289–1296 (2019).
Liger (R) https://github.com/welch-lab/liger
引:Welch JD, Kozareva V, Ferreira A, Vanderburg C, Martin C, Macosko EZ. Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity. Cell. 2019 Jun 13;177(7):1873-1887.e17.
聚类
scNMF (R) https://github.com/ttriche/scNMF Lopez R, Regier J, Cole MB, Jordan 引:MI, Yosef N. Deep generative modeling for single-cell transcriptomics. Nat Methods. 2018 Dec;15(12):1053-1058.
细胞类型鉴定
SingleR (R) https://github.com/dviraran/SingleR
引:Aran, D., Looney, A.P., Liu, L. et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol 20, 163–172 (2019).
Garnett (R) https://cole-trapnell-lab.github.io/garnett/docs_m3/
引:Pliner HA, Shendure J, Trapnell C. Supervised classification enables rapid annotation of cell atlases. Nat Methods. 2019 Oct;16(10):983-986.
cellassign (R) https://github.com/Irrationone/cellassign
引:Zhang, A.W., O’Flanagan, C., Chavez, E.A. et al. Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling. Nat Methods 16, 1007–1015 (2019).
celaref (R) https://github.com/MonashBioinformaticsPlatform/celaref
CHETAH (R) https://github.com/jdekanter/CHETAH
引:de Kanter JK, Lijnzaad P, Candelli T, Margaritis T, Holstege FCP. CHETAH: a selective, hierarchical cell type identification method for single-cell RNA sequencing. Nucleic Acids Res. 2019 Sep 19;47(16):e95.
进化树的关系识别
tooManyCellsR (R) https://github.com/GregorySchwartz/tooManyCellsR
引:Schwartz, G.W., Zhou, Y., Petrovic, J. et al. TooManyCells identifies and visualizes relationships of single-cell clades. Nat Methods 17, 405–413 (2020).
拟时序分析
Monocle2 (R) http://cole-trapnell-lab.github.io/monocle-release/
引:Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS, Rinn JL. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol. 2014 Apr;32(4):381-386.
Monocle3 (R) https://cole-trapnell-lab.github.io/monocle3/
Palantir (Python)
引:https://github.com/dpeerlab/Palantir/ Setty, M., Kiseliovas, V., Levine, J. et al. Characterization of cell fate probabilities in single-cell data with Palantir. Nat Biotechnol 37, 451–460 (2019).
PAGA (Python) https://github.com/theislab/paga
引:Wolf, F.A., Hamey, F.K., Plass, M. et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol 20, 59 (2019).
Slingshot (R、Python) https://github.com/kstreet13/slingshot
引:Street, K., Risso, D., Fletcher, R. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics 19, 477 (2018).
RNA速率
velocyto (R、Python) http://velocyto.org/
引:La Manno, G., Soldatov, R., Zeisel, A. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).
scVelo (Python) https://github.com/theislab/scvelo
引:Bergen, V., Lange, M., Peidli, S. et al. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat Biotechnol 38, 1408–1414 (2020).
转录调控网络
SCENIC (R、Python) https://scenic.aertslab.org/
引:Aibar, S., González-Blas, C., Moerman, T. et al. SCENIC: single-cell regulatory network inference and clustering. Nat Methods 14, 1083–1086 (2017).
DoRothEA (R) https://github.com/saezlab/DoRothEA
引:Holland, C.H., Tanevski, J., Perales-Patón, J. et al. Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data. Genome Biol 21, 36 (2020).
hdWGCNA (R) https://smorabit.github.io/hdWGCNA/
引:Morabito, S., Miyoshi, E., Michael, N. et al. Single-nucleus chromatin accessibility and transcriptomic characterization of Alzheimer’s disease. Nat Genet 53, 1143–1155 (2021).
scWGCNA (R) https://github.com/CFeregrino/scWGCNA
引:Feregrino C, Tschopp P. Assessing evolutionary and developmental transcriptome dynamics in homologous cell types. Dev Dyn. 2022 Sep;251(9):1472-1489.
细胞通讯
CellphoneDB (Python) https://pypi.org/project/CellphoneDB/
引:Efremova, M., Vento-Tormo, M., Teichmann, S.A. et al. CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nat Protoc 15, 1484–1506 (2020).
CellChat (R) https://github.com/sqjin/CellChat
引:Jin, S., Guerrero-Juarez, C.F., Zhang, L. et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun 12, 1088 (2021).
Celltalker (R) https://github.com/arc85/celltalker
NicheNet (R) https://nichenet.be/
引:Browaeys, R., Saelens, W. & Saeys, Y. NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods 17, 159–162 (2020).
iTALK (R) https://github.com/Coolgenome/iTALK
引:Wang, Yuanxin et al. “iTALK: an R Package to Characterize and Illustrate Intercellular Communication.” bioRxiv (2019)
CSOmap (R) https://github.com/zhongguojie1998/CSOmap
引:Ren X, Zhong G, Zhang Q, Zhang L, Sun Y, Zhang Z. Reconstruction of cell spatial organization from single-cell RNA sequencing data based on ligand-receptor mediated self-assembly. Cell Res. 2020 Sep;30(9):763-778.
功能富集
irGSEA (R) https://github.com/chuiqin/irGSEA
代谢分析
Compass (Python) https://github.com/YosefLab/Compass
引:Wagner A, Wang C, Fessler J. et al. Metabolic modeling of single Th17 cells reveals regulators of autoimmunity. Cell. 2021 Aug 5;184(16):4168-4185.e21.
scFEA (Python) https://github.com/changwn/scFEA
引:Alghamdi N, Chang W, Dang P. et al. A graph neural network model to estimate cell-wise metabolic flux using single-cell RNA-seq data. Genome Res. 2021 Oct;31(10):1867-1884.
scmetabolism (R) https://github.com/wu-yc/scMetabolism
引:Wu Y, Yang S, Ma J. et al. Spatiotemporal Immune Landscape of Colorectal Cancer Liver Metastasis at Single-Cell Level. Cancer Discov. 2022 Jan;12(1):134-153.
衰老分析
scAge (Python) https://github.com/alex-trapp/scAge
引:Trapp, A., Kerepesi, C. & Gladyshev, V.N. Profiling epigenetic age in single cells. Nat Aging 1, 1189–1201 (2021).
干性分析
CytoTrace (R) https://cytotrace.stanford.edu/
引:Gulati GS, Sikandar SS, Wesche DJ. et al. Single-cell transcriptional diversity is a hallmark of developmental potential. Science. 2020 Jan 24;367(6476):405-411.
拷贝数变异
InferCNV (R、Python) https://github.com/broadinstitute/inferCNV/wiki
引:Patel AP, Tirosh I, Trombetta JJ. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science. 2014 Jun 20;344(6190):1396-401.
CopyKAT (R) https://github.com/navinlabcode/copykat
引:Gao R, Bai S, Henderson YC. et al. Delineating copy number and clonal substructure in human tumors from single-cell transcriptomes. Nat Biotechnol. 2021 May;39(5):599-608.
Souporcell (Python) https://github.com/wheaton5/souporcell
引:Heaton H, Talman AM, Knights A, Imaz M, Gaffney DJ, Durbin R, Hemberg M, Lawniczak MKN. Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes. Nat Methods. 2020 Jun;17(6):615-620.
变异位点分析
scSNV (Python) https://github.com/GWW/scsnv
引:Wilson, G.W., Derouet, M., Darling, G.E. et al. scSNV: accurate dscRNA-seq SNV co-expression analysis using duplicate tag collapsing. Genome Biol 22, 144 (2021).
融合基因分析
scFUSION (Python) https://github.com/ZijieJin/scFusion
引:Jin, Z., Huang, W., Shen, N. et al. Single-cell gene fusion detection by scFusion. Nat Commun 13, 1084 (2022).
可变剪切
MARVEL (R) https://github.com/wenweixiong/MARVEL
引:Wen WX, Mead AJ, Thongjuea S. MARVEL: an integrated alternative splicing analysis platform for single-cell RNA sequencing data. Nucleic Acids Res. 2023 Mar 21;51(5):e29.
3’UTR分析
SCAPTURE (Python) https://github.com/YangLab/SCAPTURE
引:Li GW, Nan F, Yuan GH, Liu CX, Liu X, Chen LL, Tian B, Yang L. SCAPTURE: a deep learning-embedded pipeline that captures polyadenylation information from 3' tag-based RNA-seq of single cells. Genome Biol. 2021 Aug 10;22(1):221.
转座元件分析
scTE (Python) https://github.com/JiekaiLab/scTE
引:He, J., Babarinde, I.A., Sun, L. et al. Identifying transposable element expression dynamics and heterogeneity during development at the single-cell level with a processing pipeline scTE. Nat Commun 12, 1456 (2021).
L1 mRNA分析
SCIFER (R) https://github.com/rodrigarc/scifer
引:Stow, E.C., Baddoo, M., LaRosa, A.J. et al. SCIFER: approach for analysis of LINE-1 mRNA expression in single cells at a single locus resolution. Mobile DNA 13, 21 (2022).
去卷积联合转录组
Scissor (R) https://github.com/sunduanchen/Scissor
引:Sun, D., Guan, X., Moran, A.E. et al. Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data. Nat Biotechnol 40, 527–538 (2022).
Cibersortx https://cibersortx.stanford.edu/
Music2 (R) https://github.com/Jiaxin-Fan/MuSiC2
引:Fan J, Lyu Y, Zhang Q, Wang X, Li M, Xiao R. MuSiC2: cell-type deconvolution for multi-condition bulk RNA-seq data. Brief Bioinform. 2022 Nov 19;23(6):bbac430.
拟细胞分析
Metacell (R) https://github.com/tanaylab/metacell
引:Baran, Y., Bercovich, A., Sebe-Pedros, A. et al. MetaCell: analysis of single-cell RNA-seq data using K-nn graph partitions. Genome Biol 20, 206 (2019).
Supercell (R) https://github.com/GfellerLab/SuperCell
引:Bilous M, Tran L, Cianciaruso C, Gabriel A, Michel H, Carmona SJ, Pittet MJ, Gfeller D. Metacells untangle large and complex single-cell transcriptome networks. BMC Bioinformatics. 2022 Aug 13;23(1):336.
白细胞抗原分析
scHLAcount (Python) https://github.com/10XGenomics/scHLAcount
引:Darby CA, Stubbington MJT, Marks PJ, Martínez Barrio Á, Fiddes IT. scHLAcount: allele-specific HLA expression from single-cell gene expression data. Bioinformatics. 2020 Jun 1;36(12):3905-3906.
药敏分析
Beyondcell (R) https://github.com/cnio-bu/beyondcell
引:Fustero-Torre, C., Jiménez-Santos, M.J., García-Martín, S. et al. Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data. Genome Med 13, 187 (2021).
联合GWAS
scDRS (Python) https://github.com/martinjzhang/scDRS
引:Zhang, M.J., Hou, K., Dey, K.K. et al. Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data. Nat Genet 54, 1572–1580 (2022).
GSSG (R) https://github.com/kkdey/GSSG
引:Jagadeesh, K.A., Dey, K.K., Montoro, D.T. et al. Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat Genet 54, 1479–1492 (2022).
免疫组库重构
TRUST4 (linux) https://github.com/liulab-dfci/TRUST4
引:Song, L., Cohen, D., Ouyang, Z. et al. TRUST4: immune repertoire reconstruction from bulk and single-cell RNA-seq data. Nat Methods 18, 627–630 (2021).
蛋白表达矩阵
metaVIPER (R+Python)
https://www.bioconductor.org/packages/release/bioc/html/viper.html http://califano.c2b2.columbia.edu/aracne
引:Ding, H., Douglass, E.F., Sonabend, A.M. et al. Quantitative assessment of protein activity in orphan tissues and single cells using the metaVIPER algorithm. Nat Commun 9, 1471
PISCES (R) https://github.com/califano-lab/PISCES/
引:Vlahos, L.J., Obradovic, A., Worley, J., Tan, X., Howe, A., Laise, P., Wang, A.L., Drake, C.G., & Califano, A. (2023). Systematic, Protein Activity-based Characterization of Single Cell State. bioRxiv.
空间转录组
上游分析
SpaceRanger (linux)
https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/installation
下游分析
综合分析
Giotto (R、Python) https://rubd.github.io/Giotto_site/
引:Dries R, Zhu Q, Dong R. et al. Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biol. 2021 Mar 8;22(1):78.
STUtility (R) https://github.com/jbergenstrahle/STUtility/
引:Bergenstråhle, J., Larsson, L. & Lundeberg, J. Seamless integration of image and molecular analysis for spatial transcriptomics workflows. BMC Genomics 21, 482 (2020).
Squidpy (Python) https://github.com/scverse/squidpy
引:Palla, G., Spitzer, H., Klein, M. et al. Squidpy: a scalable framework for spatial omics analysis. Nat Methods 19, 171–178 (2022).
聚类分析
BayesSpace (R) https://github.com/edward130603/BayesSpace
引:Zhao, E., Stone, M.R., Ren, X. et al. Spatial transcriptomics at subspot resolution with BayesSpace. Nat Biotechnol 39, 1375–1384 (2021).
SpatialCPie (R) https://github.com/jbergenstrahle/SpatialCPie/
引:Bergenstråhle, J., Bergenstråhle, L. & Lundeberg, J. SpatialCPie: an R/Bioconductor package for spatial transcriptomics cluster evaluation. BMC Bioinformatics 21, 161 (2020).
SRTsim (R) https://github.com/xzhoulab/SRTsim
引:Zhu, J., Shang, L. & Zhou, X. SRTsim: spatial pattern preserving simulations for spatially resolved transcriptomics. Genome Biol 24, 39 (2023).
空间高变基因
SpatialDE (Python) https://github.com/Teichlab/SpatialDE
引:Svensson, V., Teichmann, S. & Stegle, O. SpatialDE: identification of spatially variable genes. Nat Methods 15, 343–346 (2018).
SOMDE (Python) https://github.com/WhirlFirst/somde
引:Hao M, Hua K, Zhang X. SOMDE: A scalable method for identifying spatially variable genes with self-organizing map. Bioinformatics. 2021 Jun 24:btab471.
scGCO (Python) https://github.com/WangPeng-Lab/scGCO
引:Zhang, K., Feng, W. & Wang, P. Identification of spatially variable genes with graph cuts. Nat Commun 13, 5488 (2022).
trendsceek (R) https://github.com/edsgard/trendsceek
引:Edsgärd, D., Johnsson, P. & Sandberg, R. Identification of spatial expression trends in single-cell gene expression data. Nat Methods 15, 339–342 (2018).
SPARK (R) https://github.com/xzhoulab/SPARK
引:Sun, S., Zhu, J. & Zhou, X. Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies. Nat Methods 17, 193–200 (2020).
组织细分
Baysor (Julia) https://github.com/kharchenkolab/Baysor
引:Petukhov V, Xu RJ, Soldatov RA, Cadinu P, Khodosevich K, Moffitt JR, Kharchenko PV. Cell segmentation in imaging-based spatial transcriptomics. Nat Biotechnol. 2022 Mar;40(3):345-354.
SPATA (R) https://themilolab.github.io/SPATA2/
引:Jan Kueckelhaus, Jasmin von Ehr, Vidhya M. Ravi. et al.Inferring spatially transient gene expression pattern from spatial transcriptomic studies.bioRxiv(2020)
细胞类型鉴定和去卷积
SPOTlight (R) https://github.com/MarcElosua/SPOTlight
引:Elosua-Bayes M, Nieto P, Mereu E, Gut I, Heyn H (2021): SPOTlight: seeded NMF regression to deconvolute spatial transcriptomics spots with single-cell transcriptomes. Nucleic Acids Res 49(9):e50.
Cell2location (Python) https://github.com/BayraktarLab/cell2location
引:Kleshchevnikov, V., Shmatko, A., Dann, E. et al. Cell2location maps fine-grained cell types in spatial transcriptomics. Nat Biotechnol 40, 661–671 (2022).
CellTrek (R) https://github.com/navinlabcode/CellTrek
引:Wei, R., He, S., Bai, S. et al. Spatial charting of single-cell transcriptomes in tissues. Nat Biotechnol 40, 1190–1199 (2022).
RCTD (R) https://github.com/dmcable/RCTD
引:Cable DM, Murray E, Zou LS, Goeva A, Macosko EZ, Chen F, Irizarry RA. Robust decomposition of cell type mixtures in spatial transcriptomics. Nat Biotechnol. 2022 Apr;40(4):517-526.
stereoscope (Python) https://github.com/almaan/stereoscope
引:Andersson, A., Bergenstråhle, J., Asp, M. et al. Single-cell and spatial transcriptomics enables probabilistic inference of cell type topography. Commun Biol 3, 565 (2020).
DSTG (Python) https://github.com/Su-informatics-lab/DSTG
引:Song Q, Su J. DSTG: deconvoluting spatial transcriptomics data through graph-based artificial intelligence. Brief Bioinform. 2021 Sep 2;22(5):bbaa414.
GraphST (Python) https://github.com/JinmiaoChenLab/GraphST
引:Long, Y., Ang, K.S., Li, M. et al. Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST. Nat Commun 14, 1155 (2023).
Tangram (Python) https://github.com/broadinstitute/Tangram
引:Biancalani, T., Scalia, G., Buffoni, L. et al. Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram. Nat Methods 18, 1352–1362 (2021).
STRIDE (Python) https://github.com/wanglabtongji/STRIDE
引:Sun D, Liu Z, Li T, Wu Q, Wang C. STRIDE: accurately decomposing and integrating spatial transcriptomics using single-cell RNA sequencing. Nucleic Acids Res. 2022 Apr 22;50(7):e42.
细胞轨迹推断
stLearn (Python) https://github.com/BiomedicalMachineLearning/stLearn
引:Pham et al. et al. stLearn: integrating spatial location, tissue morphology and gene expression to find cell types, cell-cell interactions and spatial trajectories within undissociated tissues.bioRxiv(2020)
干性分析
CytoSPACE (R) https://github.com/digitalcytometry/cytospace
引:Vahid, M.R., Brown, E.L., Steen, C.B. et al. High-resolution alignment of single-cell and spatial transcriptomes with CytoSPACE. Nat Biotechnol (2023).
相邻位置分析
PRECAST (R) https://github.com/feiyoung/PRECAST
引:Liu, W., Liao, X., Luo, Z. et al. Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST. Nat Commun 14, 296 (2023).
细胞通讯
GCNG (Python) https://github.com/xiaoyeye/GCNG
引:Yuan, Y., Bar-Joseph, Z. GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics data. Genome Biol 21, 300 (2020).
SpaOTsc (Python) https://github.com/zcang/SpaOTsc
引:Cang, Z., Nie, Q. Inferring spatial and signaling relationships between cells from single cell transcriptomic data. Nat Commun 11, 2084 (2020).
MISTyR (R) https://github.com/saezlab/mistyR
引:Tanevski, J., Flores, R.O.R., Gabor, A. et al. Explainable multiview framework for dissecting spatial relationships from highly multiplexed data. Genome Biol 23, 97 (2022).