单细胞转录组高级分析 | CellPhoneDB v5简介

文摘   2024-10-16 16:43   江苏  

🔗单细胞测序🔗scRNA-seq高级分析🔗scATAC-seq🔗R包开发🔗源码拆解🔗测试🔗RNA-seq🔗其它生信分析🔗R语言🔗Python🔗环境配置🔗文献分享🔗一只羊的碎碎念

CellPhoneDB v5目前支持自定义database,这样方便了研究其它非人/小鼠的。CellPhoneDB最好的学习资源,就是发表于nature protocols上的原文,按照这份protocol从安装、运行到可视化一个示例数据大概只需要两个小时时间。

This protocol typically takes ~2 h to complete, from installation to statistical analysis and visualization, for a dataset of ~10 GB, 10,000 cells and 19 cell types, and using five threads.

本文摘抄了《CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes》中的部分内容,后面再分享安装、人/小鼠/自定义数据库使用。

简介

什么是细胞通讯

Complex extracellular responses start with the binding of a ligand to its cognate receptor and the activation of specific cell signaling pathways. Mapping these ligand–receptor interactions is fundamental to understanding cellular behavior and response to neighboring cells.

With the exponential growth of single-cell RNA sequencing (scRNA-seq)1, it is now possible to measure the expression of ligands and receptors in multiple cell types and systematically decode intercellular communication networks that will ultimately explain tissue function in homeostasis and their alterations in disease.

Identifying ligand–receptor interactions from scRNA-seq requires both the annotation of complex ligand–receptor relationships from the literature and a statistical method that integrates the resource with scRNA-seq data and selects relevant interactions from the dataset.

What is CellphoneDB?

CellphoneDB is a publicly available repository of HUMAN curated receptors, ligands and their interactions paired with a tool to interrogate your own single-cell transcriptomics data (or even bulk transcriptomics data if your samples represent pure populations!).

特点:

1.【考虑了亚基的情况】In contrast to other repositories, our database takes into account the subunit architecture of both ligands and receptors, representing heteromeric complexes accurately. We include subunit architecture for both ligands and receptors to represent heteromeric complexes accurately (Fig. 1).This is critical, because cell–cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies.

2.【人工挑选】Our repository relies on the use of public resources to annotate receptors and ligands, as well as manual curation of specific families of proteins involved in cell–cell communication.

和别的工具的区别

The majority of these methods use lists of binary ligand–receptor pairs to assign communication between cells, without considering multimeric receptors. Relevant interactions are inferred by filtering on the basis of the expression levels of the ligand and receptor.

A major strength of CellPhoneDB, as compared with most other databases, is that it takes into account the structural composition of ligands and receptors, which is important because ligand–receptor interactions often involve multiple subunits.

未来改进

也是从数据库和统计检验两方面:

Our database, although comprehensive, is not a complete list of all possible ligand–receptor interactions, and this should be taken into consideration when interpreting cell–cell communication networks, especially the total number of interactions between cell types.

Furthermore, our statistical method prioritizes cell-type-enriched and potentially biologically important interactions that would result in downstream signaling events. Therefore, a non-significant P value does not indicate that the interaction is not present, only that it is not highly specific between two cell types. (一个不显著的P值并不表明这种相互作用不存在,只表明它在两种细胞类型之间没有高度特异性)

结果解析

a, Overview of selected ligand–receptor interactions using CellPhoneDB on the decidua dataset from ref. 5; P values are indicated by circle size; scale is shown below the plot. The means of the average expression level of interacting molecule 1 in cluster 1 and interacting molecule 2 in cluster 2 are indicated by color.

b, Heatmap showing the total number of interactions between cell types in the decidua dataset obtained with CellPhoneDB.

原理

Cells with the same cluster annotation are pooled together as a cell state. We derive enriched ligand–receptor interactions between two cell states on the basis of expression of a receptor by one cell state and a ligand by another cell state. For each gene in the cluster, the percentage of cells expressing the gene and the gene expression mean are calculated (Fig. 2b).

We consider the expression levels of ligands and receptors within each cell state and use empirical shuffling to calculate which ligand–receptor pairs display significant cell-state specificity (Fig. 2c,d). This predicts molecular interactions between cell populations via specific protein complexes and generates potential cell–cell communication networks, which can be visualized using intuitive tables and plots (Fig. 2e). Specificity of the ligand–receptor interaction is important, because some of the ligand–receptor pairs are ubiquitously expressed by the cells in a tissue and therefore are not informative regarding specific communication between particular cell states.

Fig. 2 | Overview of the statistical method framework used to infer ligand–receptor complexes specific to two cell types from single-cell transcriptomics data. a, CellPhoneDB input data consist of a scRNA-seq counts file and celltype annotation. Large datasets can be subsampled using geometric sketching3. b, Enriched receptor–ligand interactions between two cell types are derived on the basis of expression of a receptor by one cell type and a ligand by another cell type. The member of the complex with the minimum average expression is considered for the subsequent statistical analysis. c, We generate a null distribution of the mean of the average ligand and receptor expression in the interacting clusters by randomly permuting the cluster labels of all cells. d, The P value for the likelihood of cell-type specificity of a given receptor–ligand complex is calculated on the basis of the proportion of the means that are as high as or higher than the actual mean. e, Ligand–receptor pairs are ranked on the basis of their total number of significant P values across the cell populations. Visualization of the results using intuitive tables and plots is provided via the web interface. L1, example ligand L1; R1, example receptor R1; Sub., subunit. Adapted from ref. 5, Macmillan Publishers Limited.

版本

v2

Compared to the original CellPhoneDB platform, our updated version, CellPhoneDB v.2.0, has incorporated new features, such as subsampling of the original dataset to enable the fast querying of large datasets (geometric sketching)3 or the visualization of results using intuitive tables, plots and network files that can be directly uploaded into Cytoscape (https://cytosca pe.org/). In addition, we now offer users the possibility of using their own list of ligand–receptor interactions through our easy-to-use Python GitHub package.

v5

从数据库到算法方面有以下区别:

  1. New python package that can be easily executed in Jupyter Notebook and Collabs.

  2. A scoring methodology to rank interaction based on the expression specificity of the interacting partners.

  3. A CellSign module to leverage interactions based on the activity of the transcription factor downstream the receptor. This module is accompanied by a collection of 211 well described receptor-transcription factor direct relationships.

  4. A new method of querying of CellphoneDB results search_utils.search_analysis_results.

  5. Tutorials to run CellphoneDB

  6. Improved computational efficiency of method 2 cpdb_statistical_analysis_method.

  7. A new database (cellphonedb-data v5.0) with more manually curated interactions, making up to a total of ~3,000 interactions. This release of CellphoneDB database has three main changes:

  • Integrates new manually reviewed interactions with evidenced roles in cell-cell communication.
  • Includes non-protein molecules acting as ligands.
  • For interactions with a demonstrated signalling directionality, partners have been ordered according (ligand is partner A, receptor partner B).
  • Interactions have been classified within signaling pathways.
  • CellphoneDB no longer imports interactions from external resources. This is to avoid the inclusion of low-confidence interactions.

END


#

付费视频

#

推荐阅读

生信分析环境搭建

※上游分析|cellranger上游分析(植物适用)

※标准分析|标准分析流程
※标准分析|Read10X源码拆解
标准分析|自动获得QC阈值
标准分析|污染处理SoupX

注释|植物细胞marker
注释|自动注释SCSA

※细胞分化|轨迹分析基本概念1
※细胞分化|轨迹分析基本概念2
※细胞分化|monocle1原理
※细胞分化|monocle2原理
细胞分化|解决monocle2报错
细胞分化|Cytotrace分析
细胞分化|使用VECTOR进行无监督发育方向推断
※细胞分化|单细胞可变剪切分析全流程(基于velocyto.R)
细胞分化|不同scVelo模型
细胞分化|使用GeneTrajectory进行基因轨迹分析

※富集分析|基于TBtools&R语言进行富集分析及可视化
富集分析|更新clusterprofiler包
富集分析|基因ID格式转换
富集分析|水稻富集分析
※富集分析|植物组织特异性干细胞通路获取

※可视化|Featureplot函数进阶
※可视化|DotPlot函数进阶
※可视化|给你的Dotplot添加聚类及其它统计信息
※可视化|Cell级降维图绘制

※单细胞联合bulk|一文搞定R包Scissor

公共数据|EgdeTurbo下载CNCB数据
公共数据|不使用Read10x如何读取数据

#

关于我

分享内容:分子标记开发及种质资源鉴定、单细胞多组学数据分析、生信编程、算法原理、文献分享与复现等...

点个赞再走!


你好我是一只羊
个人号,内容主要涉及种质资源、分子标记开发及遗传多样性分析,表观遗传、编程语言在生物信息学中的应用、转录组、基因组、单细胞测序多组学数据分析等;其它更新平台:B站&小红书-一只羊做生信/捡羊毛的咩/生信小羊🐑
 最新文章