单细胞转录组高级分析 | CellPhoneDB v5简介

文摘 2024-10-16 16:43 江苏

🔗单细胞测序、🔗scRNA-seq高级分析、🔗scATAC-seq、 🔗R包开发、🔗源码拆解、 🔗测试、🔗RNA-seq 、🔗其它生信分析、 🔗R语言、🔗Python 、🔗环境配置、🔗文献分享、 🔗一只羊的碎碎念

CellPhoneDB v5目前支持自定义database，这样方便了研究其它非人/小鼠的。CellPhoneDB最好的学习资源，就是发表于nature protocols上的原文，按照这份protocol从安装、运行到可视化一个示例数据大概只需要两个小时时间。

This protocol typically takes ~2 h to complete, from installation to statistical analysis and visualization, for a dataset of ~10 GB, 10,000 cells and 19 cell types, and using five threads.

本文摘抄了《CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes》中的部分内容，后面再分享安装、人/小鼠/自定义数据库使用。

简介

什么是细胞通讯

Complex extracellular responses start with the binding of a ligand to its cognate receptor and the activation of specific cell signaling pathways. Mapping these ligand–receptor interactions is fundamental to understanding cellular behavior and response to neighboring cells.

With the exponential growth of single-cell RNA sequencing (scRNA-seq)1, it is now possible to measure the expression of ligands and receptors in multiple cell types and systematically decode intercellular communication networks that will ultimately explain tissue function in homeostasis and their alterations in disease.

Identifying ligand–receptor interactions from scRNA-seq requires both the annotation of complex ligand–receptor relationships from the literature and a statistical method that integrates the resource with scRNA-seq data and selects relevant interactions from the dataset.

What is CellphoneDB?

CellphoneDB is a publicly available repository of HUMAN curated receptors, ligands and their interactions paired with a tool to interrogate your own single-cell transcriptomics data (or even bulk transcriptomics data if your samples represent pure populations!).

特点：

1.【考虑了亚基的情况】In contrast to other repositories, our database takes into account the subunit architecture of both ligands and receptors, representing heteromeric complexes accurately. We include subunit architecture for both ligands and receptors to represent heteromeric complexes accurately (Fig. 1).This is critical, because cell–cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies.

2.【人工挑选】Our repository relies on the use of public resources to annotate receptors and ligands, as well as manual curation of specific families of proteins involved in cell–cell communication.

和别的工具的区别

The majority of these methods use lists of binary ligand–receptor pairs to assign communication between cells, without considering multimeric receptors. Relevant interactions are inferred by filtering on the basis of the expression levels of the ligand and receptor.

A major strength of CellPhoneDB, as compared with most other databases, is that it takes into account the structural composition of ligands and receptors, which is important because ligand–receptor interactions often involve multiple subunits.

未来改进

也是从数据库和统计检验两方面：

Our database, although comprehensive, is not a complete list of all possible ligand–receptor interactions, and this should be taken into consideration when interpreting cell–cell communication networks, especially the total number of interactions between cell types.

Furthermore, our statistical method prioritizes cell-type-enriched and potentially biologically important interactions that would result in downstream signaling events. Therefore, a non-significant P value does not indicate that the interaction is not present, only that it is not highly specific between two cell types. （一个不显著的P值并不表明这种相互作用不存在，只表明它在两种细胞类型之间没有高度特异性）

结果解析

a, Overview of selected ligand–receptor interactions using CellPhoneDB on the decidua dataset from ref. 5; P values are indicated by circle size; scale is shown below the plot. The means of the average expression level of interacting molecule 1 in cluster 1 and interacting molecule 2 in cluster 2 are indicated by color.

b, Heatmap showing the total number of interactions between cell types in the decidua dataset obtained with CellPhoneDB.

原理

Cells with the same cluster annotation are pooled together as a cell state. We derive enriched ligand–receptor interactions between two cell states on the basis of expression of a receptor by one cell state and a ligand by another cell state. For each gene in the cluster, the percentage of cells expressing the gene and the gene expression mean are calculated (Fig. 2b).

We consider the expression levels of ligands and receptors within each cell state and use empirical shuffling to calculate which ligand–receptor pairs display significant cell-state specificity (Fig. 2c,d). This predicts molecular interactions between cell populations via specific protein complexes and generates potential cell–cell communication networks, which can be visualized using intuitive tables and plots (Fig. 2e). Specificity of the ligand–receptor interaction is important, because some of the ligand–receptor pairs are ubiquitously expressed by the cells in a tissue and therefore are not informative regarding specific communication between particular cell states.

Fig. 2 | Overview of the statistical method framework used to infer ligand–receptor complexes specific to two cell types from single-cell transcriptomics data. a, CellPhoneDB input data consist of a scRNA-seq counts file and celltype annotation. Large datasets can be subsampled using geometric sketching3. b, Enriched receptor–ligand interactions between two cell types are derived on the basis of expression of a receptor by one cell type and a ligand by another cell type. The member of the complex with the minimum average expression is considered for the subsequent statistical analysis. c, We generate a null distribution of the mean of the average ligand and receptor expression in the interacting clusters by randomly permuting the cluster labels of all cells. d, The P value for the likelihood of cell-type specificity of a given receptor–ligand complex is calculated on the basis of the proportion of the means that are as high as or higher than the actual mean. e, Ligand–receptor pairs are ranked on the basis of their total number of significant P values across the cell populations. Visualization of the results using intuitive tables and plots is provided via the web interface. L1, example ligand L1; R1, example receptor R1; Sub., subunit. Adapted from ref. 5, Macmillan Publishers Limited.

版本

v2

Compared to the original CellPhoneDB platform, our updated version, CellPhoneDB v.2.0, has incorporated new features, such as subsampling of the original dataset to enable the fast querying of large datasets (geometric sketching)3 or the visualization of results using intuitive tables, plots and network files that can be directly uploaded into Cytoscape (https://cytosca pe.org/). In addition, we now offer users the possibility of using their own list of ligand–receptor interactions through our easy-to-use Python GitHub package.

v5

从数据库到算法方面有以下区别：

New python package that can be easily executed in Jupyter Notebook and Collabs.
A scoring methodology to rank interaction based on the expression specificity of the interacting partners.
A CellSign module to leverage interactions based on the activity of the transcription factor downstream the receptor. This module is accompanied by a collection of 211 well described receptor-transcription factor direct relationships.
A new method of querying of CellphoneDB results search_utils.search_analysis_results.
Tutorials to run CellphoneDB
Improved computational efficiency of method 2 cpdb_statistical_analysis_method.
A new database (cellphonedb-data v5.0) with more manually curated interactions, making up to a total of ~3,000 interactions. This release of CellphoneDB database has three main changes:

Integrates new manually reviewed interactions with evidenced roles in cell-cell communication.
Includes non-protein molecules acting as ligands.
For interactions with a demonstrated signalling directionality, partners have been ordered according (ligand is partner A, receptor partner B).
Interactions have been classified within signaling pathways.
CellphoneDB no longer imports interactions from external resources. This is to avoid the inclusion of low-confidence interactions.

END

付费视频

最新文章

【代码】Ro/e分析量化单细胞亚群分布偏好

文献分享 | 烟草幼苗期单细胞转录组图谱 | 植物单细胞互作分析

【代码】美化 | 单细胞转录组多组差异基因火山图/环形火山图

【视频+代码】CellPhoneDB v5 | CellPhoneDB v5 可视化之网络图和贝壳图

文献分享 | 拟南芥叶片感染假单胞菌的单细胞图谱

【视频+代码】CellPhoneDB v5 | CellPhoneDB v5 可视化之热图改进

【视频+代码】CellPhoneDB v5 | 下载安装、代码实操及结果解读

单细胞转录组高级分析 | CellPhoneDB v5简介

环境配置 | homer安装

环境配置 | MACS2简介

【视频+代码】使用Liger进行综合非负矩阵分解(iNMF) | 不同批次/平台/物种/模态整合方案

【视频】跟着Cell Metabolism学作图 | 使用Plot1Cell包绘制单细胞降维图

【单篇付费】细胞分化分析|monocle1原理

【单篇付费】高级分析之细胞分化 | monocle2原理

【单篇付费】高级分析之细胞分化 | monocle2安装及实战

年中回顾 | 一只羊的2023年总结&2024规划

提供10G+练习数据，零基础做ER，柳叶刀，Nature的技术方案：影像组学人工智能实操培训班＋数据委托实验

Linux | shell脚本封装模板（内含资源分享）

高通量测序 | 高通量测序的发展历程（内含资源分享）

分子标记 | popgene32使用限制

文献分享 | 武汉大学王坤教授和周宇教授团队利用单细胞技术发现棉纤维细胞昼夜节律性生长的调控机制

文献复现4 | 人类肝细胞图谱2

文献复现3 | 人类肝细胞图谱1-数据下载

scATAC | 桑基图可视化liger整合结果（包含NA值的处理）

【单篇付费】细胞分化分析|轨迹分析的基本概念

【单篇付费】细胞分化分析|轨迹分析的基本概念2

【单篇付费】Cellranger单细胞转录组上游分析流程回顾|非人/小鼠特殊物种gtf文件修改添加线粒体、叶绿体标记

高级分析之细胞分化 | monocle2安装及实战

单细胞实战-拟南芥根(3)-使用monocle2构建细胞分化轨迹

scATAC | 使用Liger v2整合单细胞转录组和染色质开放数据

生信不要总是局限在预后模型

scATAC|如何描述基因表达与染色质开放的相关性

哇塞！你敢信？生信领域要“变天”了新技术问世，将打破困扰科研界长久以来的难题！

2019-2024年上半年单细胞多组学在植物研究中的发文情况

文献分享 | 浙江大学樊龙江团队绘制首张水稻种胚单细胞时空图谱

小技巧|使用GenomicFeatures包轻松获得基因长度

表观遗传|ChIP-seq、CUT&Tag和CUT&RUN

高级分析之细胞分化 | monocle2原理

文献分享 | 拟南芥茎尖的单细胞转录组分析

scATAC | 使用Cell Ranger ATAC进行上游分析（适用于非人/小鼠）

Nature重磅 | 颠覆认知：华人学者首次揭示“表观遗传”由细胞质中“无名小卒”精细调控！

文献分享 | 整合单细胞分析揭示肾透明细胞癌的转录和表观遗传调控特征

文献分享|水稻根尖单细胞转录组及染色质可及性图谱

多端同步|Typora+GitHub+jsDelivr+Picgo搭建图床

生信指北|开源第一步，GitHub白嫖学生包认证流程

表观多组学|DNA甲基化相关基础知识

RNA-seq|样本相关性散点图

测试|cytotrace v1使用不同细胞数对结果的影响

代码笔记容易忘？推荐这款Markdown写作神器Typora（附MAC版下载）

文献分享-2023-scPlant：植物单细胞转录组数据分析的框架

分类

时事

民生

政务

教育

文化

科技

财富

体娱

健康

情感

旅行

百科

职场

楼市

企业

乐活

学术

汽车

时尚

创业

美食

幽默

美体

文摘

原创标签

时事社会财经军事教育体育科技汽车科学房产搞笑综艺明星音乐动漫游戏时尚健康旅游美食生活摄影宠物职场育儿情感小说曲艺文化历史三农文学娱乐电影视频图片新闻宗教电视剧纪录片广告创意壁纸头像心灵鸡汤星座命理教育培训艺术文化金融财经健康医疗美妆时尚餐饮美食母婴育儿社会新闻工业农业时事政治星座占卜幽默笑话独立短篇连载作品文化历史科技互联网

发布位置

广东北京山东江苏河南浙江山西福建河北上海四川陕西湖南安徽湖北内蒙古江西云南广西甘肃辽宁黑龙江贵州新疆重庆吉林天津海南青海宁夏西藏香港澳门台湾美国加拿大澳大利亚日本新加坡英国西班牙新西兰韩国泰国法国德国意大利缅甸菲律宾马来西亚越南荷兰柬埔寨俄罗斯巴西智利卢森堡芬兰瑞典比利时瑞士土耳其斐济挪威朝鲜尼日利亚阿根廷匈牙利爱尔兰印度老挝葡萄牙乌克兰印度尼西亚哈萨克斯坦塔吉克斯坦希腊南非蒙古奥地利肯尼亚加纳丹麦津巴布韦埃及坦桑尼亚捷克阿联酋安哥拉