英国威康桑格研究所Sarah A. Teichmann团队完成人类细胞图谱数据集中自动的细胞类型统一和整合。该研究于2023年12月21日发表于国际一流学术期刊《细胞》杂志上。
研究人员介绍一种基于预测聚类树的工具--CellHint,用于解决数据集之间在注释分辨率和技术偏差方面的细胞类型差异。CellHint能准确量化细胞-细胞转录组的相似性,并将细胞类型放入关系图中,分层定义共享和独特的细胞亚型。在多个免疫数据集上的应用再现了专家编辑的注释。CellHint还揭示了八种疾病中健康肺细胞状态与患病肺细胞状态之间尚未被充分探索的关系。
此外,研究人员还介绍了一种在统一细胞类型和细胞层次结构指导下进行快速跨数据集整合的工作流程,它揭示了成人海马中未被充分认识的细胞类型。最后,研究人员将CellHint应用于来自38个数据集的12种组织,提供了一个包含370万个细胞的跨组织深度策展数据库,并提供了多种机器学习模型用于跨人体组织的自动细胞注释。
据介绍,统一单细胞群体中的细胞类型并将其整合到一个共同的框架中是建立标准化人类细胞图谱的核心。
附:英文原文
Title: Automatic cell-type harmonization and integration across Human Cell Atlas datasets
Author: Chuan Xu, Martin Prete, Simone Webb, Laura Jardine, Benjamin J. Stewart, Regina Hoo, Peng He, Kerstin B. Meyer, Sarah A. Teichmann
Issue&Volume: 2023/12/21
Abstract: Harmonizing cell types across the single-cell community and assembling them into a common framework is central to building a standardized Human Cell Atlas. Here, we present CellHint, a predictive clustering tree-based tool to resolve cell-type differences in annotation resolution and technical biases across datasets. CellHint accurately quantifies cell-cell transcriptomic similarities and places cell types into a relationship graph that hierarchically defines shared and unique cell subtypes. Application to multiple immune datasets recapitulates expert-curated annotations. CellHint also reveals underexplored relationships between healthy and diseased lung cell states in eight diseases. Furthermore, we present a workflow for fast cross-dataset integration guided by harmonized cell types and cell hierarchy, which uncovers underappreciated cell types in adult human hippocampus. Finally, we apply CellHint to 12 tissues from 38 datasets, providing a deeply curated cross-tissue database with ~3.7 million cells and various machine learning models for automatic cell annotation across human tissues.
DOI: 10.1016/j.cell.2023.11.026
Source: https://www.cell.com/cell/fulltext/S0092-8674(23)01312-0