乳腺癌轉移過程中的異常發育途徑

  • 2020 年 3 月 30 日
  • 筆記

當你的才華還撐不起你的野心時,請潛下心來,腳踏實地,跟著我們慢慢進步。不知不覺在單細胞轉錄組領域做知識分析也快兩年了,通過文獻速遞這個欄目很幸運聚集了一些小夥伴攜手共進,一起成長。

文獻速遞欄目通過簡短介紹,擴充知識面,每天關注,希望你也能有所收穫!

文章資訊

今天分享的文章是bioRxiv的預印本文章,文章研究概括了在PABC模型致癌過程中出錯的發育機制。文章題目是:Single-cell RNAseq uncovers involution mimicry as an aberrant development pathway during breast cancer metastasis

Abstrac

使用Drop-seq,於pregnancy-associated breast cancer (PABC) transgenic mouse model (Elf5 overexpreesion),展示了其breast tumor中的cellular composition和functional diversity:1. 推斷出了mammary epithelial cells各個subpopulation的lineage;2. 揭示了PABC中cancer progression的機理:由alveolar milk secretory cells主導,經由多種Tumor microenvironment (TME) 中細胞的協助,形成的異常involution過程;3. 展示了TME中的cellular & molecular pathway network:involution過程中不乏各類細胞之間的interactions,並具有ECM remodeling與inflammation的特徵

Background

Workflow

Results

1. General cell population characterization

1.1 Unbiased high-resolution scRNAseq captures cell heterogeneity of MMTV-PyMT mammary tumours

Figure 1.

A) Experimental workflow showing a schematic representation of the transgenic MMTV-PyMT/Elf5 mouse model, the number of tumours analysed and the number of cells passing the QC filter in each genotype. B) Distribution of variable genes defined by expression and dispersion, highlighting typical canonical makers for each of the main lineages. C) Heatmap showing differential expression of the top expressed genes contributing to the epithelial, stromal and immune signature. The top right panel shows the score of the signature and the percentage of cells classified to each main cell lineage analysed. The tSNE visualisation shows the coordinates of each analysed cell after dimensional reduction coloured by its main cell lineage. Roman numerals define each of the spatially formed clusters (inset). The dot plot shows the top differential markers form each of the main cell lineages and their level of expression. Bottom left panel shows a representative contour plot of the cell composition of a MMTV-PyMT tumor analysed by FACS defined by EpCAM antibodies (epithelial cells), CD45 (leukocytes) and double negative cells (stroma). Violin plot shows distribution of the number of genes per cell in each of the main cell lineages. D) Feature tSNE plots showing the expression of typical canonical markers of each of the main cell lineages.

粗略的細胞分群:

  • Marker gene (labeled) : average expression vs dispersion
  • Cell type identification: FAC sorted thus only 3 types epithelial, immune, stromal, 其marker gene的表達和tumor中這三種成分的proportion

Figure 2.

A) tSNE plot showing cell clusters defined in each of the main cell lineages and their relative frequency. Far right column depicts the main cell lineage of origin for each cluster, showing 6 clusters of epithelial origin, 5 immune and 5 stromal. B) Heatmap showing the top differentially expressed genes that define each of the clusters. C) Cluster tree modelling the phylogenic relationship of the different clusters in each of the main cell types compartments at different clustering resolutions. Dashed red line shows the resolution chosen. Coloured circles in the cluster tree represent the origin of the clusters represented in the tSNE plot shown in panel A, the full cluster tree can be found in SuppFig 4A. D) Visualisation of the top differential genes for each of the defined clusters in the immune lineage (top) and in the stroma (bottom). E) Cell identification using score values for each of the metasignatures of the xCell algorithm in the immune and stromal compartment divided by cluster.

更細緻一些的細胞分群:

  • K-means clustering-based subclusters: 6 群epithelial, 5群immune, 5群stromal
  • 亞群之間建立deduced phylogenic tree
  • 亞群的Cell type identification:對stromal & immune cells: 基於各種megasignature打分 xCell(同上,顯示了marker gene expression和亞群內的細胞組分及其proportion)

2. Epithelial cell population: assign lineages for subpopulations and compare Elf5 OE vs WT models

2.2. PyMT cancer cells are organised in a structure that resembles the mammary gland epithelial hierarchy

Figure 3.

A) tSNE visualisation of the cell groups defined by k-means clustering analysis. Bottom panel shows a gene-expression heatmap of the top expressed genes for each cell cluster. B) Distribution of cells by genotype in the defined tSNE dimensions. The Sankey diagram shows the contribution of each of the genotypes to the cell clusters. Cluster numbers are coloured by the dominant genotype (>2-fold cell content of one genotype), Elf5 (red), WT (green). Violin plots showing Elf5 (upper plot) and PyMT (bottom plot) expression in each cell cluster. C) Scatter plot showing FACS data to define the % alveolar versus luminal progenitors using canonical antibodies that define the epithelial mammary gland hierarchy (EpCAM, CD49f, Sca1 and CD49b), in PyMT tumours. Each dot represents one animal (WT n = 6 and Elf5 n = 5); bottom panels are representative FACS plots of one of the replicates for each genotype. D) Dot plot representing the expression level (red jet) and the number of expressing cells (dot size) of the transcriptional mammary gland epithelium markers in each PyMT cluster. These marker genes were grouped according to each mammary epithelial cell type as defined by Bach, et al.: Hormone sensing differentiated (Hs-d, dark pink), Hormone-sensing progenitor (Hs-p, light pink), Luminal progenitor (LP, orange), Alveolar differentiated (Alv-d, dark red), Alveolar progenitor (Alv-p, light red), Basal (B, light purple), Myoepithelial (Myo, dark purple), Undifferentiated (Multi, light blue). E) Dot plot of the expression level of the top differential marker genes in each of the PyMT clusters coloured by genotype. The yellow rectangles highlight the top genes represented by each cluster. The size of the dots represents the percentage of cells/cluster that express each particular gene (pct. exp) and the colour gradient shows the level of expression for each gene/cluster. Note both colours are shown only when the cluster was populated similarly by both genotypes according to panel B.

分析epithelial cells, :

  • 兩種分類
  • de novo clustering產生了11個subclusters (unknown biological significance)
  • by genotype: wt & Elf5 overexpression(OE) PyMT
  • 兩種目的
  • 通過literature-derived marker genes,annotate identities of these 11 subclusters (Aim A)
  • 通過previous findings about Elf5 OE,對比WT & Elf5 OE的phenotype differences (Aim B)
  • Aim A – mapping clusters with lineage hierarchy:
  • alveolar vs luminal在兩種genotype(wt & Elf5 OE)之間的proportion差異(FACS by surface marker) => Elf5 forced differentiation of the luminal progenitors (into alveolar cells)
  • mammary gland epithelium markers (Bach et al)在各個subcluster內部的表達;相當於正向annotation:對de novo cluster用已知signatures進行annotate => CLUSTER: 0,7 – stem cells; 9 – hormone sensing; 8: myoepithelial; 2,3,4 – luminal progenitor; 1, 6 – luminal, alveolar differentiated; 5, 10 – undefined (mixture of luminal and myoepithelial features)

2.2 Dynamic relationship and states of the malignant lineages of PyMT tumours: implications for the cell of origin of cancer

Figure 4.

A) Pseudotiming alignment of the PyMT cancer epithelial cells along the gene signatures that define the main lineages of the mammary gland epithelial hierarchy using the DDRTree method in Monocle2. Right panels show the distribution of the cell states by genotype. B) Projection of the states defined by pseudotime analysis into tSNE clustering coordinates overall and per pseudotime state (miniaturised tSNE plots). Right panels show the projection by genotype. C) Overlay representation of the cell identities (k-means clustering as per Fig.3) and cell lineage identification (pseudotime analysis), the proportion of cells in each cluster that belong to each defined state is shown in the bar chart (right hand side). D) Enrichment analysis (GSVA score) for the gene signatures that define the main mammary gland lineages: Basal, Luminal Progenitor (LP) and Mature Luminal (ML) for each of the clusters. Bottom panels show the expression of each of the gene signatures at single cell resolution. The top bar shows the assigned mammary epithelial cell type as per section C. E) Frequency of the different cell lineages in each genotype. F)Cluster tree showing the phylogeny relationship of the different clusters. Red arrow shows the resolution used (0.7). G)Illustration of cell diversity of PyMT tumours based on the canonical structure of the mammary gland epithelial lineages.

Aim A – 使用另一套mammary gland hierarchy gene signature(Pal et al)建立trajectory並劃分出7個pseudotime states;並對比這兩套分類方式下,clusters之間的對應關係 (I personally consider it to be kind of redundant)

  • Aim B: Elf5 OE 明顯驅動cell population towards state 7 (alveolar lineage) from state 2, 4, 5, 6
  • 對tSNE的clusters用GSVA進行annotation;對tSNE的clusters進行pseudotime states的annotation
  • => 最終建立了clusters之間的phylogenic tree

Altogether, the combination analysis of gene signatures, gene markers and pseudotiming enabled the precise annotation of PyMT cell clusters within the mammary hierarchy proposed by Pal et al., identifying a large luminal lineage that retains most of the cell diversity and strong plasticity, a basal/myoepithelial compartment and a hormone-sensing lineage

2.3. Elf5 OE vs WT: molecular effects => upregulated involution signatures

Molecular mechanisms of cancer progression associated to cancer cells of Alveolar origin

Figure 5.

A) Cell cycle stages of the PyMT cancer cells as defined by gene expression signatures using tSNE coordinates and their deconvolution (middle panel). Circled area shows the cycling cluster (C5 in Fig. 3) characterised by a total absence of G1 cells. The quantification of the proportion of cells in each stage grouped by genotype is shown in the bar chart. B) Enrichment GSVA analysis of gene expression metasignatures of cancer-related and Elf5-related hallmarks associated to PyMT/WT (green) and /Elf5 (red) tumours. C) tSNE representation of the EMT gene expression metasignature at the single cell level. Right panel shows a western blot of canonical EMT markers (E-Cadh, E-Cadherin and Vim, vimentin) on PyMT/WT or ELF5 full tumour lysates. Note: the two images correspond to the same western blot gel cropped to show the relevant samples. D) Hypoxia metasignature at the single cell level is shown in the tSNE plot, bottom panel shows a bar plot of the extension of the hypoxic areas in PyMT/WT (green) and /Elf5 (red) tissue sections (tumours and lung metastasis) stained using IHC based on hypoxyprobe binding, representative images are shown in the right panels. E) Lactation and Late Involution (stage 4, S4) metasignatures at the single cell level is shown in the tSNE plots. Pictures show IHC with an anti-milk antibody in tissue sections from a lactating mammary gland at established lactation compared with a mammary gland from an aged-matched virgin mouse; and in PyMT/WT and Elf5 tumours. F) Kaplan-Meier survival curves based on Elf5 expression using the METABRIC cohort. Patient were segregated according to Elf5 expression levels based on tertiles. Elf5-high patients (red) were defined as the top-tertile and Elf5-low patients (green) as the bottom-tertile, Log-rank p values <0.05 are shown in red. The bar chart (bottom panel) corresponds to the distribution of the PAM50 classified breast cancer subtypes in the top and bottom Elf5 expressing tertile of patients. G) Upper panel: Survival analysis (Kaplan-Meier curves) for the expression of Elf5 (left hand side) and the Involution metasignature (right hand side) in luminal breast cancer patients as per section F) Bottom panel: Kaplan-Meier survival curves for the late involution metasignature in Elf5-high patients (left hand side, ELF5-H) and Elf5-low patients (right hand side, ELF5-L). Each group of patients (ELF5-H and ELF5-L) were segregated according to tertiles for the combined expression levels of the genes from the involution metasignature: Green, inv low, bottom third; Blue, inv mid, middle third and Red, inv high, top third. Log-rank p values <0.05 are shown in red.

  • 驗證previous發現的 ELF5 OE tumor model的特徵,因此大多都是explanatory results:
  • cell cycle => G1% ++ (consistent w/ previous findings that Elf5 decrease cell proliferation)
  • GSVA for specific signatures of known functions: EMT, hypoxia, lactation and late involution;並實驗驗證
  • EMT: WB – Elf5 OE shows reduction of Vimentin, thus indicating MET
  • Hypoxia: 用hypoxyprobe去quantify hypoxic region area(面積%)- Elf5 OE shows extensive hypoxic regions
  • Lactation & involution: IHC – Elf5 OE shows strong milk production features => (由於生物學已知,involution ~ mik stasis) 發現的確an involution gene signature在 富集alveolar cells 的 luminal lineage這部分subpopulation中高表達
  • 分析Elf5 OE及其相關的Lactation & involution signatures的臨床價值(prognosis significance)
  • survival analysis:
  • 已知hi Elf5 expression ~ poor prognosis in luminal subcohort of patients
  • 驗證Lactation & involution signatures:只有involution signatures,且only in Elf5 hi patients,其high expression ~ poor prognosis
  • (Caution !=> Most Random Gene Expression Signatures Are Significantly Associated with Breast Cancer Outcome https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002240)
  • PAM50: ELF5-hi/low中subtypes的分布(因為PAM50和prognosis也很相關):Basal subytpe%在ELF5-hi中明顯增高

至此,將Elf5 OE的下游影響focus on involution process

3. Fibroblasts cell population: assign lineages for subpopulations and compare Elf5 OE vs WT models

3.1 Characterisation of cancer-associated fibroblasts in PyMT tumours

Figure 6.

A) tSNE plot groups defined by k-means clustering analysis showing a total of three cell clusters defined within the fibroblast subtype. B) Metasignatures of Cancer-associated fibroblast (CAF) signature (upper plot) and myfibroblasts (bottom panel) plotted in the fibroblast tSNE. The gene list of each metasignature was manually annotated from published scRNAseq data in human tumours 64,65. C) Desmoplastic (upper plot), Inflammatory (middle plot) and Contractile (bottom plot) metasignatures plotted in the fibroblast tSNEs from public data 67. D) Violin plots displaying marker genes for each of the three fibroblast clusters defined in section A: ECM-CAFs (0), immune-CAFs (iCAFs, 1) and myofibroblasts (2). E) Upper plot: tSNE illustration of the involution signature from 62. Middle plot: tSNE plot defined by k-means clustering analysis at resolution 1 of the fibroblast population showing a total of nine cell clusters. Bottom section: Violin plots on these nine cell clusters of the three out of four genes from the involution signature. F) Upper plot: Distribution of fibroblasts by genotype (Elf5: red), WT: green) in the defined tSNE dimensions. Bottom plot: Sankey diagram showing the contribution percentage of each of the genotypes to the cell clusters. Cluster numbers are coloured by the dominant genotype (>2-fold cell content of one genotype), Elf5 (red), WT (green). G) GSVA enrichment analysis of involuting mammary fibroblast metasignatures associated to PyMT/WT and /Elf5 tumours. Violin plots of Cxcl12, Mmp3 and Col1a1 genes in all fibroblasts of each genotype. PyMT/WT (green) PyMT/Elf5 (red).

對fibroblasts群體:

  • identify subpopulations:
  • 基於metasignatures區分CAF & myfibroblasts(圖示各signature的表達情況): 0,1 – CAFs, 2 – myofibroblasts
  • 三個subcluster, 及其各自enrich的megasignature: 0 – ECM-CAFs (secretory), immune-CAF (iCAFs), contractitle CAF (myofibroblasts) (通過GSVA,用hallmark和addtional CAF function dataset, ref67雙重驗證)
  • 通過involution marker genes,發現ECM-CAFs和iCAFs中具有involution特徵的subpopulation (involution fibroblast)
  • involution ECM-CAF: top enriched pathways和lipid metabolism相關 => 推測其為adipocyte-derived fibroblasts, which are known to have high ECM remodeling potential and invasiveness
  • involution iCAF: top enriched pathways和wound-healing相關, which corresponds to late stages of involution.
  • compare Elf5 OE vs WT
  • GSVA using CAF-involution signature – enriched in CAFs from Elf5 OE than WT (Epithelial education?)
  • 對其進行wet lab驗證:CAFs from Elf5/PyMT mammary tumours show features of mimicry involution 通過scRNAseq發現的ELF5 OE中顯著富集的involution的phenotype: 基於involuting fibroblast具有fibrillar collagen high activity,quantified test of:
  • Collagen coverage & intensity(SHG imaging)
  • Collagen thickness (Polarised light imaging)
  • Collagen spatial alignment (SHG imaging)
  • 驗證了Elf5 OE具有higher fibroblast activity both in collagen deposition & collagen rearrangement

Figure 7.

A) Representative bright field images and quantification of total coverage of picrosirius red-stained PyMT/WT and PyMT/ELF5 tumours sections n=4 mice per genotype with 10 regions of interest (ROI) per tumour. B) Representative maximum intensity projections of SHG signal and quantification of SHG signal intensity at depth (µm) and at peak in PyMT/WT and PyMT/ELF5 tumour sections, n=6 mice per genotype with 6 ROI per tumour. C) Polarised light imaging of picrosirius red stained PyMT/WT and PyMT/ELF5 tumour sections, and quantification of total signal intensity acquired via polarised light. Thick remodelled fibres/high birefringence (red-orange), medium birefringence (yellow) and less remodelled fibres/low birefringence (green) n=4 mice per genotype with 10 ROI. D) SHG images of PyMT/WT and PyMT/ELF5 tumours assessed for differences in fibre orientation angle and quantification of frequency of fibre alignment ranging from the peak alignment. Different colours correspond to specific angles of orientation n=6 PyMT/WT and n=4 PyMT/ELF5. Inset shows the cumulative frequency of fibre alignment +/-10 degrees from peak.

4. Crosstalks between epithelial cells/fibroblasts/immune cells

Characterisation of the cell-to-cell interactions involved in the cancer-associated involution mimicry

A) Heatmap of the cell-cell interactions of all cell types from PYMT tumours based on Cellphone DB. Cell classification was based on the annotation from Figure 3 for the epithelial compartment; from Figure 6 at resolution 1 in the case of fibroblasts, where the cycling cluster (Cluster 6) and the residual cluster of 15 cells (Cluster 8) were removed; in addition, Clusters 7 and 2 were considered as a sole group annotated as 「Myofibroblasts」. The rest of the cells from the immune and stromal compartments were classified according to the annotation done in Figure 2 (See Supplementary Figure 9C for global annotation). The Scale at the right-hand side shows the interaction strength based on the statistical framework included in CellphoneDB (count of statistically significant (p<0.01) interactions above mean= 0.3, see methods). B) Graphical representation of all significant cell-cell interactions identified by CellphoneDB using the parameters of more than 10 significant interactions with a mean score greater than 0.3, number cut as more than 10 connections and number split 10. The red circles correspond to the cell types from the epithelial compartment; the blue triangles represent the cells from the stromal compartment and the green squares are the cells from the immune compartment. The size of geometric figures is relative to the number of cells involved in the interactions (display as count). Different number splits were applied to establish the most significant interactions for the fibroblast and immune cell types. Fibroblast showed the strongest interactions (highlighted as blue lines) when a number split of 67 (1st Tier) and 50 (2nd Tier) were used. The immune system showed weaker interactions (highlighted as green lines) at a number split of 15 (3rd Tier) and 11 (4th Tier). C) Representative dot plots of ligand (no background)-receptor (red background) pairs. The size of the circles is relative to the number of cells within each annotated cluster that showed a positive expression of each gene and the blue gradient represents the average scaled expression. D) Violin plots of genes from canonical pathways known to recruit and expand MDSCs. E) Proposed molecular model of involution mimicry driven by Elf5 where CAFs and MDSCs are the major cell types involved.

通過已知的receptor – ligand pair結合其在細胞中的表達,推斷intercelluar interactome,

  • particular cell populations:
  • Fibroblasts中:對於ECM-CAFs & involution iCAFs群體:自己內部和自己對他人的interaction很高 (key talkers)
  • Immune cells中:Myeloid cells are the key talkers & hub talkers (talk to epithelial / fibroblasts / endothelium)
  • Particular ligand-receptor interaction between cell types : – aim to find common pathways linked to involution
  • involution – TGF
  • immune suppressive ecosystem – Cxcl12 & Dpp4
  • ECM remodeling – IGF
  • 想要validateprevious findings of increased infiltraion of myeloid-derived suppressor cells (MDSC) in Elf5 OE tumors
  • 因為細胞數量太少,不足以對myeloid cells再subcluster了,因此驗證了一些促進MDSC infiltration的genes於各類細胞中的表達 (genes in signaling pathways for MDSC expansion, recruitment and malignant activation)

此部分的分析最終構建了PyMT中,各類cell subgroups之間的interaction圖景(但沒有強調Elf5 OE和WT的區別)

Comment

這篇文章基於一個很好的模型 (PABC的preclinical transgenic mouse model),該模型的tumor發生過程即可模擬pregnancy associated alveolar epithelium differentiation。雖然沒有動態追蹤Pregnancy-related tumor的發生過程,但於該過程中的驚鴻一瞥,仔細描述了一個完整的tumor ecosystem,並address了幾點breast cancer field的大問題:

  • Cancerous epithelial cell lineage back to normal breast tissue
  • The role of cancer-related fibroblasts and immune cells
  • Communications among different types of cells

其中前兩點幾乎全部藉助前人發現的signature來反覆定義樣本中的cluster identity – 雖然沒有新marker/lineage的發現,但多種方式都指向相似function時,這種定義會更加solid,也有助於第三點對整個ecosystem的構建。這部分因此大多為explanatory,相當於對前人假想的TME interactions進行精密的定量描述。

這篇文章specific的點是Elf5 OE model,由於已知該model的部分biological facts,在做bioinformatics驗證時更準確穩妥,也容易有wet lab的validation – 但多數都只是explanatory的results,新的發現是在CAF中也發現了involution related signature,並用collagen detection進行validation;其prognosis power是dependent on Elf5本身的overexpression的,所以並不算非常驚艷;而最後一部分描述interactome時,幾乎沒有區別展示Elf5 OE與WT(可能是發現沒有strong的區別),也基本停留在descriptive層面。