Download your converted csv file

Convert-
**Gene expression matrix**- it is a*k x m*matrix with*k*rows of genes and*m*columns of samples. Each data point in this matrix represents the expression of a given gene in a given sample -
**Gene signature matrix**- it is a*k x n*matrix with*n*rows of genes and*m*columns of cell fraction Each data point in this matrix represents the contribution of a gene towards a cell type -
**Cell proportion matrix**- it is a*n x m*matrix with*n*rows of cell types and*m*columns of samples. Each data point in this matrix represents the proportion of a given cell type in a given sample

Supported input files are:

- - Gene expression matrix (for the Module "Number of CellTypes")
- - Gene signature matrix (for the Module "Components enrichment")
- - Cell proportion matrix (for the Module "Proportion visualization")

- Module "Number of CellTypes": Help for the determination of the number of cell types from the Gene expression matrix.
- Module "Components enrichment": Enrichment analysis for each component of putative cell types from the Gene signature matrix and csv file ouput of the top100 gene markers of each component.
- Module "Proportion vizualisation": Visualization of the Cell proportion matrix as interactive plots

Download your converted csv file

ConvertThis module provides guidance in the determination of the number of putative cell types from omics data (e.g. Gene expression matrix).

Cattell’s rule is here suggested for choosing K [1]: it states that components corresponding to eigenvalues to the left of the straight line should be retained. When the actual number of different cell types is equal to K, we expect that there are (K-1) eigenvalues would correspond to the mixture of cell types and that other eigenvalues would correspond to noise (or other unaccounted for confounders). Indeed, one PCA axis is needed to separate two types, two axes for three types, etc. However, when not accounting for confounders, Cattell’s rule overestimates the number of PCs.

[1] Cattell RB. The scree test for the number of factors. Multivariate Behav Res. 2010;1:245–76.

This module helps in the biological interpretation of the components identified by unsupervised approaches.

It proposes to perform Gene Set Enrichment analysis (fgsea R package [1]) on the cell-type gene marker database 'CellMatch' [2]

To be noted: ICA-based methods are subjected to a reorientation of the components, as proposed in the deconica R package [3].

[1] Korotkevich G, Sukhov V, Sergushichev A (2019). “Fast gene set enrichment analysis.” bioRxiv. doi: 10.1101/060012.

[2] Shao et al., scCATCH:Automatic Annotation on Cell Types of Clusters from Single-Cell RNA Sequencing Data, iScience, Volume 23, Issue 3, 27 March 2020. doi: 10.1016/j.isci.2020.100882

[3] https://urszulaczerwinska.github.io/DeconICA/

(1)

(2)

(3)

Sturm et al (2019). "Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology." Bioinformatics. https://doi.org/10.1093/bioinformatics/btz363