SPADE on viSNE: Automatically Categorize viSNE Populations

Background

viSNE is an excellent method for reducing high dimensional data to two dimensions and thereby enabling rapid exploratory data analysis and visualization of complex results. For cytometry data, this may assist with the categorization of events/cells into biological populations. For bulk data, this may help you understand the heterogeneity in your samples. In either case, it is sometimes useful to categorize groups seen on the viSNE map for downstream analysis. This can be done using gates:

(areas of a viSNE map categorized into populations using gates)

To learn more about this process, read our article about gating a viSNE map. Unfortunately, gating a viSNE map can be a time-consuming, subjective and detail-driven process. An alternative to this is to use a computer-driven clustering method to categorize groups seen on the viSNE map automatically. There are multiple approaches that can be used for clustering a viSNE map, one of which is running SPADE on the coordinates of the viSNE map itself. The end result of running SPADE on a viSNE map is a collection of clusters that correspond to spatial locations on the viSNE map.

When to Use SPADE on viSNE

SPADE on viSNE can be a useful approach for defining groups of cells or groups of samples when the dimensionality of your data are very high. In these cases, the "curse of dimensionality" may cause a clustering method like SPADE to be unable to perform well unless you first reduce the dimensionality of the data. Unfortunately, because every dataset is different, it's hard to know when you may reach this point. If your data are very high dimensional cytometry data, data with hundreds of markers measured in all of your samples, or you are noticing that your SPADE results don't make sense, SPADE on viSNE may be a better option for defining groups of cells or groups of samples.

Directions for Running SPADE on viSNE

1) Clone the viSNE analysis

In order to run SPADE on viSNE results, first navigate to a viSNE result. Within this viSNE experiment that houses the viSNE result files, click to run a SPADE analysis. Currently Cytobank forces you to clone this viSNE experiment so that it becomes visible in the inbox instead of being hidden within its parent experiment.

(click to run a SPADE analysis within a completed viSNE analysis)

 

2) Create a new SPADE

Within the resulting cloned viSNE experiment, do the same operation again to create a new SPADE analysis. This time it will ask for a name and proceed normally to the SPADE setup page.

 

3) Configure the SPADE run

There are a variety of configurations for a SPADE analysis. To run SPADE on viSNE, follow these guidelines:

Population

The files being included for this SPADE analysis are the results from your previous viSNE analysis. Thus, the ungated population actually corresponds to the population that was previously chosen for viSNE. For that reason, simply choose ungated for the SPADE analysis. A more restrictive population can be chosen if desired for some other workflow objective.

Channels

Choose only the two tSNE channels. This application is for clustering the tSNE map only and thus other channels should not be included for the clustering step.

(choose the tSNE channels of the viSNE map to be clustered)

Fold-Change Groups

The typical logic applies for choosing fold change groups and baselines. This is a useful way of getting fold change visualizations for a viSNE map, which is usually not possible due to the single cell nature of viSNE results.

Number of Clusters (Nodes)

The number of clusters may need to be honed empirically, but a good starting place may be ~7 time the number of populations you expect to find based on manual gating. On some of the publicly available datasets published in Weber & Robinson (2016), we demonstrated that starting with a number of clusters equal to ~7 times the number of expected populations based on manual gating, we were able to capture all of the populations with a frequency > 0.5% with an F measure that was comparable to the other clustering methods used in that paper.

Downsampled Events Target

Running SPADE on viSNE does not require downsampling to aid algorithm performance in the same way that SPADE on high parameter cytometry data requires it. Therefore, set the target to 100 percent. This means that all of the events will be included.

 

Analysis of SPADE on viSNE

General Analysis

Analysis of SPADE on viSNE can proceed exactly the same as analysis of a normal SPADE run, including coloring by channel, bubbling, fold change analysis, statistics, exporting FCS files based on bubbles, etc. The way in which the SPADE tree was created is different, but the analysis follows the same principles.

 

Colored Cluster Overlay

It can be informative to visualize each clustered segment of the viSNE map by color. This can inform the quality of the SPADE clustering. The colors should correspond to intuitive spatial groupings on the map. Seeing spatially distinct viSNE populations colored the same as a different nearby population means that the populations wound up in the same cluster, and perhaps shouldn't be. Consider increasing the number of SPADE clusters OR using a manual gate for these anomalies and otherwise keeping the other cluster results.

To make this figure, start in the SPADE result and draw a single bubble around the entire SPADE tree. Next, export the bubble as new FCS files. The resulting files will have a Cluster ID channel that can be used to draw cluster gates, which then will allow the visualization of the viSNE map colored by cluster using colored overlay populations.

Note that the cluster gating workflow that enables this visualization can be easily changed into a heatmap or any other normal analysis method.

 

 



Have more questions? Submit a request