After successfully setting up and running a viSNE analysis, you can use your viSNE map for exploratory data analysis and for visualization of the results of other more quantitative downstream analyses. Before you’ll do this, you’ll want to do a quick check of viSNE map quality to see whether you want to kick off another run tuning any of the advanced viSNE settings. This article outlines methods and considerations for assessing viSNE map quality, doing exploratory analysis with viSNE, and using viSNE for visualization of downstream analyses. Click the links below to jump to the relevant section:
- Assess viSNE Quality
- Exploratory Data Analysis with viSNE
- Color the viSNE Map by Functional Markers Not Used for Clustering
- Assess Differences Between Groups or Quality Control with viSNE Contour Plots
- Visualize Sample Heterogeneity Across Multiple Dimensional with viSNE Grid
- Color the viSNE Map by Additional Overlaid Variables: Manual Gates, Disease Groups, etc.
- Concatenate viSNE Map across Samples
- Downstream Analyses Using viSNE for Visualization
A key workflow for assessing the quality of a viSNE map is to color it by channel. Use the dot plots colored by channel functionality to color each event in the viSNE map according to its intensity on a channel within the dataset. The patterns that emerge show why dots in the map are nearby each other, or which markers make them similar to each other. In cell phenotyping, you'll see events close together on the viSNE map based on their phenotypes. In bulk sample heterogeneity analyses, you'll see samples close together on the viSNE map based on their similarity across multiple biomarkers. One good way to set up the Working Illustration to quickly and easily assess your viSNE map quality is with the clustering channels on the columns and the files or a subset of the files on the rows, like in the example below. Setting up a view like this will allow you to quickly assess viSNE map quality as explained in the section below.
(click to expand - dot plots colored by channel within the Working Illustration. Relevant settings are highlighted in red and green. Green settings indicate the key components of making a figure that rotates through a variety of channels. The Panel/Channel Values option as the coloring channel defers control of this parameter to the Channels Figure Dimension Box where multiple selections can be made. The rows display selected files, which in this case are organized by condition.)
Before proceeding with exploratory data analysis or visualization of downstream results on a viSNE map, you’ll need to assess whether the viSNE settings used in your run produced a good quality map. Even with good quality data, you might get a poorly resolved viSNE map if you haven’t used the optimal settings for your data type, panel, and number of events. When you color on the clustering channels as described above, a poorly resolved viSNE map will have overlapping and poorly formed islands that don’t separate the expression of a single marker into distinct locations on the map. This example compares a poorly converged viSNE map and a nicely converged viSNE map across several markers:
(viSNE maps colored by channel comparing a poorly converged viSNE map to a nicely converged viSNE map. Each row is a single sample colored by four different markers as indicated. The two samples are not related and are from different viSNE analyses. They are only shown together for purposes of comparison and reference for the idea of poor versus good convergence.)
The most typical reason for poor convergence is a lack of iterations when working with viSNE runs with larger numbers of events (as a rough example, in abundance of 400,000 events). If coloring the viSNE map by channel results in overlapping and poorly formed islands of events that don't separate the expression of a single marker into distinct locations on the map, another run should be attempted with more iterations to improve the clarity of the results. In addition, changing the perplexity may improve the separation of events in the viSNE map.
Note that if you have pre-gated to a fairly granular starting population for your viSNE, such as CD4 T cells or B cells, you will generally not see distinct islands resolved within these pre-gated populations, but you should see cells with high expression of a single marker appearing in distinct locations on the map. For an example of what you might expect in CD4 T cells, check out our webinar where Drs. Shahram Kordasti and Richard Ellis describe their work on Tregs that was published on the cover of Blood.
If your panel includes markers that you are interested in studying but did not include as clustering markers for viSNE (for example, signaling markers, activation markers, or inhibitory receptors), you can use dot plots colored by channel as described above for the clustering markers to color the cell populations or groups of samples on your viSNE map according to their expression of these markers. If you’ve already gated on the viSNE map or used a clustering method for automated categorization of the cell populations, you can also display the gate labels as you view functional marker expression for easier interpretation.
(viSNE maps colored by expression of p-STAT5, a functional marker that was not used to cluster the viSNE. CD4+ T cell subset gate is shown and labeled, facilitating the comparison of this subset across the three unique conditions - basal, BCR, and IL-7. This visualization reveals the upregulation of p-STAT5 in the CD4+ T cell subset following IL-7 stimulation.)
Note that this type of visual can also be used to visualize the correlation between continuous meta-data variables (e.g. age) and cell populations or groups of samples.
Oftentimes cell populations will be present at some level in multiple samples under different conditions or from different clinical groups, but the abundance of these populations will differ across samples. Contour plots are a great way to quickly visualize these differences in abundance. With single cell data, you can use them to see differences in abundance across samples from different experimental or disease conditions (e.g. unstim vs stim or responder vs non-responder), or to locate any potential differences across batch if the same technical control sample is run across repeated experimental batches. To make this illustration, use the same settings as depicted above (e.g. make x and y axes tSNE1 and tSNE2 etc.), but change “plot type” to contour and “color by” to density.
(viSNE maps colored by cell density across three unique conditions - basal, BCR, and IL-7. Arrows highlight the appearance of an abundance difference of a region within the CD4+ T cell compartment. This region is more sparse in the basal condition as compared to BCR and IL-7 conditions.)
Many of the views described above are described when your experiment has a single dimension of interest that you’d like to compare across, like stim condition or disease outcome. For more complex experiments with samples spread across multiple dimensions that you’re interested in comparing, you can use a viSNE grid to make easy visual comparisons across these dimensions. For example, in a study with multiple outcomes or stim conditions compared across several time points, you might lay out the outcomes or conditions on the rows and the time points on the columns (note that depending on your experimental setup, setting up a grid may result in empty plots; just make sure they are expected based on your data, and if something is unexpected, check that your sample tags are correct).
Colored overlay dot plots can help you explore viSNE results in two different contexts. First, they are useful for comparing traditional manual gating schemes with viSNE results. Second, they are useful for exploring the relationship between cell populations or groups of similar samples displayed on the viSNE map and other meta-data variables like disease status, treatment status, or response group. Either of these goals can be achieved using colored overlay dot plots.
Color Overlay of Manually Gated Populations
The workflow comparing viSNE results to manual gates has been demonstrated many times in the literature and can quickly show four important results:
- Populations that were not captured by traditional gates and thus were uncategorized
- Manual gates that are capturing populations that they should not be capturing
- Manual gates that are not capturing all the cells that they should be capturing
- Populations that appear to correlate nicely between the viSNE map and traditional gates
The workflow is summarized in the visual below where the cells falling into manual gates are colored on the viSNE map:
(viSNE map with cells colored by which manual gate they belong to)
The next image shows an example highlighting the four situations outlined above. The image is a dataset gated manually by a researcher and then subsequently run through viSNE and all manually gated populations expressed as colors on the viSNE map. Anything colored as dark blue did not fall into a manual gate. Any other color corresponds to a different manual gate:
(Situation 1 is seen indicated by a discrete viSNE population with no gate color. Situation 2 (top) is seen by a discrete viSNE population that has the same color as a different discrete viSNE population, showing that a sub-population exists that was not captured by manual gating. Situation 2 (bottom) is also demonstrated by a population that is mostly uncategorized except for small numbers of events captured by various manually gated populations. Situation 3 is shown in the bottom left with a population that seems contiguous in the viSNE map but is only partially captured by manual gates, leading to the spotty coloring. Situation 4 is shown as the small population in the top right that appears categorized nicely both in the viSNE map and by manual gates, since there is very little dark blue (ungated) and no other colors simultaneously present)
Color Overlay of Other Variables
Colored overlay dots can also be used to visualize differences between groups of samples:
(dots colored by their disease tissue type show that the ~800 RNA transcripts used to separate these samples on the viSNE map group them into distinct islands that correlate with their disease type)
Note that the examples shown above describe overlay of discrete or categorical variables.
For many of the visualizations described above, it may sometimes be helpful to concatenate the files from multiple samples together. This is often the case if you are trying to compare across groups and have multiple samples per group. There are multiple ways to concatenate your files such as: 1) Use Cytobank’s concatenation tool 2) Use R.
Once you’ve done some exploratory data analysis by creating different views of your viSNE map as described above, you will likely want to perform some more quantitative analyses to summarize the data, perform statistical tests, or quantitatively compare viSNE results to a traditional manual approach. The simplest way to derive quantitative statistics from a viSNE map is to manually gate the viSNE map. A simple strategy for manually gating the viSNE map is to use dot plots colored by channel (see above) and the natural separations of the viSNE map to draw gates. Zooming the gating interface will also help with this process. For regions that are difficult to gate because of a lack of clear separation between more continuous phenotypes, other plot types such as uncolored contour plots and contour plots colored by density can be used to help reveal density trends in the viSNE map that can guide gate placement. Black dot plots can be used to increase contrast for small populations that might go overlooked in other contexts:
(different plot types such as contour plots colored by density, black dot plots, and uncolored contour plots can help with gating. Density visualizations assist with dissection of phenotypic subsets that are biologically unique but in a more continuous distribution. Arrow indicates one large subset with more continuous smaller nested subsets, as is often seen in T and B cell biology, for example)
(example of a viSNE map that has been gated)
The simplest workflow for defining groups from your viSNE map is to manually gate the map (as seen above). This is a great starting point as you are learning viSNE, but it is a somewhat labor-intensive, non-scalable, and subjective process. It can be challenging (just as in traditional sequential gating) to draw gates in places where there are subtle differences between adjacent populations. Furthermore, gates drawn on one viSNE map aren't portable to future analyses because of the stochastic nature of viSNE. Researchers interested in accelerating viSNE-based workflows should understand options for automated methods for categorizing viSNE maps.
After identifying populations or groups from a viSNE map, it's often useful to summarize the expression of all channels for each of these populations. This gives you a quick summary of the variety of phenotypes or groups present on a viSNE map in a condensed, easy-to-interpret space:
(methodological example of representing populations identified in a viSNE map as a heatmap with their component expression on each channel within the dataset)
Remember that proper scaling of channel values is an important concept when working with heatmaps. Read about how to configure a heatmap of populations versus many channels and also how to scale it properly.
A number of methods have been described in the literature whereby cell populations can be defined by semi-automated or automated clustering methods. In Cytobank, you can run SPADE, SPADE on the viSNE map coordinates, or CITRUS to define cell populations automatically. In addition, computational biologists (or our services team) can use our API to run other clustering algorithms that can then be displayed back on the viSNE map for downstream analyses as described below. The resulting clusters from any of these methods can be displayed back onto the viSNE map to help visualize the results of quantitative summaries of these clusters. Generally, you will want to complete your viSNE before running any method that will cluster your data. This is important as you will need your files to have the tSNE channels in the data to be able to overlay clusters onto your viSNE map. Starting from the viSNE experiment in Cytobank, run the clustering method using any clustering channels you are interested in (the channels you do not use, including the tSNE channels, will stay with the data so you can use them for downstream analyses and visualization later). Once the clustering is done, you can follow the cluster gating workflow and then use color overlay dot plots to display the clusters overlaid on the tSNE axes. Note that overlaying CITRUS clusters specifically on a viSNE map will require exporting the significant clusters that you want to overlay along with the original files, and then using the FCS files representing the significant CITRUS clusters as the overlaid figure dimension in your color overlay dot plots. When overlaying clusters from any method on a viSNE map, it may sometimes be helpful to concatenate the files before overlaying the clusters. Concatenation can be performed using the Cytobank concatenation tool or exporting events as text files and using an R script to concatenate files.
(CITRUS identified significant cluster 34076, orange, overlaid onto all cells, blue, on the viSNE map)
Just like for the manual gating analysis described above, heatmaps and other summaries of clusters can also be created for clusters defined automatically with SPADE, SPADE on viSNE, or CITRUS. Additional instructions on creating summaries for these clustering algorithms can be found in their respective support articles.