Background
After successfully setting up and running a FlowSOM analysis, you can perform exploratory or quantitative downstream analysis on your FlowSOM results. As part of the analysis process, you’ll want to assess the quality of the FlowSOM analysis, which can be done by displaying FlowSOM metaclusters on a viSNE map and with heatmaps that show the marker expression of the FlowSOM metaclusters. This article outlines the content of static output files from FlowSOM, how to interact with the analysis output within Cytobank, ways to assess FlowSOM quality, and how to perform downstream exploratory data analysis with FlowSOM. Click the links below to jump to the relevant section:
- Output files for FlowSOM
- Interacting with FlowSOM analysis output within Cytobank
- How to assess FlowSOM Quality
- Overlay FlowSOM metaclusters on a viSNE map
- Heatmap view of flowSOM metaclusters by clustering channel
- Color FlowSOM metaclusters by clustering channels
- Exploratory Data Analysis with FlowSOM
- Output files for FlowSOM
You will receive an email notification once the analysis is completed. You can find your analysis from the email link, from the setup page for the FlowSOM run within the original experiment (navigate here via the Experiment Summary page or the Advanced Analyses menu), and in the Attachments section of the FlowSOM analysis experiment. From the setup page, you can View created experiment or Download run info and Plots. Click on Download the run info and plots to download FlowSOM_test_results folder.
FlowSOM run info file
Within that folder, there is FlowSOM run info file which specifies the run info that is associated with this particular analysis and settings used for the run as references. This file contains essential attributes of the run including basic metadata such as which user executed the run, what settings were used, and URLs back to the setup page in Cytobank. This helps improve traceability of results packages that get downloaded off of Cytobank onto local file systems.
Supporting_files
Within the same top level folder, there is a folder called supporting_files. This folder will generally not help interpret the results of the analysis, but contains important supporting files for other uses. For example, these files are used behind the scenes when writing new FCS file based on FlowSOM clusters, and also can be used as a complete record of settings at the time of analysis.
Results
Within the results folder, you will find a list of CSV files and PDF and or PNG files, depending on the output file type you chose during the FlowSOM setup. Please note that if you choose to output PNG files, the resulting file sizes will be much bigger. An example of the results folder contents is shown below.
The legend.pdf file contains the IDs in minimum spanning tree (MST) and self-organizing map (SOM) grid format, for both clusters and meta-clusters, and is available in relative size and fixed size. During the analysis setup process, you can choose to toggle the meta-cluster background on or off.
The channel_colored_MSTs folder contains aggregated_channel_colored_MSTs.pdf, which shows the MST of all samples combined, colored by each channel you selected for the PDF output settings during the FlowSOM setup. Within the same folder, each sample selected for the analysis has its own PDF file showing the MST colored by each channel. The meta-cluster background is on by default, but during analysis setup, you can choose to toggle the meta-cluster background on or off.
Metaclustering_comparisons.pdf is the file that shows the metacluster MST comparison between different clustering methods, hierarchical consensus clustering, hierarchical clustering, and k-Means. The method used in the analysis is labeled. It may be useful to inspect the results of the methods not used downstream in the analysis in case a different method produced more favorable results and you want to use that for further optimization.
Star_plots.pdf: This file shows the results in MST and SOM grid formats where each cluster is represented by a star plot. These star plots indicate the mean intensities of all clustering markers for all cells in that cluster. The height of each segment indicates the intensity: if the segment reaches the border of the circle, the cells have high expression for that marker. An example of the star plot is shown below.
The advantage of FlowSOM is that you can re-run the analysis with the same seed and highlight clustering channels vs output coloring channels.
If you get a result with empty clusters, you should try re-running FlowSOM with a lower target number of clusters. If you’re still getting empty clusters, you could be feeding in a homogenous population where more events are grouped into fewer clusters, or you could have a low event-count per file and should re-evaluate number of clusters needed in either case.
Population_pie_charts.pdf: This file compares manual gating with automated clustering. Pie charts indicate the percentage each manually gated population from the originating experiment, for each cluster identified by FlowSOM., The output can be visualized in both the grid and MST formats. An example of the pie charts in MST is shown below.
During analysis setup, you can choose to set the metacluster background on or off.
The mst_definition folder contains information that defines the minimum spanning tree (MST) coordinates and other information relevant to the MST settings.
The rest of the files in the results folder that are not listed above are CSV files that contain CV, median, and abundance stats for both the FlowSOM clusters and metaclusters for each sample.
- Interacting with FlowSOM analysis output within Cytobank
When a FlowSOM run completes, a new experiment is created that contains the original files with new channels added for the FlowSOM cluster ID and metacluster ID. You can access this experiment either from the FlowSOM settings page for the run (click “View created experiment” on the page-level navbar), or from the Experiment Summary page of the originating experiment (click on the FlowSOM run name and you will be taken to the FlowSOM experiment). Here is an example showing the FlowSOM_metacluster_id and FlowSOM_cluster_id channels:
Following completion of the FlowSOM run, the metaclusters identified by FlowSOM are automatically gated as populations FlowSOM-metacluster1 to FlowSOM-metacluster”n”, where ‘n’ is the number of metaclusters you specified during setup.
You also have the ability to combine clusters/metaclusters and generate a new population. In order to generate new gates with the clusters/metaclusters, go to Gating and click on the Automatic cluster gates button.
In the popup window, choose either FlowSOM_cluster_id or FlowSOM-metacluster_id, give a gate name. Preferably, use either cluster_ or metacluster_ as a prefix in order to distinguish which clusters those new gates are generated from. Under ‘Define specific gates’, you have the following options to generate new gates or check here for more details:
1) A comma-separated list of cluster ID numbers will result in the creation of gates for each number entered, one gate for each cluster.
2) You can use parentheses in the format of (1,2,3) or (6,8,10) to merge clusters. Here the number inside the bracket stands for the cluster or metacluster ID. With this formula, they are combined into a new gate showing up under the Gates among all other gates. If you are combining sequential clusters such as 1, 2, and 3, you can also use (1:3) where a colon denotes a consecutive range of clusters. You can combine non-consecutive clusters using a syntax such as (1:3,10:12), which would group together clusters 1 through 3 and 10 through 12 into one gate.
With meta-clusters/clusters now as new populations, you can use all the Cytobank tools available to build graphical and statistical reports, and as a way to assess FlowSOM quality, detailed below.
How to assess FlowSOM Quality
Overlaying FlowSOM-identified metaclusters onto a viSNE map can help you assess the quality of the metaclustering and can inform how you may need to iterate various settings including target number of metaclusters and normalization. If you run viSNE on your dataset prior to running FlowSOM, it will have the tSNE1 and tSNE2 channels on which you can overlay the metaclusters from FlowSOM. Learn more about to perform this workflow.
- Heatmap View of FlowSOM Metaclusters by Clustering Channel
You can build a heatmap view of the FlowSOM metaclusters by clustering channel to quickly identify phenotype of each metacluster. To do so, choose the FlowSOM_metacluster populations in the Populations dimension and set the plot type to heatmap. For more detailed instruction, please refer to How to create and configure a Heatmap.
An example of the heatmap view and overlay of the FlowSOM populations on viSNE is shown below.
- Color FlowSOM Metaclusters by Clustering Channel
You can also use Dot Plots Colored by Channel to color each metacluster with the clustering channels. An example of the dot plots colored by channel view of the FlowSOM populations is shown below.
Exploratory Data Analysis with FlowSOM
When you go to the created FlowSOM experiment, there will be saved illustrations as Metacluster Box Plots and Metacluster Heatmaps (see below). You may adjust the layout accordingly based on the sample annotations. Please see here for more details. Overview of Summary plots generation using the Illustration Editor.
(Example of saved illustrations Metacluster Box Plots in the created FlowSOM experiment)
- Color Heatmaps and Dot Plots by Functional Markers
Metacluster Heatmaps is also saved in the illustration. See more here on How to create and configure a Heatmap.
If your panel includes functional markers such as signaling markers, activation markers, or inhibitory receptors, you can use the dot plots colored by clustering channel described above for those functional markers that you may not have included as clustering channels.
- Assess differences between groups or quality control with contour plot
With single cell data, you can use contour plots to see differences in abundance across samples from different experimental or disease conditions (e.g. unstimulated vs stimulated or responder vs non-responder). You can gate these plots to reveal statistical differences or export statistics for tests of significance in downstream applications. You may select the contour plots from the plot settings to view the FlowSOM populations.
- Assess differences between groups of quality controls by concatenating samples
For many of the visualizations described above, it may sometimes be helpful to concatenate the files from multiple samples together. You can concatenate samples to view the consensus map for the completed FlowSOM analysis instead of visualizing each file separately. This is often the case if you are trying to compare across groups and have multiple samples per group. After FlowSOM analysis completed, go to the FlowSOM experiment, download the files, concatenate them, then upload the concatenated files to a new experiment. There are multiple ways to concatenate your files. Please refer to the FCS file concatenation tool for instructions.
Have more questions? Submit a request