For more reading, visit Articles on FlowSOM
What is FlowSOM?
FlowSOM is an algorithm that speeds time to analysis and quality of clustering with self-organizing maps that can reveal how all markers are behaving on all cells, and can detect subsets that might otherwise be missed. It clusters cells (or other observations) based on chosen clustering channels (or markers/features), generates a Self-Organizing Map (SOM) of clusters, produces a Minimum Spanning Tree (MST) of the clusters, and assigns each cluster to a metacluster, effectively grouping them into a population. The FlowSOM algorithm outputs SOMs and MSTs showing population abundances and marker expression in various formats including pie charts, star plots, and channel-colored plots. Cytobank has extended the functionality to automatically create a new experiment using files written out of FlowSOM that include the original channels plus the addition of cluster and metacluster ID channels, to allow interaction with the cluster and metacluster output. Read the original FlowSOM publication by Van Gassen et al, Cytometry A (2015).
What are the main advantages to running FlowSOM in Cytobank?
With FlowSOM in Cytobank, you can run multiple files through the algorithm concurrently, without having to concatenate, thereby preserving the per-file analysis in the output. Additionally, Cytobank automatically creates a FlowSOM analysis experiment after the run finishes, automatically gating the FlowSOM-identified metaclusters, so you can interact with the cluster and metacluster output on a population-level or a single-cell level, and analyze groups of samples. Courtesy of the power of cloud compute, you can also run up to 90 million events through FlowSOM in Cytobank, and you can run multiple FlowSOM runs in parallel for rapid exploration. And you never need to wrestle with installing R or any packages - we make algorithms easy to use.
How long does FlowSOM take to run?
FlowSOM takes on the order of “minutes” to run, as opposed to SPADE and viSNE which can take “hours” to “days” depending on the number of events, iterations, channels, and other factors. For example, we’ve found that running a dataset with 1 million events and 30 channels through FlowSOM takes 4 minutes. For more information on how FlowSOM settings impact runtime, view the article Effect of FlowSOM Settings on Algorithm Run Time and Memory Usage.
Does FlowSOM downsample or upsample data?
FlowSOM does not downsample or upsample data - it uses all events supplies to it throughout the process. You can, however, choose to run FlowSOM on a subset of events via the FlowSOM setup page, in which case the sampling will be random.
I am not allowed to run FlowSOM. Why?
FlowSOM is a premium functionality and only available on Premium and Enterprise Cytobank.
You must have full access to the experiment on which you want to run FlowSOM. If you are viewing a public dataset, you will need to clone it before running FlowSOM. If you have access to the experiment via a project, the owner of the project may have disabled your ability to run FlowSOM on the experiment; you can try cloning the experiment first if the project owner has enabled cloning.
Where are my FlowSOM analysis results?
FlowSOM in Cytobank delivers results in two forms: static PDF/PNG output and interactive FlowSOM analysis experiments. The static PDF/PNG output is attached in zip format to the interactive experiment’s Experiment Summary Page in the Attachments section, and also to the frozen settings page for the FlowSOM run. The interactive FlowSOM analysis experiment is created following completion of the FlowSOM run, and is linked from the Advanced Analyses menu (under FlowSOM -> View All) and the Experiment Summary page of the originating experiment.
How do I interpret with my FlowSOM results?
Read our support article on Analysis and Interpretation of FlowSOM Results for a full walk-through. To summarize, a new experiment is created following completion of the FlowSOM run. Metaclusters are automatically gated, and you can perform downstream analysis on these metaclusters using the Working Illustration. You can overlay metacluster populations on viSNE maps; rework the metacluster assignments with the Automatic cluster gates functionality; explore single cell data on a (meta)cluster level; explore marker expression of metaclusters; and more.
How can I automatically gate clusters or groups of clusters?
FlowSOM automatically draws gates around metaclusters in the resulting interactive experiment that is created following run completion. However, you can redo the metacluster assignments, gate clusters, or gate files from other non-FlowSOM experiments that have (meta)cluster ID information written in as channels using Cytobank’s Automatic cluster gates functionality.
How do I merge clusters or metaclusters?
Use the Automatic cluster gates functionality within the gating interface to specify groups of (meta)clusters that you want to group into one Gate/Population. You can combine ranges of (meta)clusters (e.g. 1-5) and/or combinations of non-consecutive (meta)clusters (e.g. 1,5,10). You cannot modify an existing gate/population’s definition in terms of (meta)cluster IDs they encompass; but you can delete ones that you want to remake as combinations of different (meta)clusters.
Why do my results look slightly different when run on the same data set?
Although FlowSOM plots tend to have similar groupings of cell subsets, FlowSOM plots may look slightly different because the algorithm is stochastic. It’s okay to run FlowSOM multiple times and pick the one that gives the best visualization. If you’re interested in re-running FlowSOM on the same data with the same settings and getting the exact same map (e.g. for validation of a workflow), you can set the seed.
Can I run viSNE or FlowSOM on my FlowSOM-generated files?
Yes - we actually recommend running viSNE on your FlowSOM (or vice versa) as means of displaying FlowSOM metaclustering results on a dimensionality reduction map for visualization. To do this, you’ll first need to navigate to the viSNE experiment, and then you can run viSNE or FlowSOM from that experiment.
How can I compare groups of samples with FlowSOM?
Run FlowSOM with all of your sample files selected. Then, in the resulting FlowSOM analysis experiment, annotate your files based on their group identity. You can then compare (meta)cluster abundance and expression in the Working Illustration of the resulting FlowSOM analysis experiment.
How do I find statistically significant differences in marker expression or population abundance with FlowSOM results?
You can compare marker expression and abundance differences for metaclusters/populations identified by FlowSOM within the Working Illustration of the FlowSOM analysis experiment (see “Statistics and Fold Change Equations in the Working Illustration” and “Overview of Figure Generation using the Working Illustration”). For more sophisticated statistical analyses, including tests of significance, you can download the FlowSOM analysis experiment files that have the cluster and metacluster IDs added as columns, or you can Export Statistics to compare in downstream tools outside of Cytobank. Make a feature request if you’d like to see this functionality integrated within Cytobank!
How can I run up to 90 million events through FlowSOM?
You can run up to 4M events through FlowSOM in a standard Premium and 8M events on Enterprise account. If you are an Enterprise Cytobank user and would like to run more events (up to 90M) through FlowSOM, please submit a support ticket to inquire about upgrading the compute power for your server.
Can I run multiple FlowSOM runs at the same time?
Yes. The number of runs you can run in parallel is a function of how your server is configured (i.e. how many supporting compute servers have been purchased by your site administrators). If you would like to run more FlowSOM runs in parallel than you currently are able to, please submit a support ticket to inquire about upgrading the compute power for your server.
The clusters in the PDF output are overlapping and difficult to discern. How do I address this?
If the clusters in your PDF output are overlapping, we recommend reducing the cluster size and/or reducing the target number of clusters. In some cases, it might help to use a different seed, or to interpret the SOM results in grid format instead of MST format (both formats are provided in PDF output). Alternatively, you can view and interact with your cluster and metacluster data in the resulting FlowSOM analysis experiment within the Working Illustration, which provides you with a way to analyze (meta)cluster abundance and marker expression at both a population-level and a single-cell level without the issue of overlap.