There are a variety of settings available to change for any given viSNE run. Understanding how changing these settings affects the overall run time of a given viSNE analysis is useful for a variety of reasons. This article provides run time data for various viSNE analyses done using Cytobank and can serve as a reference for what to expect when changing the settings for an analysis.
The dataset used for these tests is a selection of data from the Bendall, et al. (Science, 2011) mass cytometry dataset, which is publicly available on all Cytobanks. The settings, except where they are being variably tested, were default values, with 13 CD markers selected for channels. Click the links below to jump to the relevant section.
- Discussion on Run Stability
- Event Count and Run Time
- Number of Iterations and Run Time
- Perplexity and Run Time
- Theta and Run Time
- Number of Channels and Run Time
A note to consider when adjusting the settings of the viSNE run is that they affect the algorithm's use of computational resources. Therefore, certain combinations of settings may increase the chances of a viSNE run failing as a result of going beyond the compute resource allocations currently offered by Cytobank. Resource use increases most substantially as the settings increase for number of events, perplexity, and number of channels. Pushing these values to their limits simultaneously is likely to result in failure. Theta and number of iterations affect resource use less but do have a large impact on the amount of time the algorithm takes to complete. In the case of theta, the setting is inversely (and non-linearly) correlated with the run time.
Increasing the event count increases the run time:
(increased event count results in increased run time)
Increasing the number of iterations increases the run time:
(increased iterations results in increased run time - displayed stratified by event count)
Increasing perplexity increases the run time:
(increased perplexity results in increased run time)
Increasing theta decreases the run time:
(increased theta results in decreased run time - displayed stratified by event count)
Increasing number of channels increases the run time:
(increased number of channels results in increased run time - displayed stratified by number of iterations, for 2 million events)