Background
Cytobank is a cloud-based analytical platform and one of the many benefits associated with this architecture is having access to on-demand compute. One area in which this ability is very valuable is in the execution of machine-learning algorithms on scientific data, which is quickly becoming a standard strategy in data analysis. Cytobank creates dedicated cloud compute nodes as needed when running these resource-intensive algorithms and other related operations. This allows a researcher to scale their analyses to larger datasets, simultaneous runs, and ultimately to arrive at results faster without the bottleneck of analyzing on a personal computer.
When and How Do Tasks Compute Asynchronously?
A variety of operations in Cytobank are computed asynchronously on a dedicated computer within the Cytobank Cloud. Examples include SPADE, FlowSOM, viSNE, CITRUS, and a variety of other exports. In general, any task that displays a progress bar with the recognizable styling seen below is executing asynchronously. The progress bar will appear after clicking to start the functionality in question. There may be a lag period before the task starts executing while on-demand compute is being appropriated (see discussion on queues below). The presence of the progress bar means that the page can be closed or navigated away from without disrupting the task, since it is executing asynchronously. An email will be sent when the task completes, or the page can simply be checked later.
(a typical asynchronous computation progress bar. The exact styling may change depending on the task, but the underlying principle is the same)
Asynchronous Task Limits and Queues
After starting an asynchronous task a progress bar will appear. The progress bar may indicate that the task is currently in a queue:
(progress bar queue message)
There are three reasons a task could be in the queue:
1) Compute resources are currently being allocated -- Cytobank manages an extensive array of compute nodes within the Cytobank Cloud. They are allocated and de-allocated dynamically. If a compute resource is available when a job is created, it will be assigned to that resource immediately. In this case the queue message may still appear but only very briefly. If compute resources aren't available, then they will be allocated. This process should take a small number of minutes at most, during which time the task will remain queued.
2) The Per-User Task Limit has been reached -- Every Cytobank user has a cap on the number of asynchronous tasks that can be run simultaneously, regardless of task type. Any tasks submitted beyond this limit will be queued and executed as the per-user queue is cleared by the completion of existing tasks. This limit is three simultaneously executing tasks, assuming that there are task slots available within the Server Task Limit (see below).
3) The Server Task Limit has been reached -- Each Enterprise Cytobank server has its own task limit. When this cap has been reached, no new compute resources will be allocated for the server, regardless of per-user dynamics. All tasks submitted beyond this cap will thus be queued until spaces open for computation. The default cap for Enterprise Cytobank is five simultaneous tasks for the server. A larger task limit can be purchased.