# Overview of statistical inference method

Introduction to the statistical inference tool

Significance tests

Multiple testing methods

How to run a significance test

A workflow for setting up a statistical significance test

How to interpret the results

Key statistical terms

### Introduction to the statistical inference tool

The statistical inference tool in the Cytobank platform offers several statistical hypothesis testing methods that are widely used by researchers to analyze biological data. You can directly run a significance test in the Cytobank platform using results from manual gating or machine learning-based analysis, such as FlowSOM, without exporting analysis results manually outside of the Cytobank platform.

The following is a list of statistical inference methods implemented in the Cytobank platform: Figure 1: Statistical inference methods implemented in the Cytobank platform.

• Significance tests:
• Student’s t-test
• Mann-Whitney U test
• Paired student’s t-test
• Wilcoxon signed-rank test
• Kruskal-Wallis H test
• One-way analysis of variance (One-way ANOVA)
• Two-way analysis of variance (Two-way ANOVA)
• Multiple testing and correction methods:
• Tukey’s HSD
• Bonferroni correction
• FDR correction – Benjamini & Hochberg procedure

For the detailed definition of the implemented methods, please see below
Note: Please see the key statistical terms section for the definition of statistical terms and How to set up a paired statistical test for more details on sample pairing.

### Significance tests

• Student’s t-test (also known as t-test)

Student’s t-test is a parametric hypothesis test to determine if there are any differences in group means between the two groups of samples of an independent variable. It assumes that the data follows a normal distribution.

Example: You want to determine if Treatment A lowered the percentage of CD4+ T cells that are Tregs as compared to Treatment B. Perform a Student’s t-test to compare the percentage of Treg cells of CD4+ T cells between a group of mice that was treated with Treatment A to a group of mice treated with Treatment B, if you know that the percent of Tregs follows a normal distribution.

• Mann-Whitney U test (also known as the Mann-Whitney-Wilcoxon or Wilcoxon-Mann-Whitney test)

Mann-Whitney U test is a non-parametric hypothesis test used to determine if the distributions of two sample groups of an independent variable are equal or not. It does not assume the data follows a parametric distribution, such as normal distribution.

Example: You want to determine if Treatment A lowered the percentage of CD4+ T cells that are Tregs as compared to Treatment B. Perform a Mann-Whitney U test to compare the percentage of Treg cells of CD4+ T cells between a group of mice that was treated with Treatment A to a group of mice treated with Treatment B, if you know that the percent of Tregs does not follow a normal distribution.

• Paired student’s t-test

Paired student’s t-test is a parametric hypothesis test to determine if there are significant differences between groups in an experimental design with paired samples, such as if samples are from individuals pre- and post-treatment. Like the student’s t-test, paired student’s t-test assumes that the data follows a normal distribution and determines if the means of two groups are equal or not.

Example: You want to determine if a treatment lowered the percentage of CD4+ T cells that are Tregs after treatment as compared to before treatment, in the same mice. Perform a Paired student’s t-test to compare the percentage of Treg cells of CD4+ cells measured on the same mice pre- and post-treatment, if you know that the percent of Tregs follows a normal distribution.

• Wilcoxon signed-rank test

Wilcoxon signed-ranks test is a non-parametric test used to determine if the distributions of two sample groups of an independent variable in a paired experimental design are equal or not. Similar to the Mann-Whitney U test, Wilcoxon signed-rank test determines if two groups of samples have the same distribution or not.

Example: You want to determine if a treatment lowered the percentage of CD4+ T cells that are Tregs after treatment as compared to before treatment, in the same mice. Perform a Wilcoxon signed-rank test to compare the percentage of Treg cells of CD4+ cells measured on the same mice pre- and post-treatment, if you know that the percent of Tregs does not follow a normal distribution.

• Kruskal-Wallis H test

Kruskal-Wallis test is a non-parametric method that does not assume the data follows a normal distribution. The test is used to compare more than two groups of an independent variable and determine if all the groups have the distribution or not.

Example: A use case of the test is to assess treatment effectiveness. For example, a study has three groups of patients. They receive different levels of treatment and then report outcomes. A Kruskal-Wallis test can determine if the treatment outcome is corrected with the level of treatment.

• One-way analysis of variance (One-way ANOVA)

One-way ANOVA is a parametric test that can analyze an independent variable with more than two groups. The test determines if there are any differences in group means among the sample groups

Example: A use case of One-way ANOVA is to compare three different age groups of samples. The result can tell if the effectiveness of treatment shows any difference between different age groups.

• Two-way analysis of variance (Two-way ANOVA)

The two-way analysis of variance analysis is a parametric method. It can analyze two independent variables simultaneously, where the two independent variables have at least two groups. In the Cytobank platform, the two-way ANOVA analysis evaluates the effect of two independent variables without evaluating the effect of the interaction of the two independent variables.

Example: A use case of Two-way ANOVA test is to find out if a new drug or treatment works after controlling for the effect of time. For example, a study may contain two groups of samples. One group receives the new treatment, and the other group doesn’t receive any treatments. All samples are measured at three time points, pre-treatment, post-treatment Day 1, and post-treatment Day 7. A two-way ANOVA test can determine if there is a difference between the treatment group and the placebo group regarding less of the difference of the time points.

### Multiple testing methods

When more than one statistical test is being performed simultaneously, it is advisable to apply a correction, or adjustment, to p-values. This correction is applied to avoid finding significant results solely because you have done more tests, known as the “multiple testing problem.” For example, in cytometry data analysis, researchers usually describe the immune system in a sample based on multiple cell populations. When comparing samples, it is common that a statistical significance test is repeated for each cell population, known as “multiple testing.” A selection of multiple testing methods is available to handle this scenario.

Note that in the Cytobank platform, each Table dimension, as configured in the layout settings, is handled independently; multiple testing correction is performed within each table dimension. P-values are not corrected across table dimensions.

I. Following a two-group test (Student’s t test, Mann-Whitney U test, Paired student’s t test, and Wilcoxon signed-rank test), Bonferroni and FDR correction methods are available to select from the Multiple test methods menu.

• Bonferroni correction

Bonferroni correction is a common adjustment method to address the multiple testing problem. It applies a correction based on the number of tests performed together.

• FDR correction - Benjamini & Hochberg procedure, a false discovery rate (FDR) correction method

Benjamini & Hochberg procedure is a multiple testing correction method that is designed to control the false discovery rate. Compared to the Bonferroni method, an FDR-controlling method provides less stringent control of Type I errors.

II. Following the Kruskal-Wallis H test or the Two-way ANOVA test, a combination of a two-group test and correction method can be selected, to perform pairwise comparisons between all groups, as well as apply a p-value adjustment. This will identify which specific groups differ significantly from each other in the case of more than two groups. If Multiple testing is set to “None,” the only significance test results will be the overall p-value indicating significance among the IV(s).

III. Following Kruskal-Wallis H test:

• Mann-Whitney U test + Bonferroni

Perform all pairwise comparisons using Mann-Whitney U test and apply a Bonferroni correction.

• Mann-Whitney U test + FDR

Perform all pairwise comparisons using Mann-Whitney U test and apply a false discovery rate correction.

IV. Following Two-way ANOVA test:

• Student’s t test + Bonferroni

Perform all pairwise comparisons using Student’s t test and apply a Bonferroni correction.

• Student’s t test + FDR

Perform all pairwise comparisons using Student’s t test and apply a false discovery rate correction.

If a one-way ANOVA is used, the Tukey’s HSD test is available for pairwise comparisons between each group of the independent variable.

• Tukey’s HSD (honestly significant difference) test

Tukey’s HSD test, also known as Tukey’s range test or Tukey’s test, is a single-step method for correcting the multiple testing problem and conduct pairwise comparisons. It compares all possible pairs of group means to identify which groups of an independent variable differ significantly from each other

### How to run a significance test

The statistical inference tool is available in the Illustrations Editor page. To access the tool, create a new illustration in the Illustration Editor, and set plot type to be Box, Violin, Bar, Line or Summary dot. Select a Significance test method from the dropdown list that is compatible with your experimental design, based on the number of independent variables (I.V.), number of groups, parametric or nonparametric, and paired or unpaired Figure 2: To see the list of significant test methods, click Illustrations -> New illustration -> Plots -> Select a summary plot from the plot type dropdown list.

### A workflow for setting up a statistical significance test

1. Choose which statistic should be compared and which statistical test to use from the plot settings menu
I. Select a summary Plot type from the Plot type dropdown.

II. Choose a Statistic from the Statistic dropdown. This statistic will be calculated for each sample selected for inclusion in the illustration, and will be used for significance testing.

III. Pick a significance test from the Significance test dropdown based on the displayed assumptions and constraints of each test shown.

• If a paired test is selected (Paired Student’s t test or Wilcoxon signed-rank test), the Cytobank platform automatically extracts pairing information from the sample tags of an experiment. For more information on setting up a paired test, see this article.

IV. If applicable (performing multiple two-group tests within each plot, or choosing a Kruskal-Wallis H, One-way or Two-way ANOVA), choose an appropriate Multiple-testing-method method.

2. Configure the illustration layout from the layout menu

I. Assign the variable defining groups of interest as the independent variable (I.V.). The second row of the layout menu, labelled Subgroups, will be used to assign the I.V. by default. (Note: in some layouts, the I.V. may move to the first row of the layout menu.) Choose an annotated Figure dimension as the Subgroup/IV(s) to be compared.

II. Select at least two Sample tags within the I.V. to indicate sample groups

i. If a two-group test is selected (Student’s t-test, Mann-Whitney U test, Paired student’s t-test, or Wilcoxon signed-rank test), make sure the figure dimension specified as the I.V. has exactly two groups selected. An error message will appear if you have selected less than or more than two sample tags as groups.

ii. If a two-way ANOVA significant test is selected, two Figure dimensions are required, labelled I.V. #1 and I.V. #2 in the first and second rows of the Layout settings menu.

Figure 3Workflow to set up a significance test

### How to interpret the results

The Cytobank statistical inference tool displays the result of the significance test in two ways.

1. P-values or asterisks displayed in the Illustration

P-values or asterisks are shown within and/or adjacent to the Illustration plot. The format of displayed p-value or symbols depends on the statistical inference test and the selected Show p-values option is enabled. Significance symbols correspond to p-values as described in the table below.

• Two-group tests (Student’s t test, Mann-Whitney U test, Paired student’s t test and Wilcoxon signed-rank test):
• If Show p-values is set to Symbols or P-values, the selected option is displayed on the plot, centered and underlined between the two groups being compared, either for all comparisons, or only significant comparisons, defined by p <0.05.
• When a follow-up “multiple testing” option is enabled, the p-values and corresponding symbols displayed on the plot are the adjusted p-values for all the significance tests.

Figure 4: Boxplots showing significant results following Student’s t-test (left) and Student’s t test with Bonferroni correction (right)

• One-way ANOVA, Kruskal-Wallis H test:
• If Show p-values is set to Symbols or P-values, the selected option appears atop a group of data on the plot to indicate that there are significant differences between groups in that X-axis group, if an X-axis group is selected. If there is no X-axis group selected, the symbol represents significant differences between groups among all the samples displayed underneath the symbol or p-value underline.
• If a multiple testing option is enabled, an additional summary result table appears to the right of each plot if there are any statistically significant (p<0.05) pairwise comparisons. This table lists all significant pairwise comparisons and their corresponding significance symbols.

Figure 5: Bar plots showing One-way analysis of variance (left) and One-way analysis of variance with Tukey HSD follow up test significant comparisons (right).

• Two-way ANOVA:
• The names of the significant I.V.’s are displayed adjacent to each plot along with their corresponding significance symbol.
• Enabling a multiple testing method does not change the display of significance on the plot, only the Statistical Inference table (see below) Figure 6: Box plots showing significant variables identified by Two-way analysis of variance test.

For any tests that cannot be completed, due to missing data, or data that have no variation, NaN will be shown if Show p values is set to P values (all) or Symbols (all).

2. Complete testing results in the Statistical Inference table

A statistical inference table will be displayed in the Illustration when a significance test is performed. The table is located below the summary plots in a section called “Statistical Inference.”

Two-group tests

For two-group tests (Student’s t-test, Mann-Whitney U test, Paired student’s t test, Wilcoxon-signed rank test), each row of the table summarizes one comparison. Depending on the selection of the significant testing methods, the following columns will be displayed in the table.

a. Mean A or Mean B

i. Calculated mean of independent variable groups, referred to as A and B, if the aggregate method in the Plots settings is selected as mean (option available for Bar and Line plots named Summary method)

b. Median A or Median B

i. Calculated median of independent variable groups, referred to as A and B, if the summary method in the Plots settings is selected as median (option available for Bar and Line plots)

c. SD A or SD B

i. Standard deviation of independent variable groups A and B

d. “n A” or “n B”

i. Number of data points of independent variable groups A and B

e. p-value

ii. For any tests that cannot be completed, due to missing data, or data that have no variation, NaN will be shown.

i. If a multiple testing method is used following a two-group test, both the p-value and adjusted p-value is shown

ii. If a multiple testing method is used following Kruskal-Wallis H, One-way or Two-way ANOVA test, the only p-values to appear in this table will be adjusted.

iii. For any tests that cannot be completed, due to missing data, or data that have no variation, NaN will be shown.

g. Significance p-value

i. Symbol denoting unadjusted p-value for the comparison of group A and group B

ii. For any tests that cannot be completed, due to missing data, or data that have no variation, NaN will be shown.

i. Symbol denoting adjusted p-value for the comparison of group A and group B

ii. For any tests that cannot be completed, due to missing data, or data that have no variation, NaN will be shown.

Kruskal-Wallis H, One-way ANOVA, Two-way ANOVA

If One-way ANOVA, Two-way ANOVA, or Kruskal-Wallis H test is selected, and “multiple testing” option is on, two result tables will be displayed. The result table at the top is labelled with the test name and shows the omnibus results, describing if the variance among all the groups of the I.V.(s) are significantly different. The following columns are displayed in this table:

• Table figure dimension name(s)
• If a Table(s) dimension is enabled, the name of this dimension appears in the leftmost column(s)
• X-axis figure dimension name
• If an X-axis figure dimension is selected for a test with 1 I.V., the name of the dimension appears in the next column. For a Two-way ANOVA test, the X-axis figure dimension will appear in the I.V. column
• Independent variable (I.V.)
• Name of the independent variable(s) used for comparison
• P-value
• Overall p-value for the variable(s) listed in that table row
• For any tests that cannot be completed, due to missing data, or data that have no variation, NaN will be shown.
• Significance p-value
• Symbol denoting overall p-value significance
• For any tests that cannot be completed, due to missing data, or data that have no variation, NaN will be shown.

The second table at the bottom of the statistical inference table section is labelled with the name of the selected multiple testing method and shows the follow-up analysis results. The columns follow the format of the two-group test tables (above).

For example, if One-way ANOVA is performed and Tukey HSD test is selected as the follow-up analysis, the top result table reports the result of the One-way ANOVA test. The bottom table reports the results of Tukey HSD test.

### Key statistical terms

• Paired samples
• Samples that have been measured multiple times in one experiment. A common example of paired samples is an individual measured before and after experimental intervention. In this case, the Individual is used as the “pairing dimension.”
• Non-parametric vs. parametric
• Parametric methods are statistical inference methods that assume sample data follows a probability distribution that can be described using a set of parameters. For example, Student’s t-test is a parametric method because it assumes sample data follows a normal distribution that can be defined by two parameters, mean and variance. Unlike parametric methods, non-parametric methods do not assume sample data follow a parameterized distribution.
• Independent variable (I.V.)
• An I.V. is an experimental factor that defines an assignment of samples. It is a categorical variable that has a fixed number of possible values. For example, in a dataset that has pre-treatment and post-treatment samples, the I.V. of the dataset is a condition factor that defines the two groups of samples. An I.V. is selected using annotated Figure dimensions.
• Groups of an independent variable
• Groups are the possible values of an I.V. For example, consider a dataset with measurements taken on 4 timepoints. The I.V. is the timepoint and there are 4 possible values to group the samples. They are Day 1, Day 5, Day 10, and Day 20. These four possible values are the groups of the timepoint independent variable and are annotated as Sample tags.

Have more questions? Submit a request