Statistics in Optimize
There are numerous calculations that you'll find across the Webtrends Optimize platform, which vary based on the type of Test you're running and the type of Metric you're reporting on. This document looks to summarise all of these.
Starting first with a few descriptions:
Binomial Data
The important part of the word binomial is "bi", as in 2. Binomial data in Experimentation is therefore data that has 2 possible outcomes. For us, this is that an event did, or did not happen. When considering our traditional goals - purchases, page loads, clicks, etc. - binomial calculations consider whether or not these happened. Statistics on this data consider the likelihood of these actions.
Non-Binomial Data
Unlike event-level data, where we're capturing a tally of how often things happened, non-binomial data is a numeric data collection. Examples include Revenue, Units Per Transaction, etc. Such data is as useful as binomial data - knowing you've made more sales is great, but money earned is as important if not more.
AB/n Tests
These are single-factor tests, i.e. one those used for one controlled variable. You may have several variations as part of these tests, i.e. an AB/n as opposed to an AB test.
For these tests, we use the following statistical methods:
Binomial Data: 2 Tailed Student's t-test
T-tests compare the differences in conversion rate, measuring the amount of overlap in distrubution given an error rate. This value is known as the P-value. The smaller the overlap (the more extreme the data sets are), the smaller the P-value and therefore the more likely you are to have a significant result (whether positive or negative).
In Optimize, we provide the abiliy to control the Confidence Level. This is an inverted P-Value, which is more easily interpreted. The higher the Confidence Level, the higher our degree of stringency to call a change in conversion rate Statistically Significant.
Alongside P-Value based Significance calulations, you will find Change to Beat Control - a Bayesian measurement. This is a percentage likelihood of one sample group outperforming another - e.g. the Variation has a 100% Chance to beat Control
Non-Binomial Data: Wilcoxon Rank Sum test / Mann-Whitney U-test
For numeric data such as Revenue, we generate a Chance to Beat All using the multiple comparison method based on the Wilcoxon Rank Sum (pairwise two-sample rankings), also known as the Mann-Whitney U Test.
Much like binomial data, this is a comparison of distributions that we build. This is, however, based off the median as opposed to the conversion rate above
Multi-Variate Tests (MVTs)
These analyse the effect of multiple independant variables (factors) on conversion rate. EAch factor may have several variations, and together these form our Test Array (e.g. 3x3x3).
Design of Experiments (Fisher/Taguchi)
The Design of Experiments methodology used in Webtrends Optimize facilitates our patented Fractional Factorial approach - an efficiency on traditional Full Factorial test arrays.
ANOVA (Analysis of Variance)
Least Squares Estimator matrices
Uniquely in the Optimize platform, we find LSEs used as a technique for understanding Factor influence. In simple terms, the technique allows us to report back on which Factors hold the most influence over conversion rate in the scope of a given Test, and which Levels (Variations) hold the most influence over conversion rate in the scope of a given Factor.
Other tests and calculations which feature in the platform:
- Hodges–Lehmann estimators
- kruskal Wallis & Namenyi
- ANOVA with F-Test - calculating degrees of freedom.
- Chi-Squared test
- Dunnett's procedure and the Fleiss, Tytun, Ury procedure for calculation of sample sizes.