Control limits are visual references that help you detect if a statistic (or time series) is getting “out of control.” I first saw them referenced on Avinash Kaushik’s blog.

You can use a metric’s standard deviation to plot a bounded region, (\(\pm 3 \sigma\)), within which the statistic is assumed to behave normally. It’s not wandering too far off from its mean. The reason why this is useful in analytics is that more often than not, people will latch onto meaningless fluctuations and believe them to be worthy of attention. Control limits can bring some clarity to these situations.

The choice of scale for the standard deviation should depend on the time frame and variability in the data that you’re inspecting. In web analytics, depending on how much steady traffic a website gets, three standard deviations might be too much, as the bands will be too broad. You might be on the lookout for anomalies and having too large bands might lead to a large number of false negatives going undetected. It’s useful to see if the data you’re collecting follows a normal distribution, which might inform you of how tight your threshold should be.

import numpy as np
import matplotlib.pyplot as plt

arr = np.array([523, 579, 530, 52, 569, 603, 587, 505, 809, 956, 502, 520, 545, 521, 534])
mean = np.mean(arr)
stdev = np.std(arr, ddof=1) # sample standard deviation
ucl = mean + (2 * stdev) # upper control limit
lcl = mean - (2 * stdev) # lower control limit

Here are our reference points.

## 556.0
## 931.0
## 181.0
col = np.where((arr < lcl) | (arr > ucl), "red", "blue")
x_coords = np.arange(len(arr))

plt.scatter(x_coords, arr, c=col)
plt.axhline(mean, color="black")
plt.axhline(ucl, color="gray", linestyle="--")
plt.axhline(lcl, color="gray", linestyle="--")
plt.title("Control Limit Plot")