You can use a metric’s standard deviation to plot a bounded region, (\(\pm 3 \sigma\)), within which the statistic is assumed to behave normally. It’s not wandering too far off from its mean. The reason why this is useful in analytics is that more often than not, people will latch onto meaningless fluctuations and believe them to be worthy of attention. Control limits can bring some clarity to these situations.
The choice of scale for the standard deviation should depend on the time frame and variability in the data that you’re inspecting. In web analytics, depending on how much steady traffic a website gets, three standard deviations might be too much, as the bands will be too broad. You might be on the lookout for anomalies and having too large bands might lead to a large number of false negatives going undetected. It’s useful to see if the data you’re collecting follows a normal distribution, which might inform you of how tight your threshold should be.
import numpy as np import matplotlib.pyplot as plt arr = np.array([523, 579, 530, 52, 569, 603, 587, 505, 809, 956, 502, 520, 545, 521, 534]) mean = np.mean(arr) stdev = np.std(arr, ddof=1) # sample standard deviation ucl = mean + (2 * stdev) # upper control limit lcl = mean - (2 * stdev) # lower control limit
Here are our reference points.
col = np.where((arr < lcl) | (arr > ucl), "red", "blue") x_coords = np.arange(len(arr)) plt.scatter(x_coords, arr, c=col) plt.axhline(mean, color="black") plt.axhline(ucl, color="gray", linestyle="--") plt.axhline(lcl, color="gray", linestyle="--") plt.title("Control Limit Plot")