The Chebyshev inequality goes like this. Suppose that $$x$$ is an $$n$$-vector, and that $$k$$ of its entries satisfy $$|x_i| \geq a$$, where $$a > 0$$. Then $$k$$ of its entries satisfy $$x_i^2 \geq a^2$$. It follows that

$||x||^2 = x_1^2 + \cdots + x_n^2 \geq ka^2$

since $$k$$ of the numbers in the sum are at least $$a^2$$, and the other $$n - k$$ numbers are nonnegative. We conclude that $$k \leq ||x||^2 / a^2$$, which is the Chebyshev inequality.

When $$||x||^2 / a^2 \geq n$$, the inequality tells us nothing, since we always have $$k \leq n$$. In other cases, it limits the number of entries in a vector that can be large. In general, the inequality states that no entry of a vector can be larger in magnitude than the norm of the vector. Another interpretation, using the RMS value, looks like this

$\frac{k}{n} \leq \bigg( \frac{\mathbf{rms}(x)}{a} \bigg)^2$

where $$k$$ is the number of entries of $$x$$ with absolute value at least $$a$$. The left-hand side is the fraction of entries of the vector that are at least $$a$$ in absolute value. The right hand side is the inverse square of the ratio of $$a$$ to $$\mathbf{rms}(x)$$. It says, for example, that no more than 1/25 = 4% of the entries of a vector can exceed its RMS value by more than a factor of 5. Some statements that follow from the inequality are that:

• not too many of the entries of a vector can be much bigger (in absolute value) than its RMS value
• at least one entry of a vector has absolute value as large as the RMS value of the vector
import numpy as np

a = 6

x = np.array([3, 4, 1, 2, 4, 8, 5])
n = x.size
k = np.count_nonzero(x >= a)
rms = np.sqrt(np.mean(x ** 2))

k / n <= (rms / a) ** 2
## True

Relation To Standard Deviation

The Chebyshev inequality can be transcribed to an inequality expressed in terms of the mean and standard deviation: If $$k$$ is the number of entries of $$x$$ that satisfy $$|x_i - \mathbf{avg}(x)| \geq a$$, then $$k/n \leq (\mathbf{std}(x)/a)^2$$.

In probability theory the Chebyshev inequality is used to state something along the lines of “at most, approximately 11.11% of a distribution will lie at least three standard deviations away from the mean.”.

Another way to state this is: The fraction of entries of $$x$$ within $$\alpha$$ standard deviations of $$\mathbf{avg}(x)$$ is at least $$1 - 1/\alpha^2$$ (for $$a > 1$$).

As an example, if we consider a time series of return on an investment, with a mean return of 8%, and a risk (standard deviation) of 3%. By the Chebyshev inequality, the fraction of periods with a loss ($$x_i \leq 0$$) is no more than $$(3/8)^2$$, or 14.1%.