# Standardization

##### 139 words — categories: linear-algebra

For any vector \(x\), we refer to \(\tilde{x} = x - \mathbf{avg}(x)\mathbf{1}\) as the de-meaned version of \(x\), since it has average or mean value zero. If we then divide by the RMS value of \(\tilde{x}\) (which is the standard deviation of \(x\)), we obtain the vector

\[z = \frac{1}{\mathbf{std}(x)} (x - \mathbf{avg}(x)\mathbf{1})\]

This vector is called the **standardized** version of \(x\). It has mean zero, and standard deviation one. Its entries are sometimes called the *z-scores* associated with the original entries of \(x\). For example, \(z_4 = 1.4\) means that \(x_4\) is 1.4 standard deviations above the mean of the entries of \(x\).

```
import numpy as np
x = np.array([3, 5, 5])
def standardize(x):
return (1 / np.std(x)) * (x - np.mean(x))
z = standardize(x)
z
```

`## array([-1.41421356, 0.70710678, 0.70710678])`

`round(np.mean(z), 2)`

`## 0.0`

`round(np.std(z), 2)`

`## 1.0`