
Probability Density Function

Again, instead of memorizing the formula, I decomposed the equation to its fundamental components. I analyzed the exponent term first:

x-u measures the difference between each element and the average or expected value (i.e. sample is large enough and unbiased), but what does it mean to divide this difference by σ (standard deviation)? The formula becomes a little less opaque when it's rewritten this way:

The numerator and denominator look eerily similar when I deconstruct the variance term:

This calculation compares each deviation to the average of the whole population. In other words, it is a way to standardize the data.
What about the factor -1/2? Once again, I find it easier to understand the concept when it is decomposed to -1 and 1/2. -1 transforms the symmetric smile into the symmetric mountain.

How about 1/2? This one took me some time to understand. If I remove the term, the distribution becomes more concentrated around the mean (red graph below). But why would is this called 'normal' or 'standard'?

After many trials, I integrated the formula in an attempt to find the reason behind this nomenclature:


Integrating the formula yields a square root of 2π or 2.506628275 and integrating the same formula from 0 to 1 yields 0.855624392. This means 68.2689% of the total area under the curve resides between -1=-σ and 1=σ (the blue area between two green bars). 68.27% confirms the definition of the probability density function of a normal distribution, but what's so special about this? According to Wikipedia, "authors differ on which normal distribution should be called the standard one." This open-ended debate made me question how much weight one should put on human intuition (e.g. elegance) vs. data (e.g. thermal fluctuations) in defining concepts such as standard normal distribution.

Next step is taking e to the power of the final result calculated above. In addition to the numerical validation of e, here is a good explanation of Euler's number.

The last factor ensures that the total area under the curve is equal to one (i.e. integral is 1).