Geometric statistics

February 28th, 2022

One of my pet projects for years has been to explain the average and standard deviation with basic linear algebra. Once we view our data as living in a vector space, for example by saying that observations are functions on the space spanned by a finite number of points, it’s easy to motivate both by asking basic questions about the functions we have: Are they constant? If not, how far away from being constant are they? The answers are, respectively, given by the average (which is the projection onto the line of constant functions) and the standard deviation (which is the distance of a function to the line of constants).

So I was very happy when a Hacker News comment pointed me to

Saville, D. J. and Wood, G. R

A method for teaching statistics using N-dimensional geometry

The American Statistician 40:3 (1986), 205–214

which explains how to do basic statistics using geometric ideas from finite-dimensional linear algebra. The paper’s authors also say they’re writing a book to further develop these ideas, which I’d be very interested in reading.