A New View of Statistics

© 1997 Will G Hopkins

Go to: Next · Previous · Contents · Search · Home

Generalizing to a Population:


I was confused by the wide variety of models until I found a simple way of categorizing them. The trick is to think about the variables in the model as either numeric or nominal, and as either dependent or independent.

You already know about numeric and nominal variables: numeric variables have numbers as values, and nominals have names or levels. Either type can be dependent or independent. The variable you're most interested in is known as the dependent variable, because it might be dependent on, or affected by, something else that you've measured, which is therefore an independent variable. For example people's weight (dependent variable) might depend on their height (independent variable). Independent is not a very good term, because you can have several independent variables, and they may not be independent of each other. So, a better term for independent variables is effects, because they have an effect on the dependent variable. They're also known as predictor or explanatory variables, for obvious reasons. A nominal predictor variable is also known as a grouping variable, because it divides the data up into groups.

Now let's talk about the relationships between variables. I'm going to use a short-hand method to represent the relationship between a dependent and independent variable. For example, if I want to show that height affects weight, I will write:

weight <= height

The "<=" is a backwards-pointing arrow, by the way! Read the expression as "weight is affected by height".

Sure, it would be more sensible to write height => weight, and read it as "height affects weight", but statisticians are used to seeing the dependent variable on the left. It goes back to writing things like Y = X + 1. We don't write X + 1 = Y (although we could). So in general, let's write

dependent <= independent

Now, if we just substitute nominal and numeric variables for the dependent and independent variables, we'll end up with four different simple models. Here they are, with their names:

 numeric <= numeric 

Linear Regression

 numeric <= nominal 

T Test and One-Way ANOVA

 nominal <= nominal 

Contingency Table

 nominal <= numeric 

Categorical Modeling

I detail each model on the next four pages.

Go to: Next · Previous · Contents · Search · Home
resources=AT=sportsci.org · webmaster=AT=sportsci.org · Sportsci Homepage · Copyright ©1997
Last updated 25 April 97