Association in 2-Way Tables

Before proceeding, you may wish to read about 2-way tables for categorical data.


Two variables are said to be associated when changing the value of one variable results in a change in the distribution for the other variable.

In the case of two-way tables, this amounts to saying "As one variable changes values, the precentages for the other variable changes." For example, consider the following data cross-classifying marital status and job grade (need more information?).

 

Marital Status

 
Job Grade       Single Married Divorced Widowed Total
1 58 874 15 8 955
2 222 3927 70 20 4269
3 50 2396 34 10 2490
4 7 533 7 4 551

Total

337 7730 126 42 8235

Here's the appropriate graph.

The two variables (marital status and job grade) are said to be associated because as we change from single to married we find a much higher percent of people in grade 3. There are, of course, many other such relationships in this data.

Having established an association, it's always good to try to briefly describe the nature of the association. That is -- what changes occur as we move among the various marital statuses? Here we see primarily that:

These are only the broadest distinctions; try to generalize as much as possible.

Here's what the graph would look like if the two variables were not associated:

You can see that it would not matter what a person's marital status is: the job grade would have the same distribution or pattern of variation.

Of course, in some situations, the differences we observe are slight. In such cases we reserve judgement. Small differences may simply be the result of chance (sampling error). When we examine the plots for association we're looking for large differences. Beyond this, inferential tools must take over. We make use of (and you can read more on). . .tests of association for 2-way tables.