Introduction and review? Try these:
A study of the relationship between men's marital status and the level of their jobs used data on all 8235 male managers and professionals employed by a large manufacturing firm. Each mans job has a grade set by the company that reflects the value of that particular job to the company. Grade 1 contains jobs in the lowest quarter of job grades, and grade 4 contains those in the highest quarter.
In this study the subjects are the 8235 males at the firm. Two variables each categorical are measured on each subject: his marital status and his job grade. We have a single group of 8235 men, each classified in two ways, by marital status and job grade. Note that the number of men in each marital status is not fixed in advanced; nor is the number of men in each job grade these are only known after we have the data.
Marital Status |
|||||
| Job Grade | Single | Married | Divorced | Widowed | Total |
| 1 | 58 | 874 | 15 | 8 | 955 |
| 2 | 222 | 3927 | 70 | 20 | 4269 |
| 3 | 50 | 2396 | 34 | 10 | 2490 |
| 4 | 7 | 533 | 7 | 4 | 551 |
Total |
337 | 7730 | 126 | 42 | 8235 |
Here's the marginal distribution for Job Grade alone.
| Job Grade | |||
| 1 | 2 | 3 | 4 |
| 11.6% | 51.8% | 30.2% | 6.7% |
Here's the right plot for this data. There appear to be significant differences in job grade after adjusting for marital status.

These employees are not necessarily a random sample from any larger population. In fact, the 8235 employees specify the entire population. Still, we want to decide if the relationship between marital status and job grade is statistically significant in the sense that it is too strong to happen just by chance if job grades were handed out at random to men of all marital statuses. That meaning makes sense even though we have data on the entire population.
We use the c2 test to make our inference. Our hypotheses are:
H0: marital status and job grade are not associated
HA: marital status and job grade are associated
Here's Minitab output.
Chi-Square TestExpected counts are printed below observed countsC1 C2 C3 C4 Total 1 58 874 15 8 955 39.08 896.44 14.61 4.87 2 222 3927 70 20 4239 173.47 3979.05 64.86 21.62 3 50 2396 34 10 2490 101.90 2337.30 38.10 12.70 4 7 533 7 4 551 22.55 517.21 8.43 2.81 Total 337 7730 126 42 8235 Chi-Sq = 9.158 + 0.562 + 0.010 + 2.011 + 13.575 + 0.681 + 0.407 + 0.121 + 26.432 + 1.474 + 0.441 + 0.574 + 10.722 + 0.482 + 0.243 + 0.504 = 67.397DF = 9, P-Value = 0.000 2 cells with expected counts less than 5.0
Examining the table of expected counts, we see that they represent what would have happened had a) the marginal totals remained the same and b) there been exactly no association between the two variables. Here's the table of expected counts
Table of Expected Counts |
|||||
| Job Grade | Single | Married | Divorced | Widowed | Total |
| Grade 1 | 39.08 |
896.44 |
14.61 |
4.87 |
955 |
Grade 2 |
173.47 |
3979.05 |
64.86 |
21.62 |
4239 |
Grade 3 |
101.90 |
2337.30 |
38.10 |
12.70 |
2490 |
Grade 4 |
22.55 |
517.21 |
8.43 |
2.81 |
551 |
| Total | 337 |
7730 |
126 |
42 |
|
Here's what would have been the case if the expected counts were used -- this is exactly no association. This chart reflects what would be the case if the marginal distribution on job grade (see it above: 11.6%, 51.8%, 30.2% and 6.7% in grades 1-4 respectively) were true for each marital status.

The P-value is 0.000+, we conclude that marital status and job grade are associated. Examining the components of the test statistic, we note that the largest ones all come from the "single" column. It's clear that the biggest differences are between single people and everyone else. Single people are much more likely than everyone else to be in grade 1 or 2 and, consequently, much less likely to be in grades 3 and 4. The only other substantial difference is for the widowers, who are much more likely to be in job grade 1 than are the others. (Be careful: The plots put job grade 1 at bottom, while all the tables have job grade 1 at the top!)