An Introduction to 2-Way Tables

2-way tables are also known as

2-way cross-classification tables & 2-way contingency tables


The data

For each unit/subject data is collected on each of two categorical variables.

Example 1

Cocaine addicts need the drug to feel pleasure. Perhaps giving them a medication that fights depression will help them stay off cocaine. A three-year study compared an antidepressant called desipramine with lithium (a standard treatment for cocaine) and a placebo. The subjects were 72 chronic users of cocaine who wanted to break their drug habit. Twenty-four of the subjects were randomly assigned to each treatment. After treatment we measure whether or not the subject relapses into cocaine use.

In this study the subjects are the 72 cocaine addicts. Two variables – each categorical – are measured on each subject: the treatment and whether or not the subject relapsed. We have a three groups of 24 people, each classified in by whether or not there is a relapse. The number of people in each treatment group is fixed in advance; the number of relapses is only known after we have the data.

Example 2

A study of the relationship between men's marital status and the level of their jobs used data on all 8235 male managers and professionals employed by a large manufacturing firm. Each man’s job has a grade set by the company that reflects the value of that particular job to the company. Grade 1 contains jobs in the lowest quarter of job grades, and grade 4 contains those in the highest quarter.

In this study the subjects are the 8235 males at the firm. Two variables – each categorical – are measured on each subject: his marital status and his job grade. We have a single group of 8235 men, each classified in two ways, by marital status and job grade. Note that the number of men in each marital status is not fixed in advanced; nor is the number of men in each job grade – these are only known after we have the data.


Summarizing the data

Quantitative summary

The data is summarized in a "two-way (cross-classification) table."

Example 1

Treatment Relapse  No relapse Total
Desipramine 10 14 24
Lithium 18 6 24
Placebo 20 4 24
Total 48 24 72

We see that one-third (33.3%) of the subjects were allocated into each treatment (this was by design); this is the marginal distribution for the treatment variable. Notice that two-thirds (66.7%) of the subjects relapsed and one-third (33.3%) did not; this is the marginal distribution for the "whether relapsed or not" variable. This distribution is not fixed in advance.

Example 2

 

Marital Status

 
Job Grade       Single Married Divorced Widowed Total
1 58 874 15 8 955
2 222 3927 70 20 4269
3 50 2396 34 10 2490
4 7 533 7 4 551

Total

337 7730 126 42 8235

Here the marginal distributions are best tabulated. First, by marital status:

  Marital Status  
Single Married Divorced Widowed
4.1% 93.9% 1.5% 0.5%

This, of course, is not surprising, nor is it informative.

Second, by Job Grade:

  Job Grade  
1 2 3 4
11.6% 51.8% 30.2% 6.7%

Graphical summary

The marginal distributions should not be displayed in a graph -- unless, of course, your boss asked for a pie chart or bar chart (read why they stink!). A simple table is sufficient. To display both variables at the same time use a segmented or stacked bar chart. Obtain this chart using a spreadsheet program. Here are instructions for Excel.

  1. Enter the two-way table into Excel. Include headings for each row and column.
  2. Select the table: not including the marginal totals, but including the row and column designations. In the tables displayed above, you see the appropriate selections in RED. (If not viewing in color look for the lighter shaded text.)
  3. From the INSERT menu select CHART. You want a COLUMN chart -- and you want the one that produces the chart shown below.
  4. In setting up your chart make sure you choose Series in rows/columns correctly. Levels of the explanatory variable must be listed along the horizontal axis. Change the rows/columns button until you have it right.
  5. Give your chart a title, name the variables on each of the axes. A good graph tells the story without any help.

Example 1

Example 2

Notes on graphs

Your graph should tell the story without any additional information. Assume only that you are addressing an intelligent person who understands how to read the chart (if not, you can easily teach this) and that your reader has some understanding of the data that's been collected (in a written report, be particularly careful to clearly define your variables). Be particularly careful with colors and the legend. If you print to black and white -- color is no good. Contrasting "patterns" (often "cross-hatched" in different ways) can also be dangerously difficult to tell apart. Make certain, then, that when you print the graph is still readable. If you print to color, or just project on a computer, then usually Excel does OK. Make the data dominate the picture. If there's a story, the graph should tell it.

This type of data is 2 dimensional. Avoid 3-D graphs.


Read more...on association and tests for association.