| |

Lawyer's Salaries
Does this law firm discriminate on the
basis of race?
You may view/copy the data.
Salaries of lawyers at a large firm. Information on
three variables is included: salary (quantitative) in US
$, tenure--or length of service--(quantitative) in
months, and race (categorical). (Should you review the quantitative/categorical
distinction?)
Given two quantitative (or numerical. . .same thing)
variables per case, the appropriate plotting technique is
the scatterplot. When a third variable, that is
categorical, is also present, adjust your scatterplot
using different symbols for each level of the categorical
variable.

Notice that if the data on salary is aggregated over
all the different values of tenure (that is, if we merely
ignore the lenght of service) we get an extraordinarily
misleading picture. The firm does pay blacks less than
whites, but this is so because the blacks it employs have
generally been with the firm for less time that have the
whites. Averaging over the various tenures produces an
assocation between race and salary that is not present
when tenure is accounted.
Some questions and answers. The answers are in blue.
- How would you determine (from the graphs above
only) the number of cases observed in this study?
Count the number of points
plotted in the scatterplot. (There's no way to
tell from a boxplot.)
- How many variables are measured on each case?
Give a name to each variable, then identify each
as numerical or categorical.
Three variables per case:
Race (Categorical), Salary (Quantitative) &
Tenure (Quantitative).
- How would your expect the mean to compare to the
median for each race?
Because of the extreme, or
outlying, values to the right (high salaries),
for whites we would expect the mean to far exceed
the median. The boxplot for blacks indicates
relative symmetry, and therefore we'd expect the
mean and median to be roughly the same. That this
is the case is seen below
| |
Mean |
Median |
| Blacks |
98049 |
97434 |
| Whites |
186584 |
130371 |
- Do the two distributions (salary for each race)
have equal standard deviation? If not, which race
has a larger standard deviation? (Equal spread
among groups is an assumption that many
statistical inference techniques require the data
satisfy.)
The standard deviation is
much larger for whites. The boxplots indicate
much greater spread in the distribution for
whites. (In fact, the standard deviations are
31923 for blacks and 157967 for whites.)
- Does it appear that race plays any role in
determining salary?
No. There is no evidence of
this. However, clearly one or both of the
following factors do contribute to discrepancies
between the races:
1. In the past the company did not hire blacks.
2. In the past social discrimination led to few
blacks becoming lawyers.
- Is there equal variability in salaries at
differing experience levels?
No. The spread in salaries
gets larger as tenure increases.
- The company often hires experienced lawyers from
other firms. These new hires are paid on the
basis of their tenure with the previous firm
(among other things). Predict the salary of a new
hire who has 5 years of experience.
About $80,000.
- Predict the salary of a new hire who has 20 years
of experience. According to your predictions, how
much more will this person earn than the new hire
with 5 years experience? (Note that there's a
difference of 15 years.)
About $200,000. He will
earn about $120,000 more.
- Predict the salary of a new hire who has 35 years
of experience. According to your predictions, how
much more will this person earn than the new hire
with 20 years experience? (Note that there's
again a difference of 15 years.)
About $350,000. He will
earn about $150,000 more.
If you examine your predicted salary differences of
the previous two parts you will notice that there is an
increase in the rate of change of salary as one's tenure
increases. This is a feature of certain types of curves
called exponential curves. These curves have the shape
implied in the scatterplot above. (The value-over time-of
money invested in a bank and the size-over time-of a
population of organisms are two other phenomena that
generally fit exponential curves.
|
|