Prozac for Reducing Relapses in Anorexia Nervosa?
Before we analyze the data you should read about the study.
Note that the language used below very closely agrees with the language of the article.
By and large the question is: Do patients on prozac have a greater likelihood of maintaining weight and reducing depression or obsessions about losing weight than do patients on placebo? We have 35 subjects, each is randomly allocated to one of two treatment groups (placebo, prozac). Before the study was conducted the experimenters identified important parameters and relevant hypotheses. Let p1 be the proportion of persons on placebo who maintain weight and reduce depression or obsessions about losing weight; let p2 be the proportion of persons on prozac who maintain weight and reduce depression or obsessions about losing weight. The researchers wish to determine how much evidence there is in favor of the claim p1<p2. This statement becomes the alternative (research) hypothesis. The null hypothesis (hypothesis of no difference) has the two proportions equal. Then, in formal terms,
We estimate p1 and p2 with the respective observed proportions
Where'd these numbers come from? Note the unequal sample sizes. This is most likely due to subjects dropping out of the study.
Your textbook undoubtedly has a section covering "comparing two proportions." The method used in that section (the two-sample Z test) is not appropriate for this data. Reason: the relatively small sample sizes. If you look carefully at the text's treatment of this subject you will find criteria that must be established before using the text's formula. . .one of which will be violated by this data. Most texts frame the criteria "The observed numbers of successes and failures for each group must be at least 5." Here, the observed number of successes in the placebo group is 3--too small.
Your textbook's method is an approximation to the method that is used in practice (an approximation that works quite well when the criteria for using it are satisfied). The "real method" is not presented in textbooks because it is somewhat more complicated computationally. However, the idea is similar. The method I describe below is Fisher's exact test.
We note that 13 of 35 people "maintained." Suppose (playing the devil's advocate) that the prozac has no effect and that, in fact, 13 of the 35 people were destined to maintain. How likely is it that 10 or more of these 13 would be randomly allocated into the prozac group? This probability is the P-value or observed significance level (OSL) for the test. My calculations (I suspect they're correct) obtain P-value = 0.0417 (about 1 in 24). In fact, Fisher's exact test results in an exact P-value; 61,634,860/1,476,337,800 which, to the nearest 0.0001 is 0.0417.
If the null hypthesis is in fact true and the experiment is replicated there is a 1 in 24 chance of obtaining results at least as favorable to the alternative hypothesis as is the observed result (10 of 13 maintainers in the prozac group of 19).
The result is statistically significant at level 0.05--but just so. The result is not statistically significant at level 0.01. I would conclude that while there is evidence in favor of the alternative hypothesis (that the likelihood of maintaining is greater when prozac is used) the evidence is not irrefutably strong, and that more investigation should be done before using prozac as a treatment for relapses in anorexia.
Finally. . .before accepting the results of this study there are a number of issues that should be examined. Can you think of any?