The Basic Practice of Statistics 7th Edition

Published by W. H. Freeman
ISBN 10: 146414253X
ISBN 13: 978-1-46414-253-6

Chapter 2 - Describing Distributions with Numbers - Chapter 2 Exercises - Page 70: 2.43b

Answer

(Note that the comparative/side-by-side boxplots were graphed using Python, but still shows all the features of boxplots, including the outliers indicated in dots) Question: What can you say about the effects of gastric banding versus lifestyle intervention on weight loss for the subjects in this study? Answer: Although the boxplot of gastric banding has more outliers compared to the lifestyle intervention boxplot, the gastric banding treatment is more effective compared to lifestyle intervention, indicated by the weight loss median of gastric banding being significantly higher than the weight loss median of lifestyle intervention. However, the variability of weight loss by gastric banding is much larger compared to lifestyle intervention, suggesting that the weight loss by gastric banding is less consistent than lifestyle intervention weight loss, although the IQR of both treatments are roughly the same. In terms of shape of distribution, the weight loss through gastric banding is right-skewed, whilst the lifestyle intervention weight loss is nearly normally distributed.

Work Step by Step

To draw the boxplots of both treatments (Gastric binding and lifestyle intervention), we first sort the list of numbers from each treatment from smallest to largest. GB (Gastric binding): -5.4, 13.4, 15.2, 19.4, 20.2, 22.0, 24.8, 27.9, 29.7, 31.0, 32.3, 32.8, 33.9, 35.6, 36.5, 37.6, 39.0, 41.7, 43.0, 49.0, 53.4, 57.6, 64.8, 81.4 LI (Lifestyle intervention): -17.0, -16.7, -12.8, -4.6, -4.3, -3.1, -3.0, -1.8, 1.4, 2.0, 4.0, 6.0, 6.0, 11.6, 15.5, 15.8, 20.6, 34.6 Then, we count how many data points are there for each treatment. GB has 24 data points, and LI has 18 data points. To determine the median, we take the number of data points of each treatment, add 1, and divide by 2 to get the rank of the median in each treatment. Median of GB weight loss: $\frac{24+1}{2}$=12.5th => average of 12th and 13th data point = $\frac{32.8+33.9}{2}$=33.35 Median of LI weight loss: $\frac{18+1}{2}$=9.5th => average of 9th and 10th data point = $\frac{1.4+2.0}{2}$=1.7 To find Q1 and Q3 of each treatment, split each treatment data set in half so that both sides have the same number of data points. For the lower half, Q1 is the median of the lower half, while Q3 is the median of the remaining half. Use the same formula above for finding these median. The answer should be as follows: Q1 and Q3 of GB weight loss: 24.1 and 42.025 Q1 and Q3 of LI weight loss: -4.0 and 10.2 Using Q1 and Q3 of each treatment, determine the product 1.5$\times$IQR, which is calculated by 1.5$\times$(Q3 - Q1) 1.5$\times$IQR of GB weight loss = 26.8875 1.5$\times$IQR of LI weight loss = 21.3 Using the product above, we can create an inequality that shows possible outliers. For each treatment, a data point is an outlier if it is less than Q1 - 1.5$\times$IQR OR more than Q3 + 1.5$\times$IQR For GB weight loss data, a datum is an outlier if it is less than -2.7875 or more than 68.9125. For LI weight loss data, a datum is an outlier if it is less than -25.3 or more than 31.5. Based on the above information of possible outliers, GB weight loss data has 2 outliers, which are -5.4 and 81.4. LI weight loss data has 1 outlier, which is 34.6. Not considering outliers, we can determine the minimum and maximum value of each treatment. The min and max of GB weight loss is 13.4 and 64.8 respectively. The min and max of LI weight loss is -17.0 and 20.6 respectively. Organize all of these answers as a five-number summary, including the outliers To draw the comparative boxplot, draw the x-axis that is labelled "Treatments" and y-axis labelled "Weight Loss(kg)". There should be two x-ticks on the x-axis, respectively named "Gastric banding" and "Lifestyle Intervention". For the y-axis, make sure to consider the lowest number and highest number of ALL dataset combined. Also make sure to indicate outliers as dots. As for interpreting the comparative boxplots, do not say one treatment is better than the other without proof. Always based your argument on the boxplot, such as difference in medians or difference in variability, and so on. The answer above is a sample on what should be written, but you can rewrite it so that you can understand it your way.
Update this answer!

You can help us out by revising, improving and updating this answer.

Update this answer

After you claim an answer you’ll have 24 hours to send in a draft. An editor will review the submission and either publish your submission or provide feedback.