Understanding the Misuse of Statistical Techniques in Research
Written on
The Risks of Misinterpreting Statistical Findings
Multiple regression is an essential technique in data science, allowing analysts to derive insights from complex datasets. However, as the saying goes, "There are lies, damned lies, and statistics." This phrase often echoes the concerns surrounding multiple regression's potential misuse.
Recently, a study published in Plos One, titled "Body Mass Index and All-Cause Mortality in a 21st Century U.S. Population: A National Health Interview Survey Analysis," presented some unexpected findings. The conclusion noted that the risk of all-cause mortality increased by 21–108% for individuals with a BMI of 30 or higher. However, it further stated that BMI may not independently influence mortality, particularly among older adults with an overweight BMI. The researchers called for additional studies to examine various factors influencing BMI and mortality correlations.
This seemingly revolutionary assertion—that being overweight does not elevate mortality risk until one reaches extreme obesity—raises questions about the study's methodology. What does "independently of other risk factors" really mean?
Evaluating Research Methodology
An excerpt from the study's "methods" section reveals that the researchers utilized multivariable Cox proportional hazards regression to assess mortality risks while adjusting for covariates and accounting for survey design. This adjustment aims to isolate the impact of body weight on mortality.
It is standard practice for studies to control for demographic factors like age, race, and education. While accounting for insurance coverage is less common, it is a beneficial addition. However, controlling for alcohol consumption can be contentious; while it often correlates with weight gain, it can also lead to health issues unrelated to weight.
Additionally, while controlling for smoking is crucial, the researchers did not execute this perfectly. They also adjusted for physical activity, which can complicate the understanding of how weight influences mortality. The relationship between exercise and obesity is intricate; for instance, individuals who struggle with their weight may refrain from exercising due to mobility issues.
The study also controlled for comorbidities, many of which are influenced by body weight, such as diabetes and cardiovascular diseases. Furthermore, they factored in how frequently participants visited their doctors, which could indicate underlying health problems—a factor that is often more prevalent among obese individuals.
Interestingly, the researchers recognized this challenge, stating in their analysis that they did not adjust for comorbidities in the main analysis since they could influence the relationship between BMI and mortality.
The initial analysis indicated that unadjusted mortality risks were similar across various BMI categories from 20.0 to 29.9 kg/m2, but significantly elevated risks emerged at a BMI of 30 or higher.
Exploring the Impact of Smoking
Most unhealthy behaviors, including smoking, tend to complicate studies on body weight. Smoking can lead to lower weight but also contributes to numerous health issues. Thus, researchers commonly control for smoking in studies examining the health effects of body weight. However, in this case, the study adjusted for smoking alongside cancer and cardiovascular disease, potentially obscuring the true impact of smoking on health.
To be fair, the study's exclusion of individuals who died within two years of the survey is a sound practice. Weight loss is common in the final years of life, and this "reverse causation" can skew results in BMI studies.
Despite some valid methodologies, the study's overall approach raises concerns about the potential for misleading interpretations of data.
The Fine Line Between Good Science and Misleading Statistics
The study's initial analysis was labeled as the "main analysis," which may mislead readers since it was more of a preliminary investigation rather than a definitive conclusion. Ultimately, the findings suggest that being slightly overweight does not significantly affect mortality risk—as long as one is not suffering from other health issues. In contrast, obesity is linked to a notable increase in mortality risk.
This misuse of multiple regression raises critical questions about the integrity of statistical practices. Using similar methods, one could manipulate data to support various misleading conclusions.
Why Do Researchers Engage in Such Practices?
There are several reasons researchers may publish studies with questionable methodologies. Academic pressure to publish often drives scholars to produce results that stand out, even if those results are not entirely accurate. Surprisingly, studies that challenge conventional wisdom often garner more attention and citations, despite their potentially flawed conclusions.
In conclusion, while multiple regression is a valuable analytical tool, it is essential to scrutinize the variables being controlled. Are researchers accounting for mechanisms through which the independent variable might influence the dependent variable? Such scrutiny is vital in ensuring that research findings contribute meaningfully to the scientific discourse.
For further insight into the nuances of statistical analysis, consider exploring the following videos:
In the video "How Statistics Can Be Misleading," Mark Liddell discusses how statistical methods can be manipulated, shedding light on the importance of critical thinking in interpreting data.
Sanne Blauw's video, "How to Defend Yourself Against Misleading Statistics in the News," provides practical tips for identifying and questioning statistical claims in everyday reporting.
Feel free to share your thoughts or questions on research methods in health science in the comments, and I will do my best to respond.