Introgen's Data-Mining Misleads
The following example, although exaggerated, will better explain why prospectively defined data analysis is more relevant than anything found retrospectively:
Suppose lung cancer Drug X is being tested in a trial of 500 patients. Half the patients receive Drug X plus chemotherapy, the other half gets chemotherapy alone. The goal of the study, or its primary endpoint, will be to determine whether Drug X improves survival of lung cancer patients, and by how much. When the study is finished, patients in both arms of the study had a median survival of 11 months. Drug X, it seems, did nothing to improve survival of lung cancer patients, overall. But wait, it turns out that 75 left-handed lung cancer patients who took Drug X in the study survived for 20 months, while 25 left-handers treated with chemotherapy alone only lived for 10 months. Moreover, this result -- a doubling of survival for left handed lung cancer patients -- was statistically significant. On the basis of this analysis, would the FDA approve Drug X as a treatment for left-handed lung cancer patients? Of course not. The clinical trial wasn't designed in advance, or prospectively, to test that hypothesis. Instead, the conclusion that Drug X boosted survival in left-handers was derived retrospectively, after the trial was designed and the data were collected and analyzed. What may look like solid proof of Drug X's efficacy from the study is actually just an observation found through data-mining (and a potentially biased one). Another clinical trial of Drug X in left-handers would need to be run to prove the observation true.- Loading Comments...
- Loading Comments...
Featured Photo Galleries
| Dow Jones | S&P 500 | NASDAQ | 10-Year Note | |
|---|---|---|---|---|
| 10,270.47 | 1,093.48 | 2,167.88 | 34.29 |
Oil *
75.55
|
|
UP
73.00
|
UP
6.24
|
UP
18.86
|
DOWN
0.17
|
10 Yr
3.43%
SPDR Gold
109.74
|
|
+0.72%
|
+0.57%
|
+0.88%
|
-0.49%
|
Data delayed 20 minutes |














