Idea: As our scientific precision increases, it is necessary to proportionally (or exponentially) expand our capacities for computing the effect. Example: in order to definitively statistically detect a difference in males and females in incidence of heart attacks we might need, say, a total sample size of 100 (so that we have 50 people in each group so that our statistical calculations will be robust enough to make solid conclusions from). But as we learn more about heart attacks, we realize that there are more variables that affect heart attacks, like age, diet, physical activity, abdominal adiposity, educational attainment, socio-economic status, smoking, and alcohol consumption. Assuming each of these variables are dichotomized into only two groups (young, old; healthy diet, unhealthy diet; active, inactive; etc.) and that 50 people in each group is still enough to detect true difference in each variable (which is unrealistic), we would now need 25,600 people to tease out the effects of all of these different variables. With more nuanced categories, this number climbs very quickly. We know a lot about heart attacks nowadays but there is still unexplained variation in the effects we see, which means there are other things we’re not measuring and accounting for.
Not think of whole-genome research, where we are handling approximately no less than 125 megabytes of information from just one single person. Now think of the entire genome of one person’s entire microbiome. How many people would we need then? More than exist on Earth.
My prediction is that soon we’ll realize that the more we know, the less we can continue to learn. We will be reduced to underpowered tests of small questions. We will have hit an upper limit of what we can isolate or definitively know about anything, and there will be nothing we can do about it. (But before that, our computers will not have enough computing power and even before that we will never have enough money to even begin to do one sound study of these proportions). We’ll get to the point where we must resign ourselves to not knowing what we want to know. Science will become postmodern, accepting that we can’t do what our methods set out to do. It will be a kind of scientific Armageddon, having arrived at the limits of statistical possibility.
Just a thought.
In the meantime I’m now wondering if I can use Bayesian data analysis to analyze prevalence of diabetes to account for error in diagnosis. “Live by the harmless untruths that make you brave and kind and healthy and happy.