My suggestions for how to deal with data proof are in the main column. I asked readers for their suggestions, and you guys came up with several great additions to the process of checking data:
1. To prove causality, you need three things: co-variation (aka, correlation), time order (to cause B, A has to happen first) and non-spuriousness.
2. Look at the depth of correlation (not just the direction but the amount of movement). What happens when "X" changes 5% -- does it correlate with a 15% change in Y? Do other changes in "X" change with Y? Is there a it tapering off of the effect? Does it invert? Are the increases linearly or non-linearly?
3. Always beware of "ceteris paribus." Translation: "all other things being equal." For the most part, all other things are very rarely equal.
4. As George Soros wrote, Markets are reflexive. Market participants "learn." For that reason alone, be extremely wary of applying past historical patterns to current data. As Mark Twain is reputed to have said, "History doesn't repeat, but it rhymes."
5. Look at outliers. This is very much related to central tendency -- similar to non-recurring costs
6. Extraordinary claims need extraordinary evidence
7. Watch for Distribution. Most stats are based on a normal distribution (i.e., Bell Curve). If you can't provide a basis for the normative assumptions, you must use a distribution-free or non-parametric test; otherwise your conclusions are dubious.
8. Most data patterns are linear, curvilinear or non-linear. Like costs in accounting, it depends on the time period of interest (costs could be fixed, variable or mixed, e.g.)
9. If you are looking to measure a data series central tendency, consider what occurs if you use mean, median or mode. Each has pros and cons and can be "biased," producing different outcomes.
10. The key is to know what someone's interests are. Are they trying to sell you something? They will naturally frame their presentation to compel a certain action. Key question: What's missing? What one thing would have to be true for them to be wrong?
11. Present/Not Present (if X is present; when is Y not present - find an example). not present:present (if X is not present; when is Y present)
12. What kind of logic is being used? Is it Inductive, Deductive, Analogic, etc. There are assumptions with each form of reasoning, and each has potential pitfalls. Inductive can always be wrong -- the "black swan" issue. Deductive: What are the premises? Analogic: Do the compares have materially relevant similarities?
13. Case studies are often wrong. Many stories are built on them and often have compelling detail, but they are often part of a selling frame that is not generalizable to other situations.
14. Observe all points in your analysis where you assumed that events were independent (in the probabilistic sense). Re-examine whether this assumption is still valid given the current market environment. If it is not, how does this effect your calculations?
15. If an indicator has, in the past, been such a good predictor of short term moves in the market as to reach the level of "statistical significance," and if that indicator were to become widely recognized as having reached that elusive threshold of statistical significance, the indicator would cease to be effective once the majority recognized its prior effectiveness. Thus, anyone waiting for statistical certainty is certain to be disappointed.