The world can work better

Big Data – an illusion of power

The demand for knowledge what is the result/the cause in the world around us to the certain extent generates the demand for experimental data. Costs of such experimental research are undoubtedly a derivative of the size of the sample. But there are also situations when the experiment simply cannot be carried out or it would be almost unethical. In such cases „natural experiments” take place. So, we are looking for a group of cases in terms of the variables we are interested in – they might have been registered during other studies or for other reasons when some information was, or still is collected on a mass scale. Spectacular examples being social media and the case of Cambridge Analytica.

Contemporary, constantly developed databases, surveys, scientific research, government data, epidemiological data, censuses, medical statistics, opinion polls, social media give us the opportunity to base statistical surveys on samples previously – due to costs or ethical limitations – unattainable. At the same time, more and more powerful computers, online accessibility and sophisticated multiple-regression computational techniques create a seemingly unlimited room for action.

Data juggling supported with Big Data bases, but without experimental verification on a respectively small scale, exposes us – as elements of a complex system – to a very risky option. „Discoverers” of the alleged causal dependency leading to the beneficial theory might enjoy the potential profit whereas us, citizens – the potential loss

Acting with the use of Big Data quite quickly, however, hits the constraints resulting from the epistemic opacity of the world. The more variables, the more correlations that may turn out to be statistically significant but actually incorrect. With 1,000 variables, the number of apparent correlations exceeds 20,000; with 2.000, it exceeds 80,000. The increase in false relationships geometrically exceeds the increase in the amount of true information. This is why, among other things – as Taleb claims – after decoding the human genome, the analysis of the thus obtained gigantic database has not brought any spectacular breakthrough yet. But with seeking confirmation of a preconceived thesis, research intuition, political idea – the sky is the limit. Wishes enhanced by behavioral lenses will yield any needed theory based on Big Data. In turn, the intuition of a researcher, a politician or an economist who wants to save humanity on an industrial scale with new concepts, strengthened by self-confirmation testimonies in the eyes of other convinced, as part of the process of scaling the idea for the entire complex system, has many a time brought catastrophic decisions upon humanity.

Data juggling supported with Big Data bases, but without experimental verification on a respectively small scale, exposes us – as elements of a complex system – to a very risky option. „Discoverers” of the alleged causal dependency leading to the beneficial theory might enjoy the potential profit whereas us, citizens – the potential loss(48). The „discoverer,” while coming across confirmation of his intuition, has the option to stop searching and not verify his theory thoroughly, so as not to endanger himself and end up with nothing. He is ready to spread his ideas now, to implement them. He is ready to take advantage of returns to scale and earn money now. Big money. And who will find the money and those willing to repeat and potentially refute the existing, seemingly well-documented, research? For us, elements of the system, the self-preservation instinct should suggest that we need to observe data, put forward hypotheses, and then bear the costs of verifying and calibrating them, i.e. adapting to the reality of proposed action algorithms or change directions – and implement the changes first on a small scale instead of allowing for their immediate increase to the scale of the entire system.