Viktor Mayer-Schönberger on Big Data - Dictionary of Arguments
Big Data/Causality/Mayer-Schönberger: Big Data refers to a phenomenon that only works on a large scale, not on a smaller scale: to gain new insights (...) that can change markets, organizations or the relationships of citizens and governments.
Causality: one does not know why, but only that things are correlated.
By increasing the order of magnitude (amount of data) the nature changes.
Big Data is not about teaching computers how to think like people, but about applying mathematics to large amounts of data in order to make probabilities out of it. For example, whether an email is spam or the letter string "teh" is a typo.
The tools were democratized - but not the data.
Big Data: if we look at larger amounts of data because the technical means are now available, we no longer need random samples.
Without having to resort to sampling, new hypotheses can now be tested ((with the same amount of data, namely all of the data). (See Statistics/Mayer-Schönberger: with the random sample method, the question can no longer be changed).
These hypotheses can then also be tested at different levels of fineness.
Error: Large amounts of data always contain errors. One has to learn to live with it.
The mess is messy itself. ((s) No pattern can be seen in the errors).
Algorithms/Big Data/Mayer-Schönberger: the more data are available, the more algorithms are trumped. This can be seen in the way computers learn to deal with everyday language and how they translate it.
Internet search/Mayer-Schönberger: with Big Data you no longer need an initial hypothesis about a phenomenon to understand the world. So we do not need any idea of what people are looking for when it comes to how and where a flu epidemic is spreading. It is possible to predict the development of air fares without knowing the pricing policy of the airlines. Without such a hypothesis, the results will be available more quickly and will show fewer tendencies.
Spell check: Compared to Microsoft, who invested millions of dollars in improving its spell check, Google got its system virtually for free by simply examining the three billion search entries a day.
Google was not the first company with this idea, by the way. Yahoo already had this idea in 2000. At that time, however, this was rejected because old data was regarded as rubbish. (See Data/Mayer-Schönberger).
Perception/Data/Big Data/Mayer-Schönberger: our perceptions/perceptual organs were trained for a world of scarce information, not for a world of abundance of data.
In more than half of the American states, data analysis is used to support decisions on whether a prisoner should be released from prison.
Rules: Computer systems usually base their behaviour on rules that can be traced back to errors. In the case of big data analysis, this is much more difficult. The basis of what algorithms predict is too tricky for most people.
There is a danger that big data forecasts will become a black box._____________Explanation of symbols: Roman numerals indicate the source, arabic numerals indicate the page number. The corresponding books are indicated on the right hand side. ((s)…): Comment by the sender of the contribution. Translations: Dictionary of Arguments The note [Author1]Vs[Author2] or [Author]Vs[term] is an addition from the Dictionary of Arguments. If a German edition is specified, the page numbers refer to this edition.
Big Data: A Revolution That Will Transform How We Live, Work, and Think New York 2013