Facebook Pixel
Searching...
English
EnglishEnglish
EspañolSpanish
简体中文Chinese
FrançaisFrench
DeutschGerman
日本語Japanese
PortuguêsPortuguese
ItalianoItalian
한국어Korean
РусскийRussian
NederlandsDutch
العربيةArabic
PolskiPolish
हिन्दीHindi
Tiếng ViệtVietnamese
SvenskaSwedish
ΕλληνικάGreek
TürkçeTurkish
ไทยThai
ČeštinaCzech
RomânăRomanian
MagyarHungarian
УкраїнськаUkrainian
Bahasa IndonesiaIndonesian
DanskDanish
SuomiFinnish
БългарскиBulgarian
עבריתHebrew
NorskNorwegian
HrvatskiCroatian
CatalàCatalan
SlovenčinaSlovak
LietuviųLithuanian
SlovenščinaSlovenian
СрпскиSerbian
EestiEstonian
LatviešuLatvian
فارسیPersian
മലയാളംMalayalam
தமிழ்Tamil
اردوUrdu
Big Data

Big Data

A Revolution That Will Transform How We Live, Work, and Think
by Viktor Mayer-Schönberger 2013 242 pages
3.69
8k+ ratings
Listen
Listen to Summary

Key Takeaways

1. Big Data Shifts Focus from Sampling to Comprehensive Datasets

Using all the data lets us see details we never could when we were limited to smaller quantities.

From some to all. Big data marks a shift from relying on samples to analyzing comprehensive datasets. Traditional statistics relied on sampling due to limitations in data collection and processing. However, with advancements in technology, it's now feasible to analyze vast amounts of data, providing a more granular and accurate view of phenomena.

Granularity and detail. Analyzing all available data allows for deeper insights into subcategories and submarkets that sampling methods often miss. This level of detail is crucial for identifying anomalies, understanding niche preferences, and making precise predictions. For example, Google Flu Trends uses billions of search queries to predict the spread of the flu at the city level, a feat impossible with smaller, sampled datasets.

Limitations of sampling. While random sampling has been a successful shortcut, it comes with inherent weaknesses. Its accuracy depends on ensuring randomness, which is difficult to achieve, and it doesn't scale easily to include subcategories. By embracing comprehensive datasets, we can overcome these limitations and unlock new possibilities for analysis and understanding.

2. Embrace Messiness: Imperfect Data Can Yield Superior Insights

In return for relaxing the standards of allowable errors, one can get ahold of much more data.

Trading exactitude for scale. In the world of big data, a willingness to accept messiness can be a positive feature. While traditional analysis emphasizes data quality and accuracy, big data recognizes that the sheer volume of information can compensate for individual errors. This trade-off allows us to work with real-world data, which is often incomplete, inconsistent, and unstructured.

More trumps better. The Microsoft researchers' experiment with grammar checking showed that a simple algorithm with a billion words performed better than a complex algorithm with a million words. Google's translation system works well because it uses a larger but also much messier dataset: the entire global Internet and more.

Messiness in action. The Billion Prices Project, which tracks inflation in real-time by scraping data from online retailers, accepts messiness in return for scale and timeliness. Similarly, tagging systems on platforms like Flickr embrace imprecision to create a richer and more flexible way of organizing content. By accepting messiness, we can unlock new insights and create valuable services that would be impossible with traditional methods.

3. Correlation Trumps Causation: Knowing "What" Is Often Enough

In a big-data world, by contrast, we won’t have to be fixated on causality; instead we can discover patterns and correlations in the data that offer us novel and invaluable insights.

The power of prediction. Big data shifts the focus from understanding why something happens to predicting what will happen. By identifying strong correlations, we can make accurate predictions even without knowing the underlying causes. This approach has revolutionized e-commerce, healthcare, and many other fields.

Examples of correlation-based predictions:

  • Amazon's recommendation system suggests products based on purchase history, not on understanding why customers like certain items.
  • Walmart stocks Pop-Tarts before hurricanes based on historical sales data, not on understanding the psychological reasons behind the correlation.
  • FICO's Medication Adherence Score predicts whether people will take their medication based on factors like homeownership and job tenure, not on understanding their individual health beliefs.

Limitations of causality. While humans are naturally inclined to seek causal explanations, this can often lead to biases and erroneous conclusions. In contrast, correlation analysis allows us to discover patterns and relationships that we might never have considered otherwise. By embracing "what" instead of "why," we can unlock new insights and make more effective decisions.

4. Datafication: Transforming the Intangible into Quantifiable Data

Datafication refers to taking information about all things under the sun—including ones we never used to think of as information at all...and transforming it into a data format to make it quantified.

Quantifying the world. Datafication is the process of transforming information about all things, including those not traditionally considered data, into a quantifiable format. This allows us to analyze and use the information in new ways, such as predictive analysis. It unlocks the implicit, latent value of information.

Examples of datafication:

  • Professor Koshimizu's system transforms sitting positions into data to identify car thieves.
  • Maury transformed old ship logs into data to create navigational charts.
  • Google transforms search queries into data to predict flu outbreaks.

Datafication vs. Digitization. Datafication is distinct from digitization, which is simply the process of converting analog information into digital format. Datafication goes further by transforming information into a structured, quantifiable form that can be analyzed and used for new purposes.

5. Data's Value Lies in Reuse and Unlocking Latent Potential

Every single dataset is likely to have some intrinsic, hidden, not yet unearthed value, and the race is on to discover and capture all of it.

Beyond primary use. The value of data is no longer limited to its original purpose. In the big data age, data's true worth lies in its potential for reuse and the unlocking of latent value. This requires a shift in mindset from treating data as a static resource to recognizing it as a dynamic asset.

Examples of data reuse:

  • Google reuses search queries to predict flu outbreaks and improve language translation.
  • UPS reuses sensor data from its vehicles to predict engine trouble and optimize routes.
  • Aviva reuses credit reports and consumer-marketing data to assess health risks.

The option value of data. Data's true value is the sum of all the possible ways it can be employed in the future. This "option value" can be unlocked through innovative analysis, recombination with other datasets, and the creation of new services. By recognizing and harnessing this potential, organizations can create significant economic value and gain a competitive advantage.

6. Big Data Reshapes Industries and Erodes the Value of Expertise

Specific area expertise matters less in a world where probability and correlation are paramount.

Shifting power dynamics. Big data is reshaping industries by challenging traditional notions of expertise and decision-making. In a world where probability and correlation are paramount, specific area expertise matters less. This shift is disrupting established hierarchies and empowering new players.

Moneyball effect. The movie Moneyball illustrates how data-driven analysis can upstage traditional expertise. Baseball scouts were replaced by statisticians who used data to identify undervalued players and build a winning team.

Specific area expertise matters less. The rise of big data is forcing an adjustment to traditional ideas of management, decision-making, human resources, and education. Subject-matter specialists will not go away, but they will have to contend with what the big-data analysis says.

7. Privacy, Propensity, and the Perils of Unchecked Data Power

Most of our institutions were established under the presumption that human decisions are based on information that is small, exact, and causal in nature.

The dark side of data. While big data offers numerous benefits, it also presents significant risks to privacy, freedom, and fairness. Unchecked data power can lead to increased surveillance, penalties based on propensities, and a dictatorship of data.

From privacy to probability. The danger shifts from privacy to probability: algorithms will predict the likelihood that one will get a heart attack, default on a mortgage, or commit a crime. It leads to an ethical consideration of the role of free will versus the dictatorship of data.

The dictatorship of data. We risk falling victim to a dictatorship of data, whereby we fetishize the information, the output of our analyses, and end up misusing it. Society has millennia of experience in understanding and overseeing human behavior. But how do you regulate an algorithm?

8. Accountability, Human Agency, and Algorithm Auditing: Governing Big Data

New principles are needed for the age of big data, which we lay out in Chapter Nine.

New principles for a new era. The age of big data requires new rules and principles to safeguard individual rights and ensure fairness. These principles must build upon existing values but also recognize the unique challenges posed by big data.

Accountable use. Shifting the focus from individual consent to data-user accountability is essential for protecting privacy. Data users must be held responsible for their actions and take steps to mitigate potential harm.

Human agency. We must guarantee human agency by ensuring that judgments are based on real actions, not statistical predictions. This requires a redefinition of justice to protect individual freedom and responsibility.

Algorithm auditing. New institutions and professionals are needed to audit and interpret complex algorithms, ensuring transparency and accountability. These "algorithmists" will play a crucial role in safeguarding against the misuse of big data.

Last updated:

Review Summary

3.69 out of 5
Average of 8k+ ratings from Goodreads and Amazon.

Big Data receives mixed reviews, with praise for its accessible overview of the topic and illustrative examples. Critics note redundancy and oversimplification. Readers appreciate insights into data's impact on society, privacy concerns, and future implications. Some find the content outdated or lacking depth. The book is recommended for those new to big data concepts but may disappoint experts. Overall, it's viewed as a thought-provoking introduction to an increasingly important field, albeit with limitations in scope and detail.

Your rating:

About the Author

Viktor Mayer-Schönberger is a renowned expert in big data and internet governance. As a professor at Oxford University's Internet Institute, he has authored numerous articles and books on digital technology's societal impact. His work "Delete: The Virtue of Forgetting in the Digital Age" explores digital memory's implications. Mayer-Schönberger's expertise is sought after by global corporations and organizations, including Microsoft and the World Economic Forum, where he serves on advisory boards. His research and insights contribute significantly to understanding the evolving digital landscape and its effects on governance, regulation, and society.

Download PDF

To save this Big Data summary for later, download the free PDF. You can print it out, or read offline at your convenience.
Download PDF
File size: 0.23 MB     Pages: 10

Download EPUB

To read this Big Data summary on your e-reader device or app, download the free EPUB. The .epub digital book format is ideal for reading ebooks on phones, tablets, and e-readers.
Download EPUB
File size: 2.97 MB     Pages: 10
0:00
-0:00
1x
Dan
Andrew
Michelle
Lauren
Select Speed
1.0×
+
200 words per minute
Create a free account to unlock:
Requests: Request new book summaries
Bookmarks: Save your favorite books
History: Revisit books later
Recommendations: Get personalized suggestions
Ratings: Rate books & see your ratings
Try Full Access for 7 Days
Listen, bookmark, and more
Compare Features Free Pro
📖 Read Summaries
All summaries are free to read in 40 languages
🎧 Listen to Summaries
Listen to unlimited summaries in 40 languages
❤️ Unlimited Bookmarks
Free users are limited to 10
📜 Unlimited History
Free users are limited to 10
Risk-Free Timeline
Today: Get Instant Access
Listen to full summaries of 73,530 books. That's 12,000+ hours of audio!
Day 4: Trial Reminder
We'll send you a notification that your trial is ending soon.
Day 7: Your subscription begins
You'll be charged on Mar 16,
cancel anytime before.
Consume 2.8x More Books
2.8x more books Listening Reading
Our users love us
100,000+ readers
"...I can 10x the number of books I can read..."
"...exceptionally accurate, engaging, and beautifully presented..."
"...better than any amazon review when I'm making a book-buying decision..."
Save 62%
Yearly
$119.88 $44.99/year
$3.75/mo
Monthly
$9.99/mo
Try Free & Unlock
7 days free, then $44.99/year. Cancel anytime.
Settings
Appearance
Black Friday Sale 🎉
$20 off Lifetime Access
$79.99 $59.99
Upgrade Now →