Name: Becoming a Data Head
Rating: 4.6 (37 reviews)
ISBN: 9781119741763

Summary Reviews Similar Author Download

Try Full Access for 7 Days

Unlock listening & more!

Continue

Key Takeaways

1. Define the Problem Before Diving into Data

Gutman and Goldmeier filter through much of the noise to break down complex data and statistical concepts we hear today into basic examples and analogies that stick.

Focus on the business problem. Before starting any data project, clearly define the problem you're trying to solve. Avoid getting caught up in the hype of new technologies or methodologies. Instead, focus on the business value and the impact of solving the problem.

Ask key questions. To ensure the problem is well-defined, ask:

Why is this problem important?
Who does this problem affect?
What if we don't have the right data?
When is the project over?
What if we don't like the results?

Avoid methodology and deliverable focus. Be wary of projects that start with a specific technology or deliverable in mind. Instead, focus on the business problem and then determine the appropriate tools and methods.

2. Data is Encoded Information, Not Just Numbers

In demystifying these complex statistical topics, they have also created a common language that bridges the longstanding communication divide that has — until now — separated data work from business value.

Data vs. Information. Understand the difference between data and information. Data is encoded information, while information is derived knowledge. Data is the raw material, and information is the result of analysis.

Data Types. Be familiar with different data types:

Numeric (continuous and count)
Categorical (ordered and unordered)
Dates

Data Collection. Understand how data is collected (observational vs. experimental) and structured (structured vs. unstructured). This will help you assess its quality and limitations.

3. Statistical Thinking Requires Questioning Everything

Statistical thinking is a different way of thinking that is part detective, skeptical, and involves alternate takes on a problem.

Embrace skepticism. Develop a critical mindset and question the data and results you encounter. Don't take numbers at face value. Be especially skeptical of claims that align with your existing beliefs.

Understand variation. Recognize that there is variation in all things. Not every peak and valley needs an explanation. Differentiate between measurement variation and random variation.

Probability and Statistics. Use probability and statistics to manage uncertainty. Understand the difference between probability (drilling down) and statistics (drilling up).

4. Argue with the Data's Origin and Representativeness

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.

Data Origin Story. Always ask about the origin of the data. Who collected it? How was it collected? Is it observational or experimental? This will help you assess its reliability and potential biases.

Representativeness. Ensure the data is representative of the population you care about. Is there sampling bias? What did you do with outliers? What data am I not seeing? How did you deal with missing values?

Measurement. Can the data measure what you want it to measure? Be wary of proxy measures and indirect approximations.

5. Explore Data to Uncover Relationships and Opportunities

Gutman and Goldmeier offer practical advice for asking the right questions, challenging assumptions, and avoiding common pitfalls.

Embrace the exploratory mindset. Approach data analysis with curiosity and a willingness to iterate. Don't follow a rigid script. Be open to discovering new relationships and opportunities.

Ask guiding questions. As you explore the data, ask:

Can the data answer the question?
Did you discover any relationships?
Did you find new opportunities in the data?

Use visualizations. Use histograms, box plots, bar charts, and scatter plots to explore the data and spot anomalies. Verify noteworthy correlations with visualizations.

6. Probabilities Quantify Uncertainty, Challenge Intuition

Many people’s notion of probability is so impoverished that it admits [one] of only two values: 50-50 and 99%, tossup or essentially certain.

Probability vs. Intuition. Recognize that your intuition can play tricks on you. Don't underestimate variation, especially when dealing with small numbers.

Rules of the Game. Understand the basic rules of probability:

Probabilities range from 0 to 1.
The sum of all possible outcomes must equal 1.
The chance of any two events happening together cannot be greater than either event happening by itself.

Conditional Probability. Know that all probabilities are conditional. Be careful assuming independence. Don't fall for the gambler's fallacy.

7. Challenge Statistics by Understanding Inference

The most clear, concise, and practical characterization of working in corporate analytics that I’ve seen.

Statistical Inference. Understand the process of statistical inference:

Ask a meaningful question.
Formulate a hypothesis test.
Establish a significance level.
Calculate a p-value.
Calculate confidence intervals.
Reject or fail to reject the null hypothesis.

Key Questions. Ask these questions to challenge the statistics:

What is the context for these statistics?
What is the sample size?
What are you testing?
What is the null hypothesis?
What is the significance level?
How many tests are you doing?
Can I see the confidence intervals?
Is this practically significant?
Are you assuming causality?

Decision Errors. Balance decision errors (false positives and false negatives).

8. Unsupervised Learning Reveals Hidden Groups

Becoming a Data Head raises the level of education and knowledge in an industry desperate for clarity in thinking.

Unsupervised Learning. Understand the goal of unsupervised learning: to discover hidden patterns and groups in datasets without predefined labels.

Dimensionality Reduction. Learn about dimensionality reduction and principal component analysis (PCA). PCA creates composite features that capture the most variance in the data.

Clustering. Understand clustering and k-means clustering. K-means groups similar observations together based on a distance metric.

9. Regression Models Explain and Predict Relationships

Gutman and Goldmeier have written a book that is as useful for applied statisticians and data scientists as it is for business leaders and technical professionals.

Supervised Learning. Understand the goal of supervised learning: to find relationships in data with inputs and known outputs.

Regression Models. Learn about linear regression and its goal: to find the line of best fit that minimizes the sum of squared errors.

Multiple Regression. Extend linear regression to multiple features. Understand the importance of coefficients and p-values.

10. Classification Models Predict Categories

THE book that business and technology leaders need to read to fully understand the potential, power, AND limitations of data science.

Classification Models. Understand the goal of classification models: to predict a categorical variable (label).

Logistic Regression. Learn about logistic regression and its ability to predict probabilities.

Decision Trees. Understand decision trees and their ability to create a flowchart of rules.

Ensemble Methods. Learn about ensemble methods (random forests and gradient boosted trees) and their ability to improve prediction accuracy.

11. Text Analytics Transforms Words into Insights

Gutman and Goldmeier filter through much of the noise to break down complex data and statistical concepts we hear today into basic examples and analogies that stick.

Text Analytics. Understand the goal of text analytics: to extract useful insights from raw text.

Bag of Words. Learn about the bag-of-words model and its limitations.

N-grams. Understand N-grams and their ability to capture context.

Word Embeddings. Learn about word embeddings and their ability to represent words as vectors.

12. Deep Learning Mimics the Brain for Complex Tasks

What is keeping data science from reaching its true potential? It is not slow algorithms, lack of data, lack of computing power, or even lack of data scientists.

Neural Networks. Understand the basic structure of neural networks: neurons, activation functions, and layers.

Deep Learning. Learn about deep learning and its ability to automate feature engineering.

Convolutional Neural Networks. Understand convolutional neural networks and their application to image analysis.

Last updated: April 14, 2025

Report Issue

Want to read the full book?

Amazon Kindle Audible

Review Summary

4.23 out of 5

Average of 363 ratings from Goodreads and Amazon.

Becoming a Data Head is highly praised for its accessible introduction to data science concepts. Readers appreciate its clear explanations of complex topics, making it valuable for both beginners and experienced professionals. The book covers a wide range of subjects, from basic statistics to machine learning and AI. Many reviewers found it helpful for understanding data-driven decision-making in business contexts. While some felt it was too basic, most agreed it provides a solid foundation for anyone looking to enhance their data literacy.

Similar Books

Data Science for Business

Foster Provost

What You Need to Know about Data Mining and Data-Analytic Thinking

The Power of Mathematical Thinking

The Elegant Math Behind Modern AI

4.36

(532)

Storytelling with Data

Cole Nussbaumer Knaflic

A Data Visualization Guide for Business Professionals

4.39

(7.6K)

HBR's 10 Must Reads on AI, Analytics, and the New Machine Age

Harvard Business Review

Being Human in the Age of Algorithms

The Art of Skepticism in a Data-Driven World

How I Learned to Pay Attention, Master Myself, and Win

Community-Led Practices to Build the Worlds We Need

Turning Ordinary Moments into Extraordinary Results

4.18

(8.1K)

About the Author

Alex J. Gutman is a data scientist and author who co-wrote "Becoming a Data Head" with Jordan Goldmeier. The book aims to demystify data science concepts for a broad audience, including business professionals and those new to the field. Gutman's approach focuses on practical applications and real-world examples, helping readers understand how data can be used effectively in various contexts. His writing style is praised for its clarity and ability to explain complex topics in an accessible manner. Gutman's expertise in data science and his skill in communicating technical concepts to non-technical audiences are evident throughout the book.

Download PDF

To save this Becoming a Data Head summary for later, download the free PDF. You can print it out, or read offline at your convenience.

Download PDF

File size: 0.23 MB Pages: 13

Download EPUB

To read this Becoming a Data Head summary on your e-reader device or app, download the free EPUB. The .epub digital book format is ideal for reading ebooks on phones, tablets, and e-readers.

Download EPUB

File size: 2.95 MB Pages: 9

Compare Features	Free	Pro
📖 Read Summaries Read unlimited summaries. Free users get 3 per month
🎧 Listen to Summaries Listen to unlimited summaries in 40 languages	—
❤️ Unlimited Bookmarks Free users are limited to 4	—
📜 Unlimited History Free users are limited to 4	—
📥 Unlimited Downloads Free users are limited to 1	—