データ指向アプリケーションデザイン ―信頼性、拡張性、保守性の高い分散システム設計の原理

Q: What's *Designing Data-Intensive Applications* about?

Focus on Data Systems: The book explores the principles and practices behind building reliable, scalable, and maintainable data-intensive applications. It covers various architectures, data models, and the trade-offs involved in designing these systems. Enduring Principles: Despite rapid technological changes, the book emphasizes fundamental principles that remain constant across different systems, equipping readers to make informed decisions about data architecture. Real-World Examples: Martin Kleppmann uses examples from successful data systems to illustrate key concepts, making complex ideas more accessible through practical applications.

Q: Why should I read *Designing Data-Intensive Applications*?

Comprehensive Overview: The book provides a thorough examination of data systems, making it suitable for software engineers, architects, and technical managers. It covers a wide range of topics, from storage engines to distributed systems. Improved Decision-Making: By understanding the trade-offs of various technologies, readers can make better architectural decisions for their applications, crucial for meeting performance and reliability requirements. Curiosity and Insight: For those curious about how data systems work, the book offers deep insights into the internals of databases and data processing systems, encouraging critical thinking about application design.

Q: What are the key takeaways of *Designing Data-Intensive Applications*?

Reliability, Scalability, Maintainability: The book emphasizes these three principles as essential for building robust data-intensive applications. Understanding Trade-offs: It highlights the importance of understanding trade-offs in system design, such as the CAP theorem, which states that "you can only pick two out of consistency, availability, and partition tolerance." Data Models and Replication: The choice of data model significantly impacts application performance, and the book discusses various replication strategies and their implications for consistency.

Q: What are the best quotes from *Designing Data-Intensive Applications* and what do they mean?

"Technology is a powerful force in our society.": This quote underscores the dual nature of technology, serving as a reminder of the ethical responsibilities in building data systems. "The truth is the log. The database is a cache of a subset of the log.": This encapsulates the idea of event sourcing, where the log of events is the authoritative source, and the database provides a read-optimized view. "If you understand those principles, you’re in a position to see where each tool fits in.": Highlights the importance of grasping fundamental principles to effectively utilize various technologies.

Q: How does *Designing Data-Intensive Applications* define reliability, scalability, and maintainability?

Reliability: Refers to the system's ability to function correctly even in the face of faults, involving design strategies to tolerate hardware failures, software bugs, and human errors. Scalability: Concerns how well a system can handle increased load, requiring strategies like partitioning and replication to cope with growth in data volume, traffic, or complexity. Maintainability: Focuses on how easily a system can be modified and updated over time, emphasizing simplicity, operability, and evolvability for productive team work.

Q: What is the CAP theorem in *Designing Data-Intensive Applications*?

Consistency, Availability, Partition Tolerance: The CAP theorem states that in a distributed data store, it is impossible to simultaneously guarantee all three properties. Trade-offs in Design: Emphasizes the trade-offs system designers must make, such as sacrificing availability during network failures to prioritize consistency and partition tolerance. Historical Context: Introduced by Eric Brewer in 2000, the theorem has significantly influenced the design of distributed systems.

Q: How does *Designing Data-Intensive Applications* explain data models and query languages?

Data Models: Compares various data models, including relational, document, and graph models, each with strengths and weaknesses, crucial for selecting the right one based on application needs. Query Languages: Discusses different query languages like SQL for relational databases and those for NoSQL systems, essential for effectively interacting with data. Use Cases: Emphasizes that different applications have different requirements, guiding informed decisions about data architecture.

Q: What are the different replication methods in *Designing Data-Intensive Applications*?

Single-Leader Replication: Involves one node as the leader processing all writes and replicating changes to followers, common but can lead to bottlenecks. Multi-Leader Replication: Allows multiple nodes to accept writes, improving flexibility and availability but introducing complexities in conflict resolution. Leaderless Replication: Any node can accept writes, improving availability but requiring careful management of consistency.

Q: How does *Designing Data-Intensive Applications* address schema evolution?

Schema Changes: Discusses the inevitability of application changes requiring corresponding data schema changes, emphasizing backward and forward compatibility. Encoding Formats: Explores various encoding formats like JSON, XML, and binary formats, highlighting trade-offs associated with each for schema evolution. Practical Strategies: Provides advice on handling schema changes in real-world applications, ensuring old and new data versions can coexist without issues.

Q: What is the significance of event sourcing in *Designing Data-Intensive Applications*?

Immutable Event Log: Involves storing all changes as an immutable log of events, allowing easy reconstruction of the current state by replaying the log. Separation of Concerns: Enables multiple views of data from the same log, allowing for easier application evolution over time. Auditability and Recovery: Provides a clear audit trail of changes, simplifying recovery from errors by rebuilding the state from the event log.

Summary Reviews Similar よくある質問 Author

3日間フルアクセスを試す

オーディオ再生などの機能を解放！

続ける

重要ポイント

1. 分散システムはネットワークの不安定さに起因する独特の課題に直面する

分散システムに慣れていないと、これらの問題がもたらす影響は非常に混乱を招く。

ネットワークの不確実性。 分散システムは、ネットワーク障害や遅延、分断が頻繁に発生する環境で動作する。単一ノードのシステムとは異なり、送信したメッセージが必ず届く保証も、いつ届くかの保証もない。この不確実性に対応するため、分散システムはフォールトトレランスとレジリエンスを念頭に置いて設計されなければならない。

部分的な障害。 分散システムでは、一部のコンポーネントが故障しても他の部分は動作し続けることがある。この部分障害の状況は分散システム特有であり、設計や運用を大きく複雑化させる。開発者は以下のようなシナリオを考慮しなければならない。

ネットワーク問題でノードが到達不能になる
メッセージが失われたり遅延したりする
一部のノードがリクエストを処理する一方で他は停止している

一貫性の課題。 分散システムには共有メモリやグローバルな状態が存在しないため、ノード間で一貫性を保つことは困難である。各ノードは独自のローカルなシステムビューを持ち、それが他のノードのビューとずれたり古くなったりすることがある。

2. 分散環境における時計と時間同期は問題を孕む

分散システムが利用できる「全体で正確な時刻」というものは存在しない。

時計のずれ。 異なるマシンの物理時計は時間とともに必ずずれていく。定期的に同期を試みても、分散システム全体で正確な時刻を保証することはできない。このずれは以下のような問題を引き起こす。

分散トランザクションの順序付けの問題
イベントのタイムスタンプの不整合
因果関係の判定の困難さ

同期の限界。 NTP（Network Time Protocol）などのプロトコルは時計の同期を試みるが、ネットワーク遅延の影響を受け完全な同期は不可能である。この同期の不確実性により、

異なるマシンのタイムスタンプを直接比較できない
時間に基づく操作（例：分散ロック）は時計のずれを考慮する必要がある
正確なタイミングを前提としたアルゴリズムが予期せぬ失敗を招くことがある

論理時計の代替。 これらの問題に対処するため、分散システムでは物理時計に頼らず論理時計や部分順序付けの仕組みを用いることが多い。Lamportタイムスタンプやベクタークロックなどが代表例で、これにより同期された物理時計なしにイベントの一貫した順序付けが可能となる。

3. コンセンサスは分散システムにおいて不可欠だが達成は困難である

これらのシステムの議論は哲学的な領域に近い。システム内で何が真実で何が偽かをどう知るかという問題だ。

合意形成の課題。 分散ノード間でコンセンサスを得ることは分散システムの根幹的な問題である。リーダーノードの選出、操作の順序決定、レプリカ間の状態の一貫性確保などに不可欠だが、ネットワーク遅延やノード障害、情報の矛盾がこれを難しくしている。

CAP定理の示唆。 CAP定理は、ネットワーク分断が存在する場合、分散システムは一貫性（Consistency）と可用性（Availability）のどちらかを選ばなければならないと述べている。この根本的なトレードオフがコンセンサスアルゴリズムや分散データベースの設計に影響を与える。システムは、

強い一貫性を優先し可用性を犠牲にするか
可用性を優先し一貫性の緩和を受け入れるか
を決定しなければならない。

コンセンサスアルゴリズム。 コンセンサス問題に対処するため、Paxos、Raft、ZooKeeperで使われるZabなど様々なアルゴリズムが開発されている。それぞれ複雑さ、性能、耐障害性において異なるトレードオフを持つ。

4. 分散トランザクションは一貫性維持のため慎重な設計が必要である

ACIDトランザクションは自然法則ではなく、データベースにアクセスするアプリケーションのプログラミングモデルを簡素化する目的で作られたものである。

ACID特性。 分散トランザクションは複数ノードにまたがってACID（Atomicity, Consistency, Isolation, Durability）を維持しようとする。これは以下の理由で困難だ。

原子性は全ノードでの一括成功か全失敗を要求する
一貫性はネットワーク分断下でも保たれなければならない
独立性は競合操作を防ぐため調整が必要
永続性は複数の故障しうるノードにまたがって保証される

2フェーズコミット。 2フェーズコミット（2PC）は分散トランザクションでよく使われるプロトコルで、

準備フェーズ：コーディネータが参加者全員にコミット可能か問い合わせる
コミットフェーズ：全員が同意すればコミット、そうでなければ中止
という流れである。しかし、コーディネータの障害時にブロッキングが発生するなどの制約がある。

代替アプローチ。 分散システムにおける厳密なACIDトランザクションの限界を補うため、以下のようなモデルが提案されている。

長時間トランザクション向けのSagaパターン
BASE（基本的に利用可能、ソフトステート、最終的整合性）モデル
障害時に補償トランザクションを行う手法

5. レプリケーション戦略はデータの可用性と一貫性のバランスを取る

レプリケーションには複数の方法があり、それぞれ重要なトレードオフが存在する。

レプリケーションモデル。 分散システムは可用性や性能向上のために様々なレプリケーション戦略を用いる。

シングルリーダーモデル
マルチリーダーモデル
リーダーレスモデル

各モデルは一貫性、可用性、遅延の面で異なる特徴を持つ。

一貫性レベル。 レプリケーションはコピー間の一貫性維持という課題を伴う。多くのシステムは複数の一貫性レベルを提供する。

強い一貫性：全レプリカが常に同期している
最終的整合性：時間経過でレプリカが収束する
因果一貫性：操作間の因果関係を保持する

競合解決。 複数コピーの独立更新を許す場合、競合が発生することがある。解決策としては、

タイムスタンプに基づく最終書き込み勝ち
更新履歴を追跡するバージョンベクター
アプリケーション固有のマージ関数
などがある。

6. データのパーティショニングはスケーラビリティを実現するが複雑さを伴う

データをパーティショニングする主な理由はスケーラビリティの確保である。

パーティショニング戦略。 データは以下のような方法でノード間に分割される。

範囲パーティショニング：キーの範囲で分割
ハッシュパーティショニング：ハッシュ関数で分散
ディレクトリベースパーティショニング：別サービスでデータ位置を管理

それぞれデータ分布、クエリ性能、システムの柔軟性に影響を与える。

リバランスの課題。 システムの拡大や縮小に伴い、データの再分配（リバランス）が必要になる。これには、

データ移動の最小化
均等なデータ分布の維持
運用中の影響回避
が求められる。

セカンダリインデックス。 パーティショニングはセカンダリインデックスの扱いを複雑にする。選択肢としては、

ドキュメント単位でのパーティショニング
用語単位でのパーティショニング
があり、書き込み性能や読み取りクエリの能力に異なる影響を与える。

7. フォールトトレランスは不可欠だが慎重な設計が必要である

分散システムの開発は単一コンピュータ上のソフトウェア開発とは根本的に異なり、多様で予期せぬ障害が発生しうる点が最大の違いである。

障害モード。 分散システムは以下のような多様な障害に対応しなければならない。

ノードのクラッシュ
ネットワーク分断
ビザンチン障害（ノードの誤動作や悪意ある振る舞い）

フォールトトレランス設計はこれらの障害を予測し緩和することを求める。

冗長性とレプリケーション。 フォールトトレランスの主要戦略は、

複数ノードへのデータレプリケーション
冗長コンポーネントの利用（例：複数のネットワーク経路）
フェイルオーバー機構の実装
である。しかし冗長性だけでは不十分で、障害検知と適切な対応が不可欠だ。

グレースフルデグラデーション。 優れた分散システムは部分障害があっても機能を維持し、場合によっては機能を限定しつつ継続動作する。これには、

障害の孤立化による連鎖障害防止
重要機能の優先維持
ユーザーへのシステム状態の適切な通知
が含まれる。

8. 一貫性モデルは正確性と性能のトレードオフを提供する

線形化可能性（Linearizability）は分散システムで繰り返し登場する非常に強力な一貫性モデルである。

一貫性のスペクトラム。 分散システムは強いものから弱いものまで様々な一貫性モデルを提供する。

線形化可能性：最も強力で、すべての操作が原子的に起こるかのように見える
順序一貫性：各クライアントの操作順序を保持
因果一貫性：操作間の因果関係を維持
最終的整合性：最も弱く、時間経過で収束を保証

強いモデルは直感的な振る舞いを提供するが、遅延増大や可用性低下を伴うことが多い。

CAP定理の影響。 一貫性モデルの選択はCAP定理の制約を受ける。

強い一貫性モデルはネットワーク分断時に可用性を制限する
弱いモデルは可用性を高めるが不整合を許容する場合がある

アプリケーションの考慮。 適切な一貫性モデルは用途に依存する。

金融システムは強い一貫性を必要とすることが多い
ソーシャルメディアは最終的整合性を許容する場合が多い
一部のシステムは操作ごとに異なる一貫性レベルを使い分ける

9. 分散システム設計は部分障害を考慮しなければならない

分散システムでは、構成要素の一部が故障してもシステム全体が動作し続けられるよう、部分障害への耐性をソフトウェアに組み込むことを目指す。

障害検知。 分散システムでの障害検知はネットワークの不確実性により困難である。一般的な手法には、

ハートビート機構
ゴシッププロトコル
Phi-Accrual障害検出器
がある。しかし、故障したノードとネットワーク分断を区別することはしばしば不可能である。

障害対応。 障害を検知した後、システムは適切に対応しなければならない。

新たなリーダーの選出
リクエストの経路変更
復旧プロセスの開始

目的は部分障害下でもシステムの可用性と一貫性を維持することである。

設計原則。 頑健な分散システム構築のための基本原則は、

障害は必ず起こると想定し設計する
タイムアウトやリトライを用いるが限界を理解する
連鎖障害を防ぐためサーキットブレーカーを実装する
重複リクエストを安全に処理するため冪等性を考慮する
である。

本稿は分散システムの基本的な課題と原則を概説した。ネットワークの不安定さ、時間同期の問題、コンセンサスの必要性といった独特の困難に焦点を当てている。分散トランザクションの一貫性維持、データのレプリケーションとパーティショニングのバランス、フォールトトレランス設計の重要性を論じた。また、一貫性モデルの選択に伴うトレードオフや部分障害を考慮した設計の必要性についても触れている。全体を通じて、分散システムの哲学的かつ実践的な側面が浮き彫りにされている。

最終更新: January 23, 2025

Report Issue

レビューまとめ

4.70 / 5

平均： 10,000+ GoodreadsとAmazonの評価.

『Designing Data-Intensive Applications』は、ソフトウェアエンジニアや開発者にとって必読書として高く評価されている。本書はデータストレージ、分散システム、最新のデータベース概念を網羅的に解説しており、その明快な説明、実践的な例、そして洞察に満ちた図解が特に好評だ。多くの読者は、本書をデータエンジニアリングのミニ百科事典とみなし、初心者から経験豊富な専門家まで幅広く役立つ知識を提供していると認めている。一部には難解で学術的すぎると感じる箇所もあるが、複雑なデータシステムやアーキテクチャの理解に不可欠な基礎を築く書として、多くの人が支持している。

Want to read the full book?

Amazon Kindle Audible

他の人が読んだ本

Java Concurrency in Practice

Tackling Complexity in the Heart of Software

Building Microservices

Sam Newman

Designing Fine-Grained Systems

A Handbook of Agile Software Craftsmanship

4.35

23,000+

Fundamentals of Software Architecture

Mark Richards

An Engineering Approach

4.24

2,000+

System Design Interview – An insider's guide

Leadership Beyond the Management Track

4.04

3,000+

Head First Design Patterns

Eric Freeman

4.30

9,000+

よくある質問

What's Designing Data-Intensive Applications about?

Focus on Data Systems: The book explores the principles and practices behind building reliable, scalable, and maintainable data-intensive applications. It covers various architectures, data models, and the trade-offs involved in designing these systems.
Enduring Principles: Despite rapid technological changes, the book emphasizes fundamental principles that remain constant across different systems, equipping readers to make informed decisions about data architecture.
Real-World Examples: Martin Kleppmann uses examples from successful data systems to illustrate key concepts, making complex ideas more accessible through practical applications.

Why should I read Designing Data-Intensive Applications?

Comprehensive Overview: The book provides a thorough examination of data systems, making it suitable for software engineers, architects, and technical managers. It covers a wide range of topics, from storage engines to distributed systems.
Improved Decision-Making: By understanding the trade-offs of various technologies, readers can make better architectural decisions for their applications, crucial for meeting performance and reliability requirements.
Curiosity and Insight: For those curious about how data systems work, the book offers deep insights into the internals of databases and data processing systems, encouraging critical thinking about application design.

What are the key takeaways of Designing Data-Intensive Applications?

Reliability, Scalability, Maintainability: The book emphasizes these three principles as essential for building robust data-intensive applications.
Understanding Trade-offs: It highlights the importance of understanding trade-offs in system design, such as the CAP theorem, which states that "you can only pick two out of consistency, availability, and partition tolerance."
Data Models and Replication: The choice of data model significantly impacts application performance, and the book discusses various replication strategies and their implications for consistency.

What are the best quotes from Designing Data-Intensive Applications and what do they mean?

"Technology is a powerful force in our society.": This quote underscores the dual nature of technology, serving as a reminder of the ethical responsibilities in building data systems.
"The truth is the log. The database is a cache of a subset of the log.": This encapsulates the idea of event sourcing, where the log of events is the authoritative source, and the database provides a read-optimized view.
"If you understand those principles, you’re in a position to see where each tool fits in.": Highlights the importance of grasping fundamental principles to effectively utilize various technologies.

How does Designing Data-Intensive Applications define reliability, scalability, and maintainability?

Reliability: Refers to the system's ability to function correctly even in the face of faults, involving design strategies to tolerate hardware failures, software bugs, and human errors.
Scalability: Concerns how well a system can handle increased load, requiring strategies like partitioning and replication to cope with growth in data volume, traffic, or complexity.
Maintainability: Focuses on how easily a system can be modified and updated over time, emphasizing simplicity, operability, and evolvability for productive team work.

What is the CAP theorem in Designing Data-Intensive Applications?

Consistency, Availability, Partition Tolerance: The CAP theorem states that in a distributed data store, it is impossible to simultaneously guarantee all three properties.
Trade-offs in Design: Emphasizes the trade-offs system designers must make, such as sacrificing availability during network failures to prioritize consistency and partition tolerance.
Historical Context: Introduced by Eric Brewer in 2000, the theorem has significantly influenced the design of distributed systems.

How does Designing Data-Intensive Applications explain data models and query languages?

Data Models: Compares various data models, including relational, document, and graph models, each with strengths and weaknesses, crucial for selecting the right one based on application needs.
Query Languages: Discusses different query languages like SQL for relational databases and those for NoSQL systems, essential for effectively interacting with data.
Use Cases: Emphasizes that different applications have different requirements, guiding informed decisions about data architecture.

What are the different replication methods in Designing Data-Intensive Applications?

Single-Leader Replication: Involves one node as the leader processing all writes and replicating changes to followers, common but can lead to bottlenecks.
Multi-Leader Replication: Allows multiple nodes to accept writes, improving flexibility and availability but introducing complexities in conflict resolution.
Leaderless Replication: Any node can accept writes, improving availability but requiring careful management of consistency.

How does Designing Data-Intensive Applications address schema evolution?

Schema Changes: Discusses the inevitability of application changes requiring corresponding data schema changes, emphasizing backward and forward compatibility.
Encoding Formats: Explores various encoding formats like JSON, XML, and binary formats, highlighting trade-offs associated with each for schema evolution.
Practical Strategies: Provides advice on handling schema changes in real-world applications, ensuring old and new data versions can coexist without issues.

What is the significance of event sourcing in Designing Data-Intensive Applications?

Immutable Event Log: Involves storing all changes as an immutable log of events, allowing easy reconstruction of the current state by replaying the log.
Separation of Concerns: Enables multiple views of data from the same log, allowing for easier application evolution over time.
Auditability and Recovery: Provides a clear audit trail of changes, simplifying recovery from errors by rebuilding the state from the event log.

How does Designing Data-Intensive Applications propose handling network partitions?

Network Faults: Explains that network partitions can lead to inconsistencies across replicas, complicating distributed system design.
Handling Partitions: Discusses strategies like the CAP theorem, which states a system can only guarantee two of three properties: Consistency, Availability, and Partition Tolerance.
Practical Implications: Emphasizes designing systems that tolerate network faults and continue operating effectively.

What are the ethical considerations in Designing Data-Intensive Applications?

Responsibility of Engineers: Stresses the ethical implications of data collection and usage, including awareness of potential biases and discrimination in algorithms.
Impact of Predictive Analytics: Discusses risks associated with predictive analytics, urging careful consideration of data-driven decisions and their consequences.
Surveillance Concerns: Raises concerns about surveillance capabilities, advocating for user privacy, transparency, and control over personal data.

著者について

マーティン・クレップマンは、分散システムとデータエンジニアリングの分野で著名な専門家である。彼はApache Samzaの開発やLinkedInのデータインフラストラクチャへの貢献で広く知られている。本書全体にわたり、データベース、メッセージブローカー、データ処理システムに関する彼の深い専門知識が随所に示されている。クレップマンの文章は明快であり、複雑な概念をわかりやすく解説する点が高く評価されている。学術界と産業界の両方での経験を持つ彼は、理論的な概念と実践的な応用を巧みに結びつけている。彼の業績は、データ集約型アプリケーションや分散システムの分野に大きな影響を与えている。

Compare Features	Free	Pro
📖 Read Summaries Read unlimited summaries. Free users get 3 per month
🎧 Listen to Summaries Listen to unlimited summaries in 40 languages	—
❤️ Unlimited Bookmarks Free users are limited to 4	—
📜 Unlimited History Free users are limited to 4	—
📥 Unlimited Downloads Free users are limited to 1	—

People love SoBrief

Join our global community of 600,000+ readers

★★★★★

This site is a total game-changer. I've been flying through book summaries like never before. Highly, highly recommend.

— Dave G

Worth my money and time, and really well made. I've never seen this quality of summaries on other websites. Very helpful!

— Em

Highly recommended!! Fantastic service. Perfect for those that want a little more than a teaser but not all the intricate details of a full audio book.

— Greg M