Database Systems: ACID Properties, CAP Theorem, and Consistency Mechanisms
This guide organizes database systems from traditional relational databases to highly distributed NoSQL systems, focusing on their consistency models, ACID compliance, CAP theorem trade-offs, and reconciliation mechanisms.
See the summary of popular databases here: Popular Databases
MySQL/PostgreSQL: Strong consistency (ACID), structured data, transactions.
Cassandra: High availability, eventual consistency, write-heavy, distributed.
Redis: Caching, session management, real-time, high performance.
DynamoDB: Scalable, eventual/strong consistency, flexible consistency.
CosmosDB: Multi-model, tunable consistency.
Lets break them down further and use CAP theorem to understand their use cases
Traditional Relational Databases
MySQL/MariaDB
Scale: Single node to moderate clustering
ACID Compliance: Full
CAP Focus: CA (Consistency and Availability with limited Partition tolerance)
Consistency Mechanisms:
Two-Phase Commit (2PC) for distributed transactions
MVCC (Multi-Version Concurrency Control)
Row-level and table-level locking
Reconciliation: Primarily master-slave replication with binary logs
Use Cases:
E-commerce platforms requiring strict consistency (inventory, orders)
Financial applications with transaction requirements
Content Management Systems (WordPress, Drupal)
Unique: Online gaming leaderboards and state tracking where high write performance is needed
Unique: Legacy application migration path (wide tooling ecosystem)
PostgreSQL
Scale: Single node to moderate clustering
ACID Compliance: Full
CAP Focus: CA (Consistency and Availability)
Consistency Mechanisms:
MVCC for concurrent transactions
Triggers and constraints
Write-Ahead Logging (WAL)
Reconciliation:
Logical replication with write-ahead logs
BDR (Bi-Directional Replication) for multi-master setups
Conflict resolution via exclusion constraints
pglogical extension with conflict handling
Use Cases:
GIS applications (PostGIS extension)
Complex data models with referential integrity requirements
Analytics workloads requiring window functions and CTEs
Unique: Time-series data with specialized extensions (TimescaleDB)
Unique: Full-text search applications that also need ACID guarantees
Unique: Heterogeneous data storage with JSON/JSONB support while maintaining relational integrity
Distributed SQL Databases
CockroachDB
Scale: Global distribution with horizontal scaling
ACID Compliance: Full
CAP Focus: CP (Consistency and Partition tolerance)
Consistency Mechanisms:
Raft consensus algorithm
MVCC and timestamp ordering
Reconciliation:
Distributed transactions using the Raft consensus protocol
Automated range rebalancing
Use Cases:
Global SaaS applications requiring low latency in multiple regions
Financial systems requiring distributed ACID guarantees
Microservices architectures with geo-distribution
Unique: Multi-region deployment with regulatory data residency requirements
Unique: Applications requiring PostgreSQL compatibility with global scale
Unique: Hybrid cloud deployments spanning multiple cloud providers
Google Spanner
Scale: Global distribution with horizontal scaling
ACID Compliance: Full
CAP Focus: CP (with extremely high availability via TrueTime)
Consistency Mechanisms:
Paxos consensus algorithm
TrueTime API (GPS and atomic clocks)
Two-phase locking
Reconciliation:
Paxos-based synchronous replication
External consistency via TrueTime
Use Cases:
Global financial trading platforms
Multi-region payment processing systems
Mission-critical applications requiring both scale and consistency
Unique: Applications requiring external consistency guarantees across global datacenters
Unique: Systems requiring both horizontal scale and serializable isolation
Unique: Applications requiring global ACID transactions with bounded staleness guarantees
YugabyteDB
Scale: Globally distributed
ACID Compliance: Full
CAP Focus: CP (Consistency and Partition tolerance)
Consistency Mechanisms:
Raft consensus for tablet replication
DocDB storage engine with MVCC
Reconciliation:
Tablet-based data distribution
Hybrid logical clocks
Use Cases:
Cloud-native microservices requiring high write throughput
Applications requiring both SQL and NoSQL interfaces (YSQL and YCQL)
Multi-region applications with geo-partitioning needs
Unique: Systems requiring PostgreSQL and Cassandra compatibility in a single database
Unique: Multi-cloud deployments with automated failover between providers
Unique: Edge-to-cloud synchronization with geo-partitioning
NoSQL - Document Databases
MongoDB
Scale: Horizontal scaling with sharding
ACID Compliance: Document-level ACID since v4.0, multi-document transactions with limitations
CAP Focus: CP or AP (configurable)
Consistency Mechanisms:
Primary-Secondary replication
Read concerns (local, majority, linearizable)
Write concerns (w=1, w=majority, etc.)
Reconciliation:
OpLog (Operation Log) replication
Conflict resolution via vector clocks (in MongoDB 5.0+)
Automatic conflict resolution for replica sets
Election process for automatic failover
Use Cases:
Content management systems
Mobile applications with flexible schema requirements
Real-time analytics with time-series data
Unique: IoT applications with hierarchical data structures
Unique: Product catalogs with complex, nested attributes
Unique: Single-view applications consolidating data from multiple systems
CouchDB
Scale: Horizontal scaling with peer-to-peer replication
ACID Compliance: Document-level only
CAP Focus: AP (Availability and Partition tolerance)
Consistency Mechanisms:
MVCC with revision tracking
Eventually consistent model
Reconciliation:
MVCC with document versioning
Conflict detection with winner selection based on revision history
Use Cases:
Mobile applications requiring offline-first capabilities
Distributed content management systems
Data synchronization across unstable networks
Unique: Offline-first applications with bidirectional sync
Unique: Edge computing with intermittent connectivity
Unique: Peer-to-peer applications with direct device-to-device sync
NoSQL - Column Family Stores
Apache Cassandra
Scale: Massive horizontal scaling, globally distributed
ACID Compliance: Row-level only, no multi-row transactions
CAP Focus: AP (Availability and Partition tolerance) with tunable consistency
Consistency Mechanisms:
Tunable consistency levels (ONE, QUORUM, ALL)
Logical clocks for causality tracking
Gossip protocol for cluster state
Reconciliation:
CRDT-like column family model allowing concurrent updates
Timestamp-based resolution (LWW) with column-level granularity
Collection types (sets, lists, maps) with CRDT properties
Hinted handoff for missed writes
Read repair during queries
Anti-entropy repair process
Use Cases:
Time-series data for monitoring and IoT
High-throughput logging systems
Product catalogs requiring high write throughput
Unique: Recommendation engines with write-heavy workloads
Unique: Fraud detection systems requiring millisecond responses
Unique: Global messaging platforms with distributed message storage
HBase
Scale: Massive horizontal scaling
ACID Compliance: Row-level only
CAP Focus: CP (Consistency and Partition tolerance)
Consistency Mechanisms:
Strong consistency for row operations
Write-ahead logging
Uses ZooKeeper for coordination
Reconciliation:
Region servers with leader election
HDFS replication
Use Cases:
Large-scale analytics alongside Hadoop ecosystem
Time-series data for machine log analysis
Real-time access to big data
Unique: Random, real-time read/write access to Big Data
Unique: Storing and processing data from large web crawls
Unique: Column-oriented storage for wide tables (billions of columns)
ScyllaDB
Scale: Massive horizontal scaling
ACID Compliance: Similar to Cassandra (row-level)
CAP Focus: AP with tunable consistency
Consistency Mechanisms:
Tunable consistency levels
Lightweight transactions
Reconciliation:
Similar to Cassandra with performance improvements
Shard-per-core architecture
Use Cases:
High-throughput time-series applications
Cassandra-compatible applications requiring higher performance
IoT data ingestion pipelines
Unique: Low-latency adtech bidding platforms (sub-millisecond responses)
Unique: Real-time monitoring systems with C++ performance requirements
Unique: Hybrid transactional/analytical processing with low-latency requirements
NoSQL - Key-Value Stores
Redis
Scale: Single master with read replicas, Redis Cluster for horizontal scaling
ACID Compliance: Single operation atomicity, no multi-operation transactions by default
CAP Focus: CP (traditionally) or AP (Redis Cluster)
Consistency Mechanisms:
Single-threaded execution
Replication with asynchronous or semi-synchronous options
Reconciliation:
Asynchronous master-replica replication
Last-write-wins conflict resolution
Use Cases:
Caching layer for application performance
Real-time leaderboards and counting
Message broker for pub/sub systems
Unique: Rate limiting and API throttling
Unique: Real-time analytics with time-decay (e.g., trending algorithms)
Unique: Distributed locks and synchronization primitives for microservices
DynamoDB (AWS)
Scale: Massive horizontal scaling, fully managed
ACID Compliance: Item-level atomicity, transactions supported across items
CAP Focus: AP with strong consistency options
Consistency Mechanisms:
Eventually consistent reads (default)
Strongly consistent reads (option)
Transactions with optimistic concurrency
Reconciliation:
CRDT-like data types (Sets, Lists)
Conflict resolution based on last-writer-wins
Vector clock-based versioning for multi-region deployments
Cross-region replication with conflict detection
Use Cases:
Serverless application backends
Session stores and user profiles
High-scale mobile and gaming applications
Unique: Applications with unpredictable traffic patterns requiring auto-scaling
Unique: Time-to-Live (TTL) based workflows with automatic data expiration
Unique: Event-driven architectures with DynamoDB Streams
etcd
Scale: Moderate cluster sizes
ACID Compliance: High for single key operations
CAP Focus: CP (Consistency and Partition tolerance)
Consistency Mechanisms:
Raft consensus algorithm
Linearizable and serializable operations
Reconciliation:
Raft-based log replication
Leader election for handling partitions
Use Cases:
Service discovery in microservices
Configuration management for distributed systems
Kubernetes control plane storage
Unique: Distributed locking and leader election for coordination
Unique: Metadata storage for distributed file systems
Unique: Watch-based notification systems for configuration changes
Advanced Distributed & CRDT-based Systems
Riak
Scale: Horizontal scaling
ACID Compliance: Limited, eventual consistency focused
CAP Focus: AP (Availability and Partition tolerance)
Consistency Mechanisms:
Vector clocks for causality tracking
Eventually consistent by default
Reconciliation:
CRDTs (Conflict-Free Replicated Data Types)
Sibling resolution with vector clocks
Use Cases:
Fault-tolerant storage for critical data
Session storage for web applications
High-availability enterprise applications
Unique: Multi-datacenter replication with conflict management
Unique: Large-scale storage for IoT data with network constraints
Unique: Applications requiring formal CRDT models for conflict resolution
AntidoteDB
Scale: Geo-distributed
ACID Compliance: Transactional causal consistency
CAP Focus: AP (Availability and Partition tolerance)
Consistency Mechanisms:
Causal consistency with vector clocks
Transactional model
Reconciliation:
CRDT-based automatic conflict resolution
Transaction certification
Use Cases:
Collaborative editing applications
Multi-user gaming state management
Distributed banking and financial applications
Unique: Applications requiring strong CRDT guarantees with transaction support
Unique: Research systems exploring causal+ consistency models
Unique: Multi-player applications with conflict-free interactions
Redis Enterprise (with CRDTs)
Scale: Global distribution
ACID Compliance: Enhanced with CRDT support
CAP Focus: AP (Availability and Partition tolerance)
Consistency Mechanisms:
CRDT-based data types
Active-Active replication
Reconciliation:
Built-in CRDTs for counters, sets, strings
Automatic conflict resolution
Use Cases:
Multi-region caching with local write capability
Global session stores requiring low latency
Real-time analytics with geo-distributed data sources
Unique: Global leaderboards for gaming applications
Unique: Distributed rate limiting across multiple regions
Unique: Edge computing caches with bidirectional synchronization
Multi-Model Databases
FaunaDB
Scale: Global distribution
ACID Compliance: Full, distributed transactions
CAP Focus: CP with high availability
Consistency Mechanisms:
Calvin protocol for distributed transactions
Temporal versioning
Reconciliation:
Optimistic concurrency control
Global transaction log
Use Cases:
Serverless applications requiring global consistency
Financial applications with global transaction needs
SaaS platforms with strict data correctness requirements
Unique: GraphQL-native applications requiring global consistency
Unique: Temporally consistent data access across multiple regions
Unique: Event sourcing applications with distributed transaction support
CosmosDB (Azure)
Scale: Global distribution
ACID Compliance: Configurable (strong to eventual)
CAP Focus: Configurable (CP or AP)
Consistency Mechanisms:
Five consistency levels (strong, bounded staleness, session, consistent prefix, eventual)
Multi-master replication
Reconciliation:
Conflict resolution policies (Last-Writer-Wins, custom merge procedures)
Automatic conflict detection
Use Cases:
Globally distributed web and mobile applications
Multi-model data storage (document, graph, key-value)
IoT data ingestion and analytics
Unique: Applications requiring multiple API interfaces (SQL, MongoDB, Cassandra, Gremlin)
Unique: Systems requiring tunable consistency on a per-request basis
Unique: Globally distributed graph databases for recommendation engines
Vector Databases (Growing Importance for AI)
Pinecone
Scale: Distributed vector search
ACID Compliance: Limited
CAP Focus: AP (Availability and Partition tolerance)
Consistency Mechanisms:
Eventually consistent indexing
Reconciliation:
Asynchronous indexing
Versioned updates
Use Cases:
AI-powered semantic search
Recommendation systems based on embeddings
Image and audio similarity search
Unique: Hybrid search combining vector and metadata filtering
Unique: Real-time personalization using embedding-based similarity
Unique: Large-scale nearest neighbor search for AI applications
Milvus
Scale: Horizontal scaling for vector search
ACID Compliance: Limited
CAP Focus: CP (Consistency and Partition tolerance)
Consistency Mechanisms:
Log Sequence Numbers (LSNs)
Message queue based coordination
Reconciliation:
Time Travel with snapshots
Primary-secondary replication
Key Consistency and Reconciliation Algorithms
Consensus Algorithms
Paxos: Classic consensus algorithm used in Google Spanner
Raft: More understandable alternative to Paxos used in etcd, CockroachDB
ZAB (ZooKeeper Atomic Broadcast): Powers coordination in ZooKeeper
Viewstamped Replication: Focus on primary-backup replication with view changes
Conflict Resolution
CRDTs (Conflict-Free Replicated Data Types): Mathematical structures that automatically resolve conflicts
State-based CRDTs: Exchange full state
Operation-based CRDTs: Exchange operations
OT (Operational Transformation): Used in collaborative editing, transforms operations to preserve intention
MVCC (Multi-Version Concurrency Control): Maintains multiple versions to allow concurrent reads/writes
Vector Clocks: Track causality between events in a distributed system
Version Vectors: Extension of vector clocks for replica synchronization
Dotted Version Vectors: Enhanced version vectors for handling concurrent operations
Last-Writer-Wins: Simple timestamp-based approach
Quorum-based systems: Read/write operations succeed based on majority agreements