MediumPremium

Database Sharding vs. Partitioning

We have 50TB of user data. Explain the strategies to split this data across multiple nodes.
20 min read14 Jan 2026

Solution

Sharding (Horizontal Partitioning)

Sharding implies distributing data across multiple physical servers (shards). Each shard holds a subset of the data and acts as a self-contained database.

Sharding Strategies

  1. Key-Based (Hash) Sharding: hash(user_id) % num_servers. Good distribution, but hard to add servers (requires re-sharding).
  2. Range-Based Sharding: Users A-M go to Server 1, N-Z to Server 2. Easy to query ranges, but leads to uneven distribution.

The Hot Partition Problem

If you shard by "Celebrity User ID" or "Region", one shard might receive 90% of the traffic (e.g., all users querying Justin Bieber's tweets). This defeats the purpose of distributed systems. Consistent Hashing is often used to mitigate this.