Sharding implies distributing data across multiple physical servers (shards). Each shard holds a subset of the data and acts as a self-contained database.
hash(user_id) % num_servers. Good distribution, but hard to add servers (requires re-sharding).If you shard by "Celebrity User ID" or "Region", one shard might receive 90% of the traffic (e.g., all users querying Justin Bieber's tweets). This defeats the purpose of distributed systems. Consistent Hashing is often used to mitigate this.