Rough notes on high-scalability systems

  • Scalability
    • Scalability is the capability of a system, process, or a network to grow and manage increased demand.
    • Horizontal
      • Scaling up by adding more servers
      • Easier to adjust dynamically
    • Vertical
      • Scaling up by beefing up server resources/power
  • Reliability
    • The probability a system will fail in a given period
    • Achieved through redundancy of both the software components and data.
  • Availability
    • The time a system remains operational
    • If a system is reliable, it is available.
    • However, if it is available, it is not necessarily reliable.
  • Efficiency
    • Response Time
    • Throughput
  • Serviceability or Manageability
    • Also Observability
  • Load Balancing
    • Prevents single point of failure
    • Improves overall application availability and response time
    • Distributes Load, no single struggling server
    • Provides analytics
    • Handles failures
    • Can add LBs in three places
      • Between the user and the web server
      • Between web servers and an internal platform layer, like application servers or cache servers
      • Between internal platform layer and database.
    • How do they work
      • Health checks to backend servers
      • Methods
        • Round Robin
        • Weighted round robin
        • Least connections
        • Least response time
        • IP hash
    • LBs must also be redundant
  • Caching
    • LB helps scaling, but caching enables:
      • Vastly better use of resources
      • Implementing requirements that would otherwise be unattainable
      • Pros: Better for static data
      • Cons: Not good for user generated data, where freshness and consistencty is critical
    • Assume 1M requests for 100 items
      • If we split 50 items to 2 servers, each will still have to deal with 1M requests
      • If we replicate 100 items to 2 servers, each will deal with 500K requests instead
    • GET requests can be UDP, SET/DELETE can be TCP
    • Application server cache
      • Could improve request/response time, but if we have scaled with many nodes that are behind a load balancer, we’ll have many cache misses, because requests will be routed to different nodes
    • CDNs
      • Tip: use a different subdomain (cdn.facebook.com) for static content, will help later when passing DNS to your CDN
    • Cache invalidation
      • Cache is amazing, but needs management to keep cache coherent with DB
      • If data is updated on DB, needs to be updated in cache.
        • Known as cache invalidation
    • Types of Cache
      • Write-through
        • Write to cache and DB at the same time
        • More reliable, less data losses
        • But, with increased latency for write operations
      • Write-around
        • Bypass cache and write directly to DB
        • If data is being reread, cache miss will happen
      • Write-back
        • Data is written to cache and confirmed immediately to client
        • Data will be written to DB at a later time
        • Low latency and high thoughput for write-intensive apps
        • Comes with risk of data loss
    • Cache Eviction policies
      • FIFO
      • LIFO
      • LRU
      • LFU
      • RR
  • Data Partitioning
    • Breaking up big DB in smaller parts
    • Easier to scale vertically than vertically after a point
    • Types
      • Horizontal Partitioning / Sharding / Range based partitioning
        • Put ranges of rows in different tables
          • If we split across multiple servers, then we have sharding
        • Main problem is that if range isn’t chosen properly it’s lead to unbalanced servers
          • e.g. using zip code can result in unbalances (bigger vs smaller cities)
      • Vertical Partitioning
        • Different tables in different servers
        • e.g. for instagram, separate servers for
          • Profile info
          • Friend list
          • Photos
        • Main problem is that you might need to distribute to more servers if specific server grows big (e.g. billions of photos)
      • Federation
        • split into multiple DB based on function (product, user, forums DBs)
      • Directory based paritioning
        • Abstracts away partitioning details
    • Partitioning Criteria
      • Key or Hash-based
        • 100 servers, ID % 100
        • Must ensure uniform allocation
        • Con: changinc hash function requires redistributing the data
      • List partitioning
        • Each parittion is assigned a key
        • In order to inser a record, we see which partition has the respective key
          • e.g. store Sweden, Nowray etc. in partition for Nordic countries
      • Round Robin paritioning
        • Ensures uniform distribution
      • Composite paritioning
        • Combine above methods
          • E.g. list partitioning then hash-based
    • Problems with partitioning
      • Joins and Denormalization
        • Join needs to combine data from many servers
        • Can be mitigated with denormalization, but we need to deal with data inconsistency if we go down that path
      • Referential integrity
        • Data integrity constraints, like foreign keys, can be hard
        • Frequently not supported by RDBMS, need to be enforced in application code
      • Rebalancing
        • Needed when data distribution is not uniform, or if there’s a lot of load in one partition
        • Perdorming rebalancing is difficult to do without downtime
  • DB Indexing
    • One of the first methods to consider when performance is no longer satisfactory
    • Makes searches of records faster
    • Slows down data insertion & update
    • Can be created using one or more columns
    • Makes sense for read-intensive, not for write intenstive apps
  • Proxy Servers
    • Used to
      • Filter requests
      • Log
      • Transform requests
        • Add/remove headers
        • Encrypt/Decrypt
        • Compressing
      • Caching
  • Redundancy & Replication
    • Redundancy is the duplication of critical components or functions of a system
    • Goals
      • Increasing the reliability of the system, usually in the form of a backup
      • Improve actual system performance
    • Without redundancy, losing a server means losing the file
    • Replication means sharing info to ensure consistency between redundant resources
      • Widely used in RDBMS systems, usually with a master-slave relationships
      • Master gets all updates which then ripple to the slaves
  • SQL vs NoSQL
    • In SQL, you design your schema, in noSQL, you design your Queries
    • Relational DBs
      • Have predefined schema
      • Use rows and columns
      • Each row contains all info for entity
      • ACID compliant
        • Atomicity, Consistency, Isolation, Durability
          • Atomicity, each transaction is all or nothing
          • Consistency, any transaction will bring the database from one valid state to another
          • Isolation, executing transactions concurrently has the same results as if the transactions were executed serially
          • Durability, once a transaction has been committed, it will remain so
    • Non-Relational DBs
      • Sacrifice ACID for scalability and performance
      • Unstructured, Distributed, with dynamic schema
      • Hold all the data
      • Types
        • Key-Value storage
          • Redis
        • Document DBs
          • Data stored in documents that are grouped in collections
          • Each can have entirely different structure
            • MongoDB, CouchDB
        • Graph DBs
          • Store data whose relationship is better represented as a graph
          • Data saved in nodes, properties, lines
            • InfiniteGraph, Neo4j
  • CAP Theorem
    • Consistency, Availability, Partition tolerance
    • Distributed systems can provide at most two guarantees ^
    • Consistency
      • Every read receives the most recent write or error
      • Achieved by updating all nodes and then allow further reads
        • Sacrifices Availability
      • Different from Consistency as defined for ACID
    • Availability
      • Every request receives a non-error response, w/o guarantee that it contains the most recent write
      • Every request gets a response, even if some nodes fail
    • Partition tolerance
      • The system continues to function even if nodes can’t communicate between them
    • When a network failure happens, we have to decide to:
        1. Cancel the operation, i.e. decrease availability but ensure consistency
        1. Proceed with the operation, i.e. provide availability w/o consistency
  • Consistent Hashing
    • Use Binary Search Tree (support successor) to represent cache nodes
    • Only k/n keys need to be moved (k: total keys, n: servers)
      • Contrast with normal hashing, all keys need to be moved
    • DHT is one of the fundamendal components in distributed systems
    • Problems with normal hashing (e.g. key % n)
      • Not horizontally scalable. It breaks when we add one more node
      • Might not be load balanced
    • To reduce variance, we can use virtual copies of caching servers
      • k copies of each server will make Binary Search Tree k times bigger
  • Polling strategies
    • AJAX Polling
      • Repeatedly poll server
    • HTTP Long Polling
      • Like repeat polling, but wait and keep connection open
      • “Hanging GET”
    • Websockets
      • Low overhead, real-time TCP connection with server
    • Server-Sent Events (SSEs)
      • Persistent and long-term connection with server, which server uses to send data to client
  • Eventual Consistency
    • Provide BASE, instead of ACID
      • Basically aVAILABLE, Soft state, Eventual consistency
  • Strategies
    • Android and iPhone push services
    • For 10k-100k users, consider using CDN for static content
      • Pull vs Push CDN
    • For 500k users, consider breaking up service to microservices
      • Each microservice can be scaled independently
    • For 1M users, consider LBs between tiers (web, app, DB)
    • Use sql EXPLAIN to see how queries are running