Google File System

Reading notes

Features

  1. a scalable distributed file system
  2. component failures are the norm rather than the exception.
  3. files are huge by traditional standards.
  4. most files are mutated by appending new data rather than overwriting existing data. Random writes within a file are practically non-existent.
  5. co-designing the applications and the file system API benefits the overall system by increasing our flexibility.

Architecture

  • Files are divided into fixed-size chunks.
  • The master maintains all file system metadata.
  • Neither the client nor the chunkserver caches file data.
    • Clients do cache metadata
  • Clients never read and write file data through the master.
  • Data mutations
    • writes
    • record appends

Questions

How does it tolerate the failure of a chunk server?

  • Master notices missing heartbeats. Serve requests from other replicas.
  • Master decrements count of replicas for all chunks on dead chunkserver
  • Master re-replicates chunks missing replicas

results matching ""

    No results matching ""