googleFileSystemGo

A from-scratch implementation of the Google File System paper in pure Go. One file, zero dependencies (outside the standard library).

Note

I've covered each step in detail in this blog post.

What's implemented

Paper Concept	What it maps to
Chunk-based storage	64MB chunks with handle-based lookup
Master server	Namespace, chunk allocation, lease management
ChunkServers	Local chunk storage with checksums
Replication	Primary-secondary across multiple servers/racks
Lease management	60s lease timeout with version increment
Atomic record append	`Append()` with at-least-once semantics
Snapshot (copy-on-write)	Ref counting with local chunk copy
Operation log	`OperationLogEntry` for metadata changes
Garbage collection	Lazy deletion with 1min delay, orphaned chunk cleanup
Rack-aware placement	Diverse replica selection by rack
Checksums	CRC32 per 64KB block
Client location cache	60s TTL cache for chunk locations
Heartbeat	500ms interval with chunk reports
Re-replication	Background replication when replicas drop

Running it

git clone https://github.com/Jitesh117/googleFileSystemGo
cd googleFileSystemGo
go run main.go

No go get needed. No external packages. Just Go.

The demo spins up a 5-chunkserver cluster across 3 racks, exercises namespace operations, writes, reads, concurrent appends, snapshots, renames, deletions, and verifies replication.

How the data model works

Files are split into fixed-size chunks (64MB by default, 1MB for demo):

filename → [chunk_0, chunk_1, chunk_2, ...]

Each chunk has:

A globally unique ChunkHandle (64-bit)
A monotonically increasing ChunkVersion
Multiple replicas on different ChunkServers

In Go:

type ChunkHandle uint64
type ChunkVersion uint64

type LocalChunk struct {
    Handle    ChunkHandle
    Version   ChunkVersion
    Data      []byte      // chunk data
    Checksums []uint32    // CRC32 per 64KB block
    Size      int64
}

Architecture overview

Client
  └── Location Cache (60s TTL)
        └── Master (single)
              ├── Namespace (file → chunk handles)
              ├── Chunk Metadata (handle → locations, version, lease)
              ├── Operation Log
              └── Lease Manager
                    └── ChunkServer × N (e.g., 5 servers across 3 racks)
                          ├── Chunk Storage (in-memory for simulation)
                          ├── Checksum verification
                          ├── Heartbeat (500ms)
                          └── Primary/Secondary replication

A write goes: client → push data to pipeline → primary writes → primary replicates to secondaries. A read goes: client → lookup locations → read from primary. Append is atomic: client retries on CHUNK_FULL to get a new chunk.

Key simplifications vs. production GFS

Uses encoding/gob (via net/rpc) instead of a custom wire protocol
Chunks stored in memory (not persisted to local disk)
Single master (no master replication)
Everything runs in-process with localhost TCP

The architecture, data flow, and algorithms are faithful to the paper. The implementation details are simplified so the code stays readable.

Reading the code

If you want to follow along with the paper, here's a suggested order:

Core types — ChunkHandle, ChunkVersion, LocalChunk, FileMetadata
Master — namespace operations, chunk allocation, lease management
ChunkServer — chunk storage, checksums, heartbeat
Write path — pipelined data push, primary/secondary replication
Append path — atomic record append with chunk-full retry
Snapshot — copy-on-write with reference counting
GC — lazy deletion, orphaned chunk cleanup
Client — location cache, read/write/append API

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

googleFileSystemGo

What's implemented

Running it

How the data model works

Architecture overview

Key simplifications vs. production GFS

Reading the code

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

googleFileSystemGo

What's implemented

Running it

How the data model works

Architecture overview

Key simplifications vs. production GFS

Reading the code