GFS (MIT 6.824)
CommentWhy is it hard? (Big store system)
Performance -> Sharding -> constant faults
Faults -> fault tolerant systems
Tolerance -> Replication
Replication -> In consistency
Consistency -> Low Performance
Consistency: like talking to one server
STRONG CONSISTENCY
Master - mapping : filename, and position
Chunk server 0
Chunk server 1
…
64MB per chunk
Master Data:
2 main tables: (On ram)
filename -> array of chunk handles (nv)
chunk handles ->list of chunk servers (v)
version number (nv)
primary (v)
lease expiration (v)
Will store some on disk
LOG, checkpoint(snapshot) - disk
n - not v - volatile
restart: recover to snapshot and replay LOG.
READ
client send filename and offset to mcoinaster
master sends handle, list of servers
(cached. Next time, won’t ask master)
client -> chunk servers: find the file
- <- return the data
WRITES
if NO PRIMARY? - ON MASTER
find up to date replica (with version number)
pick primary and secondary
increment version number
tells primary, secondary the version number - lease
master write version number to disk
primary picks an offset
all replicas told to write at offset
if all “yes”, primary “success” -> client
else “no” -> client, and client reissue