Why?

parallelism

fault tolerance

physical reasons

security goals / isolated

challenges

  • concurrency

  • partial failure

  • performance

Infrastructure - Abstractions

  • Storage
  • Communication more: 6.829
  • Computation

Implementation

  • RPC, threads, concurrency control

Performance

scalability

  • 2x computers -> 2x throughput

Fault Tolerance

Single Computer can stay up for years

Scale turns small problem into constant problems. There’s always failures

  • Availability
    • keep operating while failure happens
  • Recoverability

Consistency

K-V db

Put(k, v) Get(k)->v

MapReduce

consider: word count

INPUT 1 -> Map -> (a, 1), (b, 1)

INPUT 2 -> Map -> (b, 1)

INPUT 3 -> Map -> (a, 1), (c, 1)

(a, 1), (a, 1) -> Reduce -> a, 2

1
2
3
4
5
6
Map(k, v)
split v into words
for each word w
emit(w, "1")
Reduce(k, v)
emit(len(v))