Query Planning

Operator Execution

access Methods

Buffer Pool Manager

Disk Manager

The DBMS’s assumes that the primary storage location of the database is on non-volatile disk.

The DBMS’s components manage the movement of data between non-volatile and volatile storage.

Storage hierarchy

  • cpu registers
  • cpu caches
  • dram // Volatile: need contant power
  • ssd // Non-Volatile
  • hdd
  • network storage

Intel Optane : Non-Volatile Memory

  • Is the future
  • not widely available

System Design Goals

Allow the DBMS to manage databases that exceed the amount of memory available.

Reading/writing to disk is expansive, so it must be managed carefully to avoid large stall and performance degradation.

Why not use the os?

  • One can use memory mapping(mmap) to store the contents of a file into a process’ address space
  • The OS is responsible for moving data for moving the files’ pages in and out of memory

What if we allow multiple threads to access the mmap files to hide page fault stalls?

This works good enough for read-only address. It is complicated when there are multiple writers.

There are some solutions to this problem:

madvise: Tell the os how you expect to read certain pages

mlock: tell the os that memory ranges cannot be paged out

msync: tell the os to flush memory ranges out to disk

DBMS (almost) always wants to control things itself and can do a better job at it

-> flushing dirty pages to disk in the correct order

-> specialized prefetching

-> Buffer replacement policy

-> thread/process scheduling

The OS is NOT your friend

Problem #1: How the DBMS represents the database in files on disk. <– this lecture

Problem #2: How the DBMS manages its memory and move back-and-forth from disk

TODAY’S AGENDA

File Storage

Page Layout

Tuple Layout

The DBMS stores a database as one or more files on disk.

  • The OS doesn’t know anything about the contents of these files.

Different DBMS manage pages in files on disk in different ways

  • Heap file organization

A heap file is an unordered collection of pages where tuples that are stored in random order

  • Create/Get/Write/Delete Page
  • Must also support iterating over all pages

Need meta-data to keep track of what pages exist and which ones have free space.

  • Linked List
  • Page Directory

The DBMS maintains special pages that tracks the location of data pages in the database files.

The directory also records the number of free slots per page.

The DBMS has to make sure that the directory pages are in sync with the data pages.

Page Layout

![](20221024-DatabseStorage.assets/屏幕截图 2022-10-25 150909.png)

Tuple Layout

![](20221024-DatabseStorage.assets/屏幕截图 2022-10-25 151039.png)