Database Storage I (CMU Databases Systems Fall 2019)
CommentQuery Planning
Operator Execution
access Methods
Buffer Pool Manager
Disk Manager
The DBMS’s assumes that the primary storage location of the database is on non-volatile disk.
The DBMS’s components manage the movement of data between non-volatile and volatile storage.
Storage hierarchy
- cpu registers
- cpu caches
- dram // Volatile: need contant power
- ssd // Non-Volatile
- hdd
- network storage
Intel Optane : Non-Volatile Memory
- Is the future
- not widely available
System Design Goals
Allow the DBMS to manage databases that exceed the amount of memory available.
Reading/writing to disk is expansive, so it must be managed carefully to avoid large stall and performance degradation.
Why not use the os?
- One can use memory mapping(mmap) to store the contents of a file into a process’ address space
- The OS is responsible for moving data for moving the files’ pages in and out of memory
What if we allow multiple threads to access the mmap files to hide page fault stalls?
This works good enough for read-only address. It is complicated when there are multiple writers.
There are some solutions to this problem:
madvise: Tell the os how you expect to read certain pages
mlock: tell the os that memory ranges cannot be paged out
msync: tell the os to flush memory ranges out to disk
DBMS (almost) always wants to control things itself and can do a better job at it
-> flushing dirty pages to disk in the correct order
-> specialized prefetching
-> Buffer replacement policy
-> thread/process scheduling
The OS is NOT your friend
Problem #1: How the DBMS represents the database in files on disk. <– this lecture
Problem #2: How the DBMS manages its memory and move back-and-forth from disk
TODAY’S AGENDA
File Storage
Page Layout
Tuple Layout
The DBMS stores a database as one or more files on disk.
- The OS doesn’t know anything about the contents of these files.
Different DBMS manage pages in files on disk in different ways
- Heap file organization
A heap file is an unordered collection of pages where tuples that are stored in random order
- Create/Get/Write/Delete Page
- Must also support iterating over all pages
Need meta-data to keep track of what pages exist and which ones have free space.
- Linked List
- Page Directory
The DBMS maintains special pages that tracks the location of data pages in the database files.
The directory also records the number of free slots per page.
The DBMS has to make sure that the directory pages are in sync with the data pages.
Page Layout
![](20221024-DatabseStorage.assets/屏幕截图 2022-10-25 150909.png)
Tuple Layout
![](20221024-DatabseStorage.assets/屏幕截图 2022-10-25 151039.png)