Course Introduction Relational Model (CMU Databases Systems Fall 2019)
CommentTEXTBOOK: Database System Comcepts 7th Edition
Silberschatz, Korth, & Sudarshan
Database:
- Organized collextion of inter-related data that models some aspect of the real-world
E.g. Create a database that models a digital music store to keep track of artists and albums
Store our database as comma-separated value (CSV) files that we manage in out own code.
- Use a separate file per entity.
- The application has to parse the files each time they want to read / update records
Artist(name, year, country) Album(name, artist ,year)
1 | for line in file |
Flat files: data integrity
- How do we ensure that the artist is the same for each album entry?
- What if somebody overwrites the album year with an invalid string?
- How do we store that there are multiple artists on an album?
- array or string?
- How do you find a particular record?
- three lines pretty fast
- a Billion?
- What if we now want to create a new application that uses the same database?
- What if two threads try to write to the same file at the same time?
- first writes something, and second overwrites
- What if the machine crashes while our program is updating a record?
- What if we want to replicate the database on multiple machines for high availabiility?
DBMS is software that allows applications to store and analyze information in a database.
A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administration of databases.
- reuse
- no reinventing wheels
There’s no courses bout browsers, but there is course about DB.
So it’s a kind of special software.
Early DBMSs
- Database application were difficult to build and maintain.
- Tight coupling between logical and physical layers.
- You have to (roughly) know what queries your app would execute before you deployed the database.
Ted Cotton: people are wasting time write DBMS again and again.
Database abstraction to avoid this maintenance:
- Store database in simple data structures
- Access data through high-level language.
- Physical storage left up to implementation.
Noway software can never produce query.
Physical storage of data is transparent to applications.
Data Models
- A data model is collecition of concepts for describing the data in a databasde.
- A schema is a description of a particular colection of data, using a given data model.
Data Model
- Relational <- SQL
- Key/Value <- NoSQL
- Graph
- Document
- Column-family
- Array/Matrix <- Machine Learning
- Hierarchical <- Obsolete / Rare
- Network
Relational Mods
- Structure: The definition of relations and their contents.
- Integrity: Ensure the database’s contents satisfy constraints.
- Manipulation: how to access and modify a database’s contents.
A relation is unordered set that contain the relation ship of attributes that represent entities
A tuple is a set of atribute values (also known as its domain) in the relation.
- Values are (normally) atomic / scalar.
- The special value NULL is a member of every domain.
A relation’s primary key uniquely identifies a single typle.
Some DBMSs automatically create an internal primarykley if you don’t define one.
A foreign key specifils that an attrebute from one relation has to map to a tyuple in another relation.
- Album -> Multiple Artists
- a separate cross reference table ArtistAlbum(artist_id, album_id), both are foreign keys.
The relational model is independent of any query language implementation.
SQL is the de facto standard.
1 | SELECT year FROM artists |