TEXTBOOK: Database System Comcepts 7th Edition

Silberschatz, Korth, & Sudarshan

Database:

  • Organized collextion of inter-related data that models some aspect of the real-world

E.g. Create a database that models a digital music store to keep track of artists and albums

Store our database as comma-separated value (CSV) files that we manage in out own code.

  • Use a separate file per entity.
  • The application has to parse the files each time they want to read / update records

Artist(name, year, country) Album(name, artist ,year)

1
2
3
4
for line in file
record = parse(line)
if "Ice Cube" == record[0]
print int(record(1))

Flat files: data integrity

  • How do we ensure that the artist is the same for each album entry?
  • What if somebody overwrites the album year with an invalid string?
  • How do we store that there are multiple artists on an album?
    • array or string?
  • How do you find a particular record?
    • three lines pretty fast
    • a Billion?
  • What if we now want to create a new application that uses the same database?
  • What if two threads try to write to the same file at the same time?
    • first writes something, and second overwrites
  • What if the machine crashes while our program is updating a record?
  • What if we want to replicate the database on multiple machines for high availabiility?

DBMS is software that allows applications to store and analyze information in a database.

A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administration of databases.

  • reuse
  • no reinventing wheels

There’s no courses bout browsers, but there is course about DB.

So it’s a kind of special software.

Early DBMSs

  • Database application were difficult to build and maintain.
  • Tight coupling between logical and physical layers.
  • You have to (roughly) know what queries your app would execute before you deployed the database.

Ted Cotton: people are wasting time write DBMS again and again.

Database abstraction to avoid this maintenance:

  • Store database in simple data structures
  • Access data through high-level language.
  • Physical storage left up to implementation.

Noway software can never produce query.

Physical storage of data is transparent to applications.

Data Models

  • A data model is collecition of concepts for describing the data in a databasde.
  • A schema is a description of a particular colection of data, using a given data model.

Data Model

  • Relational <- SQL
  • Key/Value <- NoSQL
  • Graph
  • Document
  • Column-family
  • Array/Matrix <- Machine Learning
  • Hierarchical <- Obsolete / Rare
  • Network

Relational Mods

  • Structure: The definition of relations and their contents.
  • Integrity: Ensure the database’s contents satisfy constraints.
  • Manipulation: how to access and modify a database’s contents.

A relation is unordered set that contain the relation ship of attributes that represent entities

A tuple is a set of atribute values (also known as its domain) in the relation.

  • Values are (normally) atomic / scalar.
  • The special value NULL is a member of every domain.

A relation’s primary key uniquely identifies a single typle.

Some DBMSs automatically create an internal primarykley if you don’t define one.

A foreign key specifils that an attrebute from one relation has to map to a tyuple in another relation.

  • Album -> Multiple Artists
  • a separate cross reference table ArtistAlbum(artist_id, album_id), both are foreign keys.

The relational model is independent of any query language implementation.

SQL is the de facto standard.

1
2
SELECT year FROM artists
WHERE name = "Ice Cuve";