Advanced SQL (CMU Databases Systems Fall 2019)
CommentRelational Languages
- User only needs to specify the answer that they want, not how to compute it.
- The DBMS is responsible for efficient evaluation of the query.
- Query optimizer: re-orders operations and generates query plan
Originally SEQUEL from IBM’s System R prototype
- Structured English Query Language
IBM releases DB2 in 1983
ANSI Standard in 1986. ISO in 1987
- Structured Query Language
Important: SQL is base on bags (duplicates)
Aggregations + Group By
String / Date / Time Operations
Output Control + Redirection
Nested Queries
Common Table Expressions
Window Functions
Aggregates
1 | AVG(col) |
1 | select avg(s.gpa), e.cid |
Non-aggregated values in SELECT output clause must appear in GROUP BY clause
1 | # not aggr'd yet |
Strings
all strings have to be case sensitive and declared with single quotes ‘’
sqlite: “” is ok
mysql is case insensitive
Like is used for stirng matching.String-matching operators
- ‘%’ Matches any substring (including empty strings.)
- ‘_’ Match any one character
SQL-92 defines string functins
- Many DBMSs also have theri onw unique functions
Can be used ineither output and predicxates
1 | UBSTRING(name, 0, 5) |
SQL standartd says to use || operator to concatenate two or ,more stirng together.
1 | select 'an' 'dy ' 'pavlo'; |
Date / Time operations
- operations to manipulate and mudify date / time attributes.
- Can be used in either output and predicates
- Spport / syntax varies wildly…
1 | select NOW(); # current timestamp |
sqlite: no such function
1 | select current_timestamp; # keyword |
1 | select extract day(day from date(‘2018-08-29')); # only pavlo |
1 | # mysql |
1 | # sqlite |
!!! but sqlite is the most popular
Photoshop runs sqlite inside
cause there is no copyright on sqlite. But oracle has mysql
Output redirection
1 | select distinct cid into courseids |
Don’t take it out for me. Write it to this table.
Insert tuplesfrom query into another table:
- Inner select must generatge the same columns as the target table.
- dbmss have different options / syntax on what to do with duplicates.
1 | insert into courseids |
(implements will vary widely)
Output control
1 | select sid, grade from enrolled |
LIMIT
- limit the # of tuples returned in output.
- can set an offset rto return a “range”
1 | select sid, name from student |
Nested queries
- Queries containing other queries
- They are often difficult to optimize.
- Inner queries can appear (almost)anywhere in query.
1 | select name from student where |
WHERE
ALL
ANY
IN
EXISTS
1 | select name from student |
1 | select sid, name from student |
find all coursed that has no students enrolled in it.
1 | select * from course |
window functions
- Performs a calculation across a set of tuples that related to a single row.
- Like an aggregation but tuples are not grouped into a single output tuples.
Special window functions:
- row_number() -> # of the current row
- rank() -> order position of the current row
The over keyword specifies how to group together tuples when computing the window function.
Use partition by to specify group
1 | select cid, sid, |
Find the student with the highest grade fro each course.
1 | select * from ( |
postgres is the only one supports.
Common table expressions
- provides a way to write auxiliary statements for use in a larger query.
- think of it like a temp table just for one query.
Alternative to nested queries and views.
1 | with ctename (col1, col2) as ( |