Relational Languages

  • User only needs to specify the answer that they want, not how to compute it.
  • The DBMS is responsible for efficient evaluation of the query.
    • Query optimizer: re-orders operations and generates query plan

Originally SEQUEL from IBM’s System R prototype

  • Structured English Query Language

IBM releases DB2 in 1983

ANSI Standard in 1986. ISO in 1987

  • Structured Query Language

Important: SQL is base on bags (duplicates)

Aggregations + Group By

String / Date / Time Operations

Output Control + Redirection

Nested Queries

Common Table Expressions

Window Functions

Aggregates

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
AVG(col)
MIN(col)
MAX(col)
SUM(col)
COUNT(col)
select count(login) as cnt
from student where login like '%@cs'

// or
select count(*) as cnt
from student whare login like '%@cs'

// or
select count(1) as cnt
from student whare login like '%@cs'
1
2
3
4
select avg(s.gpa), e.cid
from enrolled as e, student as s
where e.sid = s.sid
group by e.cid

Non-aggregated values in SELECT output clause must appear in GROUP BY clause

1
2
3
4
5
6
7
8
9
10
11
12
13
# not aggr'd yet
select avg(s.gpa) as avg_gpa, e.cid
from enrolled as e, student as s
where e.sid = sid
and avg_gpa > 3.9
group by e.cid

# right
select avg(s.gpa) as avg_gpa, e.cid
from enrolled as e, student as s
where e.sid = sid
group by e.cid
having avg_gpa > 3.9

Strings

all strings have to be case sensitive and declared with single quotes ‘’

sqlite: “” is ok

mysql is case insensitive

Like is used for stirng matching.String-matching operators

  • ‘%’ Matches any substring (including empty strings.)
  • ‘_’ Match any one character

SQL-92 defines string functins

  • Many DBMSs also have theri onw unique functions

Can be used ineither output and predicxates

1
2
UBSTRING(name, 0, 5)
UPPER(e.name)

SQL standartd says to use || operator to concatenate two or ,more stirng together.

1
2
select 'an' 'dy ' 'pavlo';
> andy pavlo # only mysql

Date / Time operations

  • operations to manipulate and mudify date / time attributes.
  • Can be used in either output and predicates
  • Spport / syntax varies wildly…
1
2
3
select NOW();  # current timestamp
select current_timestamp();
select current_timestamp;

sqlite: no such function

1
select current_timestamp;  # keyword
1
2
select extract day(day from date(‘2018-08-29'));  # only pavlo
select date('2919-08-29') - date('2018-01-01') as days;
1
2
# mysql
select round((unix_timestamp(date()) - unix_timestamp(date())) / (60 * 60 * 24), 0) as days
1
2
# sqlite
select cast((julianday(current_timestamp) - julianday('2018-01-01')) as int) as days;

!!! but sqlite is the most popular

Photoshop runs sqlite inside

cause there is no copyright on sqlite. But oracle has mysql

Output redirection

1
2
3
4
5
select distinct cid into courseids
from enrolled;

create table courseids (
select distinct cid from enrolled);

Don’t take it out for me. Write it to this table.

Insert tuplesfrom query into another table:

  • Inner select must generatge the same columns as the target table.
  • dbmss have different options / syntax on what to do with duplicates.
1
2
insert into courseids
(select distinct cid from enrolled);

(implements will vary widely)

Output control

1
2
3
4
5
6
7
select sid, grade from enrolled
where cid = '15-721'
order by grade;

select sid from enrolled
whhere cid = '15-721'
order by grade desc, sid asc; # grade not in input but OK

LIMIT [offset]

  • limit the # of tuples returned in output.
  • can set an offset rto return a “range”
1
2
3
4
5
6
7
select sid, name from student
where login like '%@cs'
limit 10;

select sid, name from student
where login like '%@cs'
limit 20 offset 10; # pagination

Nested queries

  • Queries containing other queries
  • They are often difficult to optimize.
  • Inner queries can appear (almost)anywhere in query.
1
2
select name from student where
sid in (select sid from enrolled);

WHERE

  • ALL

  • ANY

  • IN

  • EXISTS

1
2
3
4
5
6
7
8
9
10
select name from student 
whiere sid = any(
select sid from enrolled
where cid = '15-445'
);

select (select s.name from stucent as s
where s.sid = e.sid) as sname
from enrolled as e
where cid = '15-445'
1
2
3
4
5
6
7
8
9
10
11
12
13
select sid, name from student
where sid => all(
select sid from enrolled
);
select sid, name from student
where sid in (
select max(sid) from enrolled
);
select sid, name from student
where sid in (
select sid from enrolled
order by sid desc limit 1
);

find all coursed that has no students enrolled in it.

1
2
3
4
5
select * from course
where not exists(
select * from enrolled
where course.cid = enrolled.cid
);

window functions

  • Performs a calculation across a set of tuples that related to a single row.
  • Like an aggregation but tuples are not grouped into a single output tuples.

Special window functions:

  • row_number() -> # of the current row
  • rank() -> order position of the current row

The over keyword specifies how to group together tuples when computing the window function.

Use partition by to specify group

1
2
3
4
select cid, sid,
row_number() over (partition by sid)
from enrolled
order by cid;

Find the student with the highest grade fro each course.

1
2
3
4
5
6
7
select * from (
select *,
rank() over (partition by cid
order bygrace asc)
as rank
from enrolled) as ranking
where ranking.rank = 1;

postgres is the only one supports.

Common table expressions

  • provides a way to write auxiliary statements for use in a larger query.
    • think of it like a temp table just for one query.

Alternative to nested queries and views.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
with ctename (col1, col2) as (
select 1, 2
)
select col1 + col2 from ctename;

with ctesource (maxid) as (
select max(sid) from enrolled
)
select name from student, ctesource
where student.sid = ctesource.max;


# recursion
with recursive ctesource (counter) as (
(select 1)
union all
(select counter + 1 from ctesource
where counter < 10)
)
select * from ctesource;