Announcements:
- homework due Fri
- return Quiz
- review index question from Fri
Okay, let's get started by populating some data. Copy the following into a sql script.
-- delete the tables if they already exist
drop table if exists events cascade;
drop table if exists venues cascade;
drop table if exists cities cascade;
drop table if exists countries cascade;
drop table if exists logs cascade;
drop extension if exists tablefunc;
-- crate the schema for our tables
create table countries (
country_code char(2) primary key,
country_name text unique
);
create table cities (
name text not null,
postal_code varchar(9) check (postal_code <> ''),
country_code char(2) references countries,
primary key (country_code, postal_code)
);
create table venues (
venue_id serial primary key,
name varchar(255),
street_address text,
type char(7)
check ( type in ('public', 'private') ) default 'public',
postal_code varchar(9),
country_code char(2),
foreign key (country_code, postal_code)
references cities (country_code, postal_code) match full
);
create table events (
event_id serial primary key,
title text,
starts timestamp,
ends timestamp,
venue_id integer references venues(venue_id)
);
-- create indices
create index events_title
on events using btree (title);
create index events_starts
on events using btree (starts);
-- populate some data
insert into countries (country_code, country_name) values
('us', 'United States'),
('mx', 'Mexico'),
('au', 'Australia'),
('gb', 'United Kingdom'),
('de', 'Germany');
insert into cities values
('Portland', '97206', 'us'),
('Bozeman', '59718', 'us');
insert into venues (name, postal_code, country_code) values
('Crystal Ballroom', '97206', 'us'),
('Voodoo Doughnut', '97206', 'us'),
('My Place', '59718', 'us');
insert into events (title, starts, ends, venue_id) values
('LARP Club', '2012-02-15 17:30', '2012-02-15 19:30',
(select venue_id from venues where name = 'Voodoo Doughnut')
),
('April Fools Day', '2012-04-01', '2012-04-01 23:59', null),
('Christmas Day', '2012-12-25', '2012-12-25 23:59', null),
('Moby', '2013-02-06 21:00', '2012-02-06 23:00',
(select venue_id from venues where name = 'Crystal Ballroom')
),
('Wedding', '2012-02-26 21:00', '2012-02-26 23:00',
(select venue_id from venues where name = 'Voodoo Doughnut')
),
('Dinner with Mom', '2012-02-26 18:00', '2012-02-26 20:30',
(select venue_id from venues where name = 'My Place')
),
('Valentine''s Day', '2012-02-14 10:00', '2012-02-14 23:59', null);
Let's run a query to check that we set up the data correctly.
QUESTION: Given events and table description from chapter 1 with data:
insert into venues (name, postal_code, country_code) values
('Crystal Ballroom', '97206', 'us'),
('Voodoo Doughnut', '97206', 'us'),
('My Place', '59718', 'us');
insert into events (title, starts, ends, venue_id) values
('Wedding', '2012-02-26 21:00', '2012-02-26 23:00',
(select venue_id from venues where name = 'Voodoo Doughnut')
),
('Dinner with Mom', '2012-02-26 18:00', '2012-02-26 20:30',
(select venue_id from venues where name = 'My Place')
),
('Valentine''s Day', '2012-02-14 10:00', '2012-02-14 23:59', null);
What is the query for checking that the events data is as expected:
title | starts | ends | venue
-----------------+---------------------+---------------------+-----------------
Wedding | 2012-02-26 21:00:00 | 2012-02-26 23:00:00 | Voodoo Doughnut
Dinner with Mom | 2012-02-26 18:00:00 | 2012-02-26 20:30:00 | My Place
Valentine's Day | 2012-02-14 10:00:00 | 2012-02-14 23:59:00 |
(3 rows)
HINT: Use a left join
Let's say that we had the following table
select
event_id as holiday_id,
title as name,
starts as date
from events
where title like '%Day%' and venue_id is null;
Question: Without running the query, what does it do?
What if we had users that were going to have to run the same query many many times? Just like we have abstractions in imperative language's so that we don't have to type the same code over and over, we can have abstractions in a declarative languages. A view provides a DRY abstraction. More explicitly, some reasons for views are:
- easier for user to access contents of query
- control access, e.g. allow access to data in view without giving access to underlying data.
We can create a "view" of our query with the following:
create view holidays as
select
event_id as holiday_id,
title as name,
starts as date
from events
where title like '%Day%' and venue_id is null;
Observe that the holiday performs the query
select * from holidays;
Views are nice abstractions, but they do not provide performance improvements. A materialized view creates a table and populates the contents and is only updated at refresh. Let's see and example to explore the difference. First, we will create a materialized view:
create materialized view m_holidays as
select
event_id as holiday_id,
title as name,
starts as date
from events
where title like '%Day%' and venue_id is null;
Now that the materialized view is created let's add an event that would be in the holdays view.
insert into events (title, starts, ends, venue_id) values
('Dave''s Birth Day', '2012-05-30 00:00', '2012-04-30 23:59', null);
Question: Without running the query, what do you expect the difference to be the view and the materialized view?
Let's try it out:
select * from holidays;
Observe that holiday has my birthday
But the materialized view of holiday does not
select * from m_holidays;
Let's refresh the materialized view.
refresh materialized view m_holidays;
And now lets look at the materialized view, and we should see my birthday
select * from m_holidays;
Lets add some colors to our events.
alter table events add colors text array;
create or replace view holidays as
select
event_id as holiday_id,
title as name,
starts as date,
colors
from events
where title like '%Day%' and venue_id is null;
Try to update the view.
Question What do you expect will happen?
update holidays set colors = '{"red", "green"}'
where name = 'Christmas Day';
It worked?! In general, you cannot update views, except under a very specific criterion. See section [Updatable Views]((https://www.postgresql.org/docs/9.5/static/sql-createview.html) for the criterion.
- Indexing Techniques (Hash Indexes, B-Trees, etc.)
- Aggregation functions
- Query planner