This lesson explores join statements in more depth. You do not need to know how to write JOIN statements for this section's independent project. However, it is important to understand how data from multiple tables can be gathered in a single query statement using matching keys on related tables.
Let's look at an example where a sponsor has many athletes. When we have a one-to-many relationship, it's straightforward to select all of the "many" that belongs to the "one." Here's some sample data:
table: athletes id | name | sponsor_id ----+-----------------------+---------------- 1 | Michael Jordan | 1 2 | Maria Sharapova | 1 3 | David Beckham | 2 table: sponsors id | organization ----+----------------- 1 | Nike 2 | Adidas
If we want to get all of the athletes that Nike sponsors, we can write the following SQL statement:
SELECT * FROM athletes WHERE sponsor_id = 1;
There's another way to get this data: using a SQL join statement. A join statement is different from a join table but performs a similar job. It allows us to join information from multiple tables. Here's how it works:
SELECT * FROM sponsors JOIN athletes ON (athletes.sponsor_id = sponsors.id) WHERE sponsors.id = 1;
Let's walk through this statement line by line: We SELECT all ( * ) of the records FROM the sponsors table. We JOIN the athletes table to the sponsors table ON the key that is the same for and links both tables: the sponsorid (which is athletes.sponsorid on the athletes table and sponsors.id on the sponsors table). We limit the JOIN to only those records WHERE the sponsors.id is 1.
So, in plain English, we're joining the
sponsors tables together wherever the
sponsor_id column of the
athletes table is equal to the
id column of the
sponsors table. Then, we're narrowing to just the records where the
sponsors table has an ID of 1.
The join statement is used to simplify queries across multiple tables. Again, it has nothing inherently to do with join tables. But, as we're about to see, you can use it to join together tables that include a join table.
Let's move to a more complex example. The spotlight on some athletes has grown, and now some athletes have many sponsors, and an organization can continue to sponsor many athletes. The data now looks like this:
athletes --------- id | name ----+-------- 1 | Michael Jordan 2 | Maria Sharapova 3 | David Beckham sponsors -------- id | name ----+----------------- 1 | Nike 2 | Adidas endorsements (join table) -------- id | athlete_id | sponsor_id ----+-----------+----------- 1 | 1 | 1 2 | 2 | 1 3 | 1 | 2
If we want to figure out who all of Nike's athletes are without using a join statement, we have to go through two steps. First, we select all of the entries in the join table,
SELECT athlete_id FROM endorsements WHERE sponsor_id = 1;
From this statement, we get the
athlete_ids 1 and 2 back. Now, we have to loop through each of these
athlete_ids and select the corresponding athlete, leading to an n+1 query:
SELECT * FROM athletes WHERE id = 1; SELECT * FROM athletes WHERE id = 2;
To do this more efficiently, we can use a 3-table join:
SELECT athletes.* FROM sponsors JOIN endorsements ON (sponsors.id = endorsements.sponsor_id) JOIN athletes ON (endorsements.athlete_id = athletes.id) WHERE sponsors.id = 1;
We're taking all of the sponsors with
id 1, joining all of the endorsements to that sponsor, and then joining all of the athletes to those endorsements, and finally selecting everything from those athletes.
Let's look at another join statement using a different example.
friends ------- id | name | phone ---+---------+----------- 1 | Mike | 9165551212 2 | Courtney | 3235551212 3 | Jake | 4155551212 cities ------ id | city | state ---+----------+------ 1 | Chicago | IL 2 | New York | NY 3 | Oakland | CA 4 | Portland | OR cities_friends (join table) -------------- id | city_id | friend_id ---+---------+---------- 1 | 1 | 1 2 | 1 | 2 3 | 2 | 3 4 | 2 | 1
If we want to figure out all of the cities that Mike has visited, we can use this join statement:
SELECT cities.* FROM friends JOIN cities_friends ON (friends.id = cities_friends.friend_id) JOIN cities ON (cities_friends.city_id = cities.id) WHERE friends.id = 1;
Sometimes it is easiest to look at join statements backwards.
We want our results to be centered around whatever friend has an
id = 1. So we are asking the database to look at the
friends table and the
cities_friends table together. We want it to return any entries where
friends.id is the same as
cities_friends.friend_id. In our case the
friend_id value in the
cities_friends table is
Just by looking at the
cities_friends table, we can see that there are two entries where
1. Then we are asking the database to look at whatever the
city_id is for those two entries (in this case
2). Then the database will find those same numbers in the
cities table (Chicago and New York) and return all (*) the information about those cities. In this case, that means the name of the city and the name of the state it's in.
Think of it like slowly filtering down results of our queries. We start with a lot of database entries, we filter the join table first for the correct friend, then we look for the IDs of the other table that match with the result of our first filtering. Then we ask the database to return everything about the entries on the other table that matched.
Lesson 23 of 29
Last updated August 7, 2022