Lesson Monday

When you have a one-to-many relationship - say, a circus trainer has many animals - it's straightforward to select all of a trainer's animals. Here's some sample data:

`````` id | animals | trainer_id
----+---------+-----------
1 | tiger   | 1
2 | lion    | 1
3 | gorilla | 2

id |     name
----+-----------------
1 | harry houdini
2 | missy mysterious
``````

If we want to get all of Harry's animals, we simply:

``````SELECT * FROM animals WHERE trainer_id = 1;
``````

There's another way to get this data: with a join statement. A join statement is totally different from a join table - don't get them confused! A join table holds data about the relationship between two tables, a join statement queries the join table and other tables to select relationship specific data. Here's how it works:

``````SELECT * FROM
animals JOIN trainers ON (animals.trainer_id = trainers.id)
WHERE trainers.id = 1;
``````

This honestly is more confusing than the first way, but I want you to take a moment to understand it so that when we get to a more complex example, you have this as a foundation. We're joining the `animals` and `trainers` tables together wherever the `trainer_id` column of the `animals` table is equal to the `id` column of the `trainers` table. Then, we're narrowing down to just the records where the `trainers` table has an ID of 1.

The join statement is used to simplify queries across multiple tables. Again, it has nothing inherently to do with join tables. But, as we're about to see, you can use it to join together tables that include a join table.

So, let's move to a more meaningful example. The circus has grown, and an animal can have many trainers, and a trainer can work with many animals. The data now looks like this:

``````animals
---------

id | species
----+--------
1 | tiger
2 | lion
3 | gorilla

trainers
--------

id |     name
----+-----------------
1 | harry houdini
2 | missy mysterious

lessons (join table)
--------

id | animal_id | trainer_id
----+-----------+-----------
1 |         1 |          1
2 |         2 |          1
3 |         1 |          2
``````

If we want to figure out who all of Harry's animals are without using a join statement, we have to go through two steps. First, we select all of the entries in the join table, `lessons`:

``````SELECT animal_id FROM lessons WHERE trainer_id = 1;
``````

From this statement, we get the `animal_id`s 1 and 2 back. Now, we have to loop through each of these `animal_id`s and select the corresponding animal:

``````SELECT * FROM animals WHERE id = 1;
SELECT * FROM animals WHERE id = 2;
``````

To do this more efficiently, we can use a 3-table join:

``````SELECT animals.* FROM
trainers JOIN lessons ON (trainers.id = lessons.trainer_id)
JOIN animals ON (lessons.animal_id = animals.id)
WHERE trainers.id = 1;
``````

Take a moment to digest this. It might be helpful to picture the tables left-to-right:

``````     trainers                         lessons                  animals
--------                         --------                 --------

id |     name              id | animal_id | trainer_id      id | species
----+------------------    ----+-----------+------------    ----+--------
1 | harry houdini          1 |         1 |          1       1 | tiger
2 | missy mysterious       2 |         2 |          1       2 | lion
3 |         1 |          2       3 | gorilla
``````

We're taking all of the trainers with ID 1, then joining all of the lessons to that trainer, and then joining all of the animals to those lessons, and finally selecting everything from those animals.

Let's look at another join statement using an example we have seen before.

``````friends
-------

id | name    | phone
---+---------+-----------
1  | Chris   | 9165551212
2  | Jessica | 3235551212
3  | Jake    | 4155551212

cities
------

id | city     | state
---+----------+------
1  | Chicago  | IL
2  | New York | NY
3  | Oakland  | CA
4  | Portland | OR

cities_friends (join table)
--------------

id | city_id | friend_id
---+---------+----------
1  | 1       | 1
2  | 1       | 2
3  | 2       | 3
4  | 2       | 1
``````

If we want to figure out all of the cities that Chris has visited, we can use this join statement:

``````SELECT cities.* FROM
friends JOIN cities_friends ON (friends.id = cities_friends.friend_id)
JOIN cities ON (cities_friends.city_id = cities.id)
WHERE friends.id = 1;
``````

Sometimes it is easiest to look at join statements backwards. We want our results to be centered around whatever friend has an `id = 1`. So we are asking the database to look at the `friends` table and the `cities_friends` table together and return any entries where `friends.id` is the same as `cities_friends.friend_id`, or in our case, where the `friend_id` value in the `cities_friends` table is `1`. Just by looking at the `cities_friends` table, we can see that there are two entries where `friend_id` is `1`. Then we are asking the database to use those two entries, look at whatever the `city_id` is for those two entries (in this case `1` and `2`), and then find those same numbers in the `cities` table (Chicago and New York) and return all (*) information about those cities.

I know that this can be a little confusing, but you can think of it like slowly filtering down results of your queries. You start with a lot of database entries, you filter the join table first for the the correct friend, then you look for the IDs of the other table that match with the result of your first filtering. Then you ask the database to return everything about the entries on the other table that matched.

Here are some blogs for you to look at that could make the concept of join statements a little easier to understand: