Lesson Wednesday

Not all queries are created equal. Some are faster and some are painfully slow. n+1 queries are the worst of all. We briefly covered n+1 queries in the last course section. Let's take a closer look at what they are and how we can avoid them.

n+1 Queries

Let's say we have an HR person that wants to be able to see the last five tasks a worker has completed. We did a similar query in the lesson on database scopes. Here's the query in the Rails console:

tasks = Task.order(created_at: :desc).limit(5)

The console returns:

Task Load (3.7ms)  SELECT  "tasks".* FROM "tasks"  ORDER BY "tasks"."created_at" DESC LIMIT 5

Now let's return the name of the lists associated with the tasks. We'll add this code in the console:

tasks.each do |task|

Here's what the console returns:

  Worker Load (0.3ms)  SELECT  "workers".* FROM "workers" WHERE "workers"."id" = $1 LIMIT 1  [["id", 97]]
  Worker Load (0.2ms)  SELECT  "workers".* FROM "workers" WHERE "workers"."id" = $1 LIMIT 1  [["id", 61]]
  Worker Load (0.2ms)  SELECT  "workers".* FROM "workers" WHERE "workers"."id" = $1 LIMIT 1  [["id", 15]]
  Worker Load (0.2ms)  SELECT  "workers".* FROM "workers" WHERE "workers"."id" = $1 LIMIT 1  [["id", 22]]
  Worker Load (0.2ms)  SELECT  "workers".* FROM "workers" WHERE "workers"."id" = $1 LIMIT 1  [["id", 51]]

We just made an n+1 query.

With the code above, we query the database to retrieve 5 tasks. We then run a loop that queries the database another 5 times in order to retrieve each task.list.name.

In other words, we query the database 1 time to retrieve n tasks, then we make n queries to retrieve the lists associated with those tasks. That adds up to n+1 queries.

Each call to the database has its own overhead, so making n+1 queries instead of 1 query will take much longer. This may not be noticeable with smaller applications but it quickly adds up with larger applications.

How can we fix this?

More Efficient Queries

For the example above, we could do the following:

tasks = Task.order(created_at: :desc).includes(:worker).limit(5)

Normally Active Record will lazy load records, which means it only loads the records that are needed. This is often a good thing. Why load extra records, increasing the overhead time of the query? However, sometimes we want those extra records.

In this case, we want to eager load the associated workers, making them available to our application immediately. This is exactly what includes does: it fetches the most recent tasks and the associated workers with the minimum required number of queries.

Note that our query includes(:worker) because tasks belong to a worker; if a task had many workers, we'd use (:workers) instead.

Here's what the console returns when we use includes:

Task Load (2.2ms)  SELECT  "tasks".* FROM "tasks"  ORDER BY "tasks"."created_at" DESC LIMIT 5
Worker Load (0.4ms)  SELECT "workers".* FROM "workers" WHERE "workers"."id" IN (97, 61, 15, 22, 51)

The first query returns tasks while the second returns the associated workers. When we run our loop to retrieve worker names, no further queries are necessary because we've already queried the database to retrieve the associated workers.

It may not seem like a big deal when we're just retrieving the five tasks and their associated workers, but what if we wanted to generate the worker names of all unfinished tasks? In this seeded database, there are 6,000 unfinished tasks. We could make 6,001 queries... or we could make just two. That's a huge difference in overhead.

You aren't expected to be an expert on n+1 queries at this point and you won't need to avoid n+1 queries for this section's code review. However, we should always think carefully when you query the database. Being aware of (and avoiding) n+1 queries will also help us gain a better understanding of making queries in general. There's even a gem called Bullet that can help us avoid n+1 queries. Bullet will automatically notify us of any n+1 queries in our applications.

Lesson 31 of 34
Last updated July 14, 2022