Doctrine, QueryBuilder::setMaxResults() and collections associations are on a boat


tip

This blog post is associated with some code that you can find here.

Today, we gonna talk about an issue with DBMS, Doctrine, QueryBuilder::setMaxResults() and collections associations. I've faced this issue while trying to optimize my code to reduce the number of SQL requests after identifying a N+1 query issue.

Some context...

I will try to simplify as much as possible, I can't leak code from my job, but imagine we have 2 entities:

  1. Pet, which contains 3 fields id, pictures (contains 0 or more Picture instances) and createdAt:
src/Core/Entity/Pet.php
 1<?php
 2
 3namespace App\Core\Entity;
 4
 5use Doctrine\Common\Collections\ArrayCollection;
 6use Doctrine\Common\Collections\Collection;
 7use Doctrine\ORM\Mapping as ORM;
 8
 9/**
10 * @ORM\Entity()
11 * @ORM\Table(name="adoption_pet")
12 */
13class Pet
14{
15    /**
16     * @ORM\Id
17     * @ORM\Column(type="integer")
18     * @ORM\GeneratedValue(strategy="AUTO")
19     */
20    protected ?int $id = null;
21
22    /**
23     * @ORM\OneToMany(targetEntity="Picture", mappedBy="pet", cascade={"all"}, orphanRemoval=true)
24     */
25    protected Collection $pictures;
26
27    /**
28     * @ORM\Column(type="datetime_immutable", name="created_at")
29     */
30    protected \DateTimeImmutable $createdAt;
31
32    public function __construct()
33    {
34        $this->pictures = new ArrayCollection();
35    }
36
37    /**
38     * @return int|null
39     */
40    public function getId(): ?int
41    {
42        return $this->id;
43    }
44
45    public function getPictures(): Collection
46    {
47        return $this->pictures;
48    }
49
50    public function addPicture(Picture $picture): void
51    {
52        if (!$this->pictures->contains($picture)) {
53            $this->pictures->add($picture);
54            $picture->setPet($this);
55        }
56    }
57
58    public function setCreatedAt(\DateTimeImmutable $createdAt): void
59    {
60        $this->createdAt = $createdAt;
61    }
62}
  1. Picture, which contains 2 fields id and pet (linked to 1 Pet):
src/Core/Entity/Picture.php
 1<?php
 2
 3namespace App\Core\Entity;
 4
 5use Doctrine\ORM\Mapping as ORM;
 6
 7/**
 8 * @ORM\Entity()
 9 * @ORM\Table(name="adoption_pet_picture")
10 */
11class Picture
12{
13    /**
14     * @ORM\Id
15     * @ORM\Column(type="integer")
16     * @ORM\GeneratedValue(strategy="AUTO")
17     */
18    protected ?int $id = null;
19
20    /**
21     * @ORM\ManyToOne(targetEntity="Pet", inversedBy="pictures")
22     * @ORM\JoinColumn(onDelete="CASCADE")
23     */
24    protected ?Pet $pet = null;
25
26    // ...
27
28    public function getId(): ?int
29    {
30        return $this->id;
31    }
32
33    public function setPet(?Pet $pet): void
34    {
35        $this->pet = $pet;
36    }
37
38    public function getPet(): ?Pet
39    {
40        return $this->pet;
41    }
42}

And we have a seeded database like this;

  • the 1st Pet has 4 Picture,
  • the 2nd Pet has 2 Picture,
  • the 3rd and last Pet has 0 Picture:
 1$entityManager = ...;
 2
 3$pet1 = new Pet();
 4$pet1->setCreatedAt(new \DateTimeImmutable('+1 min'));
 5$pet1->addPicture($picture1 = new Picture());
 6$pet1->addPicture($picture2 = new Picture());
 7$pet1->addPicture($picture3 = new Picture());
 8$pet1->addPicture($picture4 = new Picture());
 9$entityManager->persist($pet1);
10
11$pet2 = new Pet();
12$pet2->setCreatedAt(new \DateTimeImmutable('+2 min'));
13$pet2->addPicture($picture1 = new Picture());
14$pet2->addPicture($picture2 = new Picture());
15$entityManager->persist($pet2);
16
17$pet3 = new Pet();
18$pet3->setCreatedAt(new \DateTimeImmutable('+3 min'));
19$entityManager->persist($pet3);
20
21$entityManager->flush();

We are ready!

So what's the issue?

If we want to fetch the 1st created entity Pet from the database, we can do something like this:

 1$qb = $entityManager->createQueryBuilder()
 2    ->select('pet')
 3    ->from(Pet::class, 'pet')
 4    ->addOrderBy('pet.createdAt', 'asc')
 5    ->setMaxResults(1);
 6
 7$pet = $qb->getQuery()->getSingleResult();
 8
 9echo $pet->getId(); // outputs "1"
10echo count($pet->getPictures()); // outputs "4"

$pet is an instance of Pet, $pet->getPictures() returns 4 Picture, everything is fine... except we have a N+1 query issue when using $pet->getPictures(), because we didn't join the field pictures.

So let's join pictures and select it aswell:

 1$qb = $entityManager->createQueryBuilder()
 2    ->select('pet')
 3    ->addSelect('pictures')
 4    ->from(Pet::class, 'pet')
 5    ->leftJoin('pet.pictures', 'pictures')
 6    ->addOrderBy('pet.createdAt', 'asc')
 7    ->setMaxResults(1);
 8
 9$pet = $qb->getQuery()->getSingleResult();
10
11echo $pet->getId(); // outputs "1"
12echo count($pet->getPictures()); // outputs "1"... wait what?

$pet is still as instance of Pet, but $pet->getPictures() returns only 1 Picture instead of 4.

Why?

This is easily understandable if we take a look to the SQL query generated by Doctrine:

1SELECT
2       a0_.id AS id_0,
3       a0_.created_at AS created_at_1,
4       a1_.id AS id_2,
5       a1_.pet_id AS pet_id_3
6FROM adoption_pet a0_
7    LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
8ORDER BY a0_.created_at ASC
9LIMIT 1

The issue comes from the LIMIT 1 clause. It allows us to fetch only 1 Pet, but is also half-initialize our Pictures:

id_0 created_at_1 id_2 pet_id_3
1 2022-01-08 09:31:48 1 1

This is documented on Doctrine website, but I wasn't aware at all about this behaviour:

"If your query contains a fetch-joined collection specifying the result limit methods are not working as you would expect. Set Max Results restricts the number of database result rows, however in the case of fetch-joined collections one root entity might appear in many rows, effectively hydrating less than the specified number of results."

First and Max Result Items (DQL Query Only):

How to avoid that?

The Doctrine's Paginator

Doctrine provides a class Paginator which allows to use ->setMaxResults() and collections associations:

 1$qb = $entityManager->createQueryBuilder()
 2    ->select('pet')
 3    ->addSelect('pictures')
 4    ->from(Pet::class, 'pet')
 5    ->leftJoin('pet.pictures', 'pictures')
 6    ->addOrderBy('pet.createdAt', 'asc')
 7    ->setMaxResults(1);
 8
 9$paginator = new \Doctrine\ORM\Tools\Pagination\Paginator($qb->getQuery(), $fetchJoinCollection = true);
10$pet = $paginator->getIterator()->current();
11
12echo $pet->getId(); // outputs "1"
13echo count($pet->getPictures()); // outputs "4"

Internally, the Paginator clone your Query and use it for two SQL requests:

  1. Fetch a or multiple identifier(s) (if you used ->setMaxResults(>1)):
1SELECT DISTINCT id_0
2FROM (SELECT DISTINCT id_0, created_at_1
3      FROM (SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
4            FROM adoption_pet a0_
5                     LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id) dctrn_result_inner
6      ORDER BY created_at_1 ASC) dctrn_result
7LIMIT 1;

Which returns:

id_0
1
  1. And then, it uses this result to nicely fetch your entities with their collections associations:
1SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
2FROM adoption_pet a0_
3         LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
4WHERE a0_.id IN (?)
5ORDER BY a0_.created_at ASC

Which returns 4 rows (because our 1st Pet has 4 Picture):

id_0 created_at_1 id_2 pet_id_3
1 2022-01-08 09:31:48 1 1
1 2022-01-08 09:31:48 2 1
1 2022-01-08 09:31:48 3 1
1 2022-01-08 09:31:48 4 1

Keep the control

If you don't want to use the Doctrine's Paginator to keep the control (or if you don't find logical to use a paginator to fetch only 1 entity), then you can use 2 separate SQL queries:

 1$qb = $entityManager->createQueryBuilder()
 2    ->select('pet')
 3    ->addSelect('pictures')
 4    ->from(Pet::class, 'pet')
 5    ->leftJoin('pet.pictures', 'pictures')
 6    ->addOrderBy('pet.createdAt', 'asc');
 7    // ^ we removed the usage of `->setMaxResults(1)` here
 8
 9// Clone the previous QueryBuilder, we only select the identifier
10$qbId = (clone $qb)
11    ->select('pet.id')
12    ->setMaxResults(1); // select 1 result
13
14// Then we use the previously fetched identifier
15$pet = $qb
16    ->andWhere('pet.id = :id')
17    ->setParameter('id', $qbId->getQuery()->getSingleScalarResult())
18    ->getQuery()
19    ->getSingleResult();
20
21echo $pet->getId(); // outputs "1"
22echo count($pet->getPictures()); // outputs "4"

And if you need to fetch multiple results:

 1$qb = $entityManager->createQueryBuilder()
 2    ->select('pet')
 3    ->addSelect('pictures')
 4    ->from(Pet::class, 'pet')
 5    ->leftJoin('pet.pictures', 'pictures')
 6    ->addOrderBy('pet.createdAt', 'asc');
 7    // ^ we removed the usage of `->setMaxResults(1)` here
 8
 9// Clone the previous QueryBuilder, we only select the identifiers
10$qbId = (clone $qb)
11    ->select('pet.id')
12    ->setMaxResults(2); // select 2 results
13
14// Then we use the previously fetched identifiers
15$pets = $qb
16    ->andWhere('pet.id IN (:id)')
17    ->setParameter('id', array_column($qbId->getQuery()->getScalarResult(), 'id'));
18    ->getQuery()
19    ->getResult();
20
21echo $pets[0]->getId(); // outputs "1"
22echo count($pets[0]->getPictures()); // outputs "4"
23
24echo $pets[1]->getId(); // outputs "2"
25echo count($pets[1]->getPictures()); // outputs "2"

tip

Even if it looks like a good idea to keep the control, @Ocramius recommends to use the Doctrine's Paginator: