Doctrine, QueryBuilder::setMaxResults() and collections associations are on a boat
tip
This blog post is associated with some code that you can find here.
Today, we gonna talk about an issue with DBMS, Doctrine, QueryBuilder::setMaxResults()
and collections associations.
I've faced this issue while trying to optimize my code to reduce the number of SQL requests after identifying a N+1 query issue.
Some context...
I will try to simplify as much as possible, I can't leak code from my job, but imagine we have 2 entities:
-
Pet
, which contains 3 fieldsid
,pictures
(contains 0 or morePicture
instances) andcreatedAt
:
1// src/Core/Entity/Pet.php 2<?php 3 4namespace App\Core\Entity; 5 6use Doctrine\Common\Collections\ArrayCollection; 7use Doctrine\Common\Collections\Collection; 8use Doctrine\ORM\Mapping as ORM; 9 10/** 11 * @ORM\Entity() 12 * @ORM\Table(name="adoption_pet") 13 */ 14class Pet 15{ 16 /** 17 * @ORM\Id 18 * @ORM\Column(type="integer") 19 * @ORM\GeneratedValue(strategy="AUTO") 20 */ 21 protected ?int $id = null; 22 23 /** 24 * @ORM\OneToMany(targetEntity="Picture", mappedBy="pet", cascade={"all"}, orphanRemoval=true) 25 */ 26 protected Collection $pictures; 27 28 /** 29 * @ORM\Column(type="datetime_immutable", name="created_at") 30 */ 31 protected \DateTimeImmutable $createdAt; 32 33 public function __construct() 34 { 35 $this->pictures = new ArrayCollection(); 36 } 37 38 /** 39 * @return int|null 40 */ 41 public function getId(): ?int 42 { 43 return $this->id; 44 } 45 46 public function getPictures(): Collection 47 { 48 return $this->pictures; 49 } 50 51 public function addPicture(Picture $picture): void 52 { 53 if (!$this->pictures->contains($picture)) { 54 $this->pictures->add($picture); 55 $picture->setPet($this); 56 } 57 } 58 59 public function setCreatedAt(\DateTimeImmutable $createdAt): void 60 { 61 $this->createdAt = $createdAt; 62 } 63}
-
Picture
, which contains 2 fieldsid
andpet
(linked to 1Pet
):
1// src/Core/Entity/Picture.php 2<?php 3 4namespace App\Core\Entity; 5 6use Doctrine\ORM\Mapping as ORM; 7 8/** 9 * @ORM\Entity() 10 * @ORM\Table(name="adoption_pet_picture") 11 */ 12class Picture 13{ 14 /** 15 * @ORM\Id 16 * @ORM\Column(type="integer") 17 * @ORM\GeneratedValue(strategy="AUTO") 18 */ 19 protected ?int $id = null; 20 21 /** 22 * @ORM\ManyToOne(targetEntity="Pet", inversedBy="pictures") 23 * @ORM\JoinColumn(onDelete="CASCADE") 24 */ 25 protected ?Pet $pet = null; 26 27 // ... 28 29 public function getId(): ?int 30 { 31 return $this->id; 32 } 33 34 public function setPet(?Pet $pet): void 35 { 36 $this->pet = $pet; 37 } 38 39 public function getPet(): ?Pet 40 { 41 return $this->pet; 42 } 43}
And we have a seeded database like this;
- the 1st
Pet
has 4Picture
, - the 2nd
Pet
has 2Picture
, - the 3rd and last
Pet
has 0Picture
:
1$entityManager = ...; 2 3$pet1 = new Pet(); 4$pet1->setCreatedAt(new \DateTimeImmutable('+1 min')); 5$pet1->addPicture($picture1 = new Picture()); 6$pet1->addPicture($picture2 = new Picture()); 7$pet1->addPicture($picture3 = new Picture()); 8$pet1->addPicture($picture4 = new Picture()); 9$entityManager->persist($pet1); 10 11$pet2 = new Pet(); 12$pet2->setCreatedAt(new \DateTimeImmutable('+2 min')); 13$pet2->addPicture($picture1 = new Picture()); 14$pet2->addPicture($picture2 = new Picture()); 15$entityManager->persist($pet2); 16 17$pet3 = new Pet(); 18$pet3->setCreatedAt(new \DateTimeImmutable('+3 min')); 19$entityManager->persist($pet3); 20 21$entityManager->flush();
We are ready!
So what's the issue?
If we want to fetch the 1st created entity Pet
from the database, we can do something like this:
1$qb = $entityManager->createQueryBuilder() 2 ->select('pet') 3 ->from(Pet::class, 'pet') 4 ->addOrderBy('pet.createdAt', 'asc') 5 ->setMaxResults(1); 6 7$pet = $qb->getQuery()->getSingleResult(); 8 9echo $pet->getId(); // outputs "1" 10echo count($pet->getPictures()); // outputs "4"
$pet
is an instance of Pet
, $pet->getPictures()
returns 4 Picture
, everything is fine... except we have a N+1 query issue when using $pet->getPictures()
, because we didn't join the field pictures
.
So let's join pictures
and select it aswell:
1$qb = $entityManager->createQueryBuilder() 2 ->select('pet') 3 ->addSelect('pictures') 4 ->from(Pet::class, 'pet') 5 ->leftJoin('pet.pictures', 'pictures') 6 ->addOrderBy('pet.createdAt', 'asc') 7 ->setMaxResults(1); 8 9$pet = $qb->getQuery()->getSingleResult(); 10 11echo $pet->getId(); // outputs "1" 12echo count($pet->getPictures()); // outputs "1"... wait what?
$pet
is still as instance of Pet
, but $pet->getPictures()
returns only 1 Picture
instead of 4.
Why?
This is easily understandable if we take a look to the SQL query generated by Doctrine:
1SELECT 2 a0_.id AS id_0, 3 a0_.created_at AS created_at_1, 4 a1_.id AS id_2, 5 a1_.pet_id AS pet_id_3 6FROM adoption_pet a0_ 7 LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id 8ORDER BY a0_.created_at ASC 9LIMIT 1
The issue comes from the LIMIT 1
clause. It allows us to fetch only 1 Pet
, but is also half-initialize our Pictures
:
id_0 | created_at_1 | id_2 | pet_id_3 |
---|---|---|---|
1 | 2022-01-08 09:31:48 | 1 | 1 |
This is documented on Doctrine website, but I wasn't aware at all about this behaviour:
"If your query contains a fetch-joined collection specifying the result limit methods are not working as you would expect. Set Max Results restricts the number of database result rows, however in the case of fetch-joined collections one root entity might appear in many rows, effectively hydrating less than the specified number of results."
How to avoid that?
The Doctrine's Paginator
Doctrine provides a class Paginator
which allows to use ->setMaxResults()
and collections associations:
1$qb = $entityManager->createQueryBuilder() 2 ->select('pet') 3 ->addSelect('pictures') 4 ->from(Pet::class, 'pet') 5 ->leftJoin('pet.pictures', 'pictures') 6 ->addOrderBy('pet.createdAt', 'asc') 7 ->setMaxResults(1); 8 9$paginator = new \Doctrine\ORM\Tools\Pagination\Paginator($qb->getQuery(), $fetchJoinCollection = true); 10$pet = $paginator->getIterator()->current(); 11 12echo $pet->getId(); // outputs "1" 13echo count($pet->getPictures()); // outputs "4"
Internally, the Paginator
clone your Query
and use it for two SQL requests:
- Fetch a or multiple identifier(s) (if you used
->setMaxResults(>1)
):
1SELECT DISTINCT id_0 2FROM (SELECT DISTINCT id_0, created_at_1 3 FROM (SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3 4 FROM adoption_pet a0_ 5 LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id) dctrn_result_inner 6 ORDER BY created_at_1 ASC) dctrn_result 7LIMIT 1;
Which returns:
id_0 |
---|
1 |
- And then, it uses this result to nicely fetch your entities with their collections associations:
1SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3 2FROM adoption_pet a0_ 3 LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id 4WHERE a0_.id IN (?) 5ORDER BY a0_.created_at ASC
Which returns 4 rows (because our 1st Pet
has 4 Picture
):
id_0 | created_at_1 | id_2 | pet_id_3 |
---|---|---|---|
1 | 2022-01-08 09:31:48 | 1 | 1 |
1 | 2022-01-08 09:31:48 | 2 | 1 |
1 | 2022-01-08 09:31:48 | 3 | 1 |
1 | 2022-01-08 09:31:48 | 4 | 1 |
Keep the control
If you don't want to use the Doctrine's Paginator
to keep the control (or if you don't find logical to use a paginator to fetch only 1 entity), then you can use 2 separate SQL queries:
1$qb = $entityManager->createQueryBuilder() 2 ->select('pet') 3 ->addSelect('pictures') 4 ->from(Pet::class, 'pet') 5 ->leftJoin('pet.pictures', 'pictures') 6 ->addOrderBy('pet.createdAt', 'asc'); 7 // ^ we removed the usage of `->setMaxResults(1)` here 8 9// Clone the previous QueryBuilder, we only select the identifier 10$qbId = (clone $qb) 11 ->select('pet.id') 12 ->setMaxResults(1); // select 1 result 13 14// Then we use the previously fetched identifier 15$pet = $qb 16 ->andWhere('pet.id = :id') 17 ->setParameter('id', $qbId->getQuery()->getSingleScalarResult()) 18 ->getQuery() 19 ->getSingleResult(); 20 21echo $pet->getId(); // outputs "1" 22echo count($pet->getPictures()); // outputs "4"
And if you need to fetch multiple results:
1$qb = $entityManager->createQueryBuilder() 2 ->select('pet') 3 ->addSelect('pictures') 4 ->from(Pet::class, 'pet') 5 ->leftJoin('pet.pictures', 'pictures') 6 ->addOrderBy('pet.createdAt', 'asc'); 7 // ^ we removed the usage of `->setMaxResults(1)` here 8 9// Clone the previous QueryBuilder, we only select the identifiers 10$qbId = (clone $qb) 11 ->select('pet.id') 12 ->setMaxResults(2); // select 2 results 13 14// Then we use the previously fetched identifiers 15$pets = $qb 16 ->andWhere('pet.id IN (:id)') 17 ->setParameter('id', array_column($qbId->getQuery()->getScalarResult(), 'id')); 18 ->getQuery() 19 ->getResult(); 20 21echo $pets[0]->getId(); // outputs "1" 22echo count($pets[0]->getPictures()); // outputs "4" 23 24echo $pets[1]->getId(); // outputs "2" 25echo count($pets[1]->getPictures()); // outputs "2"
tip
Even if it looks like a good idea to keep the control, @Ocramius recommends to use the Doctrine's Paginator
:
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>What's important is that:
— @ocramius@mastodon.social 🇺🇦🍥 (@Ocramius) January 5, 2022
* if you do a fetch-join, **DO** use the paginator
* if you don't do a fetch-join, setMaxResults() should suffice