Doctrine, QueryBuilder::setMaxResults() and collections associations are on a boat
TIP
This blog post is associated with some code that you can find here.
Today, we gonna talk about an issue with DBMS, Doctrine, QueryBuilder::setMaxResults()
and collections associations. I've faced this issue while trying to optimize my code to reduce the number of SQL requests after identifying a N+1 query issue.
Some context...
I will try to simplify as much as possible, I can't leak code from my job, but imagine we have 2 entities:
Pet
, which contains 3 fieldsid
,pictures
(contains 0 or morePicture
instances) andcreatedAt
:
// src/Core/Entity/Pet.php
<?php
namespace App\Core\Entity;
use Doctrine\Common\Collections\ArrayCollection;
use Doctrine\Common\Collections\Collection;
use Doctrine\ORM\Mapping as ORM;
/**
* @ORM\Entity()
* @ORM\Table(name="adoption_pet")
*/
class Pet
{
/**
* @ORM\Id
* @ORM\Column(type="integer")
* @ORM\GeneratedValue(strategy="AUTO")
*/
protected ?int $id = null;
/**
* @ORM\OneToMany(targetEntity="Picture", mappedBy="pet", cascade={"all"}, orphanRemoval=true)
*/
protected Collection $pictures;
/**
* @ORM\Column(type="datetime_immutable", name="created_at")
*/
protected \DateTimeImmutable $createdAt;
public function __construct()
{
$this->pictures = new ArrayCollection();
}
/**
* @return int|null
*/
public function getId(): ?int
{
return $this->id;
}
public function getPictures(): Collection
{
return $this->pictures;
}
public function addPicture(Picture $picture): void
{
if (!$this->pictures->contains($picture)) {
$this->pictures->add($picture);
$picture->setPet($this);
}
}
public function setCreatedAt(\DateTimeImmutable $createdAt): void
{
$this->createdAt = $createdAt;
}
}
// src/Core/Entity/Pet.php
<?php
namespace App\Core\Entity;
use Doctrine\Common\Collections\ArrayCollection;
use Doctrine\Common\Collections\Collection;
use Doctrine\ORM\Mapping as ORM;
/**
* @ORM\Entity()
* @ORM\Table(name="adoption_pet")
*/
class Pet
{
/**
* @ORM\Id
* @ORM\Column(type="integer")
* @ORM\GeneratedValue(strategy="AUTO")
*/
protected ?int $id = null;
/**
* @ORM\OneToMany(targetEntity="Picture", mappedBy="pet", cascade={"all"}, orphanRemoval=true)
*/
protected Collection $pictures;
/**
* @ORM\Column(type="datetime_immutable", name="created_at")
*/
protected \DateTimeImmutable $createdAt;
public function __construct()
{
$this->pictures = new ArrayCollection();
}
/**
* @return int|null
*/
public function getId(): ?int
{
return $this->id;
}
public function getPictures(): Collection
{
return $this->pictures;
}
public function addPicture(Picture $picture): void
{
if (!$this->pictures->contains($picture)) {
$this->pictures->add($picture);
$picture->setPet($this);
}
}
public function setCreatedAt(\DateTimeImmutable $createdAt): void
{
$this->createdAt = $createdAt;
}
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
Picture
, which contains 2 fieldsid
andpet
(linked to 1Pet
):
// src/Core/Entity/Picture.php
<?php
namespace App\Core\Entity;
use Doctrine\ORM\Mapping as ORM;
/**
* @ORM\Entity()
* @ORM\Table(name="adoption_pet_picture")
*/
class Picture
{
/**
* @ORM\Id
* @ORM\Column(type="integer")
* @ORM\GeneratedValue(strategy="AUTO")
*/
protected ?int $id = null;
/**
* @ORM\ManyToOne(targetEntity="Pet", inversedBy="pictures")
* @ORM\JoinColumn(onDelete="CASCADE")
*/
protected ?Pet $pet = null;
// ...
public function getId(): ?int
{
return $this->id;
}
public function setPet(?Pet $pet): void
{
$this->pet = $pet;
}
public function getPet(): ?Pet
{
return $this->pet;
}
}
// src/Core/Entity/Picture.php
<?php
namespace App\Core\Entity;
use Doctrine\ORM\Mapping as ORM;
/**
* @ORM\Entity()
* @ORM\Table(name="adoption_pet_picture")
*/
class Picture
{
/**
* @ORM\Id
* @ORM\Column(type="integer")
* @ORM\GeneratedValue(strategy="AUTO")
*/
protected ?int $id = null;
/**
* @ORM\ManyToOne(targetEntity="Pet", inversedBy="pictures")
* @ORM\JoinColumn(onDelete="CASCADE")
*/
protected ?Pet $pet = null;
// ...
public function getId(): ?int
{
return $this->id;
}
public function setPet(?Pet $pet): void
{
$this->pet = $pet;
}
public function getPet(): ?Pet
{
return $this->pet;
}
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
And we have a seeded database like this;
- the 1st
Pet
has 4Picture
, - the 2nd
Pet
has 2Picture
, - the 3rd and last
Pet
has 0Picture
:
$entityManager = ...;
$pet1 = new Pet();
$pet1->setCreatedAt(new \DateTimeImmutable('+1 min'));
$pet1->addPicture($picture1 = new Picture());
$pet1->addPicture($picture2 = new Picture());
$pet1->addPicture($picture3 = new Picture());
$pet1->addPicture($picture4 = new Picture());
$entityManager->persist($pet1);
$pet2 = new Pet();
$pet2->setCreatedAt(new \DateTimeImmutable('+2 min'));
$pet2->addPicture($picture1 = new Picture());
$pet2->addPicture($picture2 = new Picture());
$entityManager->persist($pet2);
$pet3 = new Pet();
$pet3->setCreatedAt(new \DateTimeImmutable('+3 min'));
$entityManager->persist($pet3);
$entityManager->flush();
$entityManager = ...;
$pet1 = new Pet();
$pet1->setCreatedAt(new \DateTimeImmutable('+1 min'));
$pet1->addPicture($picture1 = new Picture());
$pet1->addPicture($picture2 = new Picture());
$pet1->addPicture($picture3 = new Picture());
$pet1->addPicture($picture4 = new Picture());
$entityManager->persist($pet1);
$pet2 = new Pet();
$pet2->setCreatedAt(new \DateTimeImmutable('+2 min'));
$pet2->addPicture($picture1 = new Picture());
$pet2->addPicture($picture2 = new Picture());
$entityManager->persist($pet2);
$pet3 = new Pet();
$pet3->setCreatedAt(new \DateTimeImmutable('+3 min'));
$entityManager->persist($pet3);
$entityManager->flush();
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
We are ready!
So what's the issue?
If we want to fetch the 1st created entity Pet
from the database, we can do something like this:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->from(Pet::class, 'pet')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$pet = $qb->getQuery()->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"
$qb = $entityManager->createQueryBuilder()
->select('pet')
->from(Pet::class, 'pet')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$pet = $qb->getQuery()->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"
2
3
4
5
6
7
8
9
10
$pet
is an instance of Pet
, $pet->getPictures()
returns 4 Picture
, everything is fine... except we have a N+1 query issue when using $pet->getPictures()
, because we didn't join the field pictures
.
So let's join pictures
and select it aswell:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$pet = $qb->getQuery()->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "1"... wait what?
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$pet = $qb->getQuery()->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "1"... wait what?
2
3
4
5
6
7
8
9
10
11
12
$pet
is still as instance of Pet
, but $pet->getPictures()
returns only 1 Picture
instead of 4.
Why?
This is easily understandable if we take a look to the SQL query generated by Doctrine:
SELECT
a0_.id AS id_0,
a0_.created_at AS created_at_1,
a1_.id AS id_2,
a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
ORDER BY a0_.created_at ASC
LIMIT 1
SELECT
a0_.id AS id_0,
a0_.created_at AS created_at_1,
a1_.id AS id_2,
a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
ORDER BY a0_.created_at ASC
LIMIT 1
2
3
4
5
6
7
8
9
The issue comes from the LIMIT 1
clause. It allows us to fetch only 1 Pet
, but is also half-initialize our Pictures
:
id_0 | created_at_1 | id_2 | pet_id_3 |
---|---|---|---|
1 | 2022-01-08 09:31:48 | 1 | 1 |
This is documented on Doctrine website, but I wasn't aware at all about this behaviour:
"If your query contains a fetch-joined collection specifying the result limit methods are not working as you would expect. Set Max Results restricts the number of database result rows, however in the case of fetch-joined collections one root entity might appear in many rows, effectively hydrating less than the specified number of results."
How to avoid that?
The Doctrine's Paginator
Doctrine provides a class Paginator
which allows to use ->setMaxResults()
and collections associations:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$paginator = new \Doctrine\ORM\Tools\Pagination\Paginator($qb->getQuery(), $fetchJoinCollection = true);
$pet = $paginator->getIterator()->current();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$paginator = new \Doctrine\ORM\Tools\Pagination\Paginator($qb->getQuery(), $fetchJoinCollection = true);
$pet = $paginator->getIterator()->current();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"
2
3
4
5
6
7
8
9
10
11
12
13
Internally, the Paginator
clone your Query
and use it for two SQL requests:
- Fetch a or multiple identifier(s) (if you used
->setMaxResults(>1)
):
SELECT DISTINCT id_0
FROM (SELECT DISTINCT id_0, created_at_1
FROM (SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id) dctrn_result_inner
ORDER BY created_at_1 ASC) dctrn_result
LIMIT 1;
SELECT DISTINCT id_0
FROM (SELECT DISTINCT id_0, created_at_1
FROM (SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id) dctrn_result_inner
ORDER BY created_at_1 ASC) dctrn_result
LIMIT 1;
2
3
4
5
6
7
Which returns:
id_0 |
---|
1 |
- And then, it uses this result to nicely fetch your entities with their collections associations:
SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
WHERE a0_.id IN (?)
ORDER BY a0_.created_at ASC
SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
WHERE a0_.id IN (?)
ORDER BY a0_.created_at ASC
2
3
4
5
Which returns 4 rows (because our 1st Pet
has 4 Picture
):
id_0 | created_at_1 | id_2 | pet_id_3 |
---|---|---|---|
1 | 2022-01-08 09:31:48 | 1 | 1 |
1 | 2022-01-08 09:31:48 | 2 | 1 |
1 | 2022-01-08 09:31:48 | 3 | 1 |
1 | 2022-01-08 09:31:48 | 4 | 1 |
Keep the control
If you don't want to use the Doctrine's Paginator
to keep the control (or if you don't find logical to use a paginator to fetch only 1 entity), then you can use 2 separate SQL queries:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc');
// ^ we removed the usage of `->setMaxResults(1)` here
// Clone the previous QueryBuilder, we only select the identifier
$qbId = (clone $qb)
->select('pet.id')
->setMaxResults(1); // select 1 result
// Then we use the previously fetched identifier
$pet = $qb
->andWhere('pet.id = :id')
->setParameter('id', $qbId->getQuery()->getSingleScalarResult())
->getQuery()
->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc');
// ^ we removed the usage of `->setMaxResults(1)` here
// Clone the previous QueryBuilder, we only select the identifier
$qbId = (clone $qb)
->select('pet.id')
->setMaxResults(1); // select 1 result
// Then we use the previously fetched identifier
$pet = $qb
->andWhere('pet.id = :id')
->setParameter('id', $qbId->getQuery()->getSingleScalarResult())
->getQuery()
->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
And if you need to fetch multiple results:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc');
// ^ we removed the usage of `->setMaxResults(1)` here
// Clone the previous QueryBuilder, we only select the identifiers
$qbId = (clone $qb)
->select('pet.id')
->setMaxResults(2); // select 2 results
// Then we use the previously fetched identifiers
$pets = $qb
->andWhere('pet.id IN (:id)')
->setParameter('id', array_column($qbId->getQuery()->getScalarResult(), 'id'));
->getQuery()
->getResult();
echo $pets[0]->getId(); // outputs "1"
echo count($pets[0]->getPictures()); // outputs "4"
echo $pets[1]->getId(); // outputs "2"
echo count($pets[1]->getPictures()); // outputs "2"
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc');
// ^ we removed the usage of `->setMaxResults(1)` here
// Clone the previous QueryBuilder, we only select the identifiers
$qbId = (clone $qb)
->select('pet.id')
->setMaxResults(2); // select 2 results
// Then we use the previously fetched identifiers
$pets = $qb
->andWhere('pet.id IN (:id)')
->setParameter('id', array_column($qbId->getQuery()->getScalarResult(), 'id'));
->getQuery()
->getResult();
echo $pets[0]->getId(); // outputs "1"
echo count($pets[0]->getPictures()); // outputs "4"
echo $pets[1]->getId(); // outputs "2"
echo count($pets[1]->getPictures()); // outputs "2"
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
TIP
Even if it looks like a good idea to keep the control, @Ocramius recommends to use the Doctrine's Paginator
: