Doctrine, QueryBuilder::setMaxResults() and collections associations are on a boat
This blog post is associated with some code that you can find here.
Today, we gonna talk about an issue with DBMS, Doctrine, QueryBuilder::setMaxResults() and collections associations.
I've faced this issue while trying to optimize my code to reduce the number of SQL requests after identifying a N+1 query issue.
Some context...
I will try to simplify as much as possible, I can't leak code from my job, but imagine we have 2 entities:
Pet, which contains 3 fieldsid,pictures(contains 0 or morePictureinstances) andcreatedAt:
<?php
namespace App\Core\Entity;
use Doctrine\Common\Collections\ArrayCollection;
use Doctrine\Common\Collections\Collection;
use Doctrine\ORM\Mapping as ORM;
/**
* @ORM\Entity()
* @ORM\Table(name="adoption_pet")
*/
class Pet
{
/**
* @ORM\Id
* @ORM\Column(type="integer")
* @ORM\GeneratedValue(strategy="AUTO")
*/
protected ?int $id = null;
/**
* @ORM\OneToMany(targetEntity="Picture", mappedBy="pet", cascade={"all"}, orphanRemoval=true)
*/
protected Collection $pictures;
/**
* @ORM\Column(type="datetime_immutable", name="created_at")
*/
protected \DateTimeImmutable $createdAt;
public function __construct()
{
$this->pictures = new ArrayCollection();
}
/**
* @return int|null
*/
public function getId(): ?int
{
return $this->id;
}
public function getPictures(): Collection
{
return $this->pictures;
}
public function addPicture(Picture $picture): void
{
if (!$this->pictures->contains($picture)) {
$this->pictures->add($picture);
$picture->setPet($this);
}
}
public function setCreatedAt(\DateTimeImmutable $createdAt): void
{
$this->createdAt = $createdAt;
}
}Picture, which contains 2 fieldsidandpet(linked to 1Pet):
<?php
namespace App\Core\Entity;
use Doctrine\ORM\Mapping as ORM;
/**
* @ORM\Entity()
* @ORM\Table(name="adoption_pet_picture")
*/
class Picture
{
/**
* @ORM\Id
* @ORM\Column(type="integer")
* @ORM\GeneratedValue(strategy="AUTO")
*/
protected ?int $id = null;
/**
* @ORM\ManyToOne(targetEntity="Pet", inversedBy="pictures")
* @ORM\JoinColumn(onDelete="CASCADE")
*/
protected ?Pet $pet = null;
// ...
public function getId(): ?int
{
return $this->id;
}
public function setPet(?Pet $pet): void
{
$this->pet = $pet;
}
public function getPet(): ?Pet
{
return $this->pet;
}
}And we have a seeded database like this;
- the 1st
Pethas 4Picture, - the 2nd
Pethas 2Picture, - the 3rd and last
Pethas 0Picture:
$entityManager = ...;
$pet1 = new Pet();
$pet1->setCreatedAt(new \DateTimeImmutable('+1 min'));
$pet1->addPicture($picture1 = new Picture());
$pet1->addPicture($picture2 = new Picture());
$pet1->addPicture($picture3 = new Picture());
$pet1->addPicture($picture4 = new Picture());
$entityManager->persist($pet1);
$pet2 = new Pet();
$pet2->setCreatedAt(new \DateTimeImmutable('+2 min'));
$pet2->addPicture($picture1 = new Picture());
$pet2->addPicture($picture2 = new Picture());
$entityManager->persist($pet2);
$pet3 = new Pet();
$pet3->setCreatedAt(new \DateTimeImmutable('+3 min'));
$entityManager->persist($pet3);
$entityManager->flush();We are ready!
So what's the issue?
If we want to fetch the 1st created entity Pet from the database, we can do something like this:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->from(Pet::class, 'pet')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$pet = $qb->getQuery()->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"$pet is an instance of Pet, $pet->getPictures() returns 4 Picture, everything is fine... except we have a N+1 query issue when using $pet->getPictures(), because we didn't join the field pictures.
So let's join pictures and select it aswell:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$pet = $qb->getQuery()->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "1"... wait what?$pet is still as instance of Pet, but $pet->getPictures() returns only 1 Picture instead of 4.
Why?
This is easily understandable if we take a look to the SQL query generated by Doctrine:
SELECT
a0_.id AS id_0,
a0_.created_at AS created_at_1,
a1_.id AS id_2,
a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
ORDER BY a0_.created_at ASC
LIMIT 1The issue comes from the LIMIT 1 clause. It allows us to fetch only 1 Pet, but is also half-initialize our Pictures:
| id_0 | created_at_1 | id_2 | pet_id_3 |
|---|---|---|---|
| 1 | 2022-01-08 09:31:48 | 1 | 1 |
This is documented on Doctrine website, but I wasn't aware at all about this behaviour:
"If your query contains a fetch-joined collection specifying the result limit methods are not working as you would expect. Set Max Results restricts the number of database result rows, however in the case of fetch-joined collections one root entity might appear in many rows, effectively hydrating less than the specified number of results."
How to avoid that?
The Doctrine's Paginator
Doctrine provides a class Paginator which allows to use ->setMaxResults() and collections associations:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc')
->setMaxResults(1);
$paginator = new \Doctrine\ORM\Tools\Pagination\Paginator($qb->getQuery(), $fetchJoinCollection = true);
$pet = $paginator->getIterator()->current();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"Internally, the Paginator clone your Query and use it for two SQL requests:
- Fetch a or multiple identifier(s) (if you used
->setMaxResults(>1)):
SELECT DISTINCT id_0
FROM (SELECT DISTINCT id_0, created_at_1
FROM (SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id) dctrn_result_inner
ORDER BY created_at_1 ASC) dctrn_result
LIMIT 1;Which returns:
| id_0 |
|---|
| 1 |
- And then, it uses this result to nicely fetch your entities with their collections associations:
SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
WHERE a0_.id IN (?)
ORDER BY a0_.created_at ASCWhich returns 4 rows (because our 1st Pet has 4 Picture):
| id_0 | created_at_1 | id_2 | pet_id_3 |
|---|---|---|---|
| 1 | 2022-01-08 09:31:48 | 1 | 1 |
| 1 | 2022-01-08 09:31:48 | 2 | 1 |
| 1 | 2022-01-08 09:31:48 | 3 | 1 |
| 1 | 2022-01-08 09:31:48 | 4 | 1 |
Keep the control
If you don't want to use the Doctrine's Paginator to keep the control (or if you don't find logical to use a paginator to fetch only 1 entity), then you can use 2 separate SQL queries:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc');
// ^ we removed the usage of `->setMaxResults(1)` here
// Clone the previous QueryBuilder, we only select the identifier
$qbId = (clone $qb)
->select('pet.id')
->setMaxResults(1); // select 1 result
// Then we use the previously fetched identifier
$pet = $qb
->andWhere('pet.id = :id')
->setParameter('id', $qbId->getQuery()->getSingleScalarResult())
->getQuery()
->getSingleResult();
echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"And if you need to fetch multiple results:
$qb = $entityManager->createQueryBuilder()
->select('pet')
->addSelect('pictures')
->from(Pet::class, 'pet')
->leftJoin('pet.pictures', 'pictures')
->addOrderBy('pet.createdAt', 'asc');
// ^ we removed the usage of `->setMaxResults(1)` here
// Clone the previous QueryBuilder, we only select the identifiers
$qbId = (clone $qb)
->select('pet.id')
->setMaxResults(2); // select 2 results
// Then we use the previously fetched identifiers
$pets = $qb
->andWhere('pet.id IN (:id)')
->setParameter('id', array_column($qbId->getQuery()->getScalarResult(), 'id'));
->getQuery()
->getResult();
echo $pets[0]->getId(); // outputs "1"
echo count($pets[0]->getPictures()); // outputs "4"
echo $pets[1]->getId(); // outputs "2"
echo count($pets[1]->getPictures()); // outputs "2"Even if it looks like a good idea to keep the control, @Ocramius recommends to use the Doctrine's Paginator:
What's important is that:
— @ocramius@mastodon.social 🇺🇦🍥 (@Ocramius) January 5, 2022
* if you do a fetch-join, **DO** use the paginator
* if you don't do a fetch-join, setMaxResults() should suffice