Doctrine, QueryBuilder::setMaxResults() and collections associations are on a boat

TIP

This blog post is associated with some code that you can find here (opens new window).

Today, we gonna talk about an issue with DBMS, Doctrine (opens new window), QueryBuilder::setMaxResults() and collections associations. I've faced this issue while trying to optimize my code to reduce the number of SQL requests after identifying a N+1 query issue (opens new window).

# Some context...

I will try to simplify as much as possible, I can't leak code from my job, but imagine we have 2 entities:

  1. Pet, which contains 3 fields id, pictures (contains 0 or more Picture instances) and createdAt:
// src/Core/Entity/Pet.php
<?php

namespace App\Core\Entity;

use Doctrine\Common\Collections\ArrayCollection;
use Doctrine\Common\Collections\Collection;
use Doctrine\ORM\Mapping as ORM;

/**
 * @ORM\Entity()
 * @ORM\Table(name="adoption_pet")
 */
class Pet
{
    /**
     * @ORM\Id
     * @ORM\Column(type="integer")
     * @ORM\GeneratedValue(strategy="AUTO")
     */
    protected ?int $id = null;

    /**
     * @ORM\OneToMany(targetEntity="Picture", mappedBy="pet", cascade={"all"}, orphanRemoval=true)
     */
    protected Collection $pictures;

    /**
     * @ORM\Column(type="datetime_immutable", name="created_at")
     */
    protected \DateTimeImmutable $createdAt;

    public function __construct()
    {
        $this->pictures = new ArrayCollection();
    }

    /**
     * @return int|null
     */
    public function getId(): ?int
    {
        return $this->id;
    }

    public function getPictures(): Collection
    {
        return $this->pictures;
    }

    public function addPicture(Picture $picture): void
    {
        if (!$this->pictures->contains($picture)) {
            $this->pictures->add($picture);
            $picture->setPet($this);
        }
    }

    public function setCreatedAt(\DateTimeImmutable $createdAt): void
    {
        $this->createdAt = $createdAt;
    }
}
  1. Picture, which contains 2 fields id and pet (linked to 1 Pet):
// src/Core/Entity/Picture.php
<?php

namespace App\Core\Entity;

use Doctrine\ORM\Mapping as ORM;

/**
 * @ORM\Entity()
 * @ORM\Table(name="adoption_pet_picture")
 */
class Picture
{
    /**
     * @ORM\Id
     * @ORM\Column(type="integer")
     * @ORM\GeneratedValue(strategy="AUTO")
     */
    protected ?int $id = null;

    /**
     * @ORM\ManyToOne(targetEntity="Pet", inversedBy="pictures")
     * @ORM\JoinColumn(onDelete="CASCADE")
     */
    protected ?Pet $pet = null;

    // ...

    public function getId(): ?int
    {
        return $this->id;
    }

    public function setPet(?Pet $pet): void
    {
        $this->pet = $pet;
    }

    public function getPet(): ?Pet
    {
        return $this->pet;
    }
}

And we have a seeded database like this;

  • the 1st Pet has 4 Picture,
  • the 2nd Pet has 2 Picture,
  • the 3rd and last Pet has 0 Picture:
$entityManager = ...;

$pet1 = new Pet();
$pet1->setCreatedAt(new \DateTimeImmutable('+1 min'));
$pet1->addPicture($picture1 = new Picture());
$pet1->addPicture($picture2 = new Picture());
$pet1->addPicture($picture3 = new Picture());
$pet1->addPicture($picture4 = new Picture());
$entityManager->persist($pet1);

$pet2 = new Pet();
$pet2->setCreatedAt(new \DateTimeImmutable('+2 min'));
$pet2->addPicture($picture1 = new Picture());
$pet2->addPicture($picture2 = new Picture());
$entityManager->persist($pet2);

$pet3 = new Pet();
$pet3->setCreatedAt(new \DateTimeImmutable('+3 min'));
$entityManager->persist($pet3);

$entityManager->flush();

We are ready!

# So what's the issue?

If we want to fetch the 1st created entity Pet from the database, we can do something like this:

$qb = $entityManager->createQueryBuilder()
    ->select('pet')
    ->from(Pet::class, 'pet')
    ->addOrderBy('pet.createdAt', 'asc')
    ->setMaxResults(1);

$pet = $qb->getQuery()->getSingleResult();

echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"

$pet is an instance of Pet, $pet->getPictures() returns 4 Picture, everything is fine... except we have a N+1 query issue when using $pet->getPictures(), because we didn't join the field pictures.

So let's join pictures and select it aswell:



 

 








$qb = $entityManager->createQueryBuilder()
    ->select('pet')
    ->addSelect('pictures')
    ->from(Pet::class, 'pet')
    ->leftJoin('pet.pictures', 'pictures')
    ->addOrderBy('pet.createdAt', 'asc')
    ->setMaxResults(1);

$pet = $qb->getQuery()->getSingleResult();

echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "1"... wait what?

$pet is still as instance of Pet, but $pet->getPictures() returns only 1 Picture instead of 4.

Why?

This is easily understandable if we take a look to the SQL query generated by Doctrine:

SELECT 
       a0_.id AS id_0, 
       a0_.created_at AS created_at_1,
       a1_.id AS id_2,
       a1_.pet_id AS pet_id_3 
FROM adoption_pet a0_ 
    LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id 
ORDER BY a0_.created_at ASC 
LIMIT 1

The issue comes from the LIMIT 1 clause. It allows us to fetch only 1 Pet, but is also half-initialize our Pictures:

id_0 created_at_1 id_2 pet_id_3
1 2022-01-08 09:31:48 1 1

This is documented on Doctrine website, but I wasn't aware at all about this behaviour:

"If your query contains a fetch-joined collection specifying the result limit methods are not working as you would expect. Set Max Results restricts the number of database result rows, however in the case of fetch-joined collections one root entity might appear in many rows, effectively hydrating less than the specified number of results."

First and Max Result Items (DQL Query Only) (opens new window):

# How to avoid that?

# The Doctrine's Paginator

Doctrine provides a class Paginator (opens new window) which allows to use ->setMaxResults() and collections associations:









 
 




$qb = $entityManager->createQueryBuilder()
    ->select('pet')
    ->addSelect('pictures')
    ->from(Pet::class, 'pet')
    ->leftJoin('pet.pictures', 'pictures')
    ->addOrderBy('pet.createdAt', 'asc')
    ->setMaxResults(1);

$paginator = new \Doctrine\ORM\Tools\Pagination\Paginator($qb->getQuery(), $fetchJoinCollection = true);
$pet = $paginator->getIterator()->current();

echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"

Internally, the Paginator clone your Query and use it for two SQL requests:

  1. Fetch a or multiple identifier(s) (if you used ->setMaxResults(>1)):
SELECT DISTINCT id_0
FROM (SELECT DISTINCT id_0, created_at_1
      FROM (SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
            FROM adoption_pet a0_
                     LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id) dctrn_result_inner
      ORDER BY created_at_1 ASC) dctrn_result
LIMIT 1;

Which returns :

id_0
1
  1. And then, it uses this result to nicely fetch your entities with their collections associations:
SELECT a0_.id AS id_0, a0_.created_at AS created_at_1, a1_.id AS id_2, a1_.pet_id AS pet_id_3
FROM adoption_pet a0_
         LEFT JOIN adoption_pet_picture a1_ ON a0_.id = a1_.pet_id
WHERE a0_.id IN (?)
ORDER BY a0_.created_at ASC

Which returns 4 rows (because our 1st Pet has 4 Picture):

id_0 created_at_1 id_2 pet_id_3
1 2022-01-08 09:31:48 1 1
1 2022-01-08 09:31:48 2 1
1 2022-01-08 09:31:48 3 1
1 2022-01-08 09:31:48 4 1

# Keep the control

If you don't want to use the Doctrine's Paginator to keep the control (or if you don't find logical to use a paginator to fetch only 1 entity), then you can use 2 separate SQL queries:







 

 
 
 
 

 


 






$qb = $entityManager->createQueryBuilder()
    ->select('pet')
    ->addSelect('pictures')
    ->from(Pet::class, 'pet')
    ->leftJoin('pet.pictures', 'pictures')
    ->addOrderBy('pet.createdAt', 'asc');
    // ^ we removed the usage of `->setMaxResults(1)` here

// Clone the previous QueryBuilder, we only select the identifier
$qbId = (clone $qb)
    ->select('pet.id')
    ->setMaxResults(1); // select 1 result

// Then we use the previously fetched identifier
$pet = $qb
    ->andWhere('pet.id = :id')
    ->setParameter('id', $qbId->getQuery()->getSingleScalarResult())
    ->getQuery()
    ->getSingleResult();

echo $pet->getId(); // outputs "1"
echo count($pet->getPictures()); // outputs "4"

And if you need to fetch multiple results:







 

 
 
 
 

 


 









$qb = $entityManager->createQueryBuilder()
    ->select('pet')
    ->addSelect('pictures')
    ->from(Pet::class, 'pet')
    ->leftJoin('pet.pictures', 'pictures')
    ->addOrderBy('pet.createdAt', 'asc');
    // ^ we removed the usage of `->setMaxResults(1)` here

// Clone the previous QueryBuilder, we only select the identifiers
$qbId = (clone $qb)
    ->select('pet.id')
    ->setMaxResults(2); // select 2 results

// Then we use the previously fetched identifiers
$pets = $qb
    ->andWhere('pet.id IN (:id)')
    ->setParameter('id', array_column($qbId->getQuery()->getScalarResult(), 'id'));
    ->getQuery()
    ->getResult();

echo $pets[0]->getId(); // outputs "1"
echo count($pets[0]->getPictures()); // outputs "4"

echo $pets[1]->getId(); // outputs "2"
echo count($pets[1]->getPictures()); // outputs "2"

TIP

Even if it looks like a good idea to keep the control, @Ocramius (opens new window) recommends to use the Doctrine's Paginator: