Generate PDFs on Amazon AWS with PHP and Puppeteer: The Best Way
warning
This article is a following of article Generate PDFs on Amazon AWS with PHP and Puppeteer, you must consider reading it before going further.
Several months ago, I wrote my first article explaining how to use Browsershot and Puppeteer on AWS Lambda. We saw how to ship a brotli-fied Chrome with our lambda, how to un-brotlify Chrome at the runtime, and how to use it with Browsershot.
But yesterday, I had to update the Chrome version and I faced many issues:
- I had to download Chrome binary and Swiftshader librairies from
chrome-aws-lambda
and do the update manually - Since binaries are not the same, I had to update the
ChromiumFactory
to handle the fileswiftshader.tar.br
- I had to update the Chrome flags list by using those from
chrome-aws-lambda
.
This is the first, and the last time I want to do that.
Why should I do what chrome-aws-lambda
already does well? Isn't possible to use chrome-aws-lambda
with Browsershot?
After many hours, I was able to use chrome-aws-lambda
with a bridge between Browsershot
PHP class and Browsershot's bin/browser.js
, thanks to the method Browsershot#setBinPath
that allows us to use a custom .js
file.
Cleaning
First, let's clean a bunch of things:
- delete
chromium/
directory - delete
Chromium
andChromiumFactory
classes (and remove them from Symfony services configuration) - uninstall dependency
vdechenaux/brotli-bin-amd64
:composer remove vdechenaux/brotli-bin-amd64
Installing chrome-aws-lambda
You can't install whatever version of chrome-aws-lambda
or puppeteer
you want, they must be compatible together, see chrome-aws-lambda
's versioning table.
When writing this article, I decided to go with chrome-aws-lambda@~2.0.0
(which use Chrome 79):
1{ 2 ... 3 "dependencies": { 4 + "chrome-aws-lambda": "~2.0.0", 5 "puppeteer": "~2.0.0" 6 } 7}
Creating the bridge
The most important thing is to handle the input and the output the same way than Browsershot does. It means:
- your binary must be able to handle argument
-f <file.json>
or JSON passed at 1st argument - your binary must output data in base64 when needed
It may be hard, but in fact it's not.
I've created a bin/browser.js
file which:
- get input (request) like Browsershot does (literally a copy/paste)
- update this request with
chrome-aws-lambda
's data (Chrome path and flags) - override
process.argv[2]
with the new JSON request - and run the original Browsershot JS file
1#!/usr/bin/env node 2 3const fs = require('fs'); 4const chromium = require('chrome-aws-lambda'); 5 6const [, , ...args] = process.argv; 7 8/** 9 * There are two ways for Browsershot to communicate with puppeteer: 10 * - By giving a options JSON dump as an argument 11 * - Or by providing a temporary file with the options JSON dump, 12 * the path to this file is then given as an argument with the flag -f 13 */ 14const request = args[0].startsWith('-f ') 15 ? JSON.parse(fs.readFileSync(new URL(args[0].substring(3)))) 16 : JSON.parse(args[0]); 17 18async function bridge() { 19 // merge Browsershot options with chromium-aws-lambda options 20 request.options.executablePath = await chromium.executablePath; 21 request.options.args = [ 22 ...chromium.args, 23 ...request.options.args, 24 '--disable-dev-profile', 25 '--user-data-dir=/dev/null', 26 ]; 27 28 // override process arguments 29 process.argv[2] = JSON.stringify(request); 30 31 // then execute Browsershot's initial binary 32 return require('../vendor/spatie/browsershot/bin/browser'); 33 // or if you use Browsershot ^3.38, see https://github.com/spatie/browsershot/pull/399 34 return require('../vendor/spatie/browsershot/bin/browser').callBrowser(chromium.puppeteer); 35} 36 37bridge();
This is a real bridge between the Browsershot PHP and Browsershot JS.
Using the bridge
In your PHP code, when you use Browsershot:
1$myBrowsershotInstance->setBinPath('/path/to/bin/browser.js');
It is also possible to manually run this file, like Browsershot can do:
1$ PATH=$PATH:/usr/local/bin NODE_PATH=`npm root -g` node 'bin/browser.js' \ 2 '{"url":"https:\/\/google.fr\/","action":"screenshot","options":{"type":"png","args":["--disable-dev-shm-usage"],"viewport":{"width":1920,"height":1080},"ignoreHttpsErrors":true,"waitUntil":"domcontentloaded"}}'
If some base64 code is shown, then the bridge is working correctly!