Generate PDFs on Amazon AWS with PHP and Puppeteer: The Best Way
This article is a following of article Generate PDFs on Amazon AWS with PHP and Puppeteer, you must consider reading it before going further.
Several months ago, I wrote my first article explaining how to use Browsershot and Puppeteer on AWS Lambda. We saw how to ship a brotli-fied Chrome with our lambda, how to un-brotlify Chrome at the runtime, and how to use it with Browsershot.
But yesterday, I had to update the Chrome version and I faced many issues:
- I had to download Chrome binary and Swiftshader librairies from
chrome-aws-lambdaand do the update manually - Since binaries are not the same, I had to update the
ChromiumFactoryto handle the fileswiftshader.tar.br - I had to update the Chrome flags list by using those from
chrome-aws-lambda.
This is the first, and the last time I want to do that.
Why should I do what chrome-aws-lambda already does well? Isn't possible to use chrome-aws-lambda with Browsershot?
After many hours, I was able to use chrome-aws-lambda with a bridge between Browsershot PHP class and Browsershot's bin/browser.js, thanks to the method Browsershot#setBinPath that allows us to use a custom .js file.
Cleaning
First, let's clean a bunch of things:
- delete
chromium/directory - delete
ChromiumandChromiumFactoryclasses (and remove them from Symfony services configuration) - uninstall dependency
vdechenaux/brotli-bin-amd64:composer remove vdechenaux/brotli-bin-amd64
Installing chrome-aws-lambda
You can't install whatever version of chrome-aws-lambda or puppeteer you want, they must be compatible together, see chrome-aws-lambda's versioning table.
When writing this article, I decided to go with chrome-aws-lambda@~2.0.0 (which use Chrome 79):
{
...
"dependencies": {
{+ "chrome-aws-lambda": "~2.0.0",+}
"puppeteer": "~2.0.0"
}
}Creating the bridge
The most important thing is to handle the input and the output the same way than Browsershot does. It means:
- your binary must be able to handle argument
-f <file.json>or JSON passed at 1st argument - your binary must output data in base64 when needed
It may be hard, but in fact it's not.
I've created a bin/browser.js file which:
- get input (request) like Browsershot does (literally a copy/paste)
- update this request with
chrome-aws-lambda's data (Chrome path and flags) - override
process.argv[2]with the new JSON request - and run the original Browsershot JS file
#!/usr/bin/env node
const fs = require('fs');
const chromium = require('chrome-aws-lambda');
const [, , ...args] = process.argv;
/**
* There are two ways for Browsershot to communicate with puppeteer:
* - By giving a options JSON dump as an argument
* - Or by providing a temporary file with the options JSON dump,
* the path to this file is then given as an argument with the flag -f
*/
const request = args[0].startsWith('-f ')
? JSON.parse(fs.readFileSync(new URL(args[0].substring(3))))
: JSON.parse(args[0]);
async function bridge() {
// merge Browsershot options with chromium-aws-lambda options
request.options.executablePath = await chromium.executablePath;
request.options.args = [
...chromium.args,
...request.options.args,
'--disable-dev-profile',
'--user-data-dir=/dev/null',
];
// override process arguments
process.argv[2] = JSON.stringify(request);
// then execute Browsershot's initial binary
return require('../vendor/spatie/browsershot/bin/browser');
// or if you use Browsershot ^3.38, see https://github.com/spatie/browsershot/pull/399
return require('../vendor/spatie/browsershot/bin/browser').callBrowser(chromium.puppeteer);
}
bridge();This is a real bridge between the Browsershot PHP and Browsershot JS.
Using the bridge
In your PHP code, when you use Browsershot:
$myBrowsershotInstance->setBinPath('/path/to/bin/browser.js');It is also possible to manually run this file, like Browsershot can do:
$ PATH=$PATH:/usr/local/bin NODE_PATH=`npm root -g` node 'bin/browser.js' \
'{"url":"https:\/\/google.fr\/","action":"screenshot","options":{"type":"png","args":["--disable-dev-shm-usage"],"viewport":{"width":1920,"height":1080},"ignoreHttpsErrors":true,"waitUntil":"domcontentloaded"}}'If some base64 code is shown, then the bridge is working correctly!