Capture screenshots of websites as a (host it yourself) API
Capture screenshots of websites as a (host it yourself) API. This project is a wrapper around this library: https://github.com/sindresorhus/capture-website
docker pull robvanderleek/capture-website-api
docker run -it -p 8080:8080 robvanderleek/capture-website-api
curl 'localhost:8080/capture?url=https://news.ycombinator.com/' -o screenshot.png
git clone [email protected]:robvanderleek/capture-website-api.git && cd capture-website-api/standalone
docker build -t cwa .
docker run -it -p 8080:8080 cwa
curl 'localhost:8080/capture?url=https://www.youtube.com' -o screenshot.png
Run in a terminal:
git clone [email protected]:robvanderleek/capture-website-api.git && cd capture-website-api/standalone
yarn
yarn start
curl 'localhost:8080/capture?url=https://www.reddit.com' -o screenshot.png
Deploy and run on Vercel:
git clone [email protected]:robvanderleek/capture-website-api.git && cd capture-website-api/serverless
vercel deploy
vercel ls
curl "${SITE_URL}/api/capture?url=https://www.linkedin.com" -o screenshot.png
Call the /capture
endpoint and pass the site URL using the query parameters url
:
curl 'https://capture-website-api.vercel.app/api/capture?url=http://gmail.com' -o screenshot.png
Simple as that.
Application configuration options can be set as environment veriables or in
a .env
file in the root folder. There's an example .env
file in the codebase: .env.example
Supported options are:
Name | Descrition | Default |
---|---|---|
TIMEOUT | Timeout in seconds for loading a web page | 20 |
CONCURRENCY | Number of captures that run in parallel, more memory allows more captures to run in parallel | 2 |
MAX_QUEUE_LENGTH | Requests that can't be handled directly are queued until the queue is full | 6 |
SHOW_RESULTS | Enable web endpoint to show latest capture | false |
SECRET | Secret string to prevent undesired usage on public endpoints | "" |
Most of the configuration options from the wrapped capture-website
library are supported using query parameters.
For example, to capture a site with a 650x350 viewport, no default background and animations disabled use:
curl 'https://capture-website-api.vercel.app/api/capture?url=http://amazon.com&width=650&height=350&scaleFactor=1&defaultBackground=false&disableAnimations=true&wait_before_screenshot_ms=300' -o screenshot.png
See https://github.com/sindresorhus/capture-website for a full list of options.
You may require to wait for async requests or animations to finish before capturing the screenshot. There are two ways of doing this, both specified in the query parameters:
wait_before_screenshot_ms
(in ms, defaults to 300
) will wait before capturing a screenshot.capture-website
library's delay
(in seconds)Sometimes the capture-website
library has problems capturing sites. You can try to
capture these sites with plain Puppeteer by supplying the query parameter plainPuppeteer=true
This app looks at two environment variables:
SHOW_RESULTS
: if true
the latest capture result can be viewed in the browser by browsing the base urlSECRET
: when set all capture requests need to contain a query parameter secret
whose value matches the value of this environment variableIf you have suggestions for improvements, or want to report a bug, open an issue!
ISC © 2019 Rob van der Leek [email protected] (https://twitter.com/robvanderleek)