russian-ira-facebook-ads-datasette

Explore 3,500 Facebook ads reported to have been bought by the Russian Internet Research Agency

Stars
16
Committers
1

Converting irads JSON to Datasette

The House Intelligence Committee released 3,517 Facebook ads that were reported to have been bought by the Russian Internet Research Agency as a set of redacted PDF files.

Companion blog post: Analyzing US Election Russian Facebook Ads

Ed Summers wrote a parser that converts those PDFs into a JSON file: https://github.com/umd-mith/irads

The script in this repository downloads that JSON file and converts it into a SQLite database for use with Datasette. Use it like this:

pip3 install sqlite-utils
python3 fetch_and_build_russian_ads.py \
    https://raw.githubusercontent.com/umd-mith/irads/master/site/index.json \
    russian-ads.db

This will produce a SQLite database called ads.db. You can then explore it locally with Datasette like so:

pip3 install datasette
datasette ads.db

To see the full customized interface you will need to install a custom branch of Datasette plus a custom Datasette plugin. See the Dockerfile, or do this:

pip3 install https://github.com/simonw/datasette/archive/filter-plugin-hook.zip
pip3 install datasette-json-html pyyaml
python3 build_metadata.py
datasette russian-ads.db \
  -m russian-ads-metadata.json \
  --config default_page_size:50 --config sql_time_limit_ms:3000 \
  --config num_sql_threads:10 --config facet_time_limit_ms:3000 \
  --config allow_sql:off --config force_https_urls:1 \
  --plugins-dir=plugins --static static:static