A curated dataset of Academy Award nominations with IMDb unique identifiers.
BSD-2-CLAUSE License
A curated dataset of Academy Award nominations with IMDb unique identifiers.
The unique identifiers are key to disambiguate people/films with similar names.
Ceremony
(int) - Ordinal for which ceremony the nomination was for (starting at 1)Year
(string) - Year(s) from which the films are honored.Class
(string) - A custom broad grouping for categories. Values include:
CanonicalCategory
(string) - Removes the variations on the exact wording of the category name over the yearsCategory
(string) - The precise category name according to Oscars.orgNomId
(uuid) - Unique string representing the IMDb Nomination IDFilm
(string) - The title of the film (optional)FilmId
(uuid) - Unique string representing the IMDb Title ID.Name
(string) - The precise text used for who is being nominated.Nominees
(comma separated strings) - The names of who is nominated in a comma separated list (without any extra text like "Written by")NomineeIds
(comma separated uuids) - Unique strings (or question marks) representing the IMDb Name ID.Winner
(bool) - True if the award was wonDetail
(string) - Detail about the nomination, which could be the character name, song title, etc.Note
(string) - Additional information provided about the award/nomination.Citation
(string) - Official text of the award statement, for Scientific/Technical/Honorary awards.MultifilmNomination
(bool) - Generally the data is one nomination per row, but for certain early nominations (Ceremonies 1, 2, 3 & 8), people were nominated for multiple films, and so one nomination could be spread over multiple rows.Category (chron)
)oscars_html/search_results.html
.oscars_html/nominations.html
./parse_oscars_html.py -n
./parse_oscars_html.py
./add_fields_to_csv.py
./parse_citations.py
citations.yaml
, and run parse_citations.py
again as needed../scrape_imdb_html.py
./merge.py -w