Scripts to scrape quiz content off Edx, remove answers & consolidate to a single html for print-out/practice/review purposes.
This project requires:
Get the source code onto your computer:
if you have git:
git clone [email protected]:weilu/towardmit.git
Otherwise click the green button "Clone or download" on this page, click "Download ZIP". Then unzip the file you downloaded.
Then open the Terminal app (MacOS) or the equivalent of a console thing on other unix systems and executing the following commands:
cd towardmit
pip install -r requirements.txt
cp scrape_sample.sh scrape.sh
In the scrape.sh file, the [your request headers] needs to be replaced by request headers obtained from your browser. The request headers will be a series of '-H' options, which includes your edX login details. The header you need to scrape the courses are the cookies, which can be detected because they start with the work Cookie (e.g. -H 'Cookie: __cf...').
These headers can be found by doing the following.
(instructions are with the Chrome or Chromium web-browser, and tested using Linux & MacOS)
You may need to edit the scrape.py file to adjust for the version of the course you are enrolled to. This can be achieved by doing the following.
python scrape.py
The generated quiz html files can be found in your out
directory