A Python library for automating interaction with websites.
MIT License
.. image:: /assets/mechanical-soup-logo.png :alt: MechanicalSoup. A Python library for automating website interaction.
https://mechanicalsoup.readthedocs.io/
A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn't do JavaScript.
MechanicalSoup was created by M Hickford <https://github.com/hickford/>
, who was a fond user of the
Mechanize <https://github.com/jjlee/mechanize>
library.
Unfortunately, Mechanize was incompatible with Python 3 until 2019 <https://github.com/python-mechanize/mechanize/issues/9>
__ and its development
stalled for several years. MechanicalSoup provides a similar API, built on Python
giants Requests <http://docs.python-requests.org/en/latest/>
__ (for
HTTP sessions) and BeautifulSoup <https://www.crummy.com/software/BeautifulSoup/>
__ (for document
navigation). Since 2017 it is a project actively maintained by a small
team including @hemberger <https://github.com/hemberger>
__ and @moy <https://github.com/moy/>
__.
|Gitter Chat|
|Latest Version| |Supported Versions|
PyPy3 is also supported (and tested against).
Download and install the latest released version from PyPI <https://pypi.python.org/pypi/MechanicalSoup/>
__::
pip install MechanicalSoup
Download and install the development version from GitHub <https://github.com/MechanicalSoup/MechanicalSoup>
__::
pip install git+https://github.com/MechanicalSoup/MechanicalSoup
Installing from source (installs the version in the current working directory)::
python setup.py install
(In all cases, add --user
to the install
command to
install in the current user's home directory.)
The full documentation is available on
https://mechanicalsoup.readthedocs.io/. You may want to jump directly to
the automatically generated API documentation <https://mechanicalsoup.readthedocs.io/en/stable/mechanicalsoup.html>
__.
From <examples/expl_qwant.py>
__, code to get the results from
a Qwant search:
.. code:: python
"""Example usage of MechanicalSoup to get the results from the Qwant
search engine.
"""
import re
import mechanicalsoup
import html
import urllib.parse
# Connect to Qwant
browser = mechanicalsoup.StatefulBrowser(user_agent='MechanicalSoup')
browser.open("https://lite.qwant.com/")
# Fill-in the search form
browser.select_form('#search-form')
browser["q"] = "MechanicalSoup"
browser.submit_selected()
# Display the results
for link in browser.page.select('.result a'):
# Qwant shows redirection links, not the actual URL, so extract
# the actual URL from the redirect link:
href = link.attrs['href']
m = re.match(r"^/redirect/[^/]*/(.*)$", href)
if m:
href = urllib.parse.unquote(m.group(1))
print(link.text, '->', href)
More examples are available in <examples/>
__.
For an example with a more complex form (checkboxes, radio buttons and
textareas), read <tests/test_browser.py>
__
and <tests/test_form.py>
__.
|Build Status| |Coverage Status| |Documentation Status| |CII Best Practices|
Instructions for building, testing and contributing to MechanicalSoup:
see <CONTRIBUTING.rst>
__.
Read the FAQ <https://mechanicalsoup.readthedocs.io/en/stable/faq.html>
__.
.. |Latest Version| image:: https://img.shields.io/pypi/v/MechanicalSoup.svg :target: https://pypi.python.org/pypi/MechanicalSoup/ .. |Supported Versions| image:: https://img.shields.io/pypi/pyversions/mechanicalsoup.svg :target: https://pypi.python.org/pypi/MechanicalSoup/ .. |Build Status| image:: https://github.com/MechanicalSoup/MechanicalSoup/actions/workflows/python-package.yml/badge.svg?branch=main :target: https://github.com/MechanicalSoup/MechanicalSoup/actions/workflows/python-package.yml?query=branch%3Amain .. |Coverage Status| image:: https://codecov.io/gh/MechanicalSoup/MechanicalSoup/branch/main/graph/badge.svg :target: https://codecov.io/gh/MechanicalSoup/MechanicalSoup .. |Documentation Status| image:: https://readthedocs.org/projects/mechanicalsoup/badge/?version=latest :target: https://mechanicalsoup.readthedocs.io/en/latest/?badge=latest .. |CII Best Practices| image:: https://bestpractices.coreinfrastructure.org/projects/1334/badge :target: https://bestpractices.coreinfrastructure.org/projects/1334 .. |Gitter Chat| image:: https://badges.gitter.im/MechanicalSoup/MechanicalSoup.svg :target: https://gitter.im/MechanicalSoup/Lobby