Internubel-website-scraping

This repository contains a Puppeteer-based script for scraping product details from Internubel's website.

MIT License

Stars
0
Committers
1

Internubel product data scraping script

This repository contains a Puppeteer-based script for scraping product data from Internubel's website (https://www.internubel.be).

The script logs into the site, navigates through product groups, and extracts product details including title, image, nutrition score, and nutritional information.

Data is structured and saved into JSON files categorized by product groups, sub-groups, and sub-sub-groups.

Prerequisites

Requirements for the script:

Installation

  1. Clone the repository
    git clone https://github.com/Jihefel/Internubel-website-scraping.git
    
  2. Install NPM packages
    npm install
    
  3. Change the 14th line of internubel.js to select the language you want to use.
    • Replace "francais" by one of the languages shown above (11th line) if needed : "nederlands", "english", "deutsch"

Config .env variables

Create your configuration file .env in the root directory as the following to store your credentials.

LOGIN_EMAIL=your_email
LOGIN_PASSWORD=your_password

Replace your_email and your_password with your Internubel login credentials.

Usage

Run the scraping script using Node.js in your terminal:

node internubel.js

And wait for a moment...

The script will launch a Puppeteer-controlled browser, log into Internubel using provided credentials, and scrape product data into structured JSON files stored in the data directory.

Dependencies

License

Distributed under the MIT License. See MIT License for more information.