Code repo for Packt course I developed, "Beginning Data Wrangling with Python"
MIT License
Code repo for Packt course I developed, "Beginning Data Wrangling with Python".
“Data is the new Oil” and it is ruling the modern way of life through incredibly smart tools and transformative technologies. But oil does not come out in its final form from the rig. It has to be refined through a complex processing network. Similarly, data needs to be curated, massaged and refined to be used in intelligent algorithms and consumer products. This is called “wrangling” and (according to Forbes) all the good data scientists spend almost 60-80% of their time on this, each day, every project. It involves scraping the raw data from multiple sources (including web and database tables), imputing, formatting, transforming – basically making it ready, to be used flawlessly in the modeling process.
This course aims to teach you all the core ideas behind this process and to equip you with the knowledge of the most popular tools and techniques in the domain. As the programming framework, we have chosen Python, the most widely used language for data science. We work through real-life examples, not toy datasets. At the end of this course, you will be confident to handle a myriad array of sources to extract, clean, transform, and format your data for the great machine learning app you are thinking of building. Hop on and be the part of this exciting journey.
For an optimal student experience, we recommend the following hardware configuration:
You’ll also need the following software installed in advance:
Browser: Google Chrome/Mozilla Firefox Latest Version
Python 3.4+ (preferably Python 3.6) installed (from https://python.org)
Python libraries as needed (Jupyter, Numpy, Pandas, Matplotlib, BeautifulSoup4, and so)
Notepad++/Sublime Text (latest version), Atom IDE (latest version) or other similar text editor applications.
The following python libraries installed: