advertools - online marketing productivity and analysis tools
MIT License
.. image:: https://img.shields.io/pypi/v/advertools.svg :target: https://pypi.python.org/pypi/advertools
.. image:: https://readthedocs.org/projects/advertools/badge/?version=latest :target: https://advertools.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status
.. image:: https://static.pepy.tech/badge/advertools :target: http://pepy.tech/project/advertools
Announcing Data Science with Python for SEO course <https://bit.ly/dsseo-course>
_: Cohort based course, interactive, live-coding.
advertools
: productivity & analysis tools to scale your online marketing| A digital marketer is a data scientist. | Your job is to manage, manipulate, visualize, communicate, understand, and make decisions based on data.
You might be doing basic stuff, like copying and pasting text on spread sheets, you might be running large scale automated platforms with sophisticated algorithms, or somewhere in between. In any case your job is all about working with data.
As a data scientist you don't spend most of your time producing cool visualizations or finding great insights. The majority of your time is spent wrangling with URLs, figuring out how to stitch together two tables, hoping that the dates, won't break, without you knowing, or trying to generate the next 124,538 keywords for an upcoming campaign, by the end of the week!
advertools
is a Python package that can hopefully make that part of your job a little easier.
.. code:: bash
python3 -m pip install advertools
It's very easy to learn how to use advertools. There are two main reasons for that.
First, it is essentially a set of independent functions that you can easily learn and use. There are no special data structures, or additional learning that you need. With basic Python, and an understanding of the tasks that these functions help with, you should be able to pick it up fairly easily. In other words, if you know how to use an Excel formula, you can easily use any advertools function.
The second reason is that advertools
follows the UNIX philosophy in its design and
approach. Here is one of the various summaries of the UNIX philosophy by Doug McIlroy:
Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
Let's see how advertools follows that:
Do one thing and do it well: Each function in advertools aims for that. There is a function that just extracts hashtags from a text list, another one to crawl websites, one to test which URLs are blocked by robots.txt files, and one for downloading XML sitemaps. Although they are designed to work together as a full pipeline, they can be run independently in whichever combination or sequence you want.
Write programs to work together: Independence does not mean they are unrelated. The workflows are designed to aid the online marketing practitioner in various steps for understanding websites, SEO analysis, creating SEM campaigns and others.
Programs to handle text streams because that is a universal interface: In Data Science the most used data structure that can be considered “universal” is the DataFrame. So, most functions return either a DataFrame or a file that can be read into one. Once you have it, you have the full power of all other tools like pandas for further manipulating the data, Plotly for visualization, or any machine learning library that can more easily handle tabular data.
This way it is kept modular as well as flexible and integrated.
As a next step most of these functions are being converted to no-code
interactive apps <https://adver.tools>
_ for non-coders, and taking them to the next
level.
The most important thing to achieve in SEM is a proper mapping between the three main elements of a search campaign
Keywords (the intention) -> Ads (your promise) -> Landing Pages (your delivery of the promise) Once you have this done, you can focus on management and analysis. More importantly, once you know that you can set this up in an easy way, you know you can focus on more strategic issues. In practical terms you need two main tables to get started:
Keywords: You can generate keywords <https://advertools.readthedocs.io/en/master/advertools.kw_generate.html>
_ (note I didn't say research) with the
kw_generate
function.
Ads: There are two approaches that you can use:
ad_create <https://advertools.readthedocs.io/en/master/advertools.ad_create.html>
_ function for more details.ad_from_string <https://advertools.readthedocs.io/en/master/advertools.ad_from_string.html>
_Tutorials and additional resources
Data Science for Digital Marketing and SEO/SEM <https://www.oncrawl.com/technical-seo/data-science-seo-digital-marketing-guide-beginners/>
_Setting a full SEM campaign <https://www.datacamp.com/community/tutorials/sem-data-science>
_ for DataCamp's website tutorialgenerating SEM keywords with Python <https://www.datacamp.com/projects/400>
_ on DataCampSetting up SEM campaigns on a large scale <https://www.semrush.com/blog/setting-up-search-engine-marketing-campaigns-on-large-scale/>
_ tutorial on SEMrushtool to generate keywords <https://www.dashboardom.com/advertools>
_ online based on the kw_generate
functionProbably the most comprehensive online marketing area that is both technical (crawling, indexing, rendering, redirects, etc.) and non-technical (content creation, link building, outreach, etc.). Here are some tools that can help with your SEO
SEO crawler: <https://advertools.readthedocs.io/en/master/advertools.spider.html>
_
A generic SEO crawler that can be customized, built with Scrapy, & with several
features:
robots.txt downloader <https://advertools.readthedocs.io/en/master/advertools.sitemaps.html#advertools.sitemaps.robotstxt_to_df>
_
A simple downloader of robots.txt files in a DataFrame format, so you can
keep track of changes across crawls if any, and check the rules, sitemaps,
etc.
XML Sitemaps downloader / parser <https://advertools.readthedocs.io/en/master/advertools.sitemaps.html>
_
An essential part of any SEO analysis is to check XML sitemaps. This is a
simple function with which you can download one or more sitemaps (by
providing the URL for a robots.txt file, a sitemap file, or a sitemap index
SERP importer and parser for Google & YouTube <https://advertools.readthedocs.io/en/master/advertools.serp.html>
_
Connect to Google's API and get the search data you want. Multiple search
parameters supported, all in one function call, and all results returned in a
DataFrame
Tutorials and additional resources
serp_goog
function to get SERP rankings on Google <https://www.dashboardom.com/google-serp>
_analyzing SERPs on a large scale with Python <https://www.semrush.com/blog/analyzing-search-engine-results-pages/>
_ on SEMrushSERP datasets on Kaggle <https://www.kaggle.com/eliasdabbas/datasets?search=engine>
_ for practicing on different industries and use casesSERP notebooks on Kaggle <https://www.kaggle.com/eliasdabbas/notebooks?sortBy=voteCount&group=everyone&pageSize=20&userId=484496&tagIds=1220>
_Content Analysis with XML Sitemaps and Python <https://www.semrush.com/blog/content-analysis-xml-sitemaps-python/>
_news sites <https://www.kaggle.com/eliasdabbas/news-sitemaps>
, Turkish news sites <https://www.kaggle.com/eliasdabbas/turk-haber-sitelerinin-site-haritalari>
,Bloomberg news <https://www.kaggle.com/eliasdabbas/bloomberg-business-articles-urls>
_URLs, page titles, tweets, video descriptions, comments, hashtags are some
examples of the types of text we deal with. advertools
provides a few
options for text analysis
Word frequency <https://advertools.readthedocs.io/en/master/advertools.word_frequency.html>
_
Counting words in a text list is one of the most basic and important tasks in
text mining. What is also important is counting those words by taking in
consideration their relative weights in the dataset. word_frequency
does
just that.
URL Analysis <https://advertools.readthedocs.io/en/master/advertools.urlytics.html>
_
We all have to handle many thousands of URLs in reports, crawls, social media
extracts, XML sitemaps and so on. url_to_df
converts your URLs into
easily readable DataFrames.
Emoji <https://advertools.readthedocs.io/en/master/advertools.emoji.html>
_
Produced with one click, extremely expressive, highly diverse (3k+ emoji),
and very popular, it's important to capture what people are trying to communicate
with emoji. Extracting emoji, get their names, groups, and sub-groups is
possible. The full emoji database is also available for convenience, as well
as an emoji_search
function in case you want some ideas for your next
social media or any kind of communication
extract_ functions <https://advertools.readthedocs.io/en/master/advertools.extract.html>
_
The text that we deal with contains many elements and entities that have
their own special meaning and usage. There is a group of convenience
functions to help in extracting and getting basic statistics about structured
entities in text; emoji, hashtags, mentions, currency, numbers, URLs, questions
and more. You can also provide a special regex for your own needs.
Stopwords <https://advertools.readthedocs.io/en/master/advertools.stopwords.html>
_
A list of stopwords in forty different languages to help in text analysis.
Tutorial on DataCamp for creating the word_frequency
function and
explaining the importance of the difference between absolute and weighted word frequency <https://www.datacamp.com/community/tutorials/absolute-weighted-word-frequency>
_
Text Analysis for Online Marketers <https://www.semrush.com/blog/text-analysis-for-online-marketers/>
_
An introductory article on SEMrush
In addition to the text analysis techniques provided, you can also connect to
the Twitter and YouTube data APIs. The main benefits of using advertools
for this:
Handles pagination and request limits: typically every API has a limited number of results that it returns. You have to handle pagination when you need more than the limit per request, which you typically do. This is handled by default
DataFrame results: APIs send you back data in a formats that need to be parsed and cleaned so you can more easily start your analysis. This is also handled automatically
Multiple requests: in YouTube's case you might want to request data for the same query across several countries, languages, channels, etc. You can specify them all in one request and get the product of all the requests in one response
Tutorials and additional resources
A visual tool to check what is trending on Twitter <https://www.dashboardom.com/trending-twitter>
_ for all available locations
A Twitter data analysis dashboard <https://www.dashboardom.com/twitterdash>
_ with many options
How to use the Twitter data API with Python <https://www.kaggle.com/eliasdabbas/twitter-in-a-dataframe>
_
Extracting entities from social media posts <https://www.kaggle.com/eliasdabbas/extract-entities-from-social-media-posts>
_ tutorial on Kaggle
Analyzing 131k tweets <https://www.kaggle.com/eliasdabbas/extract-entities-from-social-media-posts>
_ by European Football clubs tutorial on Kaggle
An overview of the YouTube data API with Python <https://www.kaggle.com/eliasdabbas/youtube-data-api>
_
Function names mostly start with the object you are working on, so you can use autocomplete to discover other options:
| kw_
: for keywords-related functions
| ad_
: for ad-related functions
| url_
: URL tracking and generation
| extract_
: for extracting entities from social media posts (mentions, hashtags, emoji, etc.)
| emoji_
: emoji related functions and objects
| twitter
: a module for querying the Twitter API and getting results in a DataFrame
| youtube
: a module for querying the YouTube Data API and getting results in a DataFrame
| crawlytics
: a module for analyzing crawl data (compare, links, redirects, and more)
| serp_
: get search engine results pages in a DataFrame, currently available: Google and YouTube
| crawl
: a function you will probably use a lot if you do SEO
| *_to_df
: a set of convenience functions for converting to DataFrames
(log files, XML sitemaps, robots.txt files, and lists of URLs)