Progression path for a GIS analyst who wants to become proficient in using Python for GIS: from apprentice to guru
MIT License
Progression path for a GIS analyst who wants to become proficient in using Python for GIS: from apprentice to guru
This is a work in progress
This is an attempt to provide a structured collection of resources that could help a GIS professional to learn how to use Python when working with spatial data management, mapping, and analysis. The resources are organized by progress category so basically everyone should be able to learn something new along the way.
The resources will include books, web pages and blog posts, online courses, videos, Q/A from GIS.SE, links to code snippets, and some bedtime readings.
The resources will be applicable both for Esri software users as well as open-source GIS professionals.
You should be able to write short simple scripts in pure Python with no connection to GIS. To learn the basics of Python, you can find a ton of resources online such as CodeAcademy, Learn Python the Hard Way, Dive into Python, A Whirlwind Tour of Python, and many other books from Python.org and this Free programming books GitHub repo.
If you don't want to learn Python this way and would rather like to catch up learning how Python can be used for GIS:
Going through these books may be sufficient to learn everything you may ever need if you are an Esri or an open-source GIS user, respectively.
Look for videos at Esri Video web page and search for Python
and sort by most recent. An example of URL.
At this point, you should be able to:
arcpy
site-package or ogr/gdal/pyqgis
librariesarcpy.Result
object returned) / ogr
geometry methods / PyQGIS
tools from Python codearcpy
/ogr
listing functionsarcpy.da
cursors or ogr
data source methodsarcpy.Geometry()
objects (accessing their properties and methods) or ogr.Feature()
arcpy.mapping
module or pyqgis
moduleAt this point, you should be familiar with:
for
and while
loops, if-elif-else
blocksimport os
)return
statement)open
function.csv
files using the csv
module and unicodecsv
moduleThis section contains examples of tasks that you might need to write at some point of time. Implementing these tasks in Python code would be a good sign that you have mastered the basics of Python for GIS.
.txt
or a .csv
file information about your GIS assetsNow, for getting started with Python development, Visual Studio Code with Python extension(s) is arguably the best choice. It's completely free, you can install it on any of your physical or virtual machines and it has great support for Python development. Choosing between commercial IDEs, Wing IDE or PyCharm would be a great choice.
Learn about VCS such as Git
for managing the source code. BitBucket by Atlassian and GitLab provides free private repositories and GitHub provides free public repositories (you need to pay to create private ones).
Git
, read the Git Pro book for free onlineWatch Python courses on training sites such as Pluralsight [Python Fundamentals](https://app.pluralsight.com/library/courses/python-fundamentals/table-of-contents, Enthought Python Foundation Series or Safari Books online
Watch Esri video Python: Useful Libraries for the GIS Professional
Learn about type hinting in Python. There is an excellent blog post on how type hints are used in PyCharm and a help page from Wing IDE people. Find out whether your Python IDE supports static code analysis and start using the type hints (with support both for Python 2.7 and 3.5+).
typing
module in Python 3.6Learn how run other Python programs or executable files from your program using subprocess
module. This is handy when you need to run an .exe
program in the middle of your Python program. This is often the case when you use arcpy
/ ogr
code in the beginning of the script, but then need to run ArcObjects console app / compiled C++ app to get something done before you can proceed further.
arcpy
in repo arcapi
arcpy.mapping
(20+ tools)virtualenv
and venv
for managing Python 2 for Python 3 environments, respectivelytox
and retox
to run your tests/programs on multiple Python installations (can be handy when building script tools to be used both in ArcGIS Desktop with Python 2.7 and ArcGIS Pro with Python 3.x or QGIS with Python 2.7 and QGIS with Python 3.x)At this point, you should be able to:
arcpy.mapping
with data-driven pages or pyqgis
.pdf
files (eg. re-ordering, merging, splitting) using arcpy
or pure Python packages such as pypdf2
.png
and .pdf
arcpy.ArcSDESQLExecute()
or GDALDataset::ExecuteSQL()
FieldInfo
, FieldMap
, and FieldMappings
classes from arcpy
or ogr.FieldDefn()
to manage data schema changesToolValidator
class or build simple QGIS pluginsarcpy
-driven code with the help of geoprocessing messagesJSON
data in Python and arcpy
and GeoJSON
for ogr
xlrd
Python packagexlsxwriter
package or xlwd
arcpy.da.Walk()
and os.walk()
to traverse folders with GIS datasets recursivelyAt this point, you should be familiar with:
pip
PYTHONPATH
environment variable and concept of paths and running Python programs from cmd
collections
module data structures such as defaultdict
, namedtuple
, Counter
enumerate
function*args
and **kwargs
SQLite
from Pythontry/except
blocktraceback
moduleftplib
modulecmd
and a task schedulerzipfile
module for .zip
files and tarfile
for .tar
and .tar.gz
files)Twilio
logging
module) - handy to use instead of print
statementsLearn how to use ArcObjects from Python:
Learn how to access ArcGIS Pro .NET libraries from Python:
pythonnet
packageclr
Learn about other GIS packages:
Learn about how to build desktop GUI applications using Tkinter, WxPython, PyQt, PySide, or Kivy and then embed them into ArcGIS or just let them be aware of spatial datasets:
Learn about computational geometry and find out how it can help you in your work. Maybe you could use a tool that is not present in your desktop GIS or you are looking for something that performs faster. There are two main computational geometry libraries and both were written in C:
scipy.spatial
module, an exceptional tool for anyone who deals with geometrical data.SWIG
. The CGAL
is somewhat difficult to install and compile, but does provide much richer functionality.qhull
and CGAL
, respectively.Learn about using Python with FME Desktop:
Learn the ArcGIS REST API:
requests
moduleLearn about managing and processing larger spatial datasets as performance will matter:
cProfile
)multiprocessing
module with ArcGIS at Esri blog post Multiprocessing with ArcGIS – Approaches and Considerations (Part 1)
Learn about using Python for Big Data management and analysis
PySpark
that will let you use Spark in Python as well as various geospatial libraries that will let you do geospatial analysis using Spark: magellan
, spatialspark
, and GeoSpark
Hadoop
and Presto
for large data analysis at Uber: Query the planet: Geospatial big data analytics at Uber
Learn about Esri File Geodatabase C++ API with .NET bindings to be able to work with file geodatabases programmatically using C++ or .NET
ESRI File Geodatabase (OpenFileGDB)
and ESRI File Geodatabase (FileGDB)
GDAL drivers to connect to Esri file geodatabase programmatically or using open-source toolsGDALDataset::ExecuteSQL()
in a PyQT desktop SQL editor GDBee
Learn how Python is used in the enterprise watching the Enterprise Software with Python O'reilly video course
Learn IPython and the concept of reproducible research:
Learn about using Python for web development:
flask
and django
. Start with flask
and only then move to django
geodjango
to serve spatial datasets on the web. Read through slides ArcGIS JavaScript Plus Django Equals Dynamic Web App
Watch Python – Beyond the Basics on Pluralsight
Learn about nlpk
Python package to work with human language data (eg. parsing address data)
Learn about regex
Python package to work with regular expressions in Python (eg. finding addresses in a specific format)
Learn about difflib
and Levenshtein C extension
to do fuzzy string matching (eg. finding the closest address string in the registry for an input address)
Learn Selenium
Python package to be able to automate web app testing. Read the docs for Python bindings here
Learn about numerical computing and data science:
Anaconda
and learn about conda
. This is helpful as Python in ArcGIS Pro is implemented using a conda environmentscipy.spatial
can do for your GIS workLearn about connecting to various DBMS from Python:
pymssql
cx_Oracle
psycopg2
or sqlalchemy
Learn about using machine learning with Python:
Learn about using computer vision (CV) with Python to do image processing:
Learn about creating and parsing HTML:
BeautifulSoup
. Having this skill would be handy when a web page should be searched for some information and loaded into a GIS dataset or when you are building HTML reportsregistrant
package reports information about the Esri geodatabase contentsScrapy
Learn about creating and parsing XML:
.xml
files using built-in xml.etree.ElementTree
class and 3rd party package lxml
Learn about source code testing, linting, and refactoring:
unittest
built-in module and more advanced pytest
frameworkcoverage.py
module to create code coverage reportsHypothesis
for writing more powerful unit testspylint
, flake8
, and pyflakes8
to keep the code tidywemake-python-styleguide
flake8
plugin; however, it combines violations from a lot of other flake8
pluginsyapf
and autopep8
to automatically reformat the source code to conform to a style. It is best to run autopep8
with aggressive option enabled to reformat the code and then run yapf
on the result codeSonarPython
are implemented in wemake-python-styleguide
Start looking for doing certain things outside of GIS applications using pure Python, for instance, using pandas
Learn best practices for organizing configuration and settings for a larger workflow where you need to keep the config values separately from the business logic (eg. using json
, ConfigParser
or using OOP constructors)
Learn about extending Python with C or C++:
.pyd
compiled file that can be imported as a regular module into Python module)Boost
, SWIG
, native Python C API, and pybind11
. pybind11
is the most user-friendly oneAt this point, you should be able to:
comtypes
libraryxlsxwriter
.pdf
files from scratch that would contain map images, custom charts, and tables using reportlab
.pdf
documents using pypdf2
.pdf
report files using ArcGIS report templates (.rlf
) and arcpy
arcpy.Graph
, arcpy.GraphTemplate
with graph template files (.tee
), and Make Graph GP toolnetworkx
(eg. point-to-point routing)Matplotlib
(both vector and raster)numpy
and pandas
for manipulating spatial dataset attribute tablerequests
and/or arcrest
package to access ArcGIS Server site, ArcGIS Online / Portal organizations through the ArcGIS REST APIfmeobjects
At this point, you should be familiar with:
arcrest
or geopandas
reporting bugs or pulling in new functionalityconda
environments and installing various packages into specific environments.pyd
) and write a .pyi
interface file to provide the intellisense for your Python IDEThis section contains the examples of tasks that you might need to write at some point of time. Implementing these tasks in Python code would be a good sign that you have mastered the advanced concepts of Python for GIS.
arcpy
package and ArcObjectsnetworkx
rtree
in PostGIS, SQL Server STContains
, or shapely
Python packagepandas
scikit-learn
to mimic some of the ArcGIS Spatial Statistics toolsPyQt
a GUI application for executing SQL queries against file geodatabases