Code and iPython notebooks for final semester computing project, for a Masters in Information Technology degree at the University of Melbourne.
Repository for my final semester computing project at the University of Melbourne, under the supervision of Dr. Richard Sinnott.
'Walkability Index' of a suburb measures how 'walkable' a given neighborhood is, in terms of road connectivity, population density, and land use mix.
Using data from Australian Urban Research Infrastructure Network's (AURIN) e-infrastructure, this project explores possible statistical and spatial analysis of large scale areas (in our case, Inner Melbourne), by breaking them down into smaller chunks, and aggregating them during analysis. The approach allows us to calculate, and present the walkability metric for intuitively understandable areas like suburbs or greater regions.
The idea of breaking down and aggregating regions is based on the concept of Statistical Areas (SAs) in Australia. Defined by the Australian Bureau of Statistics, SAs can be represented as:
AURIN's 'Walkability Workflow' is the process which is used to generate the walkability data for a given area. Calculating the walkability of a neighborhood is a resource intensive task, the cost of which, in terms of time and computations, increases with the area of the neighborhood (say, SA4). This is because the workflow involves computationally expensive tasks including 'road network traversal' in all possible directions in a given area.
At the time of this project, AURIN's system was unable to handle processing of large areas in an acceptable time limit.
Aggregating the SA1s that are contained within an SA2, we can represent that SA2, without extra computational costs on the AURIN systems. We can also group together SA1s to represent SA3s or even SA4s. Similarly, we can aggregate the results (obtained from the walkability workflow), to give a walkability score to larger regions.
For this project, we decided to take the mean of the walkability indices of SA1s to represent the SA2s, SA3s, and SA4s that they belong to. A very basic approach, but it opens the doors to large scale analytics, once the SA1s have been processed (in parallel if possible, but this is a data analysis project, so no details on that here).
As an advantage, we can use AURIN's public health, transportation, economic value data for SA2s, which we'll correlate with the walkability scores for more insight!
The project includes the following notebooks (data manipulation and analyses):
The code, including its result, relevant to a particular topic is described in its notebook. It is recommended to view the notebooks in the above order to avoid missing out on code explainations.
The following technologies were used for this part of the project:
Note: Choropleth maps are available in the choropleth-map directory as html files, and should be downloaded to be viewed (as they are not inherently rendered in an iPython notebook).