Factorio economic analysis
Factorio (homepage, Steam) is a fantastic top-down open-world RTS/factory game with an emphasis on automation and technical depth. There's no game I can't ruin by applying math!
The purpose of this project is to get some analytical data about the game's economics, and probably do some constrained linear programming optimization to try out some "computed economies" at scale.
The scripts don't interact with the game itself. Instead, they pull and scrape data from the wiki, an instance of MediaWiki. As luck would have it, MediaWiki has a pretty good REST API, and the API endpoint for the Factorio wiki is open. This means that we can:
The wiki has two instances - one documenting the experimental version of Factorio (0.17 as of this writing), and the other documenting the stable version (0.16). This project uses the stable version.
To download and save the game's item database, issue
./pull-items.py
To look at the resulting (large) JSON file, use xzless
.
The data from the wiki are a colourful mix of inaccurate, incomplete, inconsistently presented, and in a format not conducive to analysis. The preprocessing step attempts to fix that:
./preprocess.py
To look at the resulting (>5MiB) CSV, use xzless -S
, unwrapped lines being
crucial here. As you'll notice, the data are extremely sparse, but let's not
overcomplicate things by switching to a sparse format when xz
already has a
0.3% (!) compression ratio.
The CSV is for consumption by R. In addition, a NumPy zipped matrix (.npz) and
metadata JSON file are created, for consumption by analyse.py
.
analyse.r
is a stub; you can hack on it if you want to manipulate the recipe
matrix. Currently, though, I'm moving toward using analyse.py
and SciPy.
SciPy has a lot of options for numerical optimization. We're interested in linear programming via linprog :
Variables:
c
is a 1*n row vector of coefficients of the linear objective function.x
is an n*1 column vector of linear objective function variables to beA_ub
is a p*n matrix of upper bound constraints.b_ub
is a p*1 column vector of upper bound constraints.A_eq
is a q*n matrix of equality constraints.b_eq
is a q*1 column vector of equality constraints.bounds
is either one (min, max) tuple applying to all variables in x
, or an
(min, max) variables for each variable in x
.In our case:
n
is equal to 681, the number of recipesx
is a floating-point number representing the (potentiallymin
portion of bounds
must be zero; i.e., we cannot have a negativec
must be calculated based on the requested objective function. This can beA_ub
must take into account the total number of whole or partial manualTo calculate c
, the objective function coefficient 1n row vector, multiply an
expense r1 column vector by the recipe r*n matrix, r
being the number of
resources. The expense vector has one entry per resource.