Below is how to run my solution, along with the answers to various questions.
From the root dir....
To build the docker app image:
./build
To start the containers:
./start
To migrate the database
./migrate
You might want to create a virtual env. Make sure that the requirements in the requirements.txt and installed e.g.
pip install -r app/requirements.txt
To start sending data
./send_data
Visit http://localhost:8088/?pgsql=db&username=xbird&db=xbird&ns=public
I chose a relational sql database. While nosql databases can be useful in certain situations, relational databases have been depended on and optimized for decades. The ability to define a schema, and change that schema over time, means you can reliably maintain data consistency. The data that is defined in the protobuf format is easy to map into a relational schema.
I chose postgres because it is opensource and has a huge active community. It has great features such as json columns which allow for schemaless data if needed. (I am also familiar with it!)
I would batch the samples and process them asynchronously using jobs and workers (e.g. celery)
I would use logging aggregator such as prometheus or papertrail. I would also have alerts set up do monitor things such as db cpu usage, etc.
As I have not used python in a long time, I found this more time consuming than it should have been. It took me a while to set up database migrations, etc. Most of the tools I had to use were not familiar to me. Thus I did not have the time to do many things I would have liked too e.g.