mnist-flow

This Project is only repository for solving AI Engineer Party. DataSet is MNIST to reduce GCE cost.

Architecture

Quick Start using GCE Configuration

Set your GCE region, Please use this command on Python2.7, beacase of gcloud cli. To use gcloud, it had been connected your GCE Account, see gcloud info.
```
 $ gcloud config set compute/zone asia-northeast1
```

Install Cloud Dataflow Python SDK

$ pip install --upgrade google-cloud-dataflow --user

Create IAM which have roles in BigQuery and Google Storage View.
```
$ export GOOGLE_APPLICATION_CREDENTIALS="key.json"
```

Preprocess tf.keras.datasets.mnist to be flatten in script/make_data.py and upload to Google Bigquery as splitting train/test dataset.

$ python script/make_data.py
$ gzip data/*.txt
$ bq load --source_format=CSV -F":" mnist.train data/train.txt.gz \
    "key:integer,image:string,label:integer"
$ bq load --source_format=CSV -F":" mnist.test data/test.txt.gz \
    "key:integer,image:string,label:integer"

Install BigQuery Client Libraries if you want to run on real time. Be careful call one query making below traffic. Google BigQuery price is $0.035 per GB. So we should be cached dataset.

60000 MNIST train Dataset : 151.4MB
10000 MNIST test Dataset : 25.3MB

from google.cloud import bigquery
client = bigquery.Client()
query = ("SELECT image, label FROM mnist.train")
query_job = client.query(
    query,
)  # API request - starts the query
for row in query_job:  # API request - fetches results
    print(row)

To Training Adanet, run train_adanet.py

$ python train_adanet.py \
			--RANDOM_SEED 42 \
			--NUM_CLASSES 10 \
			--TRAIN_STEPS 5000 \
			--BATCH_SIZE 64 \
			--LEARNING_RATE 0.001 \
			--FEATURES_KEY "images" \
			--ADANET_ITERATIONS 2\
			--MODEL_DIR "./models" \
			--SAVED_DIR "./saved_model" \
			--logfile "log.txt"

See saved model directory structure below:

models
└── 1566790420
    ├── saved_model.pb
    └── variables
        ├── variables.data-00000-of-00001
        └── variables.index

Create Google Storage Bucket to upload saved model folder

$ PROJECT_ID=$(gcloud config list project --format "value(core.project)")
$ BUCKET="${PROJECT_ID}-ml"

# create bucket
$ gsutil mb -c regional -l asia-northeast1 gs://${BUCKET}

# upload saved model
$ gsutil -m cp -R ./saved_model gs://${BUCKET}

Deploy google function

$ cd gfunction
$ gcloud functions deploy handler --runtime python37 \
				--trigger-http --memory 2048 --region asia-northeast1

Test API using curl

$ curl -F 'file=@testdata/0.png' 'your api handler server'
output : 0

pylint

$ pip install pylint
$ pylint **/*.py

But I use pep8 supported on Pycharm.

Author

Tae Hwan Jung(Jeff Jung) @graykode, Kyung Hee Univ CE(Undergraduate).
Author Email : [email protected]

Related Projects

GoogleCloudMLExamples

29 Oct 2016 71

gravitate-backend

04 Nov 2018 1

count-tokens-hf-datasets

This project shows how to derive the total number of training tokens from a large text dataset fr...

10 Jun 2022 15

cnn_captcha

use cnn recognize captcha by tensorflow. 本项目针对字符型图片验证码，使用tensorflow实现卷积神经网络，进行验证码识别。

07 Nov 2018 2,765

CodeAssist

CodeAssist is an advanced code completion tool that provides high-quality code completions for Py...

09 Feb 2022 54

ai-toolkit

Unifying all code detailing learned ML concepts for easy reuse.

16 Dec 2018 14