TensorFlow as a Service, a general purpose framework to serve TF models.
MIT License
A general purpose framework (written in Go) to serve TensorFlow models. It provides reach and flexible set of APIs to efficiently access your favorite TF models via HTTP interface. The TFaaS supports JSON and ProtoBuffer data-formats.
The following set of APIs is provided:
/upload
to push your favorite TF model to TFaaS server either for Form or/delete
to delete your TF model from TFaaS server/models
to view existing TF models on TFaaS server/predict/json
to serve TF model predictions in JSON data-format/predict/proto
to serve TF model predictions in ProtoBuffer data-format/predict/image
to serve TF model predictions forimages in JPG/PNG formatsdocker run --rm -h `hostname -f` -p 8083:8083 -i -t veknet/tfaas
# example of image based model upload
curl -X POST http://localhost:8083/upload
-F 'name=ImageModel' -F 'params=@/path/params.json'
-F 'model=@/path/tf_model.pb' -F 'labels=@/path/labels.txt'
# example of TF pb file upload
curl -s -X POST http:/localhost:8083/upload \
-F 'name=vk' -F 'params=@/path/params.json' \
-F 'model=@/path/model.pb' -F 'labels=@/path/labels.txt'
# example of bundle upload produce with Keras TF
# here is our saved model area
ls model
assets saved_model.pb variables
# we can create tarball and upload it to TFaaS via bundle end-point
tar cfz model.tar.gz model
curl -X POST -H "Content-Encoding: gzip" \
-H "content-type: application/octet-stream" \
--data-binary @/path/models.tar.gz http://localhost:8083/upload
# obtain predictions from your ImageModel
curl https://localhost:8083/image -F 'image=@/path/file.png' -F 'model=ImageModel'
# obtain predictions from your TF based model
cat input.json
{"keys": [...], "values": [...], "model":"model"}
# call to get predictions from /json end-point using input.json
curl -s -X POST -H "Content-type: application/json" \
-d@/path/input.json http://localhost:8083/json
Fore more information please visit curl client page.
Clients communicate with TFaaS via HTTP protocol. See examples for Curl, Python and C++ clients.
Benchmark results on CentOS, 24 cores, 32GB of RAM serving DL NN with 42x128x128x128x64x64x1x1 architecture (JSON and ProtoBuffer formats show similar performance):
For more information please visit bencmarks page.