FastDeploy - FastDeploy 1.0.0 Latest Release

Published by jiangjiajun almost 2 years ago

1.0.0 Release Note

全场景高性能AI部署工具⚡️FastDeploy 1.0.0正式发布！🎉 支持飞桨及开源社区150+模型的多硬件高性能部署，为开发者提供简单全场景、简单易用、极致高效的全新部署体验！

多推理后端与多硬件支持

FastDeploy支持在多种硬件上以不同后端的方式进行推理部署，各后端模块可根据开发者需求灵活编译集成，自行编译参考 FastDeploy编译文档。

后端	平台	支持模型格式	支持硬件
Paddle Inference	Linux(x64)/Windows(x64)	Paddle	x86 CPU/NVIDIA GPU/Jetson/GraphCore IPU
Paddle Lite	Linux(aarch64/armhf)/Android	Paddle	Arm CPU/Kunlun R200/RV1126
Poros	Linux(x64)	TorchScript	x86 CPU/NVIDIA GPU
OpenVINO	Linux(x64)/Windows(x64)/OSX(x86)	Paddle/ONNX	x86 CPU/Intel GPU
TensorRT	Linux(x64/aarch64)/Windows(x64)	Paddle/ONNX	NVIDIA GPU/Jetson
ONNX Runtime	Linux(x64/aarch64)/Windows(x64)/OSX(x86/arm64)	Paddle/ONNX	x86 CPU/Arm CPU/NVIDIA GPU

除此之外，FastDeploy也基于Paddle.js 支持模型在网页前端及智能小程序部署工具，参阅 Web部署了解更多细节。

丰富的AI模型端到端推理

FastDeploy支持如下飞桨模型套件的端到端部署

除飞桨开发套件外，FastDeploy同时支持了开源社区内热门深度学习模型的部署，目前v1.0共完成150+模型的支持，下表为部分重点模型的支持情况，阅读部署示例了解更多详细内容。

场景	支持模型
图像分类	ResNet/MobileNet/PP-LCNet/YOLOv5-Clas等系列模型
目标检测	PP-YOLOE/PicoDet/RCNN/PP-YOLOE/YOLOv5/YOLOv6/YOLOv7/YOLOX/NanoDet等系列模型
语义分割	PP-LiteSeg/PP-HumanSeg/DeepLabv3p/UNet等系列模型
图像/视频抠图	PP-Matting/PP-Mattingv2/ModNet/RobustVideoMatting
文字识别	PP-OCRv2/PP-OCRv3
视频超分	PP-MSVSR/BasicVSR/EDVR
目标跟踪	PP-Tracking
姿态/关键点识别	PP-TinyPose/HeadPose-FSANet
人脸对齐	PFLD/FaceLandmark1000/PIPNet等系列模型
人脸检测	RetinaFace/UltraFace/YOLOv5-Face/SCRFD等系列模型
人脸识别	ArcFace/CosFace/PartialFC/VPL/AdaFace等系列模型
语音合成	PaddleSpeech 流式语音合成模型
语义表示	PaddleNLP ERNIE 3.0 Tiny系列模型
信息抽取	PaddleNLP 通用信息抽取UIE模型
文图生成	Stable Diffusion

高性能服务化部署

FastDeploy基于 Triton Inference Server 提供服务化部署能力。支持Paddle/ONNX模型在不同硬件以及不同后端上的高性能服务化部署体验。

自动化压缩与模型转换

PaddleSlim自动化压缩

FastDeploy基于 PaddleSlim 提供一键量化工具，通过如下命令快速完成模型的无损压缩加速。

fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml \
                    --method='PTQ' --save_dir='./yolov5s_ptq_model/'

目前FastDeploy已完成量化模型与如下后端的适配测试

硬件/推理后端	ONNX Runtime	Paddle Inference	TensorRT	Paddle Inference TensorRT	Paddle Lite
CPU	支持	支持	-	-	支持
GPU	-	-	支持	支持	-
RK1126	-	-	-	-	支持

自动压缩精度与性能对比如下表所示，精度近乎无损，性能最高提升400%

一键压缩的更多细节与使用方式，参阅FastDeploy一键压缩功能。

模型转换

为了便于对多框架模型的部署支持，FastDeploy预置了 X2Paddle 转换能力，在安装FastDeploy后，通过如下命令可快速完成转换，并通过FastDeploy部署。

fastdeploy convert --framework onnx --model yolov5s.onnx --save_dir yolov5s_paddle_model

更多使用方式，参阅FastDeploy模型转换。

端到端部署性能优化

FastDeploy在各模型的部署中，重点关注端到端到的部署体验和性能。在1.0版本中，FastDeploy在端到端进行如下优化

服务端对预处理过程进行融合，降低内存创建开销和计算量
移动端集成百度视觉技术部自研高性能图像处理库 FlyCV

结合FastDeploy多后端支持的优势，相较原有部署代码，所有模型端到端性能大幅提升，下表为其中部分模型的测试数据，
bf6bea741738e3e2944945cda30d95c2

1.0.0 Release Note

We are excited to announce the release of ⚡️FastDeploy 1.0.0! 🎉 FastDeploy supports high performance end-to-end deployment for over 150 AI models from PaddlePaddle and other open source community on multiple hardware.

Multiple Inference Backend and Hardware Support

FastDeploy supports inference deployment on multiple hardware with different backends, each backend module can be flexibly compiled and integrated according to the developer's needs, please refer to FastDeploy compilation documentation。

Backend	Platform	Model Format	Supported Hardware in FastDeploy
Paddle Inference	Linux(x64)/Windows(x64)	Paddle	x86 CPU/NVIDIA GPU/GraphCore IPU
Paddle Lite	Linux(aarch64/armhf)/Android	Paddle	Arm CPU/Kunlun R200/RV1126
Poros	Linux(x64)/Windows(x64)	TorchScript	x86 CPU/NVIDIA GPU
OpenVINO	Linux(x64)/Windows(x64)/OSX(x86)	Paddle/ONNX	x86 CPU/Intel GPU
TensorRT	Linux(x64/aarch64)/Windows(x64)	Paddle/ONNX	NVIDIA GPU/Jetson
ONNX Runtime	Linux(x64/aarch64)/Windows(x64)/OSX(x86/arm64)	Paddle/ONNX	x86 CPU/Arm CPU/NVIDIA GPU

In addition, FastDeploy also supports the deployment of models on the web and mini application based on Paddle.js, see Web Deployment for more details.

AI Model End-to-end Inference Support

FastDeploy supports end-to-end deployment of the following PaddlePaddle models are as follows：

In addition, FastDeploy also supports the deployment of popular deep learning models in the open source community. over 150 models are currently supported in release 1.0, the table below shows some of the key models supported, refer to deployment examples for more details.

Task	Supported Models
Classification	ResNet/MobileNet/PP-LCNet/YOLOv5-Clas and other series models
Object Detection	PP-YOLOE/PicoDet/RCNN/PP-YOLOE/YOLOv5/YOLOv6/YOLOv7/YOLOX/NanoDet and other series models
Segmentation	PP-LiteSeg/PP-HumanSeg/DeepLabv3p/UNet and other series models
Image/Video Matting	PP-Matting/PP-Mattingv2/ModNet/RobustVideoMatting
OCR	PP-OCRv2/PP-OCRv3
Video Super-Resolution	PP-MSVSR/BasicVSR/EDVR
Object Tracking	PP-Tracking
Posture/Key-point Recognition	PP-TinyPose/HeadPose-FSANet
Face Align	PFLD/FaceLandmark1000/PIPNet and other series models
Face Detection	RetinaFace/UltraFace/YOLOv5-Face/SCRFD and other series models
Face Recognition	ArcFace/CosFace/PartialFC/VPL/AdaFace and other series models
Text-to-Speech	PaddleSpeech Streaming Speech Synthesis Model
Semantic Representation	PaddleNLP ERNIE 3.0 series models
Information Extraction	PaddleNLP Universal Information Extraction UIE model
Content Generation	Stable Diffusion

High Performance Serving Deployment

⚡️FastDeploy provides high performance serving system for AI model based on Triton Inference Server . Supports the Paddle/ONNX model for a fast service-base deployment experience on different hardware and different backends.

Tool Components

PaddleSlim Auto Compression Toolkit

FastDeploy provides a one-click quantization tool based on PaddleSlim to quickly speed up the lossless compression of models with the following commands.

fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml \
                    --method='PTQ' --save_dir='./yolov5s_ptq_model/'

FastDeploy has now completed testing the adaptation of the quantitative model on the following backend

Hardware/Deployment backend	ONNX Runtime	Paddle Inference	TensorRT	Paddle Inference TensorRT	Paddle Lite
CPU	Supported	Supported	-	-	Supported
GPU	-	-	Supported	Supported	-
RK1126	-	-	-	-	Supported

The following table compares the accuracy and performance of auto-compression, with virtually no loss of overall accuracy and improved performance 100%~400%

For more details and usage of the one-click quantization, see FastDeploy one-click quantization.

Model Conversion

To facilitate deployment support for multiple framework models, FastDeploy integrates X2Paddle conversion capabilities, which can be quickly completed and deployed via FastDeploy with the following command after installing FastDeploy.

fastdeploy convert --framework onnx --model yolov5s.onnx --save_dir yolov5s_paddle_model

For more information on how to use it, see FastDeploy Model Conversion。

End-to-end Deployment Performance Optimisation

FastDeploy focuses on the end-to-end deployment experience and performance in each model deployment. In version 1.0, FastDeploy has made the following end-to-end optimisations:

Server-side fusion of pre-processing processes to reduce memory creation overhead and computation
Mobile integration with Baidu Vision's own high-performance image processing library FlyCV

The end-to-end inference performance of all models is significantly improved compared to the original deployment code which has Combined with the advantages of FastDeploy's multi-backend support. and the following table shows the test data of some of the models

Thanks to the following developers for their contributions to FastDeploy! Contributors List
@leiqing1 @jiangjiajun @DefTruth @joey12300 @felixhjh @ziqi-jin @yunyaoXYY @wjj19950828 @heliqi @ZeyuChen @ChaoII @Zheng-Bicheng @wang-xinyu @HexToString @yeliang2258 @WinterGeng @LDOUBLEV @rainyfly @czr-gc @chenqianhe @kiddyjinjin @Zeref996 @TrellixVulnTeam @D-DanielYang @totorolin @hguandl @ChrisKong93 @Xiue233 @jm12138 @triple-Mu @yingshengBD @GodIsBoom @PatchTester @onecatcn

FastDeploy - FastDeploy 0.8.0

Published by jiangjiajun almost 2 years ago

0.8.0 Release Note

新增PIPNet、FaceLandmark1000人脸对齐模型的部署支持详情
新增视频超分系列模型 PP-MSVSR、EDVR、BasicVSR 详情
升级YOLOv7部署代码，增加批量预测部署支持 https://github.com/PaddlePaddle/FastDeploy/pull/611
新增UIE服务化部署案例详情
修复ArcFace示例代码中Cosine Similarity计算问题 https://github.com/PaddlePaddle/FastDeploy/pull/648
[测试功能] 新增OpenVINO后端Device设置，支持集显/独立显卡的调用 https://github.com/PaddlePaddle/FastDeploy/pull/472
新增Android图像分类、目标检测、语义分割、OCR、人脸检测 APK工程及示例

图像分类	目标检测	语义分割	文字识别	人脸检测
工程代码	工程代码	工程代码	工程代码	工程代码
扫码或点击链接安装试用	扫码或点击链接安装试用	扫码或点击链接安装试用	扫码或点击链接安装试用	扫码或点击链接安装试用

0.8.0 Release Note

Support PIPNet, FaceLandmark1000 face alignment models deployment Details
Support Video Super-Resolution series model PP-MSVSR、EDVR、BasicVSR Details
Upgrade YOLOv7 deployment code to add batch_predict deployment https://github.com/PaddlePaddle/FastDeploy/pull/611
Support UIE service-based deployment Details
Fix a bug with the Cosine Similarity calculation in the ArcFace sample code https://github.com/PaddlePaddle/FastDeploy/pull/648
[Test functions] Support OpenVINO backend Device settings, support for integrated/discrete graphics card https://github.com/PaddlePaddle/FastDeploy/pull/472
Support Android image classification, target detection, semantic segmentation, OCR, face detection APK projects and examples

Image Classification	Object Detection	Semantic Segmentation	OCR	Face Detection
Project Code	Project Code	Project Code	Project Code	Project Code
Scan the code or click on the link to install and try out	Scan the code or click on the link to install try out	Scan the code or click on the link to install and try out	Scan the code or click on the link to install and try out	Scan the code or click on the link to install and try out

New Contributors

@jm12138 made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/613
@Xiue233 made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/633
@ChrisKong93 made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/648

Full Changelog: https://github.com/PaddlePaddle/FastDeploy/compare/release/0.7.0...release/0.8.0

FastDeploy - FastDeploy 0.7.0 Release Note

Published by jiangjiajun almost 2 years ago

0.7.0 Release Note

新增Paddle Lite TIM-VX集成，支持RK1芯片上的部署详情
人脸检测模型SCRFD模型新增RKNPU2的部署支持部署示例
新增Stable Diffusion模型部署示例部署示例
PaddleClas/PaddleDetection/YOLOv5部署代码升级，支持predict及batch_predict
支持大于2G以上的Paddle模型转ONNX部署
新增PaddleClas模型服务化部署案例部署案例
针对FDTensor增加Pad function操作符，支持在batch预测时，对输入进行Padding
针对FDTensor增加Python API to_dlpack接口，支持FDTensor在不同框架间的无拷贝传输

0.7.0 Release Note

Integrate Paddle Lite TIM-VX for supporting hardware such as Rockchip RV1126 . Details
Support Face detection model SCRFD on Rockchip RK3588, RK3568 and other hardware.
Support Stable Diffusion model deployment.
Upgrade PaddleClas、PaddleDetection、YOLOv5 deployment code to support predict and batch_predict;
Support for Paddle model to ONNX deployments larger than 2G.
Support PaddleClas model service-based deployment.
Add the Pad function operator for the FDTensor to support Padding of the input during batch prediction.
Add Python API to_dlpack interface for FDTensor to support copyless transfer of FDTensor between frameworks.

New Contributors

@GodIsBoom made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/529
@yingshengBD made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/557
@triple-Mu made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/563

Full Changelog: https://github.com/PaddlePaddle/FastDeploy/compare/release/0.6.0...release/0.7.0

FastDeploy - FastDeploy 0.6.0 Release Note

Published by jiangjiajun almost 2 years ago

0.6.0 Release Note

模型

新增FSANet头部姿态识别模型详情
新增PFLD人脸对齐模型详情
PP-Tracking模型增加轨迹可视化详情
新增ERNIE文本分类模型详情

服务化部署

FastDeploy Runtime新增Clone接口支持，降低Paddle Inference/TensorRT/OpenVINO后端在多实例下内存/显存的使用

端侧部署

新增RKNPU2（3588）部署支持详情

性能优化

优化YOLO系列、PaddleClas、PaddleDetection前后处理内存创建逻辑
融合视觉预处理操作，优化PaddleClas、PaddleDetection预处理性能
集成TensorRT BatchedNMSDynamic_TRT插件，提升TensorRT端到端部署性能

其它

修复若干文档问题
增加FastDeploy Runtime C++使用示例详情

0.6.0 Release Note

Model

Support FSANet head pose recognition model Details
Support PFLD face alignment model Details
PP-Tracking model adds track visualisation Details
Support ERNIE text classification model Details

Service-based Deployment

FastDeploy Runtime Adds Clone interface support for service-based deployment, reducing the memory、GPU memory usage of Paddle Inference、TensorRT、OpenVINO backend in multiple instances.

Edge Deployment

Support RKNPU2（3588） Details.

Performance Optimisation

Optimize preprocessing and postprocessing memory creation logic on YOLO series, PaddleClas, PaddleDetection.
Integrate visual preprocessing operations, optimize the preprocessing performance of PaddleClas and PaddleDetection, and improve end-to-end performance.
Integrating the TensorRT BatchedNMSDynamic_TRT plugin to improve the performance of TensorRT end-to-end deployments.

Others

Fixing several documentation issues
Adding FastDeploy Runtime C++ usage examples Details

New Contributors

@rainyfly made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/453
@WinterGeng made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/487

Full Changelog: https://github.com/PaddlePaddle/FastDeploy/compare/release/0.4.0...release/0.6.0

FastDeploy - FastDeploy 0.5.0

Published by jiangjiajun almost 2 years ago

What's Changed

后端

新增通过Paddle Inference TensorRT推理支持
新增通过Paddle Inference在IPU硬件上的推理支持
解决原生TensorRT无法支持输入输出INT64数据问题
ONNX Runtime、Paddle Inference、TensorRT后端添加多流支持

模型

新增跟踪模型PP-Tracking 示例
新增RobustVideoMatting视频模型示例
新增FastDeploy模型集成开发流程文档文档

其它

修复非固定Shape情况下PP-Matting的预测问题
修复语义分割模型Python可视化函数问题
修复部分模型使用文档

New Contributors

@czr-gc made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/437

Full Changelog: https://github.com/PaddlePaddle/FastDeploy/compare/release/0.4.0...release/0.5.0

FastDeploy - FastDeploy 0.4.0

Published by jiangjiajun about 2 years ago

0.4.0版本新增Android移动端部署支持！

What's Changed

移动端部署

增加FastDeploy Android C++预测库，支持arm64-v8a和armeabi-v7a架构，详见预编译库下载
增加目标检测模型PicoDet的Android部署，详见示例
增加图像分类PaddleClas系列模型的Android部署，详见示例

模型

优化YOLOv5/6/7 GPU部署端到端性能，通过YOLOv5::UseCudaPreprocessing()启用GPU前处理后，T4 GPU(TensorRT)上性能提升30%~50%，详见PR说明 https://github.com/PaddlePaddle/FastDeploy/pull/370
增加7个Web端js部署案例，详见js部署示例
增加TinyPose以及PicoDet+TinyPose串联Pipeline部署支持，详见示例
增加Torch Vision ResNet系列模型的部署支持，详见示例
PPOCRSystemv2 & PPOCRSystemv3重命名为PPOCRv2 & PPOCRv3
优化PaddleSeg & PaddleOCR中部分模型警告信息

服务化部署

增加语义模型TTS服务化部署，详见示例
增加ERNIE 3.0服务化部署，详见示例
修复服务化CPU部署镜像中的core问题

推理后端

GPU部署增加EnablePinedMemory接口，支持Paddle Inference和TensorRT推理时，使用Pinned Memory，提升数据从GPU拷贝至CPU的传输生能，详见PR https://github.com/PaddlePaddle/FastDeploy/pull/403

文档(仍在完善中)

新上线Python API文档，详见 Python API文档
新上线C++ API文档，详见C++ API文档

New Contributors

@HexToString made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/384
@wang-xinyu made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/370
@LDOUBLEV made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/392
@chenqianhe made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/415

Full Changelog: https://github.com/PaddlePaddle/FastDeploy/compare/release/0.3.0...release/0.4.0

FastDeploy - FastDeploy v0.3.0

Published by jiangjiajun about 2 years ago

What's Changed

模型

新增PaddleSeg的PP-ModNet和PP-HumanMatting部署支持部署示例
新增YOLOv5-Classification模型部署支持部署示例

量化加速

基于PaddleSlim提供一键量化工具，支持CPU/GPU上部署性能的倍速提升详细内容
支持YOLO系列和PaddleClas图像分类系列模型一键量化加速详细内容

编译

支持用户环境指定自定义路径下的OpenCV、OpenVINO、ONNX Runtime编译依赖
Mac x86上增加OpenVINO后端的编译支持
增加arm上Paddle-Lite的后端支持
支持Jetson上编译安装参考文档

服务化部署

发布FastDeploy-Triton服务化CPU/GPU部署镜像，支持Paddle/ONNX模型的多后端的高性能服务化部署详细内容
新增YOLOv5服务化部署示例详细内容

代码优化

解决模型Predict时修改传入图像的问题
增加TensorRT后端max_workspace_size设置接口
优化PaddleSeg部署模型在动态Shape下的提示信息
修复Windows上加载TensorRT序列化文件失败的问题
增加fastdeploy_init.sh和fastdeploy_init.bat帮助开发者快速导入FastDeploy依赖库

New Contributors

@onecatcn made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/264
@Zheng-Bicheng made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/290
@TrellixVulnTeam made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/315
@yeliang2258 made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/257

Full Changelog: https://github.com/PaddlePaddle/FastDeploy/compare/release/0.2.1...release/0.3.1

FastDeploy - FastDeploy v0.2.1

Published by jiangjiajun about 2 years ago

What's Changed

模型

新增PaddleDetection MaskRCNN/PPYOLOE+/PPOCRv2/PPOCRv3/PPMatting等视觉模型端到端部署支持，详情参阅FastDeploy/examples/vision
新增UIE文本NLP模型端到端部署支持，详情参阅FastDeploy/examples/text

推理后端

新增OpenVINO推理后端，得益于OpenVINO团队的支持，大部分Paddle模型均已支持使用OpenVINO在CPU上加速推理
TensorRT优化使用体验，无需再手动调用SetTrtInputShape设置输入范围，改为默认在推理过程中动态设置
参阅文档如何切换推理后端了解更多详情

使用体验

新增部分使用文档，包含编译、SDK使用等
优化Windows上编译，使用中的部分易用性问题

New Contributors

@heliqi made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/190
@ChaoII made their first contribution in https://github.com/PaddlePaddle/FastDeploy/pull/211

Full Changelog: https://github.com/PaddlePaddle/FastDeploy/compare/release/0.2.0...release/0.2.1

FastDeploy - FastDeploy v0.2.0

Published by jiangjiajun about 2 years ago

多推理后端支持

集成Paddle Inference、ONNX Runtime、TensorRT后端，并支持根据模型自动选择最佳推理后端。
支持源码编译，更灵活地选择后端，可参考 FastDeploy编译文档

文档优化

新增44个模型的Python/C++API文档及部署示例，更多内容参考 FastDeploy部署示例

FastDeploy - FastDeploy v0.1.0

Published by jiangjiajun over 2 years ago

⚡️FastDeploy v0.1.0测试版发布！🎉
💎 发布40个重点模型在8种重点软硬件环境的支持的SDK
😊 支持网页端、pip包两种下载使用方式

FastDeploy

1.0.0 Release Note

多推理后端与多硬件支持

丰富的AI模型端到端推理

高性能服务化部署

自动化压缩与模型转换

PaddleSlim自动化压缩

模型转换

端到端部署性能优化

1.0.0 Release Note

Multiple Inference Backend and Hardware Support

AI Model End-to-end Inference Support

High Performance Serving Deployment

Tool Components

PaddleSlim Auto Compression Toolkit

Model Conversion

End-to-end Deployment Performance Optimisation

0.8.0 Release Note

0.8.0 Release Note

New Contributors

0.7.0 Release Note

0.7.0 Release Note

New Contributors

0.6.0 Release Note

模型

服务化部署

端侧部署

性能优化

其它

0.6.0 Release Note

Model

Service-based Deployment

Edge Deployment

Performance Optimisation

Others

New Contributors

What's Changed

后端

模型

其它

New Contributors

What's Changed

移动端部署

模型

服务化部署

推理后端

文档(仍在完善中)

New Contributors

What's Changed

模型

量化加速

编译

服务化部署

代码优化

New Contributors

What's Changed

模型

推理后端

使用体验

New Contributors

多推理后端支持

更多视觉模型支持

文档优化

Related Projects

PP-YOLOE