modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

APACHE-2.0 License

Downloads
132.8K
Stars
6.1K
Committers
233

Bot releases are hidden (Show)

modelscope - v1.14.0 Latest Release

Published by zzhangpurdue 6 months ago

New models

No. Model-id and links
1 Qwen1.5-110B-Chat
2 CodeQwen1.5-7B-Chat
3 WizardLM-2-8x22B
4 c4ai-command-r-v01
5 通义千问1.5-32B-对话
6 dbrx-instruct

Highlight features

  1. Dataset refactoring, to be compatible with HF datasets structure, new to limit datasets<2.19.0 (Breaking Changes)

What's Changed

New Contributors

Full Changelog: https://github.com/modelscope/modelscope/compare/v1.13.2...v1.14.0

modelscope - v1.13.2

Published by tastelikefeet 7 months ago

Highlight features

  1. Dataset refactoring, to be compatible with HF datasets structure. (Breaking Changes)
  2. Unified datasets storage and management with GIT. (Breaking Changes)

What's Changed

New Contributors

Full Changelog: https://github.com/modelscope/modelscope/compare/v1.13.1...v1.13.2

modelscope - v1.13.1

Published by tastelikefeet 7 months ago

New models

No. Model-id and links
1 GeoMVSNet:基于几何感知的多视图深度估计
2 Res2Net说话人确认-中文-3D-Speaker-16k
3 ResNet34说话人确认-中文-3D-Speaker-16k
4 自监督深度补全

Highlight features

  1. Support importing AWQConfig from modelscope
  2. Support stream_generate for LLMPipeline

What's Changed

New Contributors

Full Changelog: https://github.com/modelscope/modelscope/compare/v1.12.0...v1.13.1

modelscope - v1.12.0 release

Published by liuyhwangyh 8 months ago

中文版本

新模型推荐

序号 模型名称&快捷链接
1 支持qwen1.5系列模型
2 RIFE视频插帧
3 VFI-RAFT视频插帧
4 轻量级快速图像特征点匹配

高亮功能

  • add rife-video-frame-interpolation and model (#685)
  • image normal estimation (#683)
  • add image matching fast model based on lightglue (#694)
  • Feature/LoFTR_image_local_feature_matching (#687)
  • support qwen1.5 models
  • upgrade funasr1.0

BugFix

  • fix anydoor pre-commit flake8 and isort errors (#707)
  • fix some ci case issue.
modelscope - v1.11.0 release

Published by lylalala 9 months ago

New Models Recommended

No Model Name & Link
0 Emu2-Gen
1 qanything_models
2 Emu2-Chat
3 Emu2
4 TinyLlama-1.1B-Chat-v1.0
5 notux-8x7b-v1
6 Machine_Mindset_en_ENFJ
7 Machine_Mindset_en_ENFP
8 Machine_Mindset_en_ENTJ
9 Machine_Mindset_en_ENTP
10 Machine_Mindset_en_ESFJ
11 Machine_Mindset_en_ESFP
12 Machine_Mindset_en_ESTJ
13 Machine_Mindset_en_ESTP
14 Machine_Mindset_en_INFJ
15 Machine_Mindset_en_INFP
16 Machine_Mindset_en_INTJ
17 Machine_Mindset_en_INTP
18 Machine_Mindset_en_ISFJ
19 Machine_Mindset_en_ISFP
20 Machine_Mindset_en_ISTJ
21 Machine_Mindset_en_ISTP
22 Machine_Mindset_zh_ENFJ
23 Machine_Mindset_zh_ENFP
24 Machine_Mindset_zh_ENTJ
25 Machine_Mindset_zh_ENTP
26 Machine_Mindset_zh_ESFJ
27 Machine_Mindset_zh_ESFP
28 Machine_Mindset_zh_ESTJ
29 Machine_Mindset_zh_ESTP
30 Machine_Mindset_zh_INFJ
31 Machine_Mindset_zh_INFP
32 Machine_Mindset_zh_INTJ
33 Machine_Mindset_zh_ISFJ
34 Machine_Mindset_zh_ISFP
35 Machine_Mindset_zh_ISTJ
36 Machine_Mindset_zh_ISTP
37 WavMark
38 speech_eres2net_large_200k_sv_zh-cn_16k-common
39 emotion2vec_base
30 QAnything
41 speech_fsmn_vad_zh-cn-8k-common-onnx
42 speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1-onnx
43 dolphin-2.6-mistral-7b
44 scepter_scedit
45 deepseek-moe-16b-base
46 deepseek-moe-16b-chat
47 phi-2
48 llava-internlm-7b
49 llava-v1.5-7b-xtuner
50 llava-v1.5-7b-xtuner-pretrain
51 llava-internlm-7b-pretrain
52 speech_ngram_lm_zh-cn-ai-wesp-fst-token8358
53 realisticVisionV51_v51VAE
54 THUDM_chatglm-6b
55 AnyDoor
56 cv_gaussian-splatting-recon_damo
57 AnimateDiff_ms
58 cv_omnidata_image-normal-estimation_normal
59 cv_rife_video-frame-interpolation
60 Qwen-7B-Chat-GGUF
61 stable-zero123
62 cv_adabins_image-depth-prediction_indoor
63 Qwen-14B-Chat-GGUF
64 wav2vec2-large-xlsr-53-english
65 speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
66 cv_resnet-transformer_local-feature-matching_outdoor-data
67 naturalspeech2_libritts
68 text_to_audio
69 valle_libritts
70 vits_ljspeech
71 hifigan_speech_bigdata
72 BigVGAN_singing_bigdata
73 singing_voice_conversion
74 Machine_Mindset_zh_INTP
75 bel_canto
76 music_genre
77 chest_falsetto
78 llava-v1.5-13b-xtuner
79 llava-v1.5-13b-xtuner-pretrain
80 Mistral-7B-Instruct-v0.2-GGUF
81 cv_transformer_image-matching_fast
82 cv_efficientsam-s_image-instance-segmentation_sa1b
83 Ziya-Visual-Lyrics-14B
84 dpo-sdxl-text2image-v1
85 mistral-ft-optimized-1218
86 IP-Adapter-FaceID
87 mask_refine
88 MindChat-Qwen-1_8B
89 AnyDoor_models
90 dolphin-2.5-mixtral-8x7b
91 SVHN-Recognition
92 cv_anytext_text_generation_editing
93 SOLAR-10.7B-Instruct-v1.0
94 insecta
95 pianos
96 kagentlms_baichuan2_13b_mat
97 kagentlms_qwen_7b_mat
98 cv_raft_dense-optical-flow_things
99 HEp2
100 cv_marigold_monocular-depth-estimation
101 OpenDalleV1.1
102 speech_sambert-hifigan_nsf_tts_emily_en-gb_24k
103 speech_sambert-hifigan_nsf_tts_eric_en-gb_24k
104 knowlm-13b-zhixi
105 RankingGPT-qwen-7b
106 hoyoGPT
107 RankingGPT-baichuan2-7b
108 RankingGPT-llama2-7b
109 RankingGPT-bloom-7b
110 RankingGPT-bloom-3b
111 RankingGPT-bloom-1b1
112 RankingGPT-bloom-560m
113 knowlm-13b-base-v1.0
114 knowlm-13b-ie
115 scepter_20240103-212734

Highlight

  • Add AnyDoor support (#688)
  • Add syncdreamer as a image-to-3d pipeline (#679)
  • Add audio codec and codec-based TTS model

Improvements

  • Update cuda to 12.1.0
  • Update transformers to 4.36.2
  • Update ckpt to general_v0.1 (#696)

BugFix

  • Fix embedding and inference device in faq question answering pipeline
  • Remove DOCKER_BUILDKIT=0 for cpu build issue
  • Fix mmcv-full issue
modelscope - v1.10.0 release

Published by liuyhwangyh 10 months ago

中文版本

新模型推荐

序号 模型名称&快捷链接
0 Yi-34B-Chat-4bits
1 Yi-34B-Chat-8bits
2 Yi-6B-Chat
3 Yi-6B-Chat-4bits
4 Yi-6B-Chat-8bits
5 Yi-34B-Chat
6 Video-LLaVA-V1.5
7 Video-LLaVA-7B
8 LanguageBind_Video
9 LanguageBind_Video_FT
10 LanguageBind_Image
11 LanguageBind_Video_merge
12 LanguageBind_Audio
13 LanguageBind_Depth
14 LanguageBind_Thermal
15 Video-LLaVA-Pretrain-7B
16 Aquila2-70B-Expr
17 AquilaChat2-70B-Expr
18 AquilaChat2-34B-Int4-GPTQ
19 AquilaChat2-34B-16K
20 speech_rwkv_transducer_asr-en-16k-gigaspeech-vocab5001-pytorch-online
21 funasr-runtime-win-cpu-x64
22 ModelScope-Agent-14B
23 speech_sambert-hifigan_nsf_tts_donna_en-us_24k
24 speech_sambert-hifigan_nsf_tts_david_en-us_24k
25 MiniGPT-v2
26 speech_sambert-hifigan_tts_waan_Thai_16k
27 cv_background_generation_sd
28 speech_eres2net_base_250k_sv_zh-cn_16k-common
29 Pose-driven-image-generation-HumanSD
30 cv_stable-diffusion-v2_image-feature
31 nlp_minilm_ibkd_sentence-embedding_english-sts
32 nlp_minilm_ibkd_sentence-embedding_english-msmarco
33 speech_eres2net_large_mej_lre_16k_common
34 speech_eres2net_base_mej_lre_16k_common
35 Ziya2-13B-Base
36 SUS-Chat-34B
37 animatediff-motion-adapter-v1-5
38 animatediff-motion-adapter-v1-4
39 animatediff-motion-adapter-v1-5-2
40 animatediff-motion-lora-zoom-in
41 animatediff-motion-lora-pan-left
42 animatediff-motion-lora-tilt-up
43 animatediff-motion-lora-rolling-clockwise
44 animatediff-motion-lora-zoom-out
45 animatediff-motion-lora-pan-right
46 animatediff-motion-lora-tilt-down
47 animatediff-motion-lora-rolling-anticlockwise
48 Qwen-1_8B-Chat-Int4
49 Qwen-1_8B-Chat-Int8
50 Qwen-72B-Chat-Int4
51 Qwen-72B-Chat-Int8
52 Qwen-Audio-Chat
53 Qwen-72B-Chat
54 Qwen-72B
55 Qwen-1_8B
56 Qwen-1_8B-Chat
57 Qwen-Audio
58 cogvlm-chat
59 cogvlm-base-224
60 cogvlm-base-490
61 cogvlm-grounding-base
62 cogvlm-grounding-generalist
63 deepseek-llm-7b-base
64 deepseek-llm-67b-base
65 deepseek-llm-7b-chat
66 deepseek-llm-67b-chat
67 xlm-MLM-en-2048
68 OrionStar-Yi-34B-Chat
69 tigerbot-180b-base-v2
70 tigerbot-13b-chat-v5
71 tigerbot-13b-chat-v5-4bit-exl2
72 tigerbot-70b-chat-v4-4bit-exl2
73 tigerbot-180b-chat-v2
74 tigerbot-13b-chat-v5-4k
75 tigerbot-13b-base-v3
76 tigerbot-70b-chat-v4-4k
77 tigerbot-70b-base-v2
78 tigerbot-70b-chat-v4
79 BlueLM-7B-Base
80 BlueLM-7B-Chat-32K
81 BlueLM-7B-Chat-4bits
82 Sunsimiao-Qwen-7B
83 MindChat-Qwen-7B-v2-self_lora
84 jina-embeddings-v2-base-en
85 jina-embeddings-v2-small-en
86 qwen-chat-7B-ggml
87 qwen-chat-14B-ggml
88 bge-reranker-large
89 bge-reranker-base

高亮功能

  • 支持本地拉起测试推理服务
  • 支持vllm推理
  • LLMPipeline 支持vllm
  • 官方镜像升级到python3.10,pytorch升级2.1.0, tensorflow 1.14.0 ubuntu22.04
  • upgrade to python3.10

功能列表

  • Support VLLM in LLMPipeline (#604)
  • add bpemodel path in asr_trainer
  • add llm riddles (#621)
  • feat: deploy checker for swingdeploy

功能提升

  • python311 support for whl
  • llm pipeline support chatglm3 (#618)
  • Support transformers==4.35.0 (#633)

BugFix

  • Fix _set_gradient_checkpointing bug (#660)
  • fix test reliability issue (#657)
  • fix: DocumentGroundedDialogRetrievalModel qry_encoder.encoder.embeddings.position_ids error (#647)
  • fix asr paraformer finetune bug
  • fix uie trainer: eval failed (#617)
  • Fix vllm: change if condition (#607)
  • fix shop_segmentation to use old timm lib and bump version to 1.9.4rc2
  • fix the numpy bug for card detection correction
  • fix issues for 3dhuman models
  • fix logger: remove file handler for original user logging (#645)

English Version

Highlight

  • local launch inference server
  • support vllm
  • LLMPipeline support vllm
  • Image upgrade to python3.10, pytorch2.1.0,tensorflow2.14.0, ubuntu22.04

Breaking changes

Feature

  • Support VLLM in LLMPipeline (#604)
  • add bpemodel path in asr_trainer
  • add llm riddles (#621)
  • feat: deploy checker for swingdeploy

Improvements

  • python311 support for whl
  • llm pipeline support chatglm3 (#618)
  • Support transformers==4.35.0 (#633)

BugFix

  • Fix _set_gradient_checkpointing bug (#660)
  • fix test reliability issue (#657)
  • fix: DocumentGroundedDialogRetrievalModel qry_encoder.encoder.embeddings.position_ids error (#647)
  • fix asr paraformer finetune bug
  • fix uie trainer: eval failed (#617)
  • Fix vllm: change if condition (#607)
  • fix shop_segmentation to use old timm lib and bump version to 1.9.4rc2
  • fix the numpy bug for card detection correction
  • fix issues for 3dhuman models
  • fix logger: remove file handler for original user logging (#645)
modelscope - v1.9.4

Published by wenmengzhou 12 months ago

中文版本

Feature

  • 新增句子向量模型,支持gte, bloom
  • stable diffusion新增freeU方法
  • LLMPipeline 支持Swift adapter模型推理
  • 镜像制作时自动升级funasr transformer最新版本
  • venv强制依赖移除,以便更好地支持windows系统 #575

bugfix

  • 修复 shop_segmentation pipeline兼容timm 0.5.2
  • 修复huggingface position_ids兼容性问题
  • 修复chatglm sp_tokenizer属性确实问题
  • 修复ofa模型transformers新版兼容性问题
  • 修复trainer中work_dir设置不生效问题 #573
  • 修复hf相关的bug #569 #567

新增模型推荐

序号 模型名称&快捷链接
1 GTE文本向量-中文-通用领域-large
2 GTE文本向量-英文-通用领域-large
3 GTE文本向量-英文-通用领域-small
4 GTE文本向量-英文-通用领域-base
5 GTE文本向量-中文-通用领域-small
6 X-vector说话人转换点定位-两人-中文
7 Udever 多语言通用文本表示模型 3b
8 Udever 多语言通用文本表示模型 1b1
9 GTE文本向量-中文-通用领域-base
10 基于扩散模型的人物多视图生成模型
11 Udever 多语言通用文本表示模型 560m
12 Udever 多语言通用文本表示模型 7b1
13 通义千问-14B-Chat-Int8
14 通义千问-7B-Chat-Int8
15 CT-Transformer标点-中英文-通用-large-onnx
16 CodeFuse-QWen-14B
17 ECAPA-TDNN说话人确认-中文-CNCeleb-16k
18 ECAPA-TDNN说话人确认-中文-3D-Speaker-16k
19 中文字体风格迁移模型
20 Whisper语音识别-英文-small
21 Whisper语音识别-多语言-large
22 中文字体生成基础模型
23 FreeU文本生成图像模型
24 Paraformer语音识别-英文-通用-16k-离线-长音频版
25 Paraformer分角色语音识别-中文-通用
26 Paraformer语音识别-英文-通用-16k-离线-large-onnx
27 PASDv2图像超分辨率
28 Transducer语音识别-英文-gigaspeech-16k-实时
29 PMR-base
30 PMR-large
31 EQA-PMR-large
32 CodeFuse-StarCoder-15B
33 基于NER微调的机器阅读理解模型
34 CodeFuse-CodeLlama-34B-4bits
35 零样本文本分类-SSTuning-base-多语
36 通义千问-14B-Chat-Int4
37 零样本文本分类-SSTuning-base-英语
38 人脸检测与五官定位
39 多语言Conformer Listener
40 SambertHifigan语音合成-多语言-多人预训练-16k
41 音频量化编码-freqcodec_magphase-英文-libritts-16k-gr8nq32ds320-pytorch
42 音频量化编码-freqcodec_magphase-英文-libritts-16k-gr1nq32ds320-pytorch
43 音频量化编码-Encodec-中英文-通用-16k-nq32ds640-pytorch
44 音频量化编码-Encodec-中英文-通用-16k-nq32ds320-pytorch
45 音频量化编码-Encodec-英文-libritts-16k-nq32ds320-pytorch
46 ERes2Net-Large说话人日志-对话场景角色区分-通用
47 3DHuman-Syn三维角色驱动
48 音频量化编码-Encodec-英文-libritts-16k-nq32ds640-pytorch
49 文本生成3D头部模型
50 文本引导模型纹理生成-三维视觉
51 3DHuman-Syn生成式3D人物模型库

English Version

Feature

  • Added sentence vector model, supporting gte and bloom.
  • Stable diffusion introduces a new freeU method.
  • LLMPipeline now supports Swift adapter model inference.
  • Automatically upgrade to the latest version of funasr transformer during image creation.
  • Forced venv dependency removed to better support Windows system. #575

bugfix

  • Fixed shop_segmentation pipeline compatibility with timm 0.5.2.
  • Resolved compatibility issues with huggingface position_ids.
  • Fixed the missing sp_tokenizer attribute in chatglm.
  • Addressed compatibility issues of ofa model with newer transformers version.
  • Resolved the issue where the work_dir setting in trainer was not taking effect. #573
  • Fixed hf-related bugs. #569 #567.

New Models Recommended

No Model Name & Link
1 nlp_gte_sentence-embedding_chinese-large
2 nlp_gte_sentence-embedding_english-large
3 nlp_gte_sentence-embedding_english-small
4 nlp_gte_sentence-embedding_english-base
5 nlp_gte_sentence-embedding_chinese-small
6 speech_xvector_transformer_scl_zh-cn_16k-common
7 udever-bloom-3b
8 udever-bloom-1b1
9 nlp_gte_sentence-embedding_chinese-base
10 multimodal_multiview_avatar_gen
11 udever-bloom-560m
12 udever-bloom-7b1
13 Qwen-14B-Chat-Int8
14 Qwen-7B-Chat-Int8
15 punc_ct-transformer_cn-en-common-vocab471067-large-onnx
16 CodeFuse-QWen-14B
17 speech_ecapa-tdnn_sv_zh-cn_cnceleb_16k
18 speech_ecapa-tdnn_sv_zh-cn_3dspeaker_16k
19 font_style_transfer_model
20 speech_whisper-small_asr_english
21 speech_whisper-large_asr_multilingual
22 font_generation_base_model
23 multi-modal_freeu_stable_diffusion
24 speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020
25 speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn
26 speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx
27 PASD_v2_image_super_resolutions
28 speech_conformer_transducer_asr-en-16k-gigaspeech-vocab5001-pytorch-online
29 PMR-base
30 PMR-large
31 EQA-PMR-large
32 CodeFuse-StarCoder-15B
33 NER-PMR-Large
34 CodeFuse-CodeLlama-34B-4bits
35 zero-shot-classify-SSTuning-XLM-R
36 Qwen-14B-Chat-Int4
37 zero-shot-classify-SSTuning-base
38 cv_face_detection_landmark
39 speech_conformer_larger_asr_multi_language-16k-common-vocab30392-pytorch
40 speech_sambert-hifigan_tts_multilingual_multisp_pretrain_16k
41 audio_codec-freqcodec_magphase-en-libritts-16k-gr8nq32ds320-pytorch
42 audio_codec-freqcodec_magphase-en-libritts-16k-gr1nq32ds320-pytorch
43 audio_codec-encodec-zh_en-general-16k-nq32ds640-pytorch
44 audio_codec-encodec-zh_en-general-16k-nq32ds320-pytorch
45 audio_codec-encodec-en-libritts-16k-nq32ds320-pytorch
46 speech_eres2net-large_speaker-diarization_common
47 cv_3d-human-animation
48 audio_codec-encodec-en-libritts-16k-nq32ds640-pytorch
49 cv_HRN_text-to-head
50 cv_diffuser_text-texture-generation
51 cv_3d-human-synthesis-library
modelscope - v1.9.3 release

Published by liuyhwangyh 12 months ago

中文版本

高亮功能

  • 优化ci
  • 兼容transformers 4.34.0
  • Support int4 model for llm_pipeline

BugFix

  • fix merge error (#582)
  • move venv import from file level to class level to avoid import error… (#575)

English Version

Highlight

  • optimize ci
  • compatible with transformers 4.34.0
  • Support int4 model for llm_pipeline

BugFix

  • fix merge error (#582)
  • move venv import from file level to class level to avoid import error… (#575)
modelscope - v1.9.2 release

Published by liuyhwangyh about 1 year ago

中文版本

新模型推荐

高亮功能

  • 增加image_control_3d_portrait模型
  • 增加3dhuman render and animation 模型
  • 增加LLMPipeline支持大模型推理

功能列表

  • 支持swift trainer和pipeline
  • 增加image_control_3d_portrait模型
  • 增加3dhuman render and animation 模型
  • 新增 model for card correction
  • 新增 head_reconstruction and text_to_head model
  • 增加LLMpipeline支持大模型推理
  • 增加 onnx exporter for ocr recognition model
  • 增加 onnx exporter for ocr_detection db model

功能提升

BugFix

  • 修复onnxruntime 新版本兼容性问题
  • 修复huggingface兼容性问题
  • asr支持本地模型
  • 修复ci问题

English Version

New Model List and Quick Access

Highlight

  • Add 3dhuman render and animation models
  • Add image_control_3d_portrait
  • Add LLMpipeline support LLM inference

Breaking changes

Feature

  • support swift trainer and pipeline (#547)
  • add image_control_3d_portrait
  • add 3dhuman render and animation models
  • add model for card correction
  • add head_reconstruction and text_to_head model
  • add LLMpipeline support LLM inference
  • add onnx exporter for ocr recognition model
  • add onnx exporter for ocr_detection db model

Improvements

BugFix

  • Fix onnxruntime providers parameter compatible issue
  • Fix hf bug (#569)
  • Fix support local asr models (#556)
  • Fix fix ci issue
modelscope -

Published by liuyhwangyh about 1 year ago

中文版本

新模型推荐

高亮功能

  • 模型下载增加失败重试功能

功能列表

功能提升

  • 支持模型下载失败重试

BugFix

  • 解决新版本transformers position_ids兼容性问题

English Version

New Model List and Quick Access

Highlight

  • Retry download model when failed.

Breaking changes

Feature

  • Retry download model when failed.

Improvements

BugFix

  • Fix latest transformers position_ids compatible issue.
modelscope - v1.9.0 release

Published by liuyhwangyh about 1 year ago

中文版本

新模型推荐

序号 模型名称&快捷链接
1 通义千问-VL-Chat-Int4
2 t5-base
3 WizardMath-7B-V1.0
4 WizardCoder-3B-V1.0
5 WizardCoder-Python-13B-V1.0
6 WizardCoder-Python-34B-V1.0
7 WizardMath-13B-V1.0
8 WizardLM-30B-V1.0
9 WizardLM-7B-V1.0
10 CodeLlama-34b-Instruct-hf
11 CodeLlama-34b-Python-hf
12 CodeLlama-13b-Instruct-hf
13 CodeLlama-7b-hf
14 CodeLlama-7b-Python-hf
15 CodeLlama-34b-hf
16 CodeLlama-13b-hf
17 CodeLlama-13b-Python-hf
18 CodeLlama-7b-Instruct-hf
19 WizardCoder-15B-V1.0
20 WizardCoder-1B-V1.0
21 WizardMath-7B-V1.0
22 WizardLM-13B-V1.2
23 ERes2Net-Base语种识别-中英粤日韩识别-8k
24 ERes2Net-large语种识别-中英粤日韩识别-8k
25 Paraformer语音识别-英文-通用-16k-离线-1B-pytorch
26 Regularized DINO说话人确认-中文-CNCeleb-16k

高亮功能

  • video to video model support a10, v100
  • funasr支持mossformer模型
  • SDXL支持lora微调
  • stable diffusion支持fp16训练和推理

不兼容更新

  • 最低支持python3.8
  • 镜像tensorflow 升级2.13.0
  • numpy,pandas版本升级

功能列表

功能提升

  • 更新qwen QA 示例
  • 添加qwen QA langchain示例

BugFix

  • 修复stable diffusion fp16 bug
  • 修复图像上色模型加载问题
  • 修复 chatglm2b rope_ratio config 缺失问题

English Version

New Model List and Quick Access

No Model Name & Link
1 qwen-VL-Chat-Int4
2 t5-base
3 WizardMath-7B-V1.0
4 WizardCoder-3B-V1.0
5 WizardCoder-Python-13B-V1.0
6 WizardCoder-Python-34B-V1.0
7 WizardMath-13B-V1.0
8 WizardLM-30B-V1.0
9 WizardLM-7B-V1.0
10 CodeLlama-34b-Instruct-hf
11 CodeLlama-34b-Python-hf
12 CodeLlama-13b-Instruct-hf
13 CodeLlama-7b-hf
14 CodeLlama-7b-Python-hf
15 CodeLlama-34b-hf
16 CodeLlama-13b-hf
17 CodeLlama-13b-Python-hf
18 CodeLlama-7b-Instruct-hf
19 WizardCoder-15B-V1.0
20 WizardCoder-1B-V1.0
21 WizardMath-7B-V1.0
22 WizardLM-13B-V1.2
23 ERes2Net-Base
24 ERes2Net-large
25 Paraformer
26 Regularized

Highlight

  • video to video model support a10, v100
  • Add funasr support mossformer model
  • Support sdxl finetune by lora method
  • support float16 training and pipeline for stable diffusion

Breaking changes

  • Deprecated python3.7 support
  • tensorflow upgrade to 2.13.0 in image
  • numpy, pandas version upgrade

Feature

Improvements

  • upgrade qwen QA sample

BugFix

  • Fix bugs of stable diffusion fp16
  • fix image colorization model load issue
  • fix chatglm2b rope_ratio config is missing
modelscope - Release 1.8.2

Published by wenmengzhou about 1 year ago

中文版本

新模型

VideoComposer:组合视频合成 (#431)

改进

  • 移除numpy版本 <=1.22.0 的限制 (#453)
  • 更改llama2的max_length默认值 (#452)
  • 在函数generate中支持llama2输入到设备
  • 支持为llama加载数据集
  • 更新qwen qa示例
  • 添加readme和警告 (#462)

Bug修复

  • 修复chatglm2b rope_ratio配置参数缺失 (#440)
  • 修复pipeline检查错误 (#455)
  • 修复针对python37的copytree bug (#464)

English Version

New models

VideoComposer: Compositional Video Synthesis with Motion Controllability (#431)

Improvement

  • remove restriction of numpy version <=1.22.0 (#453)
  • change llama2 max_length default value (#452)
  • support llama2 inputs to device in function generate
  • support load dataset for llama
  • update qwen qa example
  • add readme and warning (#462)

Bugfix

  • fix chatglm2b rope_ratio config is missing (#440)
  • fix pipeline check error (#455)
  • fix copytree python37 bug (#464)
modelscope - Release 1.8.1

Published by wenmengzhou about 1 year ago

bugfix for qwen

  • streamer pass error
  • check flash attention installation even if use_fast_att is set True
  • fix quantization model run failed
modelscope - Release 1.8.0

Published by wenmengzhou about 1 year ago

中文版本

新模型推荐

  序号    模型名称&快捷链接 
 1  千问-7B  千问-7B-chat 
2 chatglm2-6b-32k
3 MDQE视频实例分割
4 语音合成-越南语-通用领域-24k-发音人tien
5 语音合成-马来语-通用领域-24k-发音人farah
6 stable-diffusion-xl-refiner-1.0
7 stable-diffusion-xl-base-1.0
8 PolyLM-智能服务-文本生成模型-多语言-13B
9 鹏城·盘古增强版-2.6B-CPU
10 ERes2Net-Base语种识别-中英文识别-16k
11 ERes2Net-Large语种识别-中英文识别-16k
12 codegeex2-6b
13 openbuddy-llama2-13b-v8.1-fp16
14 CT-Transformer标点-中英文-通用-large
15 CAM++语种识别-中英文识别-16k
16 zeroscope_v2_xl高清文生视频
17 FreeWilly2
18 Beautiful-Realistic-Asians-v5
19 ProST: 视频文本通用检索模型
10 Realistic_Vision_V4.0
21 CAM++说话人确认-中文-3DSpeaker-16k
22 Llama-2-70b-ms
23 Llama-2-13b-chat-ms
24 Llama-2-7b-ms
25 Llama-2-7b-chat-ms
26 Llama-2-13b-ms
27 PolyLM-指令精调-文本生成模型-多语言-13B
28 LLaVA视觉问答模型
29 Paraformer语音识别-英文-通用-16k-离线-large-pytorch
30 speech_bert_semantic-spk-turn-detection-punc_speaker-diarization_chinese
31 生成扩散模型高效调优-Swift-LoRA
32 生成扩散模型高效调优-Swift-Adapter
33 生成扩散模型高效调优-Swift-Prompt
34 MindChat-7B
35 MindChat-6B
36 MindChat-Baichuan-13B
37 rwkv-4-music
38 RWKV-4-Raven-7B
39 rwkv-4-world
40 球面上的全景图单目深度估计
41 AquilaChat-7B
42 Sunsimiao-6B-05M
43 Sunsimiao-InternLM-01M
44 4K 超高清 NeRF 重建算法
45 基于扩散模型的文生图-360全景图生成模型
46 speech_bert_dialogue-detetction_speaker-diarization_chinese
47 PolyLM-文本生成模型-多语言-13B
48 stable-diffusion-xl-base-0.9
49 百川13B对话模型
50 百川13B模型
51 书生·浦语大模型
52 internlm-chat-7b
53 StableSR图像超分辨率
54 ERes2Net-Large说话人确认-中文-3D-Speaker-16k
55 ERes2Net-Base说话人确认-中文-3D-Speaker-16k
56 基于向量量化的神经辐射场压缩
57 baichuan_agent
58 Regularized DINO说话人确认-中文-3D-Speaker-16k
59 CAAI-Hackathon
60 BAT语音识别-中文-aishell1-16k-离线

功能列表

  • 训练时使用AutoModel情况下增加模型版本检查
  • 支持safe tensors weight pipeline。
  • 支持transformers类模型的流式输出。
  • 优化类包装器。
  • 支持huggingface transformers的AutoModel, AutoConfig and AutoTokenizer
  • 在llm中添加了完整参数的sft。
  • 添加了日语README。
  • 在params和load函数中添加了download_mode参数。
  • 使用Xformers 提高了attention计算时的内存/显存性能。

功能提升

  • 将 stable diffusion 版本升级到更强大的版本2.1。
  • 增加custom stable diffusion 微调。
  • 添加了 stable diffusion swift tuner。
  • 在不使用deepspeed的情况下支持llama和lora微调。
  • 为lora  stable diffusion 添加了lora_rank参数。
  • 优化了torch1.11和torch2.0.1镜像构建脚本。
  • 在sbert文本分类中支持从数据集获取标签,并在chatglm-6b中构建文件数据集。
  • 更新了speaker_verification_pipeline.py。
  • 更新了默认测试级别。
  • 更新了aliyuncs pip的默认值。
  • 更新了语言识别任务名称。
  • 为text-to-video添加了height和width参数。
  • 自定义 diffusion pipeline。
  • 添加了支持ASRDataset的download_mode参数。
  • 更新了chatglm6b v2的新版本。
  • 改进了加载meta-csv缓存路径的方式。
  • 添加了一个example/llm模块。
  • 优化了注释和格式。
  • 设置download_mode的默认值。
  • 使用num_inference_steps和guidance_scale参数更新pipeline。
  • 在finetune_speech_recognition.py中添加了download_mode的支持,使用params.download_mode。
  • 在finetune_speech_recognition.py中使用ASRDataset替换MsDataset。
  • 更新了ASRDataset,为重新下载数据集(如数据集损坏或损坏)添加了download_mode。
  • 更新了asr_dataset.py,支持使用download_mode重新下载数据。
  • 将text_in设置为必需参数。
  • 修改了text_generation_pipeline类的参数传递。
  • 添加了baichuan/chatglm2+lora+agent示例。
  • 添加了 stable diffusion 教程ipynb。

BugFix

  • 忽略http错误,以防止模型检查时混淆。
  • 解决了加载 checkpoint时出现的不同设备问题。
  • 修复了缺少plugin模块文件的问题。
  • 解决了镜像标签无cuda的问题。
  • 解决了easycv CPU扩展构建问题。
  • 解决了设备错误问题。
  • 解决了amp和device_map问题。
  • 通过配置pysptk >= 0.1.19解决了pip安装错误。
  • 解决了ckpt输出目录忽略*.safetensors的问题。
  • 修复了对baichuan的eval和sequence_length支持的问题。
  • 解决了在文档分割pipeline中使用cuda设备的问题。
  • 删除了清华相关的硬编码。
  • 解决了与stable diffusion pipeline相关的错误,该pipeline无法识别“lora_scale”参数。
  • 解决了与chatglm2模块相关的错误。
  • 修复了mPLUG-Owl生成长度错误。
  • 更正了speaker模型的详细信息。
  • 修复了在加载本地stable diffusion数据集时出现的错误。
  • 修复了chatglm管道中的错误。
  • 修复了chatglm6b 2的错误。
  • 修复了与empty hypothesis相关的chatglm2评估错误。

English Version

New Model List and Quick Access

No Model Name & Link
  1  Qwen-7B  Qwen-7B-chat 
2 chatglm2-6b-32k
3 MDQE video-instance-segmentation
4 speech_sambert-hifigan_nsf_tts_tien_Vietnamese_24k
5 speech_sambert-hifigan_nsf_tts_farah_Malay_24k
6 stable-diffusion-xl-refiner-1.0
7 stable-diffusion-xl-base-1.0
8 PolyLM-assistant_13b_text_generation
9 pangu-plus-2.6B-CPU
10 ERes2Net-Base-language identification-en-cn-16k
11 ERes2Net-Large-language identification-en-cn-16k
12 codegeex2-6b
13 openbuddy-llama2-13b-v8.1-fp16
14 CT-Transformer-punc-cn-en-common-large
15 CAM++-language identification-en-cn-16k
16 zeroscope_v2_xl high-definition text-to-video generation
17 FreeWilly2
18 Beautiful-Realistic-Asians-v5
19 ProST: retrieval model for video-text
20 Realistic_Vision_V4.0
21 CAM++-zh-cn-3DSpeaker-16k
22 Llama-2-70b-ms
23 Llama-2-13b-chat-ms
24 Llama-2-7b-ms
25 Llama-2-7b-chat-ms
26 Llama-2-13b-ms
27 PolyLM-multialpaca-text_generation-13B
28 LLaVA visual-question-answering
29 Paraformer-asr_nat-zh-cn-16k-large-pytorch
30 speech_bert_semantic-spk-turn-detection-punc_speaker-diarization_chinese
31 multi-modal_efficient-diffusion-tuning-Swift-LoRA
32 multi-modal_efficient-diffusion-tuning-Swift-Adapter
33 multi-modal_efficient-diffusion-tuning-Swift-Prompt
34 MindChat-7B
35 MindChat-6B
36 MindChat-Baichuan-13B
37 rwkv-4-music
38 RWKV-4-Raven-7B
39 rwkv-4-world
40 monocular depth estimation for panoramic images on a sphere
41 AquilaChat-7B
42 Sunsimiao-6B-05M
43 Sunsimiao-InternLM-01M
44 4K Ultra high definition NeRF 3d-reconstruction
45 diffusion_text-to-360panorama-image_generation
46 speech_bert_dialogue-detetction_speaker-diarization_chinese
47 PolyLM-text_generation-13B
48 stable-diffusion-xl-base-0.9
49 Baichuan-13B-Chat
50 Baichuan-13B-Base
51 internlm-chat-7b-8k
52 internlm-chat-7b
53 StableSR image-super-resolution
54 ERes2Net-Large large_sv_zh-cn_3dspeaker_16k
55 ERes2Net-Base base_sv_zh-cn_3dspeaker_16k
56 3d-reconstruction_vector-quantize-compression
57 baichuan_agent
58 Regularized DINO ecapa_tdnn -3D-Speaker-16k
59 CAAI-Hackathon
60 BAT-asr-zh-cn-aishell1-16k

Feature

  • Added check model for training and model_dir for automodel.
  • Added support for safetensors weight pipeline.
  • Support for stream output on transformers model
  • Refined class wrapper
  • Added Support for AutoModel, AutoConfig and AutoTokenizer
  • Added full parameter sft to llm
  • Added Japanese README
  • Added download_mode param to params and load function.
  • Accelerated memory efficient attention with Xformers.

Improvements

  • Upgraded stable diffusion version to more powerful version 2.1.
  • Custom method for finetuning stable diffusion
  • Added stable diffusion swift tuner
  • Support for llama & lora finetune without deepspeed
  • Added lora_rank parameter for lora stable diffusion
  • Refactored torch1.11 and torch2.0.1 build script
  • Support getting labels from dataset in sbert text classification and building dataset from file in chatglm-6b
  • Updated speaker_verification_pipeline.py
  • Changed tests level
  • Set default aliyuncs pip
  • Added height and width parameters for text-to-video.
  • Customized the diffusion pipeline.
  • Added support for ASRDataset for download_mode parameters.
  • Updated the Fea/chatglm6b v2 new version.
  • Improved the load meta-csv cathe paths.
  • Added an example/llm module.
  • Optimized comments and formatting.
  • Set the download_mode default value.
  • Updated the pipeline with num_inference_steps and guidance_scale parameters.
  • Added support for download_mode in finetune_speech_recognition.py with params.download_mode.
  • Replaced MsDataset with ASRDataset in finetune_speech_recognition.py.
  • Updated ASRDataset with download_mode for re-downloading the dataset if it is broken or corrupted.
  • Updated asr_dataset.py to support download_mode for re-downloading data.
  • Made text_in a required parameter.
  • Modified the parameter passing of the text_generation_pipeline class.
  • Added baichuan/chatglm2 +lora+agent examples.
  • Added stable diffusion tutorial ipynb.

BugFix

  • Ignored http error to prevent confusion during model check.
  • Fixed checkpoint issue related to same device.
  • Fixed missing plugin python module files.
  • Fixed build tag no cuda issue.
  • Fixed easycv CPU extension build issue.
  • Fixed device error.
  • Fixed issue with amp and device_map
  • Fixed pip install error with pysptk>=0.1.19
  • Fixed issue with ckpt output directory ignoring *.safetensors
  • Fixed eval and sequence_length support for baichuan
  • Fixed issue with using cuda device in document segmentation pipeline inference
  • Removed hard code tsinghua
  • Fixed amp + device_map (#386)
  • Fixed bugs related to the Chinese stable diffusion pipeline not recognizing the 'lora_scale' argument.
  • Fixed bugs related to the chatglm2 module.
  • Fixed mPLUG-Owl generating length bug.
  • Fixed details of speaker models.
  • Fixed bugs related to loading local sd dataset.
  • Fixed bugs in the chatglm pipeline.
  • Fixed chatglm6b 2.
  • Fixed chatglm2 evaluation error related to empty hypothesis.
modelscope - Release 1.7.2

Published by wangxingjun778 about 1 year ago

Release 1.7.2
Fix some bugs

modelscope - v1.7.1 release

Published by wangxingjun778 over 1 year ago

中文版本

新特性

  • 增加baichuan模型lora inference
  • 增加baichuan和chatglm2 lora agent示例

BugFix

  • 修复历史问题
  • 修复sd加载本地数据集的问题
  • 修复chatglm推理问题
  • 修改text_generation_pipeline传参
  • 修复chatglm6b和chatglm6b 2
  • 修复评估报错:hypothesis emtpy

English Version

Features

  • Add lora_inference for baichuan
  • Add baichuan/chatglm2+lora+agent examples

BugFix

  • fix history problem
  • Fix a bug of loading local stable diffusion dataset
  • Fix/chatglm pipeline
  • Modify the parameter passing of the text_generation_pipeline class
  • Fix/chatglm6b 2
  • Fix/chatglm6b
  • fix evaluation error: hypothesis emtpy
modelscope - v1.7.0 release

Published by wangxingjun778 over 1 year ago

中文版本

新模型推荐

序号 模型名称&快捷链接
1 读光-文字识别-轻量化端侧识别模型-中英-通用领域
2 读光-文字检测-轻量化端侧DBNet行检测模型-中英-通用领域
3 CAM++说话人转换点定位-两人-中文

高亮功能

  • 新增轻量化端侧识别模型LightweightEdge
  • 新增轻量化端侧DBNet行检测模型
  • 新增CAM++说话人转换点定位
  • llama模型支持finetune和deepseed
  • llama模型支持lora
  • 对于transformers模型支持device_map
  • 数据集支持jsonl格式
  • 对于大型模型文件支持并行下载(dsw或eais环境)
  • 提升youku超大型数据集下载体验

功能列表

  • 新增轻量化端侧识别模型LightweightEdge
  • 新增轻量化端侧DBNet行检测模型
  • 新增flextrain ner样例
  • 新增在文本分类finetune的training args中增加模型版本
  • 新增StreamingMixin
  • 支持torch extension
  • 支持llama模型微调和deepspeed
  • pipeline中支持第三方的key
  • 支持说话人分离pipeline
  • 新增eres2net_aug v2模型
  • 支持transformers model device_map
  • 支持模型权重diff
  • 数据集支持jsonl格式
  • 新增Lora/Adapter/Prompt/Chatglm6b
  • 部分tests增加teardown
  • 解除datasets包版本限制
  • 支持从model id加载
  • llama模型支持lora
  • 对于大型模型文件支持并行下载

功能提升

  • 提升mPLUG-youku大型数据集的下载体验

BugFix

  • 修复DeepspeedHook.register_processor
  • dockerfile的兼容性修改(py37和py38)
  • 修复 extra_args
  • 修复ngpu bug和移除easyasr
  • 修复mplug-youku超大数据集下载相关问题
  • 修复gpt3 finetune nan的问题
  • 修复torch extension ci hang住的问题
  • 修复easycv lr hook 错误
  • 修复torch2.x 兼容性问题
  • 修复diffusers版本冲突问题
  • 修复eval RecursionError
  • 对于DiffusionForTextToImageSynthesis修复device_map问题
  • 修复stable diffusion pipeline cpu推理问题
  • 修复llama lora问题

English Version

New Model List and Quick Access

No Model Name & Link
1 cv_LightweightEdge_ocr-recognitoin-general_damo
2 cv_proxylessnas_ocr-detection-db-line-level_damo
3 speech_campplus-transformer_scl_zh-cn_16k-common

Highlight

  • Add new OCR recognition model (LightweightEdge) and some functions
  • Add ocr detection new model db-nas
  • Add CAM++ model
  • Support llama model finetune and deepspeed
  • Support lora for llama model
  • Support device_map for transformers
  • Support jsonl format in datasets
  • Support parallel download large model file
  • Improve mPLUG-YOUKU dataset downloading experience

Breaking changes

Feature

  • Add new OCR recognition model (LightweightEdge) and some functions
  • Add ocr detection new model db-nas
  • Add ner example for flextrain
  • Add model revision in training_args and modify dataset loading in finetune text classification
  • Add StreamingMixin
  • Support pre build torch extension build image, first extension megatron_util
  • Add llama finetune + deepspeed
  • Support third_party key in pipeline
  • Add speaker diarization pipeline and improve some speaker pipelines
  • Add eres2net_aug v2
  • Support device_map for transformers model
  • Add make diff & recover for model weights
  • Support jsonl format in meta data
  • Add Lora/Adapter/Prompt/Chatglm6b
  • Add teardown for tests
  • Unfreeze datasets version setting
  • Support load from model id
  • Support lora for llama
  • Support parallel download large model file

Improvements

  • llama tuned model -> pipeline
  • improve youku dataset downloading experience

BugFix

  • Fix bug for DeepspeedHook.register_processor
  • Docker file py38 and py37 compatible merge
  • Fix extra_args
  • ngpu bug and rm easyasr
  • Fix issues for downloading mplug-youku dataset
  • Fix gpt3 finetune nan
  • Fix ci hang when build torch extension
  • Fix easycv lr hook error
  • Fix torch 2.x compatible issue
  • Fix diffuser version conflict cv and multi-modal
  • Fix eval RecursionError
  • Fix device_map for DiffusionForTextToImageSynthesis
  • Fix cpu inference for stable diffusion pipeline
  • Fix llama lora bug
modelscope - v1.6.1 release

Published by wangxingjun778 over 1 year ago

中文版本

功能列表

  • 支持跳过easycv三方依赖引入
  • 支持Flextrain training args和push_to_hub
  • 支持domain_specific_object_detection 的onnx格式导出

BugFix

  • 修复test_cli CI报错
  • 修复merge hook
  • 修复NER tokenizer不能接收kwargs的问题
  • 修复lineless_table_recognition功能遇到空白图片崩溃的bug
  • 修复某些情况下private数据集鉴权失败的问题

English Version

Feature

  • Add pattern to skip easycv.thirdparty
  • Support flex train feature (training args and push_to_hub adaptions)
  • Support onnx export for domain_specific_object_detection

BugFix

  • Fix CI: test merge dataset failed
  • Fix merge_hook
  • Fix NER tokenizer which won't accept kwargs
  • Fix lineless_table_recognition crashed when input blank images
  • fix private dataset auth issue
modelscope - v1.6.0 release

Published by wangxingjun778 over 1 year ago

中文版本

该版本共新增上架5个模型。

新模型列表及快捷访问

贡献组织 模型名称 是否支持Finetune
达摩院 ERes2Net说话人确认-英文-VoxCeleb-16k-离线-pytorch
达摩院 mPLUG-Owl-多模态对话-英文-7B
达摩院 FastInst快速实例分割
达摩院 TransFace人脸识别模型
达摩院 Regularized DINO说话人确认-英文-VoxCeleb-16k-离线-pytorch

非兼容性修改

  • 支持Python3.8版本
  • 移除demo check

English Version

Highlight

  • Support Python3.8
  • Add mPLUG-Owl model
  • Add cvpr23 Fastinst model

Breaking changes

  • Support Python3.8
  • Remove demo check

Feature

  • Add ERes2Net for speaker verification
  • Add mPLUG-Owl model
  • Support FlexTrain and update the structure of trainer
  • Add cvpr23 fastinst model
  • Support Virgo MaxCompute datasource for Ali-cloud inner applications
  • Add clip_interrogator
  • Add gpt3 example
  • Add convert megatron ckpt script
  • Add trainer for UniTE
  • Add transface model
  • Add verified if whl installed
  • Support python3.8
  • Add ONNX exporter for ans dfsmn
  • Add rdino model

Improvements

  • Update multi_modal_embedding example
  • Refine easrasr
  • Pipeline input, output and parameter normalization.
  • Display hub error message
  • Remove easycv codes, plugin access

BugFix

  • Fix bug in **kwargs duplicated for audio module
  • Fix distributed hook to lazyimport and an import bug
  • Fix transformer examples
  • Add pop for base class parameters
  • Fix func update_local_model; change funasr version
  • Remove pai-easycv requirement
  • Fix hypotheses did't init in cpu device, make fid_dialogue_test available
modelscope - v1.5.0 release

Published by Firmament-cyou over 1 year ago

中文版本

新模型推荐

序号 模型名称&快捷链接
1 ResNet50行人结构化属性识别模型
2 DamoFD人脸检测关键点模型-0.5G
3 CAM++说话人确认-英文-VoxCeleb-16k
4 一种具有自我评估能力的机器翻译-中英-通用领域-large

高亮功能

  • 支持 lora 生成扩散模型高效调优
  • 增加 llama 模型
  • 支持推送到 hub 的能力
  • 为 chatglm-6B 类模型支持 chat 任务
  • 增加常用模型和任务的 cli 调用 example

功能列表

  • 支持了对使用 megatron tensor 并行模型保存的 checkpoint 拆分合并
  • 支持 lora 生成扩散模型高效调优
  • 增加 pedestrian attribute recognition 模型
  • 增加 damofd 系列模型
  • 增加 llama 模型
  • 支持推送到 hub 的能力
  • 增加 speaker cam++ 模型
  • 增加 head 支持 XlmRoberta 模型
  • 增加 canmt translation 模型
  • 为 chatglm-6B 类模型支持 chat 任务

功能提升

  • funasr 更新到 0.4.0 版本,支持 mac 运行
  • plugin 支持 trainer
  • fid_dialouge_pipeline 新增 3.7B 模型
  • 增加 Mgeo 模型 token classification 任务的训练示例
  • 增加 PALM 模型 text generation 任务的训练示例
  • 增加 CLIP 模型 multi-modal embedding 任务的训练示例
  • speech kws nearfield 训练增加梯度累积配置
  • 重构优化人脸重建模型相关代码
  • 更新图像着色指标
  • 更新 github issue 模版

BugFix

  • 修复文本生成任务模型 generate 报错
  • 修复人脸重建模型 pipeline 报错
  • 修复 pipeline 重复输出 warning 的问题
  • 修复 plugin import 包失败时报错
  • 修复 speech kws nearfield 多卡训练报错
  • 修复生成模型输出英文结果缺少空格的问题
  • 修复 jsonplus 不支持 ndarray 的问题

English Version

New Model List and Quick Access

No Model Name & Link
1 ResNet50 pedestrian-attribute-recognition image
2 DamoFD face-detection 0.5G
3 Speech cam++ English-VoxCeleb-16k
4 Canmt translation with self evaluation zh2en-large

Highlight

  • Add efficient tunner modules
  • Add llama to mslib from hf
  • Support the ability to push to hub
  • Add task chat for all chat models, like chatglm-6B
  • Add common models and tasks cli call example

Breaking changes

Feature

  • Support split and merge for megatron_base model
  • Add efficient tunner modules
  • Add pedestrian attribute recognition model
  • Add damofd model
  • Add llama to mslib from hf
  • Support the ability to push to hub
  • Add speaker model cam++ for speaker verification task
  • New head support for XlmRoberta model
  • Add canmt translation model
  • Add task chat for all chat models, like chatglm-6B

Improvements

  • support funasr for mac
  • Plugin support trainer
  • Add 3.7B size model for fid_dialouge_pipeline
  • Add token classification example for MGeo
  • Add PALM finetune example
  • Add multi-modal embedding example for CLIP
  • Speech kws nearfield training add gradient accumulation config
  • Update face reconstruction to HRN(CVPR2023)
  • Update image colorization metric
  • Update issue templates

BugFix

  • Fix generate for ModelForTextGeneration
  • Fix issues for face pipeline
  • Fix keep printing warnings in pipeline
  • Bug fixed in plugin
  • Fix speech kws nearfield training with multi-gpu
  • Fix english words without space
  • Fix jsonplus, support ndarray
Package Rankings
Top 6.75% on Proxy.golang.org
Top 2.53% on Pypi.org
Badges
Extracted from project README
PyPI license open issues GitHub pull-requests GitHub latest commit Leaderboard
Related Projects