The official Python client for the Huggingface Hub.
APACHE-2.0 License
Published by Wauplin 3 months ago
Fixing a bug in the chat completion URL to follow OpenAI standard https://github.com/huggingface/huggingface_hub/pull/2418. InferenceClient
now works with urls ending with /
, /v1
and /v1/chat/completions
.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.24.2...v0.24.3
Published by Wauplin 3 months ago
See https://github.com/huggingface/huggingface_hub/pull/2413 for more details.
Creating an empty commit on a PR was failing due to a revision
parameter been quoted twice. This patch release fixes it.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.24.1...v0.24.2
Published by Wauplin 3 months ago
This release fixes 2 things:
"[DONE]"
message in chat stream (related to TGI update https://github.com/huggingface/text-generation-inference/pull/2221)See https://github.com/huggingface/huggingface_hub/pull/2410 for more details.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.24.0...v0.24.1
Published by Wauplin 3 months ago
The InferenceClient
's chat completion API is now fully compliant with OpenAI
client. This means it's a drop-in replacement in your script:
- from openai import OpenAI
+ from huggingface_hub import InferenceClient
- client = OpenAI(
+ client = InferenceClient(
base_url=...,
api_key=...,
)
output = client.chat.completions.create(
model="meta-llama/Meta-Llama-3-8B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Count to 10"},
],
stream=True,
max_tokens=1024,
)
for chunk in output:
print(chunk.choices[0].delta.content)
Why switching to InferenceClient
if you already use OpenAI
then? Because it's better integrated with HF services, such as the Serverless Inference API and Dedicated Endpoints. Check out the more detailed answer in this HF Post.
For more details about OpenAI compatibility, check out this guide's section.
InferenceClient
improvementsSome new parameters have been added to the InferenceClient
, following the latest changes in our Inference API:
prompt_name
, truncate
and normalize
in feature_extraction
model_id
and response_format
, in chat_completion
adapter_id
in text_generation
hypothesis_template
and multi_labels
in zero_shot_classification
Of course, all of those changes are also available in the AsyncInferenceClient
async equivalent 🤗
prompt_name
to feature-extraction + update types by @Wauplin in #2363adapter_id
(text-generation) and response_format
(chat-completion) by @Wauplin in #2383Added helpers for TGI servers:
get_endpoint_info
to get information about an endpoint (running model, framework, etc.). Only available on TGI/TEI-powered models.health_check
to check health status of the server. Only available on TGI/TEI-powered models and only for InferenceEndpoint or local deployment. For serverless InferenceAPI, it's better to use get_model_status
.Other fixes:
image_to_text
output type has been fixedwait-for-model
to avoid been rate limited while model is not loadedproxies
supportThe serialization module introduced in v0.22.x
has been improved to become the preferred way to serialize a torch model to disk. It handles how of the box sharding and safe serialization (using safetensors
) with subtleties to work with shared layers. This logic was previously scattered in libraries like transformers
, diffusers
, accelerate
and safetensors
. The goal of centralizing it in huggingface_hub
is to allow any external library to safely benefit from the same naming convention, making it easier to manage for end users.
>>> from huggingface_hub import save_torch_model
>>> model = ... # A PyTorch model
# Save state dict to "path/to/folder". The model will be split into shards of 5GB each and saved as safetensors.
>>> save_torch_model(model, "path/to/folder")
# Or save the state dict manually
>>> from huggingface_hub import save_torch_state_dict
>>> save_torch_state_dict(model.state_dict(), "path/to/folder")
More details in the serialization package reference.
save_torch_state_dict
+ add save_torch_model
by @Wauplin in #2373Some helpers related to serialization have been made public for reuse in external libraries:
get_torch_storage_id
get_torch_storage_size
max_shard_size
as string in split_state_dict_into_shards_factory
by @SunMarc in #2286The HfFileSystem
has been improved to optimize calls, especially when listing files from a repo. This is especially useful for large datasets like HuggingFaceFW/fineweb for faster processing and reducing risk of being rate limited.
hf_file_system.py
by @lappemic in #2278fs.walk()
by @lhoestq in #2346Thanks to @lappemic, HfFileSystem
methods are now properly documented. Check it out here!
HfFilesyStem
Methods by @lappemic in #2380A new mechanism has been introduced to prevent empty commits if no changes have been detected. Enabled by default in upload_file
, upload_folder
, create_commit
and the huggingface-cli upload
command. There is no way to force an empty commit.
Resource Groups allow organizations administrators to group related repositories together, and manage access to those repos. It is now possible to specify a resource group ID when creating a repo:
from huggingface_hub import create_repo
create_repo("my-secret-repo", private=True, resource_group_id="66670e5163145ca562cb1988")
resource_group_id
in create_repo
by @Wauplin in #2324Webhooks allow you to listen for new changes on specific repos or to all repos belonging to particular set of users/organizations (not just your repos, but any repo). With the Webhooks API you can create, enable, disable, delete, update, and list webhooks from a script!
from huggingface_hub import create_webhook
# Example: Creating a webhook
webhook = create_webhook(
url="https://webhook.site/your-custom-url",
watched=[{"type": "user", "name": "your-username"}, {"type": "org", "name": "your-org-name"}],
domains=["repo", "discussion"],
secret="your-secret"
)
The search API has been slightly improved. It is now possible to:
model_info
/list_models
(and similarly for datasets/Spaces). For example, you can ask the server to return downloadsAllTime
for all models.>>> from huggingface_hub import list_models
>>> for model in list_models(library="transformers", expand="downloadsAllTime", sort="downloads", limit=5):
... print(model.id, model.downloads_all_time)
MIT/ast-finetuned-audioset-10-10-0.4593 1676502301
sentence-transformers/all-MiniLM-L12-v2 115588145
sentence-transformers/all-MiniLM-L6-v2 250790748
google-bert/bert-base-uncased 1476913254
openai/clip-vit-large-patch14 590557280
expand
parameter in xxx_info
and list_xxxs
(model/dataset/Space) by @Wauplin in #2333It is now possible to delete files from a repo using the command line:
Delete a folder:
>>> huggingface-cli repo-files Wauplin/my-cool-model delete folder/
Files correctly deleted from repo. Commit: https://huggingface.co/Wauplin/my-cool-mo...
Use Unix-style wildcards to delete sets of files:
>>> huggingface-cli repo-files Wauplin/my-cool-model delete *.txt folder/*.bin
Files correctly deleted from repo. Commit: https://huggingface.co/Wauplin/my-cool-mo...
repo_files
command, with recursive deletion. by @OlivierKessler01 in #2280The ModelHubMixin
, allowing for quick integration of external libraries with the Hub have been updated to fix some existing bugs and ease its use. Learn how to integrate your library from this guide.
ModelHubMixin
siblings by @Wauplin in #2394Efforts from the Korean-speaking community continued to translate guides and package references to KO! Check out the result here.
package_reference/cards.md
to Korean by @usr-bin-ksh in #2204package_reference/community.md
to Korean by @seoulsky-field in #2183guides/integrations.md
to Korean by @cjfghk5697 in #2256package_reference/environment_variables.md
to Korean by @jungnerd in #2311package_reference/webhooks_server.md
to Korean by @fabxoe in #2344guides/manage-cache.md
to Korean by @cjfghk5697 in #2347French documentation is also being updated, thanks to @JibrilEl!
A very nice illustration has been made by @severo to explain how hf://
urls works with the HfFileSystem object. Check it out here!
A few breaking changes have been introduced:
ModelFilter
and DatasetFilter
are completely removed. You can now pass arguments directly to list_models
and list_datasets
. This removes one level of complexity for the same result.organization
and name
from update_repo_visibility
. Please use a proper repo_id
instead. This makes the method consistent with all other methods from HfApi
.These breaking changes have been announced with a regular deprecation cycle.
The legacy_cache_layout
parameter (in hf_hub_download
/snapshot_download
) as well as cached_download
, filename_to_url
and url_to_filename
helpers are now deprecated and will be removed in huggingface_hub==0.26.x
. The proper way to download files is to use the current cache system with hf_hub_download
/snapshot_download
that have been in place for 2 years already.
legacy_cache_layout
parameter in hf_hub_download
by @Wauplin in #2317.resume()
if Inference Endpoint is already running by @Wauplin in #2335docs/README.md
by @lappemic in #2382safetensors[torch]
by @qgallouedec in #2371The following contributors have made significant changes to the library over the last release:
package_reference/cards.md
to Korean (#2204)package_reference/community.md
to Korean (#2183)hf_file_system.py
(#2278)docs/README.md
(#2382)HfFilesyStem
Methods (#2380)repo_files
command, with recursive deletion. (#2280)guides/integrations.md
to Korean (#2256)guides/manage-cache.md
to Korean (#2347)package_reference/environment_variables.md
to Korean (#2311)package_reference/webhooks_server.md
to Korean (#2344)Published by Wauplin 3 months ago
See https://github.com/huggingface/huggingface_hub/pull/2394 for more details.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.23.4...v0.23.5
Published by Wauplin 4 months ago
Includes:
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.23.3...v0.23.4
Published by Wauplin 5 months ago
Release 0.23.0 introduced a breaking change in InferenceClient.text_generation
. When details=True
is passed, the details
attribute in the output is always None. The patch release fixes this. See https://github.com/huggingface/huggingface_hub/pull/2316 for more details.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.23.2...v0.23.3
Published by Wauplin 5 months ago
split_state_dict_into_shards_factory
now accepts string values as max_shard_size
(ex: "5MB"
), in addition to integer values. Related PR: https://github.com/huggingface/huggingface_hub/pull/2286.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.23.1...v0.23.2
Published by Wauplin 5 months ago
See https://github.com/huggingface/huggingface_hub/pull/2271 for more details.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.23.0...v0.23.1
Published by Wauplin 6 months ago
The 0.23.0
release comes with a big revamp of the download process, especially when it comes to downloading to a local directory. Previously the process was still involving the cache directory and symlinks which led to misconceptions and a suboptimal user experience. The new workflow involves a .cache/huggingface/
folder, similar to the .git/
one, that keeps track of the progress of a download. The main features are:
Example to download q4 GGUF file for microsoft/Phi-3-mini-4k-instruct-gguf:
# Download q4 GGUF file from
huggingface-cli download microsoft/Phi-3-mini-4k-instruct-gguf Phi-3-mini-4k-instruct-q4.gguf --local-dir=data/phi3
With this addition, interrupted downloads are now resumable! This applies both for downloads in local and cache directories which should greatly improve UX for users with slow/unreliable connections. In this regard, the resume_download
parameter is now deprecated (not relevant anymore).
.huggingface/
folder to .cache/huggingface/
by @Wauplin in #2262InferenceClient
It is now possible to provide a list of tools when chatting with a model using the InferenceClient
! This major improvement has been made possible thanks to TGI that handle them natively.
>>> from huggingface_hub import InferenceClient
# Ask for weather in the next days using tools
>>> client = InferenceClient("meta-llama/Meta-Llama-3-70B-Instruct")
>>> messages = [
... {"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."},
... {"role": "user", "content": "What's the weather like the next 3 days in San Francisco, CA?"},
... ]
>>> tools = [
... {
... "type": "function",
... "function": {
... "name": "get_current_weather",
... "description": "Get the current weather",
... "parameters": {
... "type": "object",
... "properties": {
... "location": {
... "type": "string",
... "description": "The city and state, e.g. San Francisco, CA",
... },
... "format": {
... "type": "string",
... "enum": ["celsius", "fahrenheit"],
... "description": "The temperature unit to use. Infer this from the users location.",
... },
... },
... "required": ["location", "format"],
... },
... },
... },
... ...
... ]
>>> response = client.chat_completion(
... model="meta-llama/Meta-Llama-3-70B-Instruct",
... messages=messages,
... tools=tools,
... tool_choice="auto",
... max_tokens=500,
... )
>>> response.choices[0].message.tool_calls[0].function
ChatCompletionOutputFunctionDefinition(
arguments={
'location': 'San Francisco, CA',
'format': 'fahrenheit',
'num_days': 3
},
name='get_n_day_weather_forecast',
description=None
)
It is also possible to provide grammar rules to the text_generation
task. This ensures that the output follows a precise JSON Schema specification or matches a regular expression. For more details about it, check out the Guidance guide from Text-Generation-Inference docs.
Mention more chat-completion
task instead of conversation
in documentation.
chat_completion
and remove conversational
from Inference guide by @Wauplin in #2215chat-completion
relies on server-side rendering in all cases, including when model is transformers
-backed. Previously it was only the case for TGI-backed models and templates were rendered client-side otherwise.
Improved logic to determine whether a model is served via TGI or transformers
.
Raise error in chat completion when unprocessable by @Wauplin in #2257
Document more chat_completion by @Wauplin in #2260
The PseudoLab team is a non-profit dedicated to make AI more accessible in the Korean-speaking community. In the past few weeks, their team of contributors managed to translated (almost) entirely the huggingface_hub
documentation. Huge shout-out to the coordination on this task! Documentation can be accessed here.
guides/webhooks_server.md
to Korean by @nuatmochoi in #2145reference/login.md
to Korean by @SeungAhSon in #2151package_reference/tensorboard.md
to Korean by @fabxoe in #2173package_reference/inference_client.md
to Korean by @cjfghk5697 in #2178reference/inference_endpoints.md
to Korean by @harheem in #2180package_reference/file_download.md
to Korean by @seoyoung-3060 in #2184package_reference/cache.md
to Korean by @nuatmochoi in #2191package_reference/collections.md
to Korean by @boyunJang in #2214package_reference/inference_types.md
to Korean by @fabxoe in #2171guides/upload.md
to Korean by @junejae in #2139reference/repository.md
to Korean by @junejae in #2189package_reference/space_runtime.md
to Korean by @boyunJang in #2213guides/repository.md
to Korean by @cjfghk5697 in #2124guides/model_cards.md
to Korean" by @SeungAhSon in #2128guides/community.md
to Korean by @seoulsky-field in #2126guides/cli.md
to Korean by @harheem in #2131guides/search.md
to Korean by @seoyoung-3060 in #2134guides/inference.md
to Korean by @boyunJang in #2130guides/manage-spaces.md
to Korean by @boyunJang in #2220guides/hf_file_system.md
to Korean by @heuristicwave in #2146package_reference/hf_api.md
to Korean by @fabxoe in #2165package_reference/mixins.md
to Korean by @fabxoe in #2166guides/inference_endpoints.md
to Korean by @usr-bin-ksh in #2164package_reference/utilities.md
to Korean by @cjfghk5697 in #2196@bilgehanertan added support for 2 new routes:
get_user_overview
to retrieve high-level information about a user: username, avatar, number of models/datasets/Spaces, number of likes and upvotes, number of interactions in discussion, etc.@bilgehanertan added a new command to the CLI to handle tags. It is now possible to:
>>> huggingface-cli tag Wauplin/my-cool-model v1.0
You are about to create tag v1.0 on model Wauplin/my-cool-model
Tag v1.0 created on Wauplin/my-cool-model
>>> huggingface-cli tag Wauplin/gradio-space-ci -l --repo-type space
Tags for space Wauplin/gradio-space-ci:
0.2.2
0.2.1
0.2.0
0.1.2
0.0.2
0.0.1
>>> huggingface-cli tag -d Wauplin/my-cool-model v1.0
You are about to delete tag v1.0 on model Wauplin/my-cool-model
Proceed? [Y/n] y
Tag v1.0 deleted on Wauplin/my-cool-model
For more details, check out the CLI guide.
This ModelHubMixin
got a set of nice improvement to generate model cards and handle custom data types in the config.json
file. More info in the integration guide.
ModelHubMixin
: more metadata + arbitrary config types + proper guide by @Wauplin in #2230In a shared environment, it is now possible to set a custom path HF_TOKEN_PATH
as environment variable so that each user of the cluster has their own access token.
HF_TOKEN_PATH
as environment variable by @Wauplin in #2185Thanks to @Y4suyuki and @lappemic, most custom errors defined in huggingface_hub
are now aggregated in the same module. This makes it very easy to import them from from huggingface_hub.errors import ...
.
Fixed HFSummaryWriter
(class to seamlessly log tensorboard events to the Hub) to work with either tensorboardX
or torch.utils
implementation, depending on the user setup.
Speed to list files using HfFileSystem
has been drastically improved, thanks to @awgr. The values returned from the cache are not deep-copied anymore, which was unfortunately the part taking the most time in the process. If users want to modify values returned by HfFileSystem
, they would need to copy them before-hand. This is expected to be a very limited drawback.
Progress bars in huggingface_hub
got some flexibility!
It is now possible to provide a name to a tqdm bar (similar to logging.getLogger
) and to enable/disable only some progress bars. More details in this guide.
>>> from huggingface_hub.utils import tqdm, disable_progress_bars
>>> disable_progress_bars("peft.foo")
# No progress bars for `peft.boo.bar`
>>> for _ in tqdm(range(5), name="peft.foo.bar"):
... pass
# But for `peft` yes
>>> for _ in tqdm(range(5), name="peft"):
... pass
100%|█████████████████| 5/5 [00:00<00:00, 117817.53it/s]
--local-dir-use-symlink
and --resume-download
As part of the download process revamp, some breaking changes have been introduced. However we believe that the benefits outweigh the change cost. Breaking changes include:
.cache/huggingface/
folder is not present at the root of the local dir. It only contains file locks, metadata and partially downloaded files. If you need to, you can safely delete this folder without corrupting the data inside the root folder. However, you should expect a longer recovery time if you try to re-run your download command.--local-dir-use-symlink
is not in used anymore and will be ignored. It is not possible anymore to symlinks your local dir with the cache directory. Thanks to the .cache/huggingface/
folder, it shouldn't be needed anyway.--resume-download
has been deprecated and will be ignored. Resuming failed downloads is now activated by default all the time. If you need to force a new download, use --force-download
.As part of #2237 (Grammar and Tools support), we've updated the return value from InferenceClient.chat_completion
and InferenceClient.text_generation
to match exactly TGI output. The attributes of the returned objects did not change but the classes definition themselves yes. Expect errors if you've previously had from huggingface_hub import TextGenerationOutput
in your code. This is however not the common usage since those objects are already instantiated by huggingface_hub
directly.
Some other breaking changes were expected (and announced since 0.19.x):
list_files_info
is definitively removed in favor of get_paths_info
and list_repo_tree
WebhookServer.run
is definitively removed in favor of WebhookServer.launch
api_endpoint
in ModelHubMixin push_to_hub
's method is definitively removed in favor of the HF_ENDPOINT
environment variableCheck #2156 for more details.
hf_file_system
by @Wauplin in #2253updatedRefs
in WebhookPayload by @Wauplin in #2169TestHfHubDownloadRelativePaths
+ implicit delete folder is ok by @Wauplin in #2259The following contributors have made significant changes to the library over the last release:
guides/community.md
to Korean (#2126)guides/hf_file_system.md
to Korean (#2146)guides/inference_endpoints.md
to Korean (#2164)Published by Wauplin 7 months ago
Published by Wauplin 7 months ago
Fixed a bug breaking the SetFit integration.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.22.0...v0.22.1
Published by Wauplin 7 months ago
Discuss about the release in our Community Tab. Feedback is welcome!! 🤗
Support for inference tools continues to improve in huggingface_hub
. At the menu in this release? A new chat_completion
API and fully typed inputs/outputs!
A long-awaited API has just landed in huggingface_hub
! InferenceClient.chat_completion
follows most of OpenAI's API, making it much easier to integrate with existing tools.
Technically speaking it uses the same backend as the text-generation
task but requires a preprocessing step to format the list of messages into a single text prompt. The chat template is rendered server-side when models are powered by TGI, which is the case for most LLMs: Llama, Zephyr, Mistral, Gemma, etc. Otherwise, the templating happens client-side which requires minijinja
package to be installed. We are actively working on bridging this gap, aiming at rendering all templates server-side in the future.
>>> from huggingface_hub import InferenceClient
>>> messages = [{"role": "user", "content": "What is the capital of France?"}]
>>> client = InferenceClient("HuggingFaceH4/zephyr-7b-beta")
# Batch completion
>>> client.chat_completion(messages, max_tokens=100)
ChatCompletionOutput(
choices=[
ChatCompletionOutputChoice(
finish_reason='eos_token',
index=0,
message=ChatCompletionOutputChoiceMessage(
content='The capital of France is Paris. The official name of the city is "Ville de Paris" (City of Paris) and the name of the country\'s governing body, which is located in Paris, is "La République française" (The French Republic). \nI hope that helps! Let me know if you need any further information.'
)
)
],
created=1710498360
)
# Stream new tokens one by one
>>> for token in client.chat_completion(messages, max_tokens=10, stream=True):
... print(token)
ChatCompletionStreamOutput(choices=[ChatCompletionStreamOutputChoice(delta=ChatCompletionStreamOutputDelta(content='The', role='assistant'), index=0, finish_reason=None)], created=1710498504)
ChatCompletionStreamOutput(choices=[ChatCompletionStreamOutputChoice(delta=ChatCompletionStreamOutputDelta(content=' capital', role='assistant'), index=0, finish_reason=None)], created=1710498504)
(...)
ChatCompletionStreamOutput(choices=[ChatCompletionStreamOutputChoice(delta=ChatCompletionStreamOutputDelta(content=' may', role='assistant'), index=0, finish_reason=None)], created=1710498504)
ChatCompletionStreamOutput(choices=[ChatCompletionStreamOutputChoice(delta=ChatCompletionStreamOutputDelta(content=None, role=None), index=0, finish_reason='length')], created=1710498504)
InferenceClient.chat_completion
+ use new types for text-generation by @Wauplin in #2094
We are currently working towards more consistency in tasks definitions across the Hugging Face ecosystem. This is no easy job but a major milestone has recently been achieved! All inputs and outputs of the main ML tasks are now fully specified as JSONschema objects. This is the first brick needed to have consistent expectations when running inference across our stack: transformers (Python), transformers.js (Typescript), Inference API (Python), Inference Endpoints (Python), Text Generation Inference (Rust), Text Embeddings Inference (Rust), InferenceClient (Python), Inference.js (Typescript), etc.
Integrating those definitions will require more work but huggingface_hub
is one of the first tools to integrate them. As a start, all InferenceClient
return values are now typed dataclasses. Furthermore, typed dataclasses have been generated for all tasks' inputs and outputs. This means you can now integrate them in your own library to ensure consistency with the Hugging Face ecosystem. Specifications are open-source (see here) meaning anyone can access and contribute to them. Python's generated classes are documented here.
Here is a short example showcasing the new output types:
>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient()
>>> client.object_detection("people.jpg"):
[
ObjectDetectionOutputElement(
score=0.9486683011054993,
label='person',
box=ObjectDetectionBoundingBox(xmin=59, ymin=39, xmax=420, ymax=510)
),
...
]
Note that those dataclasses are backward-compatible with the dict-based interface that was previously in use. In the example above, both ObjectDetectionBoundingBox(...).xmin
and ObjectDetectionBoundingBox(...)["xmin"]
are correct, even though the former should be the preferred solution from now on.
ModelHubMixin
is an object that can be used as a parent class for the objects in your library in order to provide built-in serialization methods to upload and download pretrained models from the Hub. This mixin is adapted into a PyTorchHubMixin
that can serialize and deserialize any Pytorch model. The 0.22 release brings its share of improvements to these classes:
model_hub_mixin
) and custom tags from the library (see 2.). You can extend/modify this modelcard by overwriting the generate_model_card
method.>>> import torch
>>> import torch.nn as nn
>>> from huggingface_hub import PyTorchModelHubMixin
# Define your Pytorch model exactly the same way you are used to
>>> class MyModel(
... nn.Module,
... PyTorchModelHubMixin, # multiple inheritance
... library_name="keras-nlp",
... tags=["keras"],
... repo_url="https://github.com/keras-team/keras-nlp",
... docs_url="https://keras.io/keras_nlp/",
... # ^ optional metadata to generate model card
... ):
... def __init__(self, hidden_size: int = 512, vocab_size: int = 30000, output_size: int = 4):
... super().__init__()
... self.param = nn.Parameter(torch.rand(hidden_size, vocab_size))
... self.linear = nn.Linear(output_size, vocab_size)
... def forward(self, x):
... return self.linear(x + self.param)
# 1. Create model
>>> model = MyModel(hidden_size=128)
# Config is automatically created based on input + default values
>>> model._hub_mixin_config
{"hidden_size": 128, "vocab_size": 30000, "output_size": 4}
# 2. (optional) Save model to local directory
>>> model.save_pretrained("path/to/my-awesome-model")
# 3. Push model weights to the Hub
>>> model.push_to_hub("my-awesome-model")
# 4. Initialize model from the Hub => config has been preserved
>>> model = MyModel.from_pretrained("username/my-awesome-model")
>>> model._hub_mixin_config
{"hidden_size": 128, "vocab_size": 30000, "output_size": 4}
# Model card has been correctly populated
>>> from huggingface_hub import ModelCard
>>> card = ModelCard.load("username/my-awesome-model")
>>> card.data.tags
["keras", "pytorch_model_hub_mixin", "model_hub_mixin"]
>>> card.data.library_name
"keras-nlp"
For more details on how to integrate these classes, check out the integration guide.
ModelHubMixin
: pass config when __init__
accepts **kwargs by @Wauplin in #2058
PytorchModelHubMixin
by @Wauplin in #2079
ModelHubMixin
by @Wauplin in #2080
HfFileSystem
download speed was limited by some internal logic in fsspec
. We've now updated the get_file
and read
implementations to improve their download speed to a level similar to hf_hub_download
.
We are aiming at moving all errors raised by huggingface_hub
into a single module huggingface_hub.errors
to ease the developer experience. This work has been started as a community contribution from @Y4suyuki.
HfApi
class now accepts a headers
parameters that is then passed to every HTTP call made to the Hub.
📚 More documentation in Korean!
package_reference/overview.md
to Korean by @jungnerd in #2113
The new types returned by InferenceClient
methods should be backward compatible, especially to access values either as attributes (.my_field
) or as items (i.e. ["my_field"]
). However, dataclasses and dicts do not always behave exactly the same so might notice some breaking changes. Those breaking changes should be very limited.
ModelHubMixin
internals changed quite a bit, breaking some use cases. We don't think those use cases were in use and changing them should really benefit 99% of integrations. If you witness any inconsistency or error in your integration, please let us know and we will do our best to mitigate the problem. One of the biggest change is that the config values are not attached to the mixin instance as instance.config
anymore but as instance._model_hub_mixin
. The .config
attribute has been mistakenly introduced in 0.20.x
so we hope it has not been used much yet.
huggingface_hub.file_download.http_user_agent
has been removed in favor of the officially document huggingface_hub.utils.build_hf_headers
. It was a deprecated method since 0.18.x
.
The CI pipeline has been greatly improved, especially thanks to the efforts from @bmuskalla. Most tests are now passing in under 3 minutes, against 8 to 10 minutes previously. Some long-running tests have been greatly simplified and all tests are now ran in parallel with python-xdist
, thanks to a complete decorrelation between them.
We are now also using the great uv
installer instead of pip
in our CI, which saves around 30-40s per pipeline.
python-xdist
on all tests by @bmuskalla in #2059
The following contributors have made significant changes to the library over the last release:
Published by Wauplin 8 months ago
Release v0.21 introduced a breaking change make it impossible to save a PytorchModelHubMixin
-based model that has shared tensors. This has been fixed in https://github.com/huggingface/huggingface_hub/pull/2086.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.21.3...v0.21.4
Published by Wauplin 8 months ago
More details in https://github.com/huggingface/huggingface_hub/pull/2058.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.21.2...v0.21.3
Published by Wauplin 8 months ago
See https://github.com/huggingface/huggingface_hub/pull/2056. (+https://github.com/huggingface/huggingface_hub/pull/2050 shipped as v0.21.1).
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.21.0...v0.21.2
Published by Wauplin 8 months ago
Discuss about the release in our Community Tab. Feedback welcome!! 🤗
All objects returned by the HfApi
client are now dataclasses!
In the past, objects were either dataclasses, typed dictionaries, non-typed dictionaries and even basic classes. This is now all harmonized with the goal of improving developer experience.
Kudos goes to the community for the implementation and testing of all the harmonization process. Thanks again for the contributions!
The HfFileSystem
class implements the fsspec
interface to allow loading and writing files with a filesystem-like interface. The interface is highly used by the datasets
library and this release will improve further the efficiency and robustness of the integration.
rm
on branch by @lhoestq in #1957
HfFileSystem
by @mariosasko in #1981
HfFileSystem.url
method by @mariosasko in #2027
The PyTorchModelHubMixin
class let's you upload ANY pytorch model to the Hub in a few lines of code. More precisely, it is a class that can be inherited in any nn.Module
class to add the from_pretrained
, save_pretrained
and push_to_hub
helpers to your class. It handles serialization and deserialization of weights and configs for you and enables download counts on the Hub.
With this release, we've fixed 2 pain points holding back users from using this lib:
.safetensors
files instead of pytorch pickles for safety reasons. Loading from previous pytorch pickles is still supported but we are moving toward completely deprecating them (in a mid to long term plan).PyTorchModelHubMixin
by @bmuskalla in #2033
Audio-to-audio task is now supported by both by the InferenceClient
!
>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient()
>>> audio_output = client.audio_to_audio("audio.flac")
>>> for i, item in enumerate(audio_output):
>>> with open(f"output_{i}.flac", "wb") as f:
f.write(item["blob"])
Also fixed a few things:
With the aim of harmonizing repo structures and file serialization on the Hub, we added a new module serialization
with a first helper split_state_dict_into_shards
that takes a state dict and split it into shards. Code implementation is mostly taken from transformers
and aims to be reused by other libraries in the ecosystem. It seamlessly supports torch
, tensorflow
and numpy
weights, and can be easily extended to other frameworks.
This is a first step in the harmonization process and more loading/saving helpers will be added soon.
split_state_dict_into_shards
helper by @Wauplin in #1938
Community is actively getting the job done to translate the huggingface_hub
to other languages. We now have docs available in Simplified Chinese (here) and in French (here) to help democratize good machine learning!
base_model
in modelcard metadata by @Wauplin in #1936
hf_transfer
extra into setup.py
and docs/
by @jamesbraza in #1970
download --repo-type
by @jamesbraza in #1986
get_safetensors_metadata
docstring by @Wauplin in #1951
Creating a commit with an invalid README will fail early instead of uploading all LFS files before failing to commit.
Added a revision_exists
helper, working similarly to repo_exists
and file_exists
:
>>> from huggingface_hub import revision_exists
>>> revision_exists("google/gemma-7b", "float16")
True
>>> revision_exists("google/gemma-7b", "not-a-revision")
False
revision_exists
helper by @Wauplin in #2042
InferenceClient.wait(...)
now raises an error if the endpoint is in a failed state.
Improved progress bar when downloading a file
Other stuff:
ModelFilter
and DatasetFilter
are deprecated when listing models and datasets in favor of a simpler API that lets you pass the parameters directly to list_models
and list_datasets
.>>> from huggingface_hub import list_models, ModelFilter
# use
>>> list_models(language="zh")
# instead of
>>> list_models(filter=ModelFilter(language="zh"))
Cleaner, right? ModelFilter
and DatasetFilter
will still be supported until v0.24
release.
ModelStatus.compute_type
is not a string anymore but a dictionary with more detailed information (instance type + number of replicas). This breaking change reflects a server-side update.warnings.warn
in repocard.py by @Wauplin in #1980
force_download=True
by @scruel in #1983
setup.cfg
to pyproject.toml
by @jamesbraza in #1971
pre-commit
by @jamesbraza in #1987
toml-sort
tool by @jamesbraza in #1972
The following contributors have made significant changes to the library over the last release:
Published by Wauplin 9 months ago
This patch release fixes an issue when retrieving the locally saved token using huggingface_hub.HfFolder.get_token
. For the record, this is a "planned to be deprecated" method, in favor of huggingface_hub.get_token
which is more robust and versatile. The issue came from a breaking change introduced in https://github.com/huggingface/huggingface_hub/pull/1895 meaning only 0.20.x
is affected.
For more details, please refer to https://github.com/huggingface/huggingface_hub/pull/1966.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.20.2...v0.20.3
Published by Wauplin 10 months ago
A concurrency issue when using userdata.get
to retrieve HF_TOKEN
token led to deadlocks when downloading files in parallel. This hot-fix release fixes this issue by using a global lock before trying to get the token from the secrets vault. More details in https://github.com/huggingface/huggingface_hub/pull/1953.
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.20.1...v0.20.2
Published by Wauplin 10 months ago
This hot-fix release fixes a circular import error happening when import login
or logout
helpers from huggingface_hub
.
Related PR: https://github.com/huggingface/huggingface_hub/pull/1930
Full Changelog: https://github.com/huggingface/huggingface_hub/compare/v0.20.0...v0.20.1