Efficient few-shot learning with Sentence Transformers
APACHE-2.0 License
Bot releases are visible (Hide)
This is a patch release with two notable fixes and a feature:
args.max_steps
if args.max_steps
> the number of steps. This prevents accidentally being in warm-up for longer than the desired warmup proportion.SetFitModel.labels
if this variable hasn't been defined yet.Full Changelog: https://github.com/huggingface/setfit/compare/v1.0.2...v1.0.3
Published by tomaarsen 9 months ago
Full Changelog: https://github.com/huggingface/setfit/compare/v1.0.1...v1.0.2
Published by tomaarsen 11 months ago
This release heavily refactors the SetFit trainer and introduces some much requested features, such as:
Read the v1.0.0 Migration Guide in the documentation: https://hf.co/docs/setfit/how_to/v1.0.0_migration_guide
Read the more detailed release notes in the documentation: https://huggingface.co/docs/setfit/how_to/v1.0.0_migration_guide#v100-changelog
sample_dataset
by @grofte in https://github.com/huggingface/setfit/pull/396
trainer.evaluate()
by @grofte in https://github.com/huggingface/setfit/pull/402
Trainer
& TrainingArguments
, add SetFit ABSA by @tomaarsen in https://github.com/huggingface/setfit/pull/265
metric_kwargs
to custom metric callable by @tomaarsen in https://github.com/huggingface/setfit/pull/456
Trainer
, TrainingArguments
, SetFitABSA, logging, evaluation during training, callbacks, docs by @tomaarsen in https://github.com/huggingface/setfit/pull/439
Full Changelog: https://github.com/huggingface/setfit/compare/v0.7.0...v1.0.0
Published by tomaarsen over 1 year ago
This release introduces numerous bug fixes, including critical ones for push_to_hub
, save_pretrained
and distillation training.
_save_pretrained
, resolve TypeError: unsupported operand type(s) for +: 'PosixPath' and 'str'
by @tomaarsen in #332cls
instead by @kobiche in #341model.predict
when using string labels by @tomaarsen in #331pandas
to <2 for compatibility tests by @tomaarsen in #350Trainer.push_to_hub
to use **kwargs
by @tomaarsen in #351The following contributors have made significant changes to the library over the last release:
model.predict
when using string labels (#331)pandas
to <2 for compatibility tests (#350)Trainer.push_to_hub
to use **kwargs
(#351)Published by lewtun over 1 year ago
To bring in the new year, this release comes with many bug fixes and quality of life improvements around using SetFit models. It also provides:
notebooks
for an example.push_to_hub()
: https://huggingface.co/lewtun/setfit-new-model-card
sample_dataset
by @tomaarsen in #231trainer.py
by @danielkorat in #243scripts/setfit/run_fewshot.py
by @tomaarsen in #262SentenceTransformer
resetting devices after moving a SetFitModel
by @tomaarsen in #283run_zeroshot.py
; add functionality to data.get_templated_dataset()
(formerly add_templated_examples()
) by @danielkorat in #292The following contributors have made significant changes to the library over the last release:
sample_dataset
(#231)scripts/setfit/run_fewshot.py
(#262)SentenceTransformer
resetting devices after moving a SetFitModel
(#283)trainer.py
(#243)run_zeroshot.py
; add functionality to data.get_templated_dataset()
(formerly add_templated_examples()
) (#292)Published by lewtun almost 2 years ago
This release comes with two main features:
DistillationSetFitTrainer
class that allows users to use unlabeled data to significantly boost the performance of small models like MiniLM. See this workshop for an end-to-end example.SetFit
model instances into ONNX graphs for downstream inference + optimisation. Checkout the notebooks
folder for an end-to-end example.Kudos to @orenpereg and @nbertagnolli for implementing both of these features 🔥
num_iterations
by @PhilipMay in #215scripts/setfit/run_fewshot.py
, add warning for class imbalance w. accuracy by @tomaarsen in #204The following contributors have made significant changes to the library over the last release:
Published by lewtun almost 2 years ago
Fixes an issue on Google Colab, where the default version of Python 3.7 is incompatible with the Literal
type. See #162 for more details.
Published by lewtun almost 2 years ago
SetFitModel
@blakechi has implemented a differentiable head in PyTorch for SetFitModel
that enables the model to be trained end-to-end. The implementation is backwards compatible with the scikit-learn
heads and can be activated by setting use_differentiable_head=True
when loading SetFitModel
. Here's a full example:
from datasets import load_dataset
from sentence_transformers.losses import CosineSimilarityLoss
from setfit import SetFitModel, SetFitTrainer
# Load a dataset from the Hugging Face Hub
dataset = load_dataset("sst2")
# Simulate the few-shot regime by sampling 8 examples per class
num_classes = 2
train_dataset = dataset["train"].shuffle(seed=42).select(range(8 * num_classes))
eval_dataset = dataset["validation"]
# Load a SetFit model from Hub
model = SetFitModel.from_pretrained(
"sentence-transformers/paraphrase-mpnet-base-v2",
use_differentiable_head=True,
head_params={"out_features": num_classes},
)
# Create trainer
trainer = SetFitTrainer(
model=model,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
loss_class=CosineSimilarityLoss,
metric="accuracy",
batch_size=16,
num_iterations=20, # The number of text pairs to generate for contrastive learning
num_epochs=1, # The number of epochs to use for constrastive learning
column_mapping={"sentence": "text", "label": "label"} # Map dataset columns to text/label expected by trainer
)
# Train and evaluate
trainer.freeze() # Freeze the head
trainer.train() # Train only the body
# Unfreeze the head and freeze the body -> head-only training
trainer.unfreeze(keep_body_frozen=True)
# or
# Unfreeze the head and unfreeze the body -> end-to-end training
trainer.unfreeze(keep_body_frozen=False)
trainer.train(
num_epochs=25, # The number of epochs to train the head or the whole model (body and head)
batch_size=16,
body_learning_rate=1e-5, # The body's learning rate
learning_rate=1e-2, # The head's learning rate
l2_weight=0.0, # Weight decay on **both** the body and head. If `None`, will use 0.01.
)
metrics = trainer.evaluate()
# Push model to the Hub
trainer.push_to_hub("my-awesome-setfit-model")
# Download from Hub and run inference
model = SetFitModel.from_pretrained("lewtun/my-awesome-setfit-model")
# Run inference
preds = model(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"])
compute()
method called by trainer.evaluate()
by @mpangrazzi in #125loss_class
issue by @PhilipMay in #154The following contributors have made significant changes to the library over the last release:
loss_class
issue (#154)compute()
method called by trainer.evaluate()
(#125)Published by lewtun about 2 years ago
This release includes improvements to the hyperparameter_search()
function of SetFitTrainer
, along with several small fixes in saving fine-tuned models.
Thanks to @sanderland @bradleyfowler123 @Mouhanedg56 for their contributions 🤗 !
Published by lewtun about 2 years ago
This release comes with two main features:
optuna
integration to run hyperparameter search on both the SetFitModel
head and the hyperparameters used during training.The following contributors have made significant changes to the library over the last release:
Published by lewtun about 2 years ago
Fixes a bug where the column mapping checks threw an error when a column mapping wasn't provided for datasets with valid column names.
See #82 for more details.
Published by lewtun about 2 years ago
SetFitTrainer
The SetFitTrainer
assumes that the training and evaluation datasets contain text
and label
columns. Previously, this required users to manually rename their dataset columns before creating the trainer. In #75 we added support for users to specify the column mapping directly in the trainer.