Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
BSD-3-CLAUSE License
Bot releases are visible (Hide)
Published by junpeiz about 1 month ago
Compare to 7.2 (including features from 8.0b1 and 8.0b2)
protobuf
python package which improves serialization latency.torch 2.4.0
, numpy 2.0
, scikit-learn 1.5
.torch.export
torch.jit.trace
converterct.optimize.torch
could be exported by torch.export
and then convert.cluster_dim > 1
and palettization with per channel scale by setting enable_per_channel_scale=True
.coremltools.optimize.coreml
and coremltools.optimize.torch
torchao
(including the ops produced by torchao such as _weight_int4pack_mm
).quantized_decomposed
namespace, such as embedding_4bit
, etc.constexpr_blockwise_shift_scale
, constexpr_lut_to_dense
, constexpr_sparse_to_dense
, etcscaled_dot_product_attention
clip
opoptimizationHints
.coremltools.utils
coremltools.utils.MultiFunctionDescriptor
and coremltools.utils.save_multifunction
, for creating an mlprogram with multiple functions in it, that can share weights.coremltools.models.utils.bisect_model
can break a large Core ML model into two smaller models with similar sizes.coremltools.models.utils.materialize_dynamic_shape_mlmodel
can convert a flexible input shape model into a static input shape model.protobuf
python package: Improves serialization latency.numpy 2.0
.scikit-learn 1.5
.coremltools.models.utils.bisect_model
can break a large Core ML model into two smaller models with similar sizes.coremltools.models.utils.materialize_dynamic_shape_mlmodel
can convert a flexible input shape model into a static input shape model.coremltools.optimize.coreml
cluster_dim > 1
in coremltools.optimize.coreml.OpPalettizerConfig
, you can do the vector palettization, where each entry in the lookup table is a vector of length cluster_dim
.enable_per_channel_scale=True
in coremltools.optimize.coreml.OpPalettizerConfig
, weights are normalized along the output channel using per channel scales before being palettized.coremltools.optimize.torch
.coremltools.optimize.torch
coremltools.optimize.torch
.SKMPalettizer
.PostTrainingPalettizer
and DKMPalettizer
.cluter_dtype
option in favor of lut_dtype
in ModuleDKMPalettizerConfig
.ConvTranspose
modules with PostTrainingQuantizer
and LinearQuantizer
.GPTQ
.Conv2D
layer with per-block quantization in GPTQ
.QAT
APIs.torch.export
conversion support
clip
.torch.export
modelimport torch
import coremltools as ct
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.register_buffer("state_1", torch.tensor([0.0, 0.0, 0.0]))
def forward(self, x):
# In place update of the model state
self.state_1.mul_(x)
return self.state_1 + 1.0
source_model = Model()
source_model.eval()
example_inputs = (torch.tensor([1.0, 2.0, 3.0]),)
exported_model = torch.export.export(source_model, example_inputs)
coreml_model = ct.convert(exported_model, minimum_deployment_target=ct.target.iOS18)
torch.export
models with dynamic input shapesimport torch
import coremltools as ct
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.linear = torch.nn.Linear(3, 5)
def forward(self, x):
y = self.linear(x)
return y
source_model = Model()
source_model.eval()
example_inputs = (torch.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]),)
dynamic_shapes = {"x": {0: torch.export.Dim(name="batch_dim")}}
exported_model = torch.export.export(source_model, example_inputs, dynamic_shapes=dynamic_shapes)
coreml_model = ct.convert(exported_model)
torch.export
with 4-bit weight compressionimport torch
from torch._export import capture_pre_autograd_graph
from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_pt2e
from torch.ao.quantization.quantizer.xnnpack_quantizer import (
XNNPACKQuantizer,
get_symmetric_quantization_config,
)
import coremltools as ct
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.linear = torch.nn.Linear(3, 5)
def forward(self, x):
y = self.linear(x)
return y
source_model = Model()
source_model.eval()
example_inputs = (torch.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]),)
pre_autograd_graph = capture_pre_autograd_graph(source_model, example_inputs)
quantization_config = get_symmetric_quantization_config(weight_qmin=-7, weight_qmax=8)
quantizer = XNNPACKQuantizer().set_global(quantization_config)
prepared_graph = prepare_pt2e(pre_autograd_graph, quantizer)
converted_graph = convert_pt2e(prepared_graph)
exported_model = torch.export.export(converted_graph, example_inputs)
coreml_model = ct.convert(exported_model, minimum_deployment_target=ct.target.iOS17)
Published by YifanShenSZ 4 months ago
For all the new features, find the updated documentation in the docs-guides
coremltools.utils.MultiFunctionDescriptor()
and coremltools.utils.save_multifunction
, for creating an mlprogram
with multiple functions in it, that can share weights. Updated the model loading API to load specific functions for prediction.coremltools.optimize
mlprogram
) pertaining to compression:
coremltools.optimize.coreml
ct.optimize.coreml.experimental.linear_quantize_activations
coremltools.optimize.torch
PostTrainingPalettizer
, PostTrainingQuantizer
SKMPalettizer
for sensitive k-means palettization algorithm, layerwise_compression
for GPTQ/sparseGPT quantization/pruning algorithm)coremltools.convert
implementation, so that for converting torch models compressed with ct.optimize.torch
, there is no longer a need to provide additional pass pipeline arguments.constexpr_blockwise_shift_scale
, constexpr_lut_to_dense
, constexpr_sparse_to_dense
, etcscaled_dot_product_attention
torch.export
conversion supportimport torch
import torchvision
import coremltools as ct
torch_model = torchvision.models.vit_b_16(weights="IMAGENET1K_V1")
x = torch.rand((1, 3, 224, 224))
example_inputs = (x,)
exported_program = torch.export.export(torch_model, example_inputs)
coreml_model = ct.convert(exported_program)
ct.optimize.torch
ct.optimize.torch
will result in a torch model that is not correctly convertedct.optimize.coreml.``OpPalettizerConfig
) does not yet have all the arguments that are supported in the cto.torch.palettization
APIs (e.g. lut_dtype
(to get int8 dtyped LUT), cluster_dim
(to do vector palettization), enable_per_channel_scale
(to apply per-channel-scale) etc).ct.optimize.torch.layerwise_compression.LayerwiseCompressor
will not produce the correct quantization scales, due to a known bug. This may lead to poor accuracy for the quantized modelSpecial thanks to our external contributors for this release: @teelrabbit @igeni @Cyanosite
Published by YifanShenSZ 6 months ago
torch.narrow
torch.adaptive_avg_pool1d
and torch.adaptive_max_pool1d
torch.numpy_t
(i.e. the numpy-style transpose operator .T
)torch.clamp_min
for integer data typetorch.add
for complex data typetf.math.top_k
when k
is variableThanks to our ExecuTorch partners and our open-source community: @KrassCodes @M-Quadra @teelrabbit @minimalic @alealv @ChinChangYang @pcuenca
Published by DawerG 12 months ago
New Features:
Includes experimental support for torch.export
API but limited to EDGE dialect.
Example usage:
import torch
from torch.export import export
from executorch.exir import to_edge
import coremltools as ct
example_args = (torch.randn(*size), )
aten_dialect = export(AnyNNModule(), example_args)
edge_dialect = to_edge(aten_dialect).exported_program()
edge_dialect._dialect = "EDGE"
mlmodel = ct.convert(edge_dialect)
Enhancements:
ct.utils.make_pipeline
- now allows specifying compute_unitsBug Fixes:
Various other bug fixes, enhancements, clean ups and optimizations.
Published by TobyRoseman about 1 year ago
coremltools.optimize
for model quantization and compression
coremltools.optimize.coreml
for compressing coreml models, in a data free manner. coremltools.compresstion_utils.*
APIs have been moved herecoremltools.optimize.torch
for compressing torch model with training data and fine-tuning. The fine tuned torch model can then be converted using coremltools.convert
mlprogram
for iOS15/macOS12. Previously calling coremltools.convert()
without providing the convert_to
or the minimum_deployment_target
arguments, used the lowest deployment target (iOS11/macOS10.13) and the neuralnetwork
backend. Now the conversion process will default to iOS15/macOS12 and the mlprogram
backend. You can change this behavior by providing a minimum_deployment_target
or convert_to
value.repeat_interleave
, unflatten
, col2im
, view_as_real
, rand
, logical_not
, fliplr
, quantized_matmul
, randn
, randn_like
, scaled_dot_product_attention
, stft
, tile
pass_pipeline
parameter has been added to coremltools.convert
to allow controls over which optimizations are performed..modelc
files). Get compiled model files from an MLModel
instance. Python API to explicitly compile a model.coremltools.optimize.coreml.get_weights_metadata
. This information can be used to customize optimization across ops when using coremltools.optimize.coreml
APIs.coremltools.compression_utils
is deprecated.mlprogram
backend is used.mlprogram
:
RangeDim
is used and no upper-bound is set (with a positive number), an exception will be raised.inputs
parameter but there are undetermined dim in input shape (for example, TF with "None" in input placeholder), it will be sanitized to a finite number (default_size + 1) and raise a warning.Special thanks to our external contributors for this release: @fukatani , @pcuenca , @KWiecko , @comeweber , @sercand , @mlaves, @cclauss, @smpanaro , @nikalra, @jszaday
Published by TobyRoseman about 1 year ago
mlprogram
for iOS15/macOS12. Previously calling coremltools.convert()
without providing the convert_to
or the minimum_deployment_target
arguments, used the lowest deployment target (iOS11/macOS10.13) and the neuralnetwork
backend. Now the conversion process will default to iOS15/macOS12 and the mlprogram
backend. You can change this behavior by providing a minimum_deployment_target
or convert_to
value.mlprogram
backend is used.mlprogram
:
RangeDim
is used and no upper-bound is set (with a positive number), an exception will be raised.inputs
parameter but there are undetermined dim in input shape (for example, TF with "None" in input placeholder), it will be sanitized to a finite number (default_size + 1) and raise a warning.coremltools.optimize.coreml.get_weights_metadata
. This information can be used to customize optimization across ops when using coremltools.optimize.coreml
APIs.repeat_interleave
and unflatten
.batch_norm
, conv
, con
v_transpose
, expand_dims
, gru
, instance_norm
, inverse
, l2_norm
, layer_norm
, linear
, local_response_norm
, log
, lstm
, matmul
, reshape_like
, resample
, resize
, reverse
, reverse_sequence
, rnn
, rsqrt
, slice_by_index
, slice_by_size
, sliding_windows
, squeeze
, transpose
.Special thanks to our external contributors for this release: @fukatani, @pcuenca, @KWiecko, @comeweber and @sercand
Published by TobyRoseman over 1 year ago
coremltools.optimize
for model quantization and compression
coremltools.optimize.coreml
for compressing coreml models, in a data free manner. coremltools.compresstion_utils.*
APIs have been moved herecoremltools.optimize.torch
for compressing torch model with training data and fine-tuning. The fine tuned torch model can then be converted using coremltools.convert
pass_pipeline
parameter has been added to coremltools.convert
to allow controls over which optimizations are performed.randn
, randn_like
, scaled_dot_product_attention
, stft
, tile
coremltools.models.ml_program.compression_utils
is deprecated.Core ML tools 7.0 guide: https://coremltools.readme.io/v7.0/
Special thanks to our external contributors for this release: @fukatani, @pcuenca, @mlaves, @cclauss, @smpanaro, @nikalra, @jszaday
Published by junpeiz over 1 year ago
pass_pipeline
parameter to coremltools.convert
.utils.make_pipeline
.converters.mil.debugging_utils.extract_submodel
Special thanks to our external contributors for this release: @fukatani, @nikalra and @kevin-keraudren.
Published by junpeiz over 1 year ago
torch==1.13.1
and torchvision==0.14.1
.torch.fft
, torchvision.ops.nms
, torch.atan2
, torch.bitwise_and
, torch.numel
,tf.signal
, tf.tensor_scatter_nd_add
.clamp
op.topk
(k not determined during compile time).padding='valid'
in PyTorch convolution.Special thanks to our external contributors for this release: @fukatani, @ChinChangYang, @danvargg, @bhushan23 and @cjblocker.
Published by jakesabathia2 almost 2 years ago
2.10
.baddbmm
, glu
, hstack
, remainder
, weight_norm
, hann_window
, randint
, cross
, trace
, and reshape_as.
repeat
and expand
op.where
op with only one input.'bhcq,bhck→bhqk’
.Special thanks to our external contributors for this release: @fukatani, @piraka9011, @giorgiop, @hollance, @SangamSwadiK, @RobertRiachi, @waylybaye, @GaganNarula, and @sunnypurewal.
Published by TobyRoseman about 2 years ago
coremltools.compression_utils
1.1.2
).1.12.1
).2.8
.coremltools.ImageType
used with inputs.CPU_AND_NE
to select the model runtime to the Neural engine and CPU.full_like
, resample
, reshape_like
, pixel_unshuffle
, topk
crop_resize
, gather
, gather_nd
, topk
, upsample_bilinear
.useCPUOnly
parameter from coremltools.convert
and coremltools.models.MLModel
. Use coremltools.ComputeUnit
instead.Published by TobyRoseman about 2 years ago
pixel_unshuffle
, resample
, topk
CPU_AND_NE
AdaptiveAvgPool2d
, cosine_similarity
, eq
, linalg.norm
, linalg.matrix_norm
, linalg.vector_norm
, ne
, PixelUnshuffle
identity_n
TensorFlow opPublished by TobyRoseman over 2 years ago
coremltools.compression_utils
.coremltools.ImageType
used with inputs.useCPUOnly
parameter from coremltools.convert
and coremltools.models.MLModel
. Use coremltools.ComputeUnit
instead.Known issues
predict
API in coremltools to crash when the either the input or output is of type grayscale float16MLComputeUnitsCPUAndNeuralEngine
is not available in coremltools yetPublished by TobyRoseman over 2 years ago
bitwise_not
dim
dot
eye
fill
hardswish
linspace
mv
new_full
new_zeros
rrelu
selu
DivNoNan
Log1p
SparseSoftmaxCrossEntropyWithLogits
Published by TobyRoseman almost 3 years ago
broadcast_tensors
, frobenius_norm
, full
, norm
and scatter_add
.Published by TobyRoseman about 3 years ago
convert_to
argument with the unified converter API to indicate the model type of the Core ML model.
coremltools.convert(..., convert_to=“mlprogram”)
converts to a Core ML model of type ML program.coremltools.convert(..., convert_to=“neuralnetwork”)
converts to a Core ML model of type neural network. “Neural network” is the older Core ML format and continues to be supported. Using just coremltools.convert(...)
will default to produce a neural network Core ML model.ct.convert(..., convert_to=“mlprogram”, compute_precision=ct.precision.FLOAT32)
or ct.convert(..., convert_to=“mlprogram”, compute_precision=ct.precision.FLOAT16)
save
method. Simply use model.save("<model_name>.mlpackage")
instead of the usual model.save(<"model_name>.mlmodel")
compute_units
parameter to MLModel and coremltools.convert. This matches the MLComputeUnits
in Swift and Objective-C. Use this parameter to specify where your models can run:
ALL
- use all compute units available, including the neural engine.CPU_ONLY
- limit the model to only use the CPU.CPU_AND_GPU
- use both the CPU and GPU, but not the neural engine.ct.convert(....., skip_model_load=True)
convert_neural_network_weights_to_fp16()
, convert_neural_network_spec_weights_to_fp16()
, that had been deprecated in coremltools 4, have been removed.useCPUOnly
parameter for MLModel and MLModel.predicthas been deprecated. Instead, use the compute_units
parameter for MLModel and coremltools.convert.Published by TobyRoseman about 3 years ago
torch_tensor_assign
op and index_put_
op . Fixed bugs in translation of expand
ops and sort
ops.Published by TobyRoseman about 3 years ago
Published by TobyRoseman about 3 years ago
compute_units
parameter to MLModel and coremltools.convert. Use this to specify where your models can run:
ALL
- use all compute units available, including the neural engine.CPU_ONLY
- limit the model to only use the CPU.CPU_AND_GPU
- use both the CPU and GPU, but not the neural engine.useCPUOnly
parameter for MLModel and coremltools.convert.compute_precision
parameter of coremltools.convert
.