LEIP Optimize

leip.optimize(model: CompilableModel, output_path: Path, options: Optional[OptimizeOptions]) CompressResults

Optimizes a pre-trained model by quantizing and compiling it.

Parameters
  • model (CompilableModel) – Model object.

  • output_path (Path) – Directory path for generated artifacts.

  • options (Optional[OptimizeOptions]) – Options that configure the model optimization.

Returns

Results of the quantization, if any.

Return type

CompressResults

class leip.OptimizeOptions(*, compress: OptimizeCompressOptions, compile: CompileOptions = CompileOptions(layout='NCHW', target='llvm', target_host=None, crc_check=False, force_int8=False, legacy_artifacts=False, optimization=CompileOptimizations(kernel=KernelOptimization(level=3), cuda=CudaOptimization(enabled=False), graph=None)))

Options for model optimization.

Parameters
Return type

None

compile: leip.core.operations.compile.options.CompileOptions

Options for compilation stage of the optimization.

compress: leip.core.operations.optimize.options.OptimizeCompressOptions

Options for compression (quantization) stage of the optimization.

class leip.OptimizeCompressOptions(*, quantizer: QuantizerType = QuantizerType.ASYMMETRIC, bits: PositiveInt = 8, data_type: DataType = DataType.UINT8, rep_dataset: FilePath = None, optimization: List[CompressOptimization] = [], use_legacy_quantizer: bool = False, quantize_input: bool = True, quantize_output: bool = True, wrap_quantizable_ops: bool = True)

Options for the compression stage of a model optimization.

Parameters
  • quantizer (QuantizerType) –

  • bits (PositiveInt) –

  • data_type (DataType) –

  • rep_dataset (Optional[FilePath]) –

  • optimization (List[CompressOptimization]) –

  • use_legacy_quantizer (bool) –

  • quantize_input (bool) –

  • quantize_output (bool) –

  • wrap_quantizable_ops (bool) –

Return type

None

bits: pydantic.types.PositiveInt

The number of bits to quantize to.

data_type: leip.core.models.enums.DataType

The data type to use when quantizing and casting. One of [float32, uint8, int8].

optimization: List[leip.core.models.enums.CompressOptimization]

Options to configure optimizations. Choose from [tensor_splitting, bias_correction].

quantize_input: bool

Whether the model input layers will be quantized.

quantize_output: bool

Whether the model output layers will be quantized.

quantizer: leip.core.models.enums.QuantizerType

Type of quantizer. One of [asymmetric, symmetric, symmetricpc].

rep_dataset: Optional[pydantic.types.FilePath]

The path to a text file that specifies a representative dataset input, used during calibration. This file should contain a newline separated list of path names to the input instances

use_legacy_quantizer: bool

Use LEIP 1.x quantizer method.

wrap_quantizable_ops: bool

Whether sequences of unquantizable ops (like exp and stack) should have dequantize ops inserted before them and quantize ops inserted after them.