2.7 kB

Raw Blame History

LLamaQuantizer
- Methods

LLamaQuantizer

Namespace: LLama

The quantizer to quantize the model.

public static class LLamaQuantizer

Inheritance Object → LLamaQuantizer

Methods

Quantize(String, String, LLamaFtype, Int32, Boolean, Boolean)

Quantize the model.

public static bool Quantize(string srcFileName, string dstFilename, LLamaFtype ftype, int nthread, bool allowRequantize, bool quantizeOutputTensor)

Parameters

srcFileName String

The model file to be quantized.

dstFilename String

The path to save the quantized model.

ftype LLamaFtype

The type of quantization.

nthread Int32

Thread to be used during the quantization. By default it's the physical core number.

allowRequantize Boolean

quantizeOutputTensor Boolean

Returns

Boolean

Whether the quantization is successful.

Exceptions

ArgumentException

Quantize(String, String, String, Int32, Boolean, Boolean)

Quantize the model.

public static bool Quantize(string srcFileName, string dstFilename, string ftype, int nthread, bool allowRequantize, bool quantizeOutputTensor)

Parameters

srcFileName String

The model file to be quantized.

dstFilename String

The path to save the quantized model.

ftype String

The type of quantization.

nthread Int32

Thread to be used during the quantization. By default it's the physical core number.

allowRequantize Boolean

quantizeOutputTensor Boolean

Returns

Boolean

Whether the quantization is successful.

Exceptions

ArgumentException