You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

llama.llamaquantizer.md 2.2 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
  1. # LLamaQuantizer
  2. Namespace: LLama
  3. The quantizer to quantize the model.
  4. ```csharp
  5. public static class LLamaQuantizer
  6. ```
  7. Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [LLamaQuantizer](./llama.llamaquantizer.md)
  8. ## Methods
  9. ### **Quantize(String, String, LLamaFtype, Int32)**
  10. Quantize the model.
  11. ```csharp
  12. public static bool Quantize(string srcFileName, string dstFilename, LLamaFtype ftype, int nthread)
  13. ```
  14. #### Parameters
  15. `srcFileName` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  16. The model file to be quantized.
  17. `dstFilename` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  18. The path to save the quantized model.
  19. `ftype` [LLamaFtype](./llama.native.llamaftype.md)<br>
  20. The type of quantization.
  21. `nthread` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  22. Thread to be used during the quantization. By default it's the physical core number.
  23. #### Returns
  24. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  25. Whether the quantization is successful.
  26. #### Exceptions
  27. [ArgumentException](https://docs.microsoft.com/en-us/dotnet/api/system.argumentexception)<br>
  28. ### **Quantize(String, String, String, Int32)**
  29. Quantize the model.
  30. ```csharp
  31. public static bool Quantize(string srcFileName, string dstFilename, string ftype, int nthread)
  32. ```
  33. #### Parameters
  34. `srcFileName` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  35. The model file to be quantized.
  36. `dstFilename` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  37. The path to save the quantized model.
  38. `ftype` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  39. The type of quantization.
  40. `nthread` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  41. Thread to be used during the quantization. By default it's the physical core number.
  42. #### Returns
  43. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  44. Whether the quantization is successful.
  45. #### Exceptions
  46. [ArgumentException](https://docs.microsoft.com/en-us/dotnet/api/system.argumentexception)<br>

C#/.NET上易用的LLM高性能推理框架,支持LLaMA和LLaVA系列模型。