- Modified `llama_sample_token_mirostat` and `llama_sample_token_mirostat_v2` to take `ref float` instead of as a `float*`. Less pointers is always good.
- Modified `llama_sample_repetition_penalty` and `llama_sample_frequency_and_presence_penalties` to take pointers instead of arrays. This allows the use non non allocating types (e.g. Span) instead of arrays
- Modified higher level API to accept `Memory<int>` instead of `int[]`, which can be used to reduce allocations at call sites
- Used those methods to add a `Clone` method to SafeLLamaContextHandle
- Simplified `LLamaContext` by using the new methods
- Sealed `LLamaContext` and `LLamaEmbedder`
- Equality, hashing etc all implemented automatically
- Default values are defined in just one place (the properties) instead of the constructor as well
- Added test to ensure that serialization works properly
- Removed some unnecessary `ToArray` calls
- Initial pass on LLamaStatelessExecutor, the context overflow management is broken but I think I found where it's ported from
* Added a bool to sbyte Utils convertor
As an attempt to avoid using any MarshalAs attribute for .Net Framework support this Utils method will take in a bool value and return a 1 for true or 0 for false sbyte.
* Changed all bool "MarshalAs" types to sbytes
Changed all previous BOOL types with "MarshalAs" attributes to SBYTEs and changed all the setters of them to use the Utils.BoolToSignedByte() convertor method.
* Fixed Utils bool convertor & added sbyte to bool
Improved the Utils bool convertor just casting an sbyte value to get rid of the unneeded sbyte array and added an sbyte to bool convertor to convert back the way to a C# bool assuming any positive value above 0 is a bool and no bools are packed in the single byte integer.
* bool to & from sbyte conversions via properties
All 1byte bools are now handled where they "sit", via public properties which perform the conversions to keep all external data able to communicate as it did before.
- Added some more comments
- Modified `llama_tokenize` to not allocate
- Modified `llama_tokenize_native` to take a pointer instead of an array, allowing use with no allocations
- Removed GgmlInitParams (not used)