Martin Evans
73172bbaba
Merge pull request #438 from martindevans/cleanup_model_unnecessary_unsafe
Model Metadata Loading Cleanup
1 year ago
Martin Evans
ce1d302e7e
Moved some native methods into `SafeLlamaModelHandle`, these methods are all wrapped in safer accessors with no extra costs so there is no need to expose them.
1 year ago
Martin Evans
1e86755071
- Removed unnecessary `unsafe` block in model metadata loading
- Clarified comments on native metadata loading methods
1 year ago
Martin Evans
de2b20aae5
- Added a specific exception for failing to load model weights.
- Checking if model is readable
1 year ago
Martin Evans
096e0e75f8
Check that the model file actually exists immediately before loading it. Improve #395
1 year ago
Martin Evans
2ea2048b78
- Added a test for tokenizing just a new line (reproduce issue https://github.com/SciSharp/LLamaSharp/issues/430 )
- Properly displaying `LLamaToken`
- Removed all tokenisation code in `SafeLLamaContextHandle` - just pass it all through to the `SafeLlamaModelHandle`
- Improved `SafeLlamaModelHandle` tokenisation:
- Renting an array, for one less allocation
- Not using `&tokens[0]` to take a pointer to an array, this is redundant and doesn't work on empty arrays
1 year ago
Martin Evans
98635a0d5a
Fixed decoding of large tokens (over 16 bytes) in streaming text decoder
1 year ago
Martin Evans
402a110a3a
Merge pull request #404 from martindevans/switched_to_LLamaToken_struct
LLamaToken Struct
1 year ago
Martin Evans
1e69e265b6
Moved some native methods to do with creating/destroying resources into their respective handles. There is **no** safe way to call most of these methods, everything must be done through through handles.
1 year ago
Martin Evans
42be9b136d
Switched form using raw integers, to a `LLamaToken` struct
1 year ago
Martin Evans
4e5e994dda
- directly returning a SafeLlamaModelHandle, instead of an IntPtr which is wrapped in a handle.
- made `llama_backend_init` private. This is automatically called, there is no way it can correctly be used externally.
- made `llama_token_to_piece` safe (Span instead of pointer)
1 year ago
Martin Evans
c002642268
- Removed some `unsafe` where it wasn't necessary
- Wrapped some native functions which take (pointer, length) in function which take a `span` instead.
1 year ago
Martin Evans
f860f88c36
Code cleanup driven by R# suggestions:
- Made `NativeApi` into a `static class` (it's not intended to be instantiated)
- Moved `LLamaTokenType` enum out into a separate file
- Made `LLamaSeqId` and `LLamaPos` into `record struct`, convenient to have equality etc
1 year ago
Martin Evans
6be3f62321
Fixed loading of very large metadata values (over 1kb)
1 year ago
Martin Evans
fb606c2488
Fixed incorrect values
1 year ago
Martin Evans
47e4fcef2a
Fixed GetString on netstandard2
1 year ago
Martin Evans
2a1e1b6183
Removed unused imports
1 year ago
Martin Evans
a2bae178fa
Added a `Metadata` property to `LLamaWeights`
1 year ago
Martin Evans
a03fe003de
Fixed decoding of text "accumulating" over time (never properly clearing buffer)
2 years ago
Martin Evans
51d4411a58
Added two new classes for detokenization tasks:
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
- `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.
Added tests for these classes and updated StatelessExecutor to use them.
Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2 years ago
Martin Evans
efdf3d630c
- Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens).
- Built a new (hacky) `Detokenize` method which handles this
2 years ago
Martin Evans
1d0620e634
Created a test that "roundtrips" strings through tokenization. This reveals some flaws with certain characters
2 years ago
Martin Evans
9daf586ba8
Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)
2 years ago
Martin Evans
1f8c94e386
Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )
2 years ago
Martin Evans
2a38808bca
- Added threads to context params, replaced all thread args with `uint?`
- Replaced all binaries
2 years ago
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2 years ago
Martin Evans
614ba40948
- Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings.
- Minimal amount of characters converted
- Allocation free
- Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span
- Allocation free
2 years ago
Martin Evans
2022b82947
Added binaries generated by this action: https://github.com/SciSharp/LLamaSharp/actions/runs/6002797872/job/16279896150
Based on this version: 6b73ef1201
2 years ago
Martin Evans
31287b5e6e
Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options:
- Just convert it to a `string`, nice and simple
- Write the bytes to a `Span<byte>` no allocations
- Write the chars to a `StringBuilder` potentially no allocations
2 years ago
Martin Evans
0c98ae1955
Passing ctx to `llama_token_nl(_ctx)`
2 years ago
Martin Evans
2056078aef
Initial changes required for GGUF support
2 years ago
Martin Evans
a911b77dec
Various minor changes, resolving about 100 ReSharper code quality warnings
2 years ago
Martin Evans
2830e5755c
- Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed.
- Deleted `NativeInfo` (internal class, not used anywhere)
2 years ago
Martin Evans
479ff57853
Renamed `EmbeddingCount` to `EmbeddingSize`
2 years ago
Martin Evans
d0a7a8fcd6
- Cleaned up disposal in LLamaContext
- sealed some classes not intended to be extended
2 years ago
Martin Evans
f3511e390f
WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving.
The biggest single change is renaming `LLamaModel` to `LLamaContext`
2 years ago
Martin Evans
2d811b2603
- Moved `GetLogits` into `SafeLLamaContextHandle`
- Added disposal check into `SafeLLamaContextHandle`
2 years ago
Martin Evans
6985d3ab60
Added comments on two properties
2 years ago
Martin Evans
c974c8429e
Removed leftover `using`
2 years ago
Martin Evans
afb9d24f3a
Added model `Tokenize` method
2 years ago
Martin Evans
369c915afe
Added TokenToString conversion on model handle
2 years ago
Martin Evans
b721072aa5
Exposed some extra model properties on safe handle
2 years ago
Martin Evans
44b1e93609
Moved LoRA loading into `SafeLlamaModelHandle`
2 years ago
Martin Evans
c95b14d8b3
- Fixed null check
- Additional comments
2 years ago
Martin Evans
f16aa58e12
Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts.
This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction.
It is built upon llama `b3f138d`, necessary DLLs are **not** included in this commit.
2 years ago