Martin Evans
e89ca5cc17
Fixed a few minor warnings
2 years ago
Martin Evans
9daf586ba8
Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)
2 years ago
Martin Evans
1f8c94e386
Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )
2 years ago
Martin Evans
9a0a0ae9fe
Removed cloning support
2 years ago
Martin Evans
0d40338692
Fixed out-of-context handling in stateless executor
2 years ago
Martin Evans
b306ac23dd
Added `Decode` method to `SafeLLamaContextHandle`
2 years ago
Martin Evans
ce1fc51163
Added some more native methods
2 years ago
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2 years ago
Martin Evans
daf09eae64
Skipping tokenization of empty strings (saves allocating an empty array every time)
2 years ago
Martin Evans
bba801f4b7
Added a property to get the KV cache size from a context
2 years ago
Martin Evans
31287b5e6e
Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options:
- Just convert it to a `string`, nice and simple
- Write the bytes to a `Span<byte>` no allocations
- Write the chars to a `StringBuilder` potentially no allocations
2 years ago
Martin Evans
ebacdb666d
- Moved the lower level state get/set methods onto SafeLLamaContextHandle
- Used those methods to add a `Clone` method to SafeLLamaContextHandle
- Simplified `LLamaContext` by using the new methods
- Sealed `LLamaContext` and `LLamaEmbedder`
2 years ago
Martin Evans
a9e6f21ab8
- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2 years ago
Martin Evans
ae8ef17a4a
- Added various convenience overloads to `LLamaContext.Eval`
- Converted `SafeLLamaContextHandle` to take a `ReadOnlySpan` for Eval, narrower type better represents what's really needed
2 years ago
Martin Evans
479ff57853
Renamed `EmbeddingCount` to `EmbeddingSize`
2 years ago
Martin Evans
d0a7a8fcd6
- Cleaned up disposal in LLamaContext
- sealed some classes not intended to be extended
2 years ago
Martin Evans
f3511e390f
WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving.
The biggest single change is renaming `LLamaModel` to `LLamaContext`
2 years ago
Martin Evans
2b2d3af26b
Moved `Eval` out of `Utils` and into `SafeLLamaContextHandle`
2 years ago
Martin Evans
0e5e00e300
Moved `TokenToString` from Utils into `SafeLLamaContextHandle` (thin wrappers around the same method in `SafeLlamaModelHandle`)
2 years ago
Martin Evans
2d811b2603
- Moved `GetLogits` into `SafeLLamaContextHandle`
- Added disposal check into `SafeLLamaContextHandle`
2 years ago
Martin Evans
cd3cf2b77d
- Moved tokenization from `Utils.Tokenize` into `SafeLLamaContextHandle.Tokenize`, one less thing in `Utils`.
- Also refactored it to return an `int[]` instead of an `IEnumerable<int>`, solving the "multiple enumeration" problems at the source!
2 years ago
Martin Evans
f16aa58e12
Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts.
This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction.
It is built upon llama `b3f138d`, necessary DLLs are **not** included in this commit.
2 years ago
Yaohui Liu
0958bbac2c
feat: add get-embedding api to LLamaModel.
2 years ago
Yaohui Liu
5a79edeb51
feat: add the framework and basic usages.
2 years ago