Martin Evans
2a38808bca
- Added threads to context params, replaced all thread args with `uint?`
- Replaced all binaries
2 years ago
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2 years ago
Martin Evans
614ba40948
- Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings.
- Minimal amount of characters converted
- Allocation free
- Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span
- Allocation free
2 years ago
Martin Evans
2022b82947
Added binaries generated by this action: https://github.com/SciSharp/LLamaSharp/actions/runs/6002797872/job/16279896150
Based on this version: 6b73ef1201
2 years ago
Martin Evans
31287b5e6e
Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options:
- Just convert it to a `string`, nice and simple
- Write the bytes to a `Span<byte>` no allocations
- Write the chars to a `StringBuilder` potentially no allocations
2 years ago
Martin Evans
0c98ae1955
Passing ctx to `llama_token_nl(_ctx)`
2 years ago
Martin Evans
2056078aef
Initial changes required for GGUF support
2 years ago
Martin Evans
a911b77dec
Various minor changes, resolving about 100 ReSharper code quality warnings
2 years ago
Martin Evans
2830e5755c
- Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed.
- Deleted `NativeInfo` (internal class, not used anywhere)
2 years ago
Martin Evans
479ff57853
Renamed `EmbeddingCount` to `EmbeddingSize`
2 years ago
Martin Evans
d0a7a8fcd6
- Cleaned up disposal in LLamaContext
- sealed some classes not intended to be extended
2 years ago
Martin Evans
f3511e390f
WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving.
The biggest single change is renaming `LLamaModel` to `LLamaContext`
2 years ago
Martin Evans
2d811b2603
- Moved `GetLogits` into `SafeLLamaContextHandle`
- Added disposal check into `SafeLLamaContextHandle`
2 years ago
Martin Evans
6985d3ab60
Added comments on two properties
2 years ago
Martin Evans
c974c8429e
Removed leftover `using`
2 years ago
Martin Evans
afb9d24f3a
Added model `Tokenize` method
2 years ago
Martin Evans
369c915afe
Added TokenToString conversion on model handle
2 years ago
Martin Evans
b721072aa5
Exposed some extra model properties on safe handle
2 years ago
Martin Evans
44b1e93609
Moved LoRA loading into `SafeLlamaModelHandle`
2 years ago
Martin Evans
c95b14d8b3
- Fixed null check
- Additional comments
2 years ago
Martin Evans
f16aa58e12
Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts.
This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction.
It is built upon llama `b3f138d`, necessary DLLs are **not** included in this commit.
2 years ago