Martin Evans
a024d2242e
It works!
had to update binary to `b1426`
2 years ago
Martin Evans
36c71abcfb
Fixed `LLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLama` spam in all executors except Stateless.
2 years ago
Martin Evans
51d4411a58
Added two new classes for detokenization tasks:
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
- `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.
Added tests for these classes and updated StatelessExecutor to use them.
Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2 years ago
Martin Evans
efdf3d630c
- Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens).
- Built a new (hacky) `Detokenize` method which handles this
2 years ago
Martin Evans
9daf586ba8
Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)
2 years ago
Martin Evans
d8434ea9d6
Merge pull request #185 from martindevans/wip_major_api_change
Major llama.cpp API Change
2 years ago
Martin Evans
1f8c94e386
Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )
2 years ago
Martin Evans
669ae47ef7
- Split parameters into two interfaces
- params contains a list of loras, instead of just one
2 years ago
Martin Evans
9a0a0ae9fe
Removed cloning support
2 years ago
Martin Evans
0d40338692
Fixed out-of-context handling in stateless executor
2 years ago
Martin Evans
ce1fc51163
Added some more native methods
2 years ago
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2 years ago
Martin Evans
08f1615e60
- Converted LLamaStatelessExecutor to run `Exec` calls inside an awaited task. This unblocks async callers while the model is being evaluated.
- Added a "spinner" to the `StatelessModeExecute` demo, which spins while waiting for the next token (demonstrating that it's not blocked).
2 years ago
redthing1
b78044347c
fix opaque GetState ( fixes #176 )
2 years ago
Martin Evans
466722dcff
Merge pull request #165 from martindevans/better_instruct_antiprompt_checking
better_instruct_antiprompt_checking
2 years ago
Martin Evans
d08a125020
Using the `TokensEndsWithAnyString` extensions for antiprompt checking in instruct executor. Simpler and more efficient.
2 years ago
Martin Evans
bba801f4b7
Added a property to get the KV cache size from a context
2 years ago
Martin Evans
4dac142bd5
Merge pull request #160 from martindevans/GetState_fix
`GetState()` fix
2 years ago
Martin Evans
832bf7dbe0
Simplified implementation of `GetState` and fixed a memory leak (`bigMemory` was never freed)
2 years ago
Martin Evans
4f7b6ffdcc
Removed `GenerateResult` method that was only used in one place
2 years ago
sa_ddam213
949b0cde16
Replace ILLamaLogger for ILogger
2 years ago
Martin Evans
31287b5e6e
Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options:
- Just convert it to a `string`, nice and simple
- Write the bytes to a `Span<byte>` no allocations
- Write the chars to a `StringBuilder` potentially no allocations
2 years ago
Martin Evans
0c98ae1955
Passing ctx to `llama_token_nl(_ctx)`
2 years ago
Martin Evans
826c6aaec3
cleaned up higher level code using the sampling API:
- Fixed multiple enumeration
- Fixed newline penalisation
2 years ago
Martin Evans
a911b77dec
Various minor changes, resolving about 100 ReSharper code quality warnings
2 years ago
Martin Evans
5a6c6de0dc
Merge pull request #115 from martindevans/model_params_record
ModelsParams record class
2 years ago
Martin Evans
70be6c7368
Removed `virtual` method in newly sealed class
2 years ago
Martin Evans
ebacdb666d
- Moved the lower level state get/set methods onto SafeLLamaContextHandle
- Used those methods to add a `Clone` method to SafeLLamaContextHandle
- Simplified `LLamaContext` by using the new methods
- Sealed `LLamaContext` and `LLamaEmbedder`
2 years ago
Martin Evans
93f24f8a51
Switched to properly typed `Encoding` property
2 years ago
Martin Evans
759ae26f36
Merge branch 'master' into grammar_basics
2 years ago
Martin Evans
a9e6f21ab8
- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2 years ago
Martin Evans
4738c26299
- Reduced context size of test, to speed it up
- Removed some unnecessary `ToArray` calls
- Initial pass on LLamaStatelessExecutor, the context overflow management is broken but I think I found where it's ported from
2 years ago
Martin Evans
ae8ef17a4a
- Added various convenience overloads to `LLamaContext.Eval`
- Converted `SafeLLamaContextHandle` to take a `ReadOnlySpan` for Eval, narrower type better represents what's really needed
2 years ago
Martin Evans
64416ca23c
- Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`)
- Integrated grammar into sampling
- Added a test for the grammar sampling
2 years ago
Martin Evans
f5a260926f
Renamed `EmbeddingCount` to `EmbeddingSize` in higher level class
2 years ago
Martin Evans
479ff57853
Renamed `EmbeddingCount` to `EmbeddingSize`
2 years ago
Martin Evans
d0a7a8fcd6
- Cleaned up disposal in LLamaContext
- sealed some classes not intended to be extended
2 years ago
Martin Evans
4d741d24f2
Marked old `LLamaContext` constructor obsolete
2 years ago
Martin Evans
20bdc2ec6f
- Apply LoRA in `LLamaWeights.LoadFromFile`
- Sanity checking that weights are not disposed when creating a context from them
- Further simplified `Utils.InitLLamaContextFromModelParams`
2 years ago
Martin Evans
e2fe08a9a2
Added a higher level `LLamaWeights` wrapper around `SafeLlamaModelHandle`
2 years ago
Martin Evans
f3511e390f
WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving.
The biggest single change is renaming `LLamaModel` to `LLamaContext`
2 years ago