Martin Evans
48ef3bb080
Added runtime checks that UseMemoryLock and UseMemorymap are actually supported.
1 year ago
Martin Evans
f860f88c36
Code cleanup driven by R# suggestions:
- Made `NativeApi` into a `static class` (it's not intended to be instantiated)
- Moved `LLamaTokenType` enum out into a separate file
- Made `LLamaSeqId` and `LLamaPos` into `record struct`, convenient to have equality etc
1 year ago
Martin Evans
3fc0f34cbe
Fixed some issues which were causing metadata overrides not to work (mostly importantly, converting the key was failing so all keys were null bytes and thus ignored).
1 year ago
Martin Evans
47e4fcef2a
Fixed GetString on netstandard2
1 year ago
Martin Evans
2f0deeadcd
Implemented serialization for `MetadataOverride`. Deserialization is broken (converter is never called)
1 year ago
Martin Evans
b868b056f7
Added metadata overrides to `IModelParams`
1 year ago
Martin Evans
b22d8b7495
- Added `GroupDisposable` to dispose a collection of items all together
- Renamed `LLamaModelKvOverride` to `LLamaModelMetadataOverride`
1 year ago
Martin Evans
439d14a061
Updated binaries:
- build run: https://github.com/SciSharp/LLamaSharp/actions/runs/7196891440
- commit: 9fb13f9584
1 year ago
Martin Evans
89fef05362
This commit ( 5fe721bdbe) accidentally removed a load of stuff that it shouldn't. Fixed that.
Originally from these PRs:
- https://github.com/SciSharp/LLamaSharp/pull/263
- https://github.com/SciSharp/LLamaSharp/pull/259
2 years ago
SignalRT
97006a214f
Merge remote-tracking branch 'upstream/master' into RuntimeDetection
2 years ago
Martin Evans
31244ae691
Merge branch 'master' into YaRN_scaling_parameters
2 years ago
SignalRT
5fe721bdbe
Revert "Merge branch 'pr/268' into RuntimeDetection"
This reverts commit 091b8d58b3502a99b3bfbec9db457c92cc736beb, reversing
changes made to 9b2ca9cf8e .
2 years ago
Martin Evans
db1bc741b0
Modified `ContextSize` in parameters to be nullable. A null value means autodetect from the model.
2 years ago
Udayshankar Ravikumar
4071c1f5fc
Updated preprocessor directives
2 years ago
Udayshankar Ravikumar
df310e15da
Fixed preprocessor directives
2 years ago
Martin Evans
04ee64a6be
Exposed YaRN scaling parameters in IContextParams
2 years ago
Udayshankar Ravikumar
1dad1ff834
Enhance framework compatibility
2 years ago
Martin Evans
529b06b35b
- Fixed rope frequency/base to use the values in the model by default, instead of always overriding them by default!
2 years ago
Martin Evans
321d0b58c4
Merge pull request #202 from martindevans/multi_gpu
Multi GPU
2 years ago
Martin Evans
51d4411a58
Added two new classes for detokenization tasks:
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
- `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.
Added tests for these classes and updated StatelessExecutor to use them.
Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2 years ago
Martin Evans
6a4cd506bd
Added a safe `TensorSplitsCollection` to the params which prevents incorrectly setting the `tensor_splits` collection
2 years ago
Martin Evans
15db194c17
Added multi GPU support
2 years ago
Martin Evans
9daf586ba8
Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)
2 years ago
Martin Evans
d8434ea9d6
Merge pull request #185 from martindevans/wip_major_api_change
Major llama.cpp API Change
2 years ago
Martin Evans
b8f0eff080
- Added `GetCharCountImpl` tests, fixed handling of empty strings
- Added ifdef to remove `Deconstruct` extension on everything except `NETSTANDARD2_0`
2 years ago
Martin Evans
2a38808bca
- Added threads to context params, replaced all thread args with `uint?`
- Replaced all binaries
2 years ago
Martin Evans
4e9b1f8cdc
- Split extension methods into separate files
2 years ago
Martin Evans
669ae47ef7
- Split parameters into two interfaces
- params contains a list of loras, instead of just one
2 years ago
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2 years ago
Martin Evans
08f1615e60
- Converted LLamaStatelessExecutor to run `Exec` calls inside an awaited task. This unblocks async callers while the model is being evaluated.
- Added a "spinner" to the `StatelessModeExecute` demo, which spins while waiting for the next token (demonstrating that it's not blocked).
2 years ago
Martin Evans
fe54f6764f
- Added unit tests for extension methods
- Removed unused `AddRangeSpan` extension
2 years ago
Martin Evans
d08a125020
Using the `TokensEndsWithAnyString` extensions for antiprompt checking in instruct executor. Simpler and more efficient.
2 years ago
Martin Evans
77bd090150
Simplified `LLamaInteractExecutor` antiprompt matching by using new extension method
2 years ago
Martin Evans
614ba40948
- Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings.
- Minimal amount of characters converted
- Allocation free
- Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span
- Allocation free
2 years ago
Rinne
4e83e48ad1
Merge pull request #122 from martindevans/gguf
Add GGUF support
2 years ago
Martin Evans
a70c7170dd
- Created a higher level `Grammar` class which is immutable and contains a list of grammar rules. This is the main "entry point" to the grammar system.
- Made all the mechanics of grammar parsing (GBNFGrammarParser, ParseState) internal. Just call `Grammar.Parse("whatever")`.
- Added a `GrammarRule` class which validates elements on construction (this allows constructing grammar without parsing GBNF).
- It should be impossible for a `GrammarRule` to represent an invalid rule.
2 years ago
Martin Evans
2056078aef
Initial changes required for GGUF support
2 years ago
Martin Evans
cf4754db44
Removed unnecessary parameters from some low level sampler methods
2 years ago
Martin Evans
4738c26299
- Reduced context size of test, to speed it up
- Removed some unnecessary `ToArray` calls
- Initial pass on LLamaStatelessExecutor, the context overflow management is broken but I think I found where it's ported from
2 years ago
Martin Evans
91bcefc852
comment on IModelParamsExtensions
2 years ago
Martin Evans
9cdc72aa67
Fixed `ToLlamaContextParams` using the wrong parameter for `use_mmap`
2 years ago
sa_ddam213
2d1269cae9
Access to IModelParamsExtensions
2 years ago
Martin Evans
f2499371ea
Pulled conversion of a `IModelParams` into a `LLamaContextParams` out into an extension method which can be used in other places.
2 years ago
Martin Evans
18462beb31
- Removed the `Update` and `GetOrDefault` extension methods (they were unused).
- Renamed `DictionaryExtensions` to `KeyValuePairExtensions`, since nothing in that file extends dictionary any more!
2 years ago
Yaohui Liu
00d91cf99e
refactor: some parts of code of LLamaModel.
2 years ago