Rinne
4f44e3b198
refactor: init some refactorings for experiment.
1 year ago
Martin Evans
b34f72a883
- Added `SamplingPipeline` to inference params which overrides all other options with an entirely custom pipeline.
- Added a `Sample` method to `LLamaContext` which uses a custom pipeline
- Modified all executors to use the custom pipeline if it exists
1 year ago
Martin Evans
e47431ed80
Modified `TensorSplitsCollection` so it accepts any number of splits, as long as it doesn't exceed the number of supported devices
2 years ago
Martin Evans
89fef05362
This commit ( 5fe721bdbe) accidentally removed a load of stuff that it shouldn't. Fixed that.
Originally from these PRs:
- https://github.com/SciSharp/LLamaSharp/pull/263
- https://github.com/SciSharp/LLamaSharp/pull/259
2 years ago
Martin Evans
e3468d04f0
Merge pull request #277 from martindevans/feature/min_p
MinP Sampler
2 years ago
Martin Evans
d743516070
- Added support for the MinP sampler
- Cleaned up comments in implementations of `IInferenceParams`
- Removed default values for all parameters in `LLamaContext.Sample` - they're never used and probably _shouldn't_ ever be used
2 years ago
SignalRT
97006a214f
Merge remote-tracking branch 'upstream/master' into RuntimeDetection
2 years ago
Martin Evans
31244ae691
Merge branch 'master' into YaRN_scaling_parameters
2 years ago
SignalRT
5fe721bdbe
Revert "Merge branch 'pr/268' into RuntimeDetection"
This reverts commit 091b8d58b3502a99b3bfbec9db457c92cc736beb, reversing
changes made to 9b2ca9cf8e .
2 years ago
Martin Evans
db1bc741b0
Modified `ContextSize` in parameters to be nullable. A null value means autodetect from the model.
2 years ago
Martin Evans
04ee64a6be
Exposed YaRN scaling parameters in IContextParams
2 years ago
Martin Evans
529b06b35b
- Fixed rope frequency/base to use the values in the model by default, instead of always overriding them by default!
2 years ago
Martin Evans
c786fb0ec8
Using `IReadOnlyList` instead of `IEnumerable` in `IInferenceParams`
2 years ago
Martin Evans
f621ec67e8
Fixed serialization
2 years ago
Martin Evans
768747c652
spelling
2 years ago
Martin Evans
b4e7f64e76
Added System.Text.Json serialization for `TensorSplitsCollectionConverter`
2 years ago
Martin Evans
6a4cd506bd
Added a safe `TensorSplitsCollection` to the params which prevents incorrectly setting the `tensor_splits` collection
2 years ago
Martin Evans
9daf586ba8
Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)
2 years ago
Martin Evans
2a38808bca
- Added threads to context params, replaced all thread args with `uint?`
- Replaced all binaries
2 years ago
Martin Evans
669ae47ef7
- Split parameters into two interfaces
- params contains a list of loras, instead of just one
2 years ago
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2 years ago
Martin Evans
3f80190f85
Minimal changes required to remove non-async inference.
2 years ago
Martin Evans
b47977300a
Removed one more unused parameter
2 years ago
Martin Evans
a1b0349561
Removed `ModelAlias` property (unused)
2 years ago
Martin Evans
d79a6556a1
Removed 3 unused properties of `InferenceParams`
2 years ago
Martin Evans
2056078aef
Initial changes required for GGUF support
2 years ago
Martin Evans
a911b77dec
Various minor changes, resolving about 100 ReSharper code quality warnings
2 years ago
Martin Evans
93f24f8a51
Switched to properly typed `Encoding` property
2 years ago
Martin Evans
2830e5755c
- Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed.
- Deleted `NativeInfo` (internal class, not used anywhere)
2 years ago
Martin Evans
759ae26f36
Merge branch 'master' into grammar_basics
2 years ago
Martin Evans
a9e6f21ab8
- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2 years ago
Martin Evans
64416ca23c
- Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`)
- Integrated grammar into sampling
- Added a test for the grammar sampling
2 years ago
Martin Evans
f3511e390f
WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving.
The biggest single change is renaming `LLamaModel` to `LLamaContext`
2 years ago
Martin Evans
685eb3b9c2
Replaced `nint` with `float[]?` in Model params, which is much more user friendly!
2 years ago
sa_ddam213
bac9cba01a
InferenceParams abstractions
2 years ago
sa_ddam213
2a04e31b7d
ModelParams abstraction
2 years ago
Martin Evans
2e76b79af6
Various minor XML docs fixes
2 years ago
Marcel
65925eac4f
Added documentation for the interfaces
2 years ago
Marcel
b911b2548b
move interfaces into abstractions folder
2 years ago
Yaohui Liu
3bf74ec9b9
feat: add chat session for refactored code.
2 years ago
Yaohui Liu
908b79e855
feat: add stateless executor.
2 years ago
Yaohui Liu
e603a09137
fix: state loading and saving not working.
2 years ago
Yaohui Liu
5679e08718
feat: add ILLamaExecutor.InferAsync.
2 years ago
Yaohui Liu
264fb9a706
refactor: LLamaModel and LLamaExecutor.
2 years ago