Martin Evans
b34f72a883
- Added `SamplingPipeline` to inference params which overrides all other options with an entirely custom pipeline.
- Added a `Sample` method to `LLamaContext` which uses a custom pipeline
- Modified all executors to use the custom pipeline if it exists
1 year ago
Martin Evans
89fef05362
This commit ( 5fe721bdbe) accidentally removed a load of stuff that it shouldn't. Fixed that.
Originally from these PRs:
- https://github.com/SciSharp/LLamaSharp/pull/263
- https://github.com/SciSharp/LLamaSharp/pull/259
2 years ago
Martin Evans
e3468d04f0
Merge pull request #277 from martindevans/feature/min_p
MinP Sampler
2 years ago
Martin Evans
d743516070
- Added support for the MinP sampler
- Cleaned up comments in implementations of `IInferenceParams`
- Removed default values for all parameters in `LLamaContext.Sample` - they're never used and probably _shouldn't_ ever be used
2 years ago
SignalRT
97006a214f
Merge remote-tracking branch 'upstream/master' into RuntimeDetection
2 years ago
Martin Evans
31244ae691
Merge branch 'master' into YaRN_scaling_parameters
2 years ago
SignalRT
5fe721bdbe
Revert "Merge branch 'pr/268' into RuntimeDetection"
This reverts commit 091b8d58b3502a99b3bfbec9db457c92cc736beb, reversing
changes made to 9b2ca9cf8e .
2 years ago
Martin Evans
db1bc741b0
Modified `ContextSize` in parameters to be nullable. A null value means autodetect from the model.
2 years ago
Martin Evans
04ee64a6be
Exposed YaRN scaling parameters in IContextParams
2 years ago
Martin Evans
529b06b35b
- Fixed rope frequency/base to use the values in the model by default, instead of always overriding them by default!
2 years ago
Martin Evans
c786fb0ec8
Using `IReadOnlyList` instead of `IEnumerable` in `IInferenceParams`
2 years ago
Martin Evans
6a4cd506bd
Added a safe `TensorSplitsCollection` to the params which prevents incorrectly setting the `tensor_splits` collection
2 years ago
sa_ddam213
b4b4000342
Merge branch 'master' into upstream_master
# Conflicts:
# LLama.Web/Common/ModelOptions.cs
# LLama.Web/Services/ConnectionSessionService.cs
# LLama/LLamaStatelessExecutor.cs
# LLama/LLamaWeights.cs
2 years ago
Martin Evans
2a38808bca
- Added threads to context params, replaced all thread args with `uint?`
- Replaced all binaries
2 years ago
sa_ddam213
e2a17d6b6f
Refactor conflicting object name SessionOptions
2 years ago
sa_ddam213
44f1b91c29
Update Web to support version 0.5.1
2 years ago
sa_ddam213
c9108f8311
Add service for managing Models and Model Contexts
2 years ago
Martin Evans
669ae47ef7
- Split parameters into two interfaces
- params contains a list of loras, instead of just one
2 years ago
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2 years ago
Martin Evans
2056078aef
Initial changes required for GGUF support
2 years ago
Martin Evans
93f24f8a51
Switched to properly typed `Encoding` property
2 years ago
Martin Evans
759ae26f36
Merge branch 'master' into grammar_basics
2 years ago
Martin Evans
a9e6f21ab8
- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2 years ago
Martin Evans
64416ca23c
- Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`)
- Integrated grammar into sampling
- Added a test for the grammar sampling
2 years ago
Martin Evans
2c933c57a1
Fixed ModelOptions in Web project
2 years ago
sa_ddam213
bac9cba01a
InferenceParams abstractions
2 years ago
sa_ddam213
2a04e31b7d
ModelParams abstraction
2 years ago
sa_ddam213
3fec7a63c7
Add Instruct and Stateless support
2 years ago
sa_ddam213
d9fbd56f10
Strongly type connection status
2 years ago