Martin Evans
835958398c
- Removed the object wrappers and configurable pipeline, they can be better written in code.
- Added BaseSamplingPipeline which provides a base impl of `ISamplingPipeline`
- Added `DefaultSamplingPipeline` which mimics normal llama.cpp sampling
1 year ago
Martin Evans
33358124db
Initial pass at a new sampling pipeline
1 year ago
Rinne
1f97ad874b
Merge pull request #333 from AsakusaRinne/master
feat: allow customized search path for native library loading.
2 years ago
Rinne
ffc347a3f3
resolve comments.
2 years ago
Rinne
b05c3154f4
feat: allow customized search path for native library loading.
2 years ago
Rinne
934358a7b3
Merge branch 'master' of github.com:AsakusaRinne/LLamaSharp into fix_chinese
2 years ago
Rinne
217c67b757
fix: chinese encoding error.
2 years ago
Martin Evans
a3614f6747
Added `native/` back into path prefix
2 years ago
Martin Evans
77003d763e
Added new symbols from llama.h
2 years ago
Martin Evans
37466956c7
Added new binaries.
- Built by this run: https://github.com/SciSharp/LLamaSharp/actions/runs/6921572568
- commit: `e937066420b79a757bf80e9836eb12b88420a218`
- Rearranged paths
2 years ago
Martin Evans
48c5039054
Improved test coverage. Discovered some issues:
FixedSizeQueue:
- Enqueue would always stop one short of filling the capacity
- Fill would only _replace_ existing items. It was only used in a place where there were not existing items! Removed the method entirely.
LLamaGrammarElement:
- Converted into a `record` struct, removed all of the (now unnecessary) equality stuff.
2 years ago
Martin Evans
c517cc18a2
Merge pull request #304 from martindevans/obsolete_attribute_eval
Added Obsolete markings to all `Eval` overloads
2 years ago
Martin Evans
16ab33ba3c
Added Obsolete markings to all `Eval` overloads
2 years ago
Martin Evans
0e51badb38
Exposed `progress_callback` in `LLamaModelParams` (although not in higher level)
2 years ago
Martin Evans
1970023ef4
Merge pull request #292 from martindevans/dotnet8.0
dotnet8.0
2 years ago
Martin Evans
89fef05362
This commit ( 5fe721bdbe) accidentally removed a load of stuff that it shouldn't. Fixed that.
Originally from these PRs:
- https://github.com/SciSharp/LLamaSharp/pull/263
- https://github.com/SciSharp/LLamaSharp/pull/259
2 years ago
Martin Evans
e9f5dbba89
Processing AVX512 branch on all dotnet versions
2 years ago
Martin Evans
e850115b5f
Added dotnet8.0 as a build target
2 years ago
Martin Evans
b44e780b0f
Merge pull request #281 from martindevans/NativeLibraryConfig_improvements
CPU Feature Detection 2
2 years ago
Martin Evans
e3468d04f0
Merge pull request #277 from martindevans/feature/min_p
MinP Sampler
2 years ago
Martin Evans
a9d1f6cb47
- Renamed `NativeLibraryConfig.Default` to `NativeLibraryConfig.Instance`. It's not default any more as soon as you call `WithX`!
- using `Lazy<T>` to initialize it automatically.
- Added in `AVX512` support for all dotnet versions (but not autodetected).
- Added in AVX version auto detection.
2 years ago
Rinne
da6718c387
docs: adjust some descriptions.
2 years ago
Yaohui Liu
d7675f7936
Merge branch 'master' of github.com:AsakusaRinne/LLamaSharp into cuda_detection
2 years ago
Martin Evans
d743516070
- Added support for the MinP sampler
- Cleaned up comments in implementations of `IInferenceParams`
- Removed default values for all parameters in `LLamaContext.Sample` - they're never used and probably _shouldn't_ ever be used
2 years ago
Yaohui Liu
cb5fb210b1
feat: optimize apis for cuda feature detection.
2 years ago
SignalRT
97006a214f
Merge remote-tracking branch 'upstream/master' into RuntimeDetection
2 years ago
Yaohui Liu
bbbfbd20b5
fix: cannot load library under some conditions.
2 years ago
Martin Evans
31244ae691
Merge branch 'master' into YaRN_scaling_parameters
2 years ago
SignalRT
7691f83516
Test build and nuget packages
2 years ago
Yaohui Liu
d03e1dbe30
feat: support cuda feature detection.
2 years ago
SignalRT
5fe721bdbe
Revert "Merge branch 'pr/268' into RuntimeDetection"
This reverts commit 091b8d58b3502a99b3bfbec9db457c92cc736beb, reversing
changes made to 9b2ca9cf8e .
2 years ago
SignalRT
200011e186
Revert "Merge feat: add detection template for cuda and avx. #268"
This reverts commit b4b3ea9d99 .
2 years ago
SignalRT
b4b3ea9d99
Merge feat: add detection template for cuda and avx. #268
Just merge cuda and avx detection and change layout.
2 years ago
Yaohui Liu
b893c6f609
feat: add detection template for cuda and avx.
2 years ago
Martin Evans
db1bc741b0
Modified `ContextSize` in parameters to be nullable. A null value means autodetect from the model.
2 years ago
Martin Evans
04ee64a6be
Exposed YaRN scaling parameters in IContextParams
2 years ago
SignalRT
46fb472d42
Align with llama.cpp b1488
2 years ago
Martin Evans
a03fdc4818
Using a reference to an array instead of pointer arithmetic. This means it will benefit from bounds checking on the array.
2 years ago
Martin Evans
08c29d52c5
Slightly refactored `SafeLLamaGrammarHandle.Create` to solve CodeQL warning about pointer arithmetic.
2 years ago
Martin Evans
b6d242193e
Debugging slowdown by removing some things:
- Removed all `record struct` uses in native code
- Removed usage of `readonly` in native structs
Minor fix:
- Added sequential layout to `LLamaModelQuantizeParams`
2 years ago
Martin Evans
51c292ebd8
Added a safe method for `llama_get_logits_ith`
2 years ago
Martin Evans
7e3cde4c13
Moved helper methods into `LLamaBatchSafeHandle`
2 years ago
Martin Evans
c7fdb9712c
Added binaries, built from ` 6961c4bd0b`
2 years ago
Martin Evans
e81b3023d5
Rewritten sampling API to be accessed through the `LLamaTokenDataArray` object
2 years ago
Martin Evans
3c5547b2b7
Reduced some uses of `NativeApi` in `BatchedDecoding` by adding some helper methods
2 years ago
Martin Evans
a024d2242e
It works!
had to update binary to `b1426`
2 years ago
Martin Evans
8cd81251b4
initial setup
2 years ago
Martin Evans
321d0b58c4
Merge pull request #202 from martindevans/multi_gpu
Multi GPU
2 years ago
Martin Evans
a03fe003de
Fixed decoding of text "accumulating" over time (never properly clearing buffer)
2 years ago
Martin Evans
51d4411a58
Added two new classes for detokenization tasks:
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
- `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.
Added tests for these classes and updated StatelessExecutor to use them.
Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2 years ago