Martin Evans
e3468d04f0
Merge pull request #277 from martindevans/feature/min_p
MinP Sampler
2 years ago
Rinne
da6718c387
docs: adjust some descriptions.
2 years ago
Yaohui Liu
d7675f7936
Merge branch 'master' of github.com:AsakusaRinne/LLamaSharp into cuda_detection
2 years ago
Martin Evans
d743516070
- Added support for the MinP sampler
- Cleaned up comments in implementations of `IInferenceParams`
- Removed default values for all parameters in `LLamaContext.Sample` - they're never used and probably _shouldn't_ ever be used
2 years ago
Yaohui Liu
4d2c5f1003
build: change nuget configuration for cuda detection.
2 years ago
Yaohui Liu
cb5fb210b1
feat: optimize apis for cuda feature detection.
2 years ago
SignalRT
97006a214f
Merge remote-tracking branch 'upstream/master' into RuntimeDetection
2 years ago
Yaohui Liu
bbbfbd20b5
fix: cannot load library under some conditions.
2 years ago
Martin Evans
31244ae691
Merge branch 'master' into YaRN_scaling_parameters
2 years ago
SignalRT
7691f83516
Test build and nuget packages
2 years ago
Yaohui Liu
d03e1dbe30
feat: support cuda feature detection.
2 years ago
SignalRT
fb95bbb4e0
Merge remote-tracking branch 'upstream/master' into RuntimeDetection
2 years ago
SignalRT
5fe721bdbe
Revert "Merge branch 'pr/268' into RuntimeDetection"
This reverts commit 091b8d58b3502a99b3bfbec9db457c92cc736beb, reversing
changes made to 9b2ca9cf8e .
2 years ago
SignalRT
200011e186
Revert "Merge feat: add detection template for cuda and avx. #268"
This reverts commit b4b3ea9d99 .
2 years ago
Rinne
47e016743e
Merge pull request #266 from philippjbauer/master
Prevent duplication of user prompts / chat history in ChatSession.
2 years ago
SignalRT
b4b3ea9d99
Merge feat: add detection template for cuda and avx. #268
Just merge cuda and avx detection and change layout.
2 years ago
SignalRT
091b8d58b3
Merge branch 'pr/268' into RuntimeDetection
2 years ago
Yaohui Liu
b893c6f609
feat: add detection template for cuda and avx.
2 years ago
Philipp Bauer
d2b544afb8
Improved method return description
2 years ago
Philipp Bauer
6ea40d1546
Use full history only when the ChatSession runs the first time
2 years ago
SignalRT
0edbd92530
Change nuget backend packages
Delete Backend.Metal because is not needed anymore.
Do not include .metal in x86_64 binaries
2 years ago
Martin Evans
db1bc741b0
Modified `ContextSize` in parameters to be nullable. A null value means autodetect from the model.
2 years ago
Udayshankar Ravikumar
4071c1f5fc
Updated preprocessor directives
2 years ago
Philipp Bauer
a288e7c02b
Prevent duplication of user prompts / chat history in ChatSession.
The way ChatSession.ChatAsync was using the provided methods
from a IHistoryTransform interface implementation created unexpected
duplication of the chat history messages. It also prevented loading
previous history into the current session.
2 years ago
SignalRT
b67198c6ef
MacOS Intel Disable METAL
2 years ago
Udayshankar Ravikumar
df310e15da
Fixed preprocessor directives
2 years ago
SignalRT
e64b9057d7
Merge branch 'RuntimeDetection' of https://github.com/SignalRT/LLamaSharp into RuntimeDetection
2 years ago
SignalRT
d1244332ed
MacOS Runtime detection and clasification
Create different paths to different MacOS platforms.
Dynamically load the right library
2 years ago
Martin Evans
04ee64a6be
Exposed YaRN scaling parameters in IContextParams
2 years ago
Udayshankar Ravikumar
1dad1ff834
Enhance framework compatibility
2 years ago
SignalRT
e1a89a8b0a
Added all binaries from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/6762323560
Add the MacOS binary from the same run
2 years ago
Martin Evans
11d8c55db7
Added all binaries from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/6762323560 ( 132d25b8a6)
2 years ago
SignalRT
46fb472d42
Align with llama.cpp b1488
2 years ago
Martin Evans
a03fdc4818
Using a reference to an array instead of pointer arithmetic. This means it will benefit from bounds checking on the array.
2 years ago
Martin Evans
08c29d52c5
Slightly refactored `SafeLLamaGrammarHandle.Create` to solve CodeQL warning about pointer arithmetic.
2 years ago
Yaohui Liu
0e139d4ee2
fix: add arm binaries to cpu nuspec.
2 years ago
Yaohui Liu
7ee27d2f99
fix: binary not copied on MAC platform.
2 years ago
Martin Evans
db8f3980ea
New binaries from this commit: 207b51900e
Should fix the extreme speed loss.
2 years ago
Martin Evans
b6d242193e
Debugging slowdown by removing some things:
- Removed all `record struct` uses in native code
- Removed usage of `readonly` in native structs
Minor fix:
- Added sequential layout to `LLamaModelQuantizeParams`
2 years ago
Martin Evans
529b06b35b
- Fixed rope frequency/base to use the values in the model by default, instead of always overriding them by default!
2 years ago
Martin Evans
dcc82e582e
Fixed `Eval` on platforms < dotnet 5
2 years ago
Martin Evans
51c292ebd8
Added a safe method for `llama_get_logits_ith`
2 years ago
Martin Evans
7e3cde4c13
Moved helper methods into `LLamaBatchSafeHandle`
2 years ago
Martin Evans
ccb8afae46
Cleaned up stateless executor as preparation for changing it to use the new batched decoding system.
2 years ago
Martin Evans
c786fb0ec8
Using `IReadOnlyList` instead of `IEnumerable` in `IInferenceParams`
2 years ago
Martin Evans
c7fdb9712c
Added binaries, built from ` 6961c4bd0b`
2 years ago
Martin Evans
e81b3023d5
Rewritten sampling API to be accessed through the `LLamaTokenDataArray` object
2 years ago
Martin Evans
3c5547b2b7
Reduced some uses of `NativeApi` in `BatchedDecoding` by adding some helper methods
2 years ago
Martin Evans
b38e3f6fe2
binaries (avx512)
2 years ago
Martin Evans
a024d2242e
It works!
had to update binary to `b1426`
2 years ago