LLamaSharp

Commit Graph

Author	SHA1	Message	Date
Yaohui Liu	d03e1dbe30	feat: support cuda feature detection.	2 years ago
SignalRT	5fe721bdbe	Revert "Merge branch 'pr/268' into RuntimeDetection" This reverts commit 091b8d58b3502a99b3bfbec9db457c92cc736beb, reversing changes made to `9b2ca9cf8e`.	2 years ago
SignalRT	200011e186	Revert "Merge feat: add detection template for cuda and avx. #268" This reverts commit `b4b3ea9d99`.	2 years ago
SignalRT	b4b3ea9d99	Merge feat: add detection template for cuda and avx. #268 Just merge cuda and avx detection and change layout.	2 years ago
Yaohui Liu	b893c6f609	feat: add detection template for cuda and avx.	2 years ago
Martin Evans	c7fdb9712c	Added binaries, built from ``6961c4bd0b``	2 years ago
Martin Evans	a024d2242e	It works! had to update binary to `b1426`	2 years ago
Martin Evans	8cd81251b4	initial setup	2 years ago
Martin Evans	15db194c17	Added multi GPU support	2 years ago
Martin Evans	e89ca5cc17	Fixed a few minor warnings	2 years ago
Martin Evans	1f8c94e386	Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )	2 years ago
Martin Evans	0d40338692	Fixed out-of-context handling in stateless executor	2 years ago
Martin Evans	9e958e896b	safe handle for batch	2 years ago
Martin Evans	ce1fc51163	Added some more native methods	2 years ago
Martin Evans	bca55eace0	Initial changes to match the llama.cpp changes	2 years ago
Haiping	10678a83d6	Merge pull request #65 from martindevans/alternative_dependency_loading CPU Feature Detection	2 years ago
sa_ddam213	09d8f434f2	Extract LLamaLogLevel, Remove Logger class	2 years ago
Martin Evans	8f58a40fb9	Added Linux dependency loading	2 years ago
Martin Evans	dd4957471f	Changed paths to match what the GitHub build action produces	2 years ago
Martin Evans	756a1ad0ba	Added a new way to load dependencies, performing CPU feature detection	2 years ago
Martin Evans	bcf06e2652	Added some comments on various native methods	2 years ago
Martin Evans	2022b82947	Added binaries generated by this action: https://github.com/SciSharp/LLamaSharp/actions/runs/6002797872/job/16279896150 Based on this version: `6b73ef1201`	2 years ago
Martin Evans	0c98ae1955	Passing ctx to `llama_token_nl(_ctx)`	2 years ago
Martin Evans	6ffa28f964	Removed `LLAMA_MAX_DEVICES` (not used)	2 years ago
Martin Evans	2056078aef	Initial changes required for GGUF support	2 years ago
Martin Evans	829f32b27d	- Added `Obsolete` attributes to the entire `OldVersion` namespace, so they can be removed in the future - Minor changes to cleanup some of the compiler warnings	2 years ago
Martin Evans	d7f971fc22	Improved `NativeApi` file a bit: - Added some more comments - Modified `llama_tokenize` to not allocate - Modified `llama_tokenize_native` to take a pointer instead of an array, allowing use with no allocations - Removed GgmlInitParams (not used)	2 years ago
sa_ddam213	726987b761	Add native logging output	2 years ago
Martin Evans	2b2d3af26b	Moved `Eval` out of `Utils` and into `SafeLLamaContextHandle`	2 years ago
Martin Evans	2d811b2603	- Moved `GetLogits` into `SafeLLamaContextHandle` - Added disposal check into `SafeLLamaContextHandle`	2 years ago
Martin Evans	cd3cf2b77d	- Moved tokenization from `Utils.Tokenize` into `SafeLLamaContextHandle.Tokenize`, one less thing in `Utils`. - Also refactored it to return an `int[]` instead of an `IEnumerable<int>`, solving the "multiple enumeration" problems at the source!	2 years ago
Yaohui Liu	bb46a990d0	fix: add bug info for native api.	2 years ago
Martin Evans	afb9d24f3a	Added model `Tokenize` method	2 years ago
Martin Evans	369c915afe	Added TokenToString conversion on model handle	2 years ago
Martin Evans	b721072aa5	Exposed some extra model properties on safe handle	2 years ago
Martin Evans	f16aa58e12	Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts. This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction. It is built upon llama `b3f138d`, necessary DLLs are not included in this commit.	2 years ago
Rinne	c5e8b3eba2	Merge pull request #56 from martindevans/memory_mapped_save_loading_and_saving Memory Mapped LoadState/SaveState	2 years ago
Rinne	1b0523f630	Merge branch 'master' into master	2 years ago
Martin Evans	4d72420a04	Replaced `SaveState` and `LoadState` implementations. These new implementations map the file into memory and then pass the pointer directly into the native API. This improves things in two ways: - A C# array cannot exceed 2,147,483,591 bytes. In my own use of LlamaSharp I encountered this limit. - This saves an extra copy of the entire state data into a C# `byte[]`, so it should be faster. This does _not_ fix some other places where `GetStateData` is used. I'll look at those in a separate PR.	2 years ago
SignalRT	56a37a0d7d	Update to lates llama.cpp Adapt the interface change in llama_backend_init	2 years ago
unknown	dba866ffcf	Update API method name	2 years ago
Yaohui Liu	1062fe1a7e	feat: upgrade the native libraries.	2 years ago
Yaohui Liu	9850417a12	feat: update quantize native params.	2 years ago
Yaohui Liu	3bf74ec9b9	feat: add chat session for refactored code.	2 years ago
Yaohui Liu	264fb9a706	refactor: LLamaModel and LLamaExecutor.	2 years ago
Yaohui Liu	3a62f087fe	fix: encoding error when using other languages.	2 years ago
Yaohui Liu	18c2ff2395	refactor: instruct mode and examples.	2 years ago
Yaohui Liu	55d5a8ae51	fix: quantization error with fp16.	2 years ago
Yaohui Liu	19979f664a	feat: support loading and saving state.	2 years ago
Yaohui Liu	4314f64b9c	feat: add check for backend package.	2 years ago

1 2

56 Commits (d03e1dbe3008b84c4263dab3ffbb12134a515439)