LLamaSharp

Commit Graph

Author	SHA1	Message	Date
Martin Evans	3b0b2ab224	Merge pull request #721 from martindevans/kv_cache_view Make `LLamaKvCacheView` Safe	1 year ago
Martin Evans	2117287da9	Expanded the `LLamaKvCacheView` to make it usable without unsafe. - Checking indices - Returning span of correct length - Hiding native methods - Hiding native types	1 year ago
Martin Evans	a0335f67a4	- Added `LLamaTemplate` which efficiently formats a series of messages according to the model template. - Fixed `llama_chat_apply_template` method (wrong entrypoint, couldn't handle null model)	1 year ago
Martin Evans	3c76440957	- Added tests for generating embeddings with generative model and embedding model - Rewritten native API methods for embeddings to return pointers - null is a valid value for these methods to return so `Span` is not appropriate	1 year ago
Martin Evans	c325ac9127	April 2024 Binary Update (#662 ) * Updated binaries, using [this build](https://github.com/SciSharp/LLamaSharp/actions/runs/8654672719/job/23733195669) for llama.cpp commit `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7`. - Added all new functions. - Moved some functions (e.g. `SafeLlamaModelHandle` specific functions) into `SafeLlamaModelHandle.cs` - Exposed tokens on `SafeLlamaModelHandle` and `LLamaWeights` through a `Tokens` property. As new special tokens are added in the future they can be added here. - Changed all token properties to return nullable tokens, to handle some models not having some tokens. - Fixed `DefaultSamplingPipeline` to handle no newline token in some models. * Moved native methods to more specific locations. - Context specific things have been moved into `SafeLLamaContextHandle.cs` and made private - they're exposed through C# properties and methods already. - Checking that GPU layer count is zero if GPU offload is not supported. - Moved methods for creating default structs (`llama_model_quantize_default_params` and `llama_context_default_params`) into relevant structs. * Removed exception if `GpuLayerCount > 0` when GPU is not supported. * - Added low level wrapper methods for new per-sequence state load/save in `SafeLLamaContextHandle` - Added high level wrapper methods (save/load with `State` object or memory mapped file) in `LLamaContext` - Moved native methods for per-sequence state load/save into `SafeLLamaContextHandle` * Added update and defrag methods for KV cache in `SafeLLamaContextHandle` * Updated submodule to `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7` * Passing the sequence ID when saving a single sequence state	1 year ago
Martin Evans	58107bb5b9	Logging interceptor (#649 ) * - Added `NativeLogConfig` which allows overriding the llama.cpp log callback - Delaying binding of this into llama.cpp until after `NativeLibraryConfig` has loaded * Using the log callback to show loading log messages during loading. * Registering log callbacks before any calls to llama.cpp except `llama_empty_call`, this is specifically selected to be a method that does nothing and is just there for triggering DLL loading. * - Removed much of the complexity of logging from `NativeApi.Load`. It always call whatever log callbacks you have registered. - Removed alternative path for `ILogger` in NativeLibraryConfig, instead it redirects to wrapping it in a delegate. * Saving a GC handle to keep the log callback alive * Removed prefix, logger should already do that. * Buffering up messages until a newline is encountered before passing log message to ILogger. * - Added trailing `\n` to log messages from loading. - Using `ThreadLocal<StringBuilder>` to ensure messages from separate threads don't get mixed together.	1 year ago
evolcano	9d091c0316	Add path to find llama.dll for MAUI This commit is originally made by lcarrere in https://github.com/SciSharp/LLamaSharp/issues/180 . I have confirmed this modification is OK in my windows 11 laptop, add make this commit according require of AsakusaRinne.	1 year ago
Martin Evans	a8ba9f05b3	March Binary Update (#565 ) * Updated binaries to llama.cpp `3ab8b3a92ede46df88bc5a2dfca3777de4a2b2b6` (build run: https://github.com/SciSharp/LLamaSharp/actions/runs/8118890586) * Added abort callback * Added properties to get/set thread count on `LLamaContext` * Fixed LLamaLogLevel numbering	1 year ago
Martin Evans	8ac1634233	Removed `llama_eval`. It is going to be completely removed in the next version of llama.cpp (#553 )	1 year ago
Martin Evans	c7d0dc915a	Assorted small changes to clean up some code warnings	1 year ago
Martin Evans	949861a581	- Added a `Modify` method to `Conversation`. This grants temporary access to directly modify the KV cache. - Re-implmented `Rewind` as an extension method using `Modify` internally - Implemented `ShiftLeft`, which shifts everything over except for some starting tokens. This is the same as the `StatelessExecutor` out-of-context handling. - Starting batch at epoch 1, this ensures that conversations (starting at zero) are below the current epoch. It also means `0` can always be used as a value guaranteed to be below the current epoch.	1 year ago
Martin Evans	15a98b36d8	Updated everything to work with llama.cpp `ce32060198`	1 year ago
Martin Evans	92b9bbe779	Added methods to `SafeLLamaContextHandle` for KV cache manipulation	1 year ago
Martin Evans	ce1d302e7e	Moved some native methods into `SafeLlamaModelHandle`, these methods are all wrapped in safer accessors with no extra costs so there is no need to expose them.	1 year ago
Martin Evans	1e86755071	- Removed unnecessary `unsafe` block in model metadata loading - Clarified comments on native metadata loading methods	1 year ago
Martin Evans	402a110a3a	Merge pull request #404 from martindevans/switched_to_LLamaToken_struct LLamaToken Struct	1 year ago
Martin Evans	1e69e265b6	Moved some native methods to do with creating/destroying resources into their respective handles. There is no safe way to call most of these methods, everything must be done through through handles.	1 year ago
Martin Evans	42be9b136d	Switched form using raw integers, to a `LLamaToken` struct	1 year ago
Martin Evans	4e5e994dda	- directly returning a SafeLlamaModelHandle, instead of an IntPtr which is wrapped in a handle. - made `llama_backend_init` private. This is automatically called, there is no way it can correctly be used externally. - made `llama_token_to_piece` safe (Span instead of pointer)	1 year ago
Martin Evans	bac3e43498	Fixed handling of empty spans	1 year ago
Martin Evans	c002642268	- Removed some `unsafe` where it wasn't necessary - Wrapped some native functions which take (pointer, length) in function which take a `span` instead.	1 year ago
Martin Evans	f860f88c36	Code cleanup driven by R# suggestions: - Made `NativeApi` into a `static class` (it's not intended to be instantiated) - Moved `LLamaTokenType` enum out into a separate file - Made `LLamaSeqId` and `LLamaPos` into `record struct`, convenient to have equality etc	1 year ago
Martin Evans	db7ecf5a43	Added a method to create a clone of a grammar instance	1 year ago
Rinne	934358a7b3	Merge branch 'master' of github.com:AsakusaRinne/LLamaSharp into fix_chinese	2 years ago
Rinne	217c67b757	fix: chinese encoding error.	2 years ago
Martin Evans	77003d763e	Added new symbols from llama.h	2 years ago
Martin Evans	37466956c7	Added new binaries. - Built by this run: https://github.com/SciSharp/LLamaSharp/actions/runs/6921572568 - commit: `e937066420b79a757bf80e9836eb12b88420a218` - Rearranged paths	2 years ago
Yaohui Liu	cb5fb210b1	feat: optimize apis for cuda feature detection.	2 years ago
Yaohui Liu	bbbfbd20b5	fix: cannot load library under some conditions.	2 years ago
Yaohui Liu	d03e1dbe30	feat: support cuda feature detection.	2 years ago
SignalRT	5fe721bdbe	Revert "Merge branch 'pr/268' into RuntimeDetection" This reverts commit 091b8d58b3502a99b3bfbec9db457c92cc736beb, reversing changes made to `9b2ca9cf8e`.	2 years ago
SignalRT	200011e186	Revert "Merge feat: add detection template for cuda and avx. #268" This reverts commit `b4b3ea9d99`.	2 years ago
SignalRT	b4b3ea9d99	Merge feat: add detection template for cuda and avx. #268 Just merge cuda and avx detection and change layout.	2 years ago
Yaohui Liu	b893c6f609	feat: add detection template for cuda and avx.	2 years ago
Martin Evans	c7fdb9712c	Added binaries, built from ``6961c4bd0b``	2 years ago
Martin Evans	a024d2242e	It works! had to update binary to `b1426`	2 years ago
Martin Evans	8cd81251b4	initial setup	2 years ago
Martin Evans	15db194c17	Added multi GPU support	2 years ago
Martin Evans	e89ca5cc17	Fixed a few minor warnings	2 years ago
Martin Evans	1f8c94e386	Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )	2 years ago
Martin Evans	0d40338692	Fixed out-of-context handling in stateless executor	2 years ago
Martin Evans	9e958e896b	safe handle for batch	2 years ago
Martin Evans	ce1fc51163	Added some more native methods	2 years ago
Martin Evans	bca55eace0	Initial changes to match the llama.cpp changes	2 years ago
Haiping	10678a83d6	Merge pull request #65 from martindevans/alternative_dependency_loading CPU Feature Detection	2 years ago
sa_ddam213	09d8f434f2	Extract LLamaLogLevel, Remove Logger class	2 years ago
Martin Evans	8f58a40fb9	Added Linux dependency loading	2 years ago
Martin Evans	dd4957471f	Changed paths to match what the GitHub build action produces	2 years ago
Martin Evans	756a1ad0ba	Added a new way to load dependencies, performing CPU feature detection	2 years ago
Martin Evans	bcf06e2652	Added some comments on various native methods	2 years ago

1 2

85 Commits (3b0b2ab2240036bb9672791d0db0adb0bac356e3)