LLamaSharp

Commit Graph

Author	SHA1	Message	Date
Martin Evans	ce4de7d607	llama_decode lock (#595 ) * Added a lock object into `SafeLlamaModelHandle` which all calls to `llama_decode` (in the `SafeLLamaContextHandle`) lock first. This prevents two contexts from running inference on the same model at the same time, which seems to be unsafe in llama.cpp. * Modified the lock to be global over _all_ inferences. This seems to be necessary (at least with the CUDA backend).	1 year ago
Martin Evans	a8ba9f05b3	March Binary Update (#565 ) * Updated binaries to llama.cpp `3ab8b3a92ede46df88bc5a2dfca3777de4a2b2b6` (build run: https://github.com/SciSharp/LLamaSharp/actions/runs/8118890586) * Added abort callback * Added properties to get/set thread count on `LLamaContext` * Fixed LLamaLogLevel numbering	1 year ago
Martin Evans	8ac1634233	Removed `llama_eval`. It is going to be completely removed in the next version of llama.cpp (#553 )	1 year ago
Martin Evans	e9d9042576	Added `Divide` to `KvAccessor`	1 year ago
Martin Evans	949861a581	- Added a `Modify` method to `Conversation`. This grants temporary access to directly modify the KV cache. - Re-implmented `Rewind` as an extension method using `Modify` internally - Implemented `ShiftLeft`, which shifts everything over except for some starting tokens. This is the same as the `StatelessExecutor` out-of-context handling. - Starting batch at epoch 1, this ensures that conversations (starting at zero) are below the current epoch. It also means `0` can always be used as a value guaranteed to be below the current epoch.	1 year ago
Martin Evans	c5146bac23	- Exposed KV debug view through `SafeLLamaContextHandle` - Added `KvCacheSequenceDivide` - Moved count tokens/cells methods to `SafeLLamaContextHandle`	1 year ago
Martin Evans	15a98b36d8	Updated everything to work with llama.cpp `ce32060198`	1 year ago
Martin Evans	92b9bbe779	Added methods to `SafeLLamaContextHandle` for KV cache manipulation	1 year ago
Martin Evans	36a9335588	Removed `LLamaBatchSafeHandle` (using unmanaged memory, created by llama.cpp) and replaced it with a fully managed `LLamaBatch`. Modified the `BatchedDecoding` example to use new managed batch.	1 year ago
Martin Evans	2ea2048b78	- Added a test for tokenizing just a new line (reproduce issue https://github.com/SciSharp/LLamaSharp/issues/430 ) - Properly displaying `LLamaToken` - Removed all tokenisation code in `SafeLLamaContextHandle` - just pass it all through to the `SafeLlamaModelHandle` - Improved `SafeLlamaModelHandle` tokenisation: - Renting an array, for one less allocation - Not using `&tokens[0]` to take a pointer to an array, this is redundant and doesn't work on empty arrays	1 year ago
Martin Evans	98635a0d5a	Fixed decoding of large tokens (over 16 bytes) in streaming text decoder	1 year ago
Martin Evans	402a110a3a	Merge pull request #404 from martindevans/switched_to_LLamaToken_struct LLamaToken Struct	1 year ago
Martin Evans	1e69e265b6	Moved some native methods to do with creating/destroying resources into their respective handles. There is no safe way to call most of these methods, everything must be done through through handles.	1 year ago
Martin Evans	2eb52b1630	made casts to/from int explicit, fixed places affected	1 year ago
Martin Evans	42be9b136d	Switched form using raw integers, to a `LLamaToken` struct	1 year ago
Martin Evans	f860f88c36	Code cleanup driven by R# suggestions: - Made `NativeApi` into a `static class` (it's not intended to be instantiated) - Moved `LLamaTokenType` enum out into a separate file - Made `LLamaSeqId` and `LLamaPos` into `record struct`, convenient to have equality etc	1 year ago
Martin Evans	2df3e7617e	Added a method to set the RNG seed on the context	1 year ago
Martin Evans	16ab33ba3c	Added Obsolete markings to all `Eval` overloads	2 years ago
Martin Evans	51c292ebd8	Added a safe method for `llama_get_logits_ith`	2 years ago
Martin Evans	51d4411a58	Added two new classes for detokenization tasks: - `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected. - `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens. Added tests for these classes and updated StatelessExecutor to use them. Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).	2 years ago
Martin Evans	efdf3d630c	- Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens). - Built a new (hacky) `Detokenize` method which handles this	2 years ago
Martin Evans	e89ca5cc17	Fixed a few minor warnings	2 years ago
Martin Evans	9daf586ba8	Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)	2 years ago
Martin Evans	1f8c94e386	Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )	2 years ago
Martin Evans	9a0a0ae9fe	Removed cloning support	2 years ago
Martin Evans	0d40338692	Fixed out-of-context handling in stateless executor	2 years ago
Martin Evans	b306ac23dd	Added `Decode` method to `SafeLLamaContextHandle`	2 years ago
Martin Evans	ce1fc51163	Added some more native methods	2 years ago
Martin Evans	bca55eace0	Initial changes to match the llama.cpp changes	2 years ago
Martin Evans	daf09eae64	Skipping tokenization of empty strings (saves allocating an empty array every time)	2 years ago
Martin Evans	bba801f4b7	Added a property to get the KV cache size from a context	2 years ago
Martin Evans	31287b5e6e	Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options: - Just convert it to a `string`, nice and simple - Write the bytes to a `Span<byte>` no allocations - Write the chars to a `StringBuilder` potentially no allocations	2 years ago
Martin Evans	ebacdb666d	- Moved the lower level state get/set methods onto SafeLLamaContextHandle - Used those methods to add a `Clone` method to SafeLLamaContextHandle - Simplified `LLamaContext` by using the new methods - Sealed `LLamaContext` and `LLamaEmbedder`	2 years ago
Martin Evans	a9e6f21ab8	- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring! - Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around	2 years ago
Martin Evans	ae8ef17a4a	- Added various convenience overloads to `LLamaContext.Eval` - Converted `SafeLLamaContextHandle` to take a `ReadOnlySpan` for Eval, narrower type better represents what's really needed	2 years ago
Martin Evans	479ff57853	Renamed `EmbeddingCount` to `EmbeddingSize`	2 years ago
Martin Evans	d0a7a8fcd6	- Cleaned up disposal in LLamaContext - sealed some classes not intended to be extended	2 years ago
Martin Evans	f3511e390f	WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving. The biggest single change is renaming `LLamaModel` to `LLamaContext`	2 years ago
Martin Evans	2b2d3af26b	Moved `Eval` out of `Utils` and into `SafeLLamaContextHandle`	2 years ago
Martin Evans	0e5e00e300	Moved `TokenToString` from Utils into `SafeLLamaContextHandle` (thin wrappers around the same method in `SafeLlamaModelHandle`)	2 years ago
Martin Evans	2d811b2603	- Moved `GetLogits` into `SafeLLamaContextHandle` - Added disposal check into `SafeLLamaContextHandle`	2 years ago
Martin Evans	cd3cf2b77d	- Moved tokenization from `Utils.Tokenize` into `SafeLLamaContextHandle.Tokenize`, one less thing in `Utils`. - Also refactored it to return an `int[]` instead of an `IEnumerable<int>`, solving the "multiple enumeration" problems at the source!	2 years ago
Martin Evans	f16aa58e12	Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts. This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction. It is built upon llama `b3f138d`, necessary DLLs are not included in this commit.	2 years ago
Yaohui Liu	0958bbac2c	feat: add get-embedding api to LLamaModel.	2 years ago
Yaohui Liu	5a79edeb51	feat: add the framework and basic usages.	2 years ago

45 Commits (ec8f83236545a1989df2f75da4e1d8d0345b0407)