LLamaSharp

Commit Graph

Author	SHA1	Message	Date
jlsantiago	3b2836eac4	Llava api (#563 ) * Add llava_binaries, update all binaries to make the test * Llava API + LlavaTest Preliminary * First prototype of Load + Unit Test * Temporary run test con branch LlavaAPI * Disable Embed test to review the rest of the test * Restore Embedding test * Use BatchThread to eval image embeddings Test Threads default value to ensure it doesn´t produce problems. * Rename test file * Update action versions * Test only one method, no release embeddings * Revert "Test only one method, no release embeddings" This reverts commit `264e176dcc`. * Correct API call * Only test llava related functionality * Cuda and Cblast binaries * Restore build policy * Changes related with code review * Add SafeHandles * Set overwrite to upload-artifact@v4 * Revert to upload-artifact@v3 * revert to upload-artifact@v3	1 year ago
Martin Evans	ce4de7d607	llama_decode lock (#595 ) * Added a lock object into `SafeLlamaModelHandle` which all calls to `llama_decode` (in the `SafeLLamaContextHandle`) lock first. This prevents two contexts from running inference on the same model at the same time, which seems to be unsafe in llama.cpp. * Modified the lock to be global over _all_ inferences. This seems to be necessary (at least with the CUDA backend).	1 year ago
Clovis Henrique Ribeiro	d0f79814e9	Added conditional compilation code to progress_callback (in LlamaModelParams struct) so the struct plays nice with legacy NET Framework 4.8 (#593 )	1 year ago
Martin Evans	f0b0bbcbb7	Mutable Logits (#586 ) Modified LLamaBatch to not share tokens with other sequences if logits is true. This ensures that the logit span at the end in used by exactly one sequence - therefore it's safe to mutate. This removes the need for copying _very_ large arrays (vocab size) and simplifies sampling pipelines.	1 year ago
Martin Evans	a8ba9f05b3	March Binary Update (#565 ) * Updated binaries to llama.cpp `3ab8b3a92ede46df88bc5a2dfca3777de4a2b2b6` (build run: https://github.com/SciSharp/LLamaSharp/actions/runs/8118890586) * Added abort callback * Added properties to get/set thread count on `LLamaContext` * Fixed LLamaLogLevel numbering	1 year ago
Martin Evans	8ac1634233	Removed `llama_eval`. It is going to be completely removed in the next version of llama.cpp (#553 )	1 year ago
Martin Evans	f0e7e7cc0a	Removed `SamplingApi`. it has been marked as Obsolete for a while, replaced by instance methods on `LLamaTokenDataArray` (#552 )	1 year ago
Martin Evans	7d84625a67	Classifier Free Guidance (#536 ) * Added a `Guidance` method to `LLamaTokenDataArray` which applies classifier free guidance * Factored out a safer `llama_sample_apply_guidance` method based on spans * Created a guided sampling demo using the batched executor * fixed comment, "classifier free" not "context free" * Rebased onto master and fixed breakage due to changes in `BaseSamplingPipeline` * Asking user for guidance weight * Progress bar in batched fork demo * Improved fork example (using tree display) * Added proper disposal of resources in batched examples * Added some more comments in BatchedExecutorGuidance	1 year ago
Scott W Harden	a6394001a1	NativeLibraryConfig: WithLogs(LLamaLogLevel) (#529 ) Adds a NativeLibraryConfig.WithLogs() overload to let the user indicate the log level (with "info" as the default)	1 year ago
Martin Evans	c7d0dc915a	Assorted small changes to clean up some code warnings	1 year ago
Martin Evans	e9d9042576	Added `Divide` to `KvAccessor`	1 year ago
Martin Evans	949861a581	- Added a `Modify` method to `Conversation`. This grants temporary access to directly modify the KV cache. - Re-implmented `Rewind` as an extension method using `Modify` internally - Implemented `ShiftLeft`, which shifts everything over except for some starting tokens. This is the same as the `StatelessExecutor` out-of-context handling. - Starting batch at epoch 1, this ensures that conversations (starting at zero) are below the current epoch. It also means `0` can always be used as a value guaranteed to be below the current epoch.	1 year ago
Martin Evans	b0acecf080	Created a new `BatchedExecutor` which processes multiple "Conversations" in one single inference batch. This is faster, even when the conversations are unrelated, and is much faster if the conversations share some overlap (e.g. a common system prompt prefix). Conversations can be "forked", to create a copy of a conversation at a given point. This allows e.g. prompting a conversation with a system prefix just once and then forking it again and again for each individual conversation. Conversations can also be "rewound" to an earlier state. Added two new examples, demonstrating forking and rewinding.	1 year ago
Martin Evans	90915c5a99	Added increment and decrement operators to `LLamaPos`	1 year ago
Martin Evans	c5146bac23	- Exposed KV debug view through `SafeLLamaContextHandle` - Added `KvCacheSequenceDivide` - Moved count tokens/cells methods to `SafeLLamaContextHandle`	1 year ago
Martin Evans	15a98b36d8	Updated everything to work with llama.cpp `ce32060198`	1 year ago
Martin Evans	5da2a2f64b	- Removed one of the constructors of `SafeLLamaHandleBase`, which implicitly states that memory is owned. Better to be explicit about this kind of thing! - Also fixed `ToString()` in `SafeLLamaHandleBase`	1 year ago
Jason Couture	ec59c5bf9e	Fix missing library name prefix for cuda	1 year ago
Jason Couture	443ce4fff4	While the dllimport changes work, manual path searching needed to be updated	1 year ago
Jason Couture	db7e1e88f8	Use llama instead of libllama in `[DllImport]` This results in windows users not needing to rename the DLL. This allows native llama builds to be dropped in, even on windows. I also took the time to update the documentation, removing references to renaming the files, since the names now match. Fixes #463	1 year ago
Martin Evans	92b9bbe779	Added methods to `SafeLLamaContextHandle` for KV cache manipulation	1 year ago
Martin Evans	96c26c25f5	Merge pull request #445 from martindevans/stateless_executor_llama_decode Swapped `StatelessExecutor` to use `llama_decode`!	1 year ago
Martin Evans	9fe878ae1f	- Fixed example - Growing more than double, if necessary	1 year ago
Martin Evans	9ede1bedc2	Automatically growing batch n_seq_max when exceeded. This means no parameters need to be picked when the batch is created.	1 year ago
Martin Evans	a2e29d393c	Swapped `StatelessExecutor` to use `llama_decode`! - Added `logits_i` argument to `Context.ApplyPenalty` - Added a new exception type for `llama_decode` return code	1 year ago
Martin Evans	99969e538e	- Removed some unused `eval` methods. - Added a `DecodeAsync` overload which runs the work in a task - Replaced some `NativeHandle` usage in `BatchedDecoding` with higher level equivalents. - Made the `LLamaBatch` grow when token capacity is exceeded, removing the need to manage token capacity externally.	1 year ago
Martin Evans	36a9335588	Removed `LLamaBatchSafeHandle` (using unmanaged memory, created by llama.cpp) and replaced it with a fully managed `LLamaBatch`. Modified the `BatchedDecoding` example to use new managed batch.	1 year ago
Martin Evans	1472704e12	Added a test with examples of troublesome strings from 0.9.1	1 year ago
Martin Evans	73172bbaba	Merge pull request #438 from martindevans/cleanup_model_unnecessary_unsafe Model Metadata Loading Cleanup	1 year ago
Martin Evans	ce1d302e7e	Moved some native methods into `SafeLlamaModelHandle`, these methods are all wrapped in safer accessors with no extra costs so there is no need to expose them.	1 year ago
Martin Evans	1e86755071	- Removed unnecessary `unsafe` block in model metadata loading - Clarified comments on native metadata loading methods	1 year ago
Martin Evans	de2b20aae5	- Added a specific exception for failing to load model weights. - Checking if model is readable	1 year ago
Martin Evans	096e0e75f8	Check that the model file actually exists immediately before loading it. Improve #395	1 year ago
Martin Evans	2ea2048b78	- Added a test for tokenizing just a new line (reproduce issue https://github.com/SciSharp/LLamaSharp/issues/430 ) - Properly displaying `LLamaToken` - Removed all tokenisation code in `SafeLLamaContextHandle` - just pass it all through to the `SafeLlamaModelHandle` - Improved `SafeLlamaModelHandle` tokenisation: - Renting an array, for one less allocation - Not using `&tokens[0]` to take a pointer to an array, this is redundant and doesn't work on empty arrays	1 year ago
Martin Evans	98635a0d5a	Fixed decoding of large tokens (over 16 bytes) in streaming text decoder	1 year ago
Martin Evans	402a110a3a	Merge pull request #404 from martindevans/switched_to_LLamaToken_struct LLamaToken Struct	1 year ago
Martin Evans	1e69e265b6	Moved some native methods to do with creating/destroying resources into their respective handles. There is no safe way to call most of these methods, everything must be done through through handles.	1 year ago
Martin Evans	82727c4414	Removed collection expressions from test	1 year ago
Martin Evans	2eb52b1630	made casts to/from int explicit, fixed places affected	1 year ago
Martin Evans	42be9b136d	Switched form using raw integers, to a `LLamaToken` struct	1 year ago
Martin Evans	4e5e994dda	- directly returning a SafeLlamaModelHandle, instead of an IntPtr which is wrapped in a handle. - made `llama_backend_init` private. This is automatically called, there is no way it can correctly be used externally. - made `llama_token_to_piece` safe (Span instead of pointer)	1 year ago
Martin Evans	bac3e43498	Fixed handling of empty spans	1 year ago
Martin Evans	c002642268	- Removed some `unsafe` where it wasn't necessary - Wrapped some native functions which take (pointer, length) in function which take a `span` instead.	1 year ago
Martin Evans	f860f88c36	Code cleanup driven by R# suggestions: - Made `NativeApi` into a `static class` (it's not intended to be instantiated) - Moved `LLamaTokenType` enum out into a separate file - Made `LLamaSeqId` and `LLamaPos` into `record struct`, convenient to have equality etc	1 year ago
Martin Evans	2cded1b296	Fixed alignment of value fields in `LLamaModelMetadataOverride`	1 year ago
Martin Evans	6be3f62321	Fixed loading of very large metadata values (over 1kb)	1 year ago
Martin Evans	fb606c2488	Fixed incorrect values	1 year ago
Martin Evans	47e4fcef2a	Fixed GetString on netstandard2	1 year ago
Martin Evans	2a1e1b6183	Removed unused imports	1 year ago
Martin Evans	a2bae178fa	Added a `Metadata` property to `LLamaWeights`	1 year ago

1 2 3 4 5

218 Commits (experimental_cpp)