LLamaSharp

Commit Graph

Author	SHA1	Message	Date
ksanchez	0bbbf171ed	Refactor executors	1 year ago
Rinne	495177fd0f	fix: typos.	1 year ago
Lyrcaxis	f01c13ee54	Made special tokens included in prompts tokenize as intended (#677 )	1 year ago
Martin Evans	c325ac9127	April 2024 Binary Update (#662 ) * Updated binaries, using [this build](https://github.com/SciSharp/LLamaSharp/actions/runs/8654672719/job/23733195669) for llama.cpp commit `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7`. - Added all new functions. - Moved some functions (e.g. `SafeLlamaModelHandle` specific functions) into `SafeLlamaModelHandle.cs` - Exposed tokens on `SafeLlamaModelHandle` and `LLamaWeights` through a `Tokens` property. As new special tokens are added in the future they can be added here. - Changed all token properties to return nullable tokens, to handle some models not having some tokens. - Fixed `DefaultSamplingPipeline` to handle no newline token in some models. * Moved native methods to more specific locations. - Context specific things have been moved into `SafeLLamaContextHandle.cs` and made private - they're exposed through C# properties and methods already. - Checking that GPU layer count is zero if GPU offload is not supported. - Moved methods for creating default structs (`llama_model_quantize_default_params` and `llama_context_default_params`) into relevant structs. * Removed exception if `GpuLayerCount > 0` when GPU is not supported. * - Added low level wrapper methods for new per-sequence state load/save in `SafeLLamaContextHandle` - Added high level wrapper methods (save/load with `State` object or memory mapped file) in `LLamaContext` - Moved native methods for per-sequence state load/save into `SafeLLamaContextHandle` * Added update and defrag methods for KV cache in `SafeLLamaContextHandle` * Updated submodule to `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7` * Passing the sequence ID when saving a single sequence state	1 year ago
Zoli Somogyi	f4fad825c7	Simplifying image handling	1 year ago
Zoli Somogyi	e991e631f9	Standardizing Image Data implementation	1 year ago
SignalRT	43677c511c	Change interface to support multiple images and add the capabitlity to render the image in the console	1 year ago
SignalRT	e8732efadd	Example InteractiveExecutor Add an Example and modifications to the interactive executor to enable Llava Models. Just a preview / demo	1 year ago
Martin Evans	a8ba9f05b3	March Binary Update (#565 ) * Updated binaries to llama.cpp `3ab8b3a92ede46df88bc5a2dfca3777de4a2b2b6` (build run: https://github.com/SciSharp/LLamaSharp/actions/runs/8118890586) * Added abort callback * Added properties to get/set thread count on `LLamaContext` * Fixed LLamaLogLevel numbering	1 year ago
Martin Evans	8ac1634233	Removed `llama_eval`. It is going to be completely removed in the next version of llama.cpp (#553 )	1 year ago
Martin Evans	a690db5d3e	Fixed build error caused by extra unnecessary parameter	1 year ago
Martin Evans	a2e29d393c	Swapped `StatelessExecutor` to use `llama_decode`! - Added `logits_i` argument to `Context.ApplyPenalty` - Added a new exception type for `llama_decode` return code	1 year ago
Martin Evans	f160fbd6d1	Added a check for EOS token in LLamaStatelessExecutor	1 year ago
Martin Evans	2eb52b1630	made casts to/from int explicit, fixed places affected	1 year ago
Martin Evans	42be9b136d	Switched form using raw integers, to a `LLamaToken` struct	1 year ago
Martin Evans	82d84afaea	Resetting the custom sampling pipeline in the stateless executor	1 year ago
Martin Evans	b34f72a883	- Added `SamplingPipeline` to inference params which overrides all other options with an entirely custom pipeline. - Added a `Sample` method to `LLamaContext` which uses a custom pipeline - Modified all executors to use the custom pipeline if it exists	1 year ago
Martin Evans	d743516070	- Added support for the MinP sampler - Cleaned up comments in implementations of `IInferenceParams` - Removed default values for all parameters in `LLamaContext.Sample` - they're never used and probably _shouldn't_ ever be used	2 years ago
Martin Evans	7e3cde4c13	Moved helper methods into `LLamaBatchSafeHandle`	2 years ago
Martin Evans	ccb8afae46	Cleaned up stateless executor as preparation for changing it to use the new batched decoding system.	2 years ago
Martin Evans	a03fe003de	Fixed decoding of text "accumulating" over time (never properly clearing buffer)	2 years ago
Martin Evans	51d4411a58	Added two new classes for detokenization tasks: - `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected. - `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens. Added tests for these classes and updated StatelessExecutor to use them. Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).	2 years ago
Martin Evans	efdf3d630c	- Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens). - Built a new (hacky) `Detokenize` method which handles this	2 years ago
Martin Evans	f1e5a8f995	- Passing the `ILogger` through to every call of `CreateContext` - Passing `ILogger` into executors	2 years ago
sa_ddam213	4ec9aed47a	Revert LLamasSharp project changes	2 years ago
sa_ddam213	b4b4000342	Merge branch 'master' into upstream_master # Conflicts: # LLama.Web/Common/ModelOptions.cs # LLama.Web/Services/ConnectionSessionService.cs # LLama/LLamaStatelessExecutor.cs # LLama/LLamaWeights.cs	2 years ago
Martin Evans	d8434ea9d6	Merge pull request #185 from martindevans/wip_major_api_change Major llama.cpp API Change	2 years ago
Martin Evans	efb0664df0	- Added new binaries - Fixed stateless executor out-of-context handling - Fixed token tests	2 years ago
sa_ddam213	9b8de007dc	Propagate ILogger	2 years ago
Martin Evans	669ae47ef7	- Split parameters into two interfaces - params contains a list of loras, instead of just one	2 years ago
Martin Evans	0d40338692	Fixed out-of-context handling in stateless executor	2 years ago
Martin Evans	d58fcbbd13	Fixed antiprompt checking	2 years ago
Martin Evans	08f1615e60	- Converted LLamaStatelessExecutor to run `Exec` calls inside an awaited task. This unblocks async callers while the model is being evaluated. - Added a "spinner" to the `StatelessModeExecute` demo, which spins while waiting for the next token (demonstrating that it's not blocked).	2 years ago
Martin Evans	3f80190f85	Minimal changes required to remove non-async inference.	2 years ago
Martin Evans	77bd090150	Simplified `LLamaInteractExecutor` antiprompt matching by using new extension method	2 years ago
Martin Evans	614ba40948	- Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings. - Minimal amount of characters converted - Allocation free - Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span - Allocation free	2 years ago
Martin Evans	93f24f8a51	Switched to properly typed `Encoding` property	2 years ago
Martin Evans	759ae26f36	Merge branch 'master' into grammar_basics	2 years ago
Martin Evans	a9e6f21ab8	- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring! - Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around	2 years ago
Martin Evans	e7b217f462	Fixed out of context logic	2 years ago
Martin Evans	4738c26299	- Reduced context size of test, to speed it up - Removed some unnecessary `ToArray` calls - Initial pass on LLamaStatelessExecutor, the context overflow management is broken but I think I found where it's ported from	2 years ago
Martin Evans	64416ca23c	- Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`) - Integrated grammar into sampling - Added a test for the grammar sampling	2 years ago
Martin Evans	f3511e390f	WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving. The biggest single change is renaming `LLamaModel` to `LLamaContext`	2 years ago
Martin Evans	270c6d55ef	Merge pull request #88 from martindevans/fix_serialization_nan Fix serialization error due to NaN	2 years ago
Martin Evans	be52737488	Using a nullable float instead of NaN, this should fix the serialization issue reported in #85	2 years ago
Martin Evans	1fceeaf352	Applied fix from #84 (antiprompt does not work in stateless executor)	2 years ago
Yaohui Liu	d609b0e1d5	Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev	2 years ago
Yaohui Liu	b60c8bd285	fix: antiprompt does not work in stateless executor.	2 years ago
Martin Evans	2b2d3af26b	Moved `Eval` out of `Utils` and into `SafeLLamaContextHandle`	2 years ago
Martin Evans	7fabcc1849	One last `TokenToString` case	2 years ago

1 2

62 Commits (6f9097f25bdb9726335a2aecf1f087cc6e2a4990)