LLamaSharp

Commit Graph

Author	SHA1	Message	Date
Rinne	495177fd0f	fix: typos.	1 year ago
Martin Evans	c325ac9127	April 2024 Binary Update (#662 ) * Updated binaries, using [this build](https://github.com/SciSharp/LLamaSharp/actions/runs/8654672719/job/23733195669) for llama.cpp commit `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7`. - Added all new functions. - Moved some functions (e.g. `SafeLlamaModelHandle` specific functions) into `SafeLlamaModelHandle.cs` - Exposed tokens on `SafeLlamaModelHandle` and `LLamaWeights` through a `Tokens` property. As new special tokens are added in the future they can be added here. - Changed all token properties to return nullable tokens, to handle some models not having some tokens. - Fixed `DefaultSamplingPipeline` to handle no newline token in some models. * Moved native methods to more specific locations. - Context specific things have been moved into `SafeLLamaContextHandle.cs` and made private - they're exposed through C# properties and methods already. - Checking that GPU layer count is zero if GPU offload is not supported. - Moved methods for creating default structs (`llama_model_quantize_default_params` and `llama_context_default_params`) into relevant structs. * Removed exception if `GpuLayerCount > 0` when GPU is not supported. * - Added low level wrapper methods for new per-sequence state load/save in `SafeLLamaContextHandle` - Added high level wrapper methods (save/load with `State` object or memory mapped file) in `LLamaContext` - Moved native methods for per-sequence state load/save into `SafeLLamaContextHandle` * Added update and defrag methods for KV cache in `SafeLLamaContextHandle` * Updated submodule to `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7` * Passing the sequence ID when saving a single sequence state	1 year ago
Martin Evans	a8ba9f05b3	March Binary Update (#565 ) * Updated binaries to llama.cpp `3ab8b3a92ede46df88bc5a2dfca3777de4a2b2b6` (build run: https://github.com/SciSharp/LLamaSharp/actions/runs/8118890586) * Added abort callback * Added properties to get/set thread count on `LLamaContext` * Fixed LLamaLogLevel numbering	1 year ago
Martin Evans	15a98b36d8	Updated everything to work with llama.cpp `ce32060198`	1 year ago
Martin Evans	42be9b136d	Switched form using raw integers, to a `LLamaToken` struct	1 year ago
Martin Evans	b868b056f7	Added metadata overrides to `IModelParams`	1 year ago
Martin Evans	439d14a061	Updated binaries: - build run: https://github.com/SciSharp/LLamaSharp/actions/runs/7196891440 - commit: `9fb13f9584`	1 year ago
Martin Evans	b34f72a883	- Added `SamplingPipeline` to inference params which overrides all other options with an entirely custom pipeline. - Added a `Sample` method to `LLamaContext` which uses a custom pipeline - Modified all executors to use the custom pipeline if it exists	1 year ago
Martin Evans	89fef05362	This commit (`5fe721bdbe`) accidentally removed a load of stuff that it shouldn't. Fixed that. Originally from these PRs: - https://github.com/SciSharp/LLamaSharp/pull/263 - https://github.com/SciSharp/LLamaSharp/pull/259	2 years ago
SignalRT	46ace3ddd7	Add targets in Web project This allow to copy binaries and make the project work. Update example model in appsettings	2 years ago
Martin Evans	e3468d04f0	Merge pull request #277 from martindevans/feature/min_p MinP Sampler	2 years ago
Martin Evans	d743516070	- Added support for the MinP sampler - Cleaned up comments in implementations of `IInferenceParams` - Removed default values for all parameters in `LLamaContext.Sample` - they're never used and probably _shouldn't_ ever be used	2 years ago
SignalRT	97006a214f	Merge remote-tracking branch 'upstream/master' into RuntimeDetection	2 years ago
Martin Evans	31244ae691	Merge branch 'master' into YaRN_scaling_parameters	2 years ago
SignalRT	5fe721bdbe	Revert "Merge branch 'pr/268' into RuntimeDetection" This reverts commit 091b8d58b3502a99b3bfbec9db457c92cc736beb, reversing changes made to `9b2ca9cf8e`.	2 years ago
Martin Evans	db1bc741b0	Modified `ContextSize` in parameters to be nullable. A null value means autodetect from the model.	2 years ago
Martin Evans	04ee64a6be	Exposed YaRN scaling parameters in IContextParams	2 years ago
Martin Evans	529b06b35b	- Fixed rope frequency/base to use the values in the model by default, instead of always overriding them by default!	2 years ago
Martin Evans	c786fb0ec8	Using `IReadOnlyList` instead of `IEnumerable` in `IInferenceParams`	2 years ago
Martin Evans	6a4cd506bd	Added a safe `TensorSplitsCollection` to the params which prevents incorrectly setting the `tensor_splits` collection	2 years ago
Martin Evans	18b15184ea	Added logger parameter in to LLama.Web context creation	2 years ago
sa_ddam213	952e77f97b	Remove old parameter	2 years ago
sa_ddam213	b4b4000342	Merge branch 'master' into upstream_master # Conflicts: # LLama.Web/Common/ModelOptions.cs # LLama.Web/Services/ConnectionSessionService.cs # LLama/LLamaStatelessExecutor.cs # LLama/LLamaWeights.cs	2 years ago
Martin Evans	2a38808bca	- Added threads to context params, replaced all thread args with `uint?` - Replaced all binaries	2 years ago
sa_ddam213	a8a498dc12	Fix up issues found during testing	2 years ago
sa_ddam213	9b8de007dc	Propagate ILogger	2 years ago
sa_ddam213	e2a17d6b6f	Refactor conflicting object name SessionOptions	2 years ago
sa_ddam213	44f1b91c29	Update Web to support version 0.5.1	2 years ago
sa_ddam213	c9108f8311	Add service for managing Models and Model Contexts	2 years ago
Martin Evans	0f03e8f1a3	Added workaround to LLama.Web and LLama.WebAPI	2 years ago
Martin Evans	669ae47ef7	- Split parameters into two interfaces - params contains a list of loras, instead of just one	2 years ago
Martin Evans	bca55eace0	Initial changes to match the llama.cpp changes	2 years ago
Martin Evans	2056078aef	Initial changes required for GGUF support	2 years ago
Martin Evans	93f24f8a51	Switched to properly typed `Encoding` property	2 years ago
Martin Evans	759ae26f36	Merge branch 'master' into grammar_basics	2 years ago
Martin Evans	a9e6f21ab8	- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring! - Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around	2 years ago
Martin Evans	64416ca23c	- Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`) - Integrated grammar into sampling - Added a test for the grammar sampling	2 years ago
Martin Evans	f3511e390f	WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving. The biggest single change is renaming `LLamaModel` to `LLamaContext`	2 years ago
Martin Evans	2c933c57a1	Fixed ModelOptions in Web project	2 years ago
sa_ddam213	bac9cba01a	InferenceParams abstractions	2 years ago
sa_ddam213	2a04e31b7d	ModelParams abstraction	2 years ago
sa_ddam213	3fec7a63c7	Add Instruct and Stateless support	2 years ago
sa_ddam213	a32a5e4ffe	Decouple connectionId from ModelSession	2 years ago
sa_ddam213	d9fbd56f10	Strongly type connection status	2 years ago
sa_ddam213	ef8cf0b283	Add RequestVerificationToken logic fo ajax prefilter, Tidy up js cancel logic	2 years ago
sa_ddam213	e574d89a40	Send prompt on Enter key	2 years ago
sa_ddam213	a139423581	Move session management to service, Use ILLamaExecutor in session to make more versatile, scroll bug	2 years ago
sa_ddam213	1ec59e120a	Move session management to service, Infer cancel support	2 years ago
sa_ddam213	fd215dce84	Update Readme	2 years ago
sa_ddam213	21b685649f	Add Readme	2 years ago

1 2

51 Commits (master)