LLamaSharp

Commit Graph

Author	SHA1	Message	Date
Rinne	495177fd0f	fix: typos.	1 year ago
Lyrcaxis	f01c13ee54	Made special tokens included in prompts tokenize as intended (#677 )	1 year ago
Martin Evans	c325ac9127	April 2024 Binary Update (#662 ) * Updated binaries, using [this build](https://github.com/SciSharp/LLamaSharp/actions/runs/8654672719/job/23733195669) for llama.cpp commit `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7`. - Added all new functions. - Moved some functions (e.g. `SafeLlamaModelHandle` specific functions) into `SafeLlamaModelHandle.cs` - Exposed tokens on `SafeLlamaModelHandle` and `LLamaWeights` through a `Tokens` property. As new special tokens are added in the future they can be added here. - Changed all token properties to return nullable tokens, to handle some models not having some tokens. - Fixed `DefaultSamplingPipeline` to handle no newline token in some models. * Moved native methods to more specific locations. - Context specific things have been moved into `SafeLLamaContextHandle.cs` and made private - they're exposed through C# properties and methods already. - Checking that GPU layer count is zero if GPU offload is not supported. - Moved methods for creating default structs (`llama_model_quantize_default_params` and `llama_context_default_params`) into relevant structs. * Removed exception if `GpuLayerCount > 0` when GPU is not supported. * - Added low level wrapper methods for new per-sequence state load/save in `SafeLLamaContextHandle` - Added high level wrapper methods (save/load with `State` object or memory mapped file) in `LLamaContext` - Moved native methods for per-sequence state load/save into `SafeLLamaContextHandle` * Added update and defrag methods for KV cache in `SafeLLamaContextHandle` * Updated submodule to `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7` * Passing the sequence ID when saving a single sequence state	1 year ago
eublefar	af796fc3e9	Change List types in executor state to arrays to enforce copy on get/set operations	1 year ago
eublefar	e05d5d4e14	Remove resetting state ops and make SessionState.ExecutorState and SessionState.ContextState no nullable	1 year ago
eublefar	35153a77dd	Chat session Get/Load in-memory state operations, reset state ops for stateful executors and context	1 year ago
Martin Evans	8ac1634233	Removed `llama_eval`. It is going to be completely removed in the next version of llama.cpp (#553 )	1 year ago
Martin Evans	b0acecf080	Created a new `BatchedExecutor` which processes multiple "Conversations" in one single inference batch. This is faster, even when the conversations are unrelated, and is much faster if the conversations share some overlap (e.g. a common system prompt prefix). Conversations can be "forked", to create a copy of a conversation at a given point. This allows e.g. prompting a conversation with a system prefix just once and then forking it again and again for each individual conversation. Conversations can also be "rewound" to an earlier state. Added two new examples, demonstrating forking and rewinding.	1 year ago
Martin Evans	765c697f77	Fixed number type	1 year ago
Martin Evans	a2e29d393c	Swapped `StatelessExecutor` to use `llama_decode`! - Added `logits_i` argument to `Context.ApplyPenalty` - Added a new exception type for `llama_decode` return code	1 year ago
Martin Evans	42be9b136d	Switched form using raw integers, to a `LLamaToken` struct	1 year ago
Martin Evans	b34f72a883	- Added `SamplingPipeline` to inference params which overrides all other options with an entirely custom pipeline. - Added a `Sample` method to `LLamaContext` which uses a custom pipeline - Modified all executors to use the custom pipeline if it exists	1 year ago
Martin Evans	48c5039054	Improved test coverage. Discovered some issues: FixedSizeQueue: - Enqueue would always stop one short of filling the capacity - Fill would only _replace_ existing items. It was only used in a place where there were not existing items! Removed the method entirely. LLamaGrammarElement: - Converted into a `record` struct, removed all of the (now unnecessary) equality stuff.	2 years ago
Martin Evans	d743516070	- Added support for the MinP sampler - Cleaned up comments in implementations of `IInferenceParams` - Removed default values for all parameters in `LLamaContext.Sample` - they're never used and probably _shouldn't_ ever be used	2 years ago
Martin Evans	a024d2242e	It works! had to update binary to `b1426`	2 years ago
Martin Evans	f1e5a8f995	- Passing the `ILogger` through to every call of `CreateContext` - Passing `ILogger` into executors	2 years ago
sa_ddam213	4ec9aed47a	Revert LLamasSharp project changes	2 years ago
sa_ddam213	b4b4000342	Merge branch 'master' into upstream_master # Conflicts: # LLama.Web/Common/ModelOptions.cs # LLama.Web/Services/ConnectionSessionService.cs # LLama/LLamaStatelessExecutor.cs # LLama/LLamaWeights.cs	2 years ago
Martin Evans	9daf586ba8	Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)	2 years ago
sa_ddam213	9b8de007dc	Propagate ILogger	2 years ago
Martin Evans	3f80190f85	Minimal changes required to remove non-async inference.	2 years ago
Martin Evans	d08a125020	Using the `TokensEndsWithAnyString` extensions for antiprompt checking in instruct executor. Simpler and more efficient.	2 years ago
Martin Evans	f366aa3abe	Changed `OpenOrCreate` to `Create` to fix #151	2 years ago
sa_ddam213	a5d742b72c	Fix Tokenize of new line, Remove space inserts	2 years ago
Martin Evans	31287b5e6e	Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options: - Just convert it to a `string`, nice and simple - Write the bytes to a `Span<byte>` no allocations - Write the chars to a `StringBuilder` potentially no allocations	2 years ago
Martin Evans	0c98ae1955	Passing ctx to `llama_token_nl(_ctx)`	2 years ago
Martin Evans	a911b77dec	Various minor changes, resolving about 100 ReSharper code quality warnings	2 years ago
Martin Evans	829f32b27d	- Added `Obsolete` attributes to the entire `OldVersion` namespace, so they can be removed in the future - Minor changes to cleanup some of the compiler warnings	2 years ago
Martin Evans	2830e5755c	- Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed. - Deleted `NativeInfo` (internal class, not used anywhere)	2 years ago
Martin Evans	759ae26f36	Merge branch 'master' into grammar_basics	2 years ago
Martin Evans	4738c26299	- Reduced context size of test, to speed it up - Removed some unnecessary `ToArray` calls - Initial pass on LLamaStatelessExecutor, the context overflow management is broken but I think I found where it's ported from	2 years ago
Martin Evans	64416ca23c	- Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`) - Integrated grammar into sampling - Added a test for the grammar sampling	2 years ago
Martin Evans	f3511e390f	WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving. The biggest single change is renaming `LLamaModel` to `LLamaContext`	2 years ago
Martin Evans	270c6d55ef	Merge pull request #88 from martindevans/fix_serialization_nan Fix serialization error due to NaN	2 years ago
Martin Evans	b5de3ee5aa	Fixed some final mentions of "mirostate" instead of "mirostat"	2 years ago
Martin Evans	0e5e00e300	Moved `TokenToString` from Utils into `SafeLLamaContextHandle` (thin wrappers around the same method in `SafeLlamaModelHandle`)	2 years ago
Martin Evans	cd3cf2b77d	- Moved tokenization from `Utils.Tokenize` into `SafeLLamaContextHandle.Tokenize`, one less thing in `Utils`. - Also refactored it to return an `int[]` instead of an `IEnumerable<int>`, solving the "multiple enumeration" problems at the source!	2 years ago
sa_ddam213	bac9cba01a	InferenceParams abstractions	2 years ago
Martin Evans	c64507cb41	Correctly passing through mu value to mirostate instead of resetting it every time.	2 years ago
Yaohui Liu	1062fe1a7e	feat: upgrade the native libraries.	2 years ago
Yaohui Liu	6c400e64c2	docs: publiash documentation 0.4.	2 years ago
Yaohui Liu	2eb2d6df83	test: add 9 examples of the new version.	2 years ago
Marcel	762fd7c1ae	Fixed a typo in FixedSizeQueue	2 years ago
Yaohui Liu	a3b8186f20	feat: support save and load chat session.	2 years ago
Yaohui Liu	3bf74ec9b9	feat: add chat session for refactored code.	2 years ago
Yaohui Liu	908b79e855	feat: add stateless executor.	2 years ago
Yaohui Liu	e603a09137	fix: state loading and saving not working.	2 years ago
Yaohui Liu	5679e08718	feat: add ILLamaExecutor.InferAsync.	2 years ago
Yaohui Liu	264fb9a706	refactor: LLamaModel and LLamaExecutor.	2 years ago

49 Commits (495177fd0fec99d3ad8530b9911e12c05aa4912c)