LLamaSharp

Commit Graph

Author	SHA1	Message	Date
Martin Evans	e89ca5cc17	Fixed a few minor warnings	2 years ago
Martin Evans	9daf586ba8	Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)	2 years ago
Martin Evans	1f8c94e386	Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )	2 years ago
Martin Evans	2a38808bca	- Added threads to context params, replaced all thread args with `uint?` - Replaced all binaries	2 years ago
Martin Evans	9a0a0ae9fe	Removed cloning support	2 years ago
Martin Evans	0d40338692	Fixed out-of-context handling in stateless executor	2 years ago
Martin Evans	b306ac23dd	Added `Decode` method to `SafeLLamaContextHandle`	2 years ago
Martin Evans	9e958e896b	safe handle for batch	2 years ago
Martin Evans	ce1fc51163	Added some more native methods	2 years ago
Martin Evans	bca55eace0	Initial changes to match the llama.cpp changes	2 years ago
Haiping	10678a83d6	Merge pull request #65 from martindevans/alternative_dependency_loading CPU Feature Detection	2 years ago
Martin Evans	daf09eae64	Skipping tokenization of empty strings (saves allocating an empty array every time)	2 years ago
Martin Evans	bba801f4b7	Added a property to get the KV cache size from a context	2 years ago
sa_ddam213	09d8f434f2	Extract LLamaLogLevel, Remove Logger class	2 years ago
Martin Evans	d3b8ee988c	Beam Search (#155 ) * Added the low level bindings to beam search.	2 years ago
Martin Evans	614ba40948	- Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings. - Minimal amount of characters converted - Allocation free - Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span - Allocation free	2 years ago
Martin Evans	6a842014ac	Removed duplicate `llama_sample_classifier_free_guidance` method	2 years ago
Martin Evans	8f58a40fb9	Added Linux dependency loading	2 years ago
Martin Evans	dd4957471f	Changed paths to match what the GitHub build action produces	2 years ago
Martin Evans	756a1ad0ba	Added a new way to load dependencies, performing CPU feature detection	2 years ago
Rinne	4e83e48ad1	Merge pull request #122 from martindevans/gguf Add GGUF support	2 years ago
Martin Evans	bcf06e2652	Added some comments on various native methods	2 years ago
Martin Evans	a70c7170dd	- Created a higher level `Grammar` class which is immutable and contains a list of grammar rules. This is the main "entry point" to the grammar system. - Made all the mechanics of grammar parsing (GBNFGrammarParser, ParseState) internal. Just call `Grammar.Parse("whatever")`. - Added a `GrammarRule` class which validates elements on construction (this allows constructing grammar without parsing GBNF). - It should be impossible for a `GrammarRule` to represent an invalid rule.	2 years ago
Mihai	0bd495276b	Add initial tests + fix bugs. Still WIP since the test is failing.	2 years ago
Martin Evans	2022b82947	Added binaries generated by this action: https://github.com/SciSharp/LLamaSharp/actions/runs/6002797872/job/16279896150 Based on this version: `6b73ef1201`	2 years ago
Martin Evans	31287b5e6e	Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options: - Just convert it to a `string`, nice and simple - Write the bytes to a `Span<byte>` no allocations - Write the chars to a `StringBuilder` potentially no allocations	2 years ago
Martin Evans	0c98ae1955	Passing ctx to `llama_token_nl(_ctx)`	2 years ago
Martin Evans	6ffa28f964	Removed `LLAMA_MAX_DEVICES` (not used)	2 years ago
Martin Evans	2056078aef	Initial changes required for GGUF support	2 years ago
Martin Evans	cf4754db44	Removed unnecessary parameters from some low level sampler methods	2 years ago
Martin Evans	f70525fec2	Two small improvements to the native sampling API: - Modified `llama_sample_token_mirostat` and `llama_sample_token_mirostat_v2` to take `ref float` instead of as a `float*`. Less pointers is always good. - Modified `llama_sample_repetition_penalty` and `llama_sample_frequency_and_presence_penalties` to take pointers instead of arrays. This allows the use non non allocating types (e.g. Span) instead of arrays - Modified higher level API to accept `Memory<int>` instead of `int[]`, which can be used to reduce allocations at call sites	2 years ago
Martin Evans	a911b77dec	Various minor changes, resolving about 100 ReSharper code quality warnings	2 years ago
Martin Evans	ebacdb666d	- Moved the lower level state get/set methods onto SafeLLamaContextHandle - Used those methods to add a `Clone` method to SafeLLamaContextHandle - Simplified `LLamaContext` by using the new methods - Sealed `LLamaContext` and `LLamaEmbedder`	2 years ago
Martin Evans	829f32b27d	- Added `Obsolete` attributes to the entire `OldVersion` namespace, so they can be removed in the future - Minor changes to cleanup some of the compiler warnings	2 years ago
zombieguy	45b01d5a78	Improved type conversion Type conversion is now done in the property rather than the utils class and uses the System.Convert class to ensure consistency.	2 years ago
Martin Evans	2830e5755c	- Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed. - Deleted `NativeInfo` (internal class, not used anywhere)	2 years ago
Martin Evans	4b7d718551	Added native symbol for CFG	2 years ago
Martin Evans	759ae26f36	Merge branch 'master' into grammar_basics	2 years ago
Martin Evans	a9e6f21ab8	- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring! - Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around	2 years ago
Martin Evans	ae8ef17a4a	- Added various convenience overloads to `LLamaContext.Eval` - Converted `SafeLLamaContextHandle` to take a `ReadOnlySpan` for Eval, narrower type better represents what's really needed	2 years ago
Martin Evans	64416ca23c	- Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`) - Integrated grammar into sampling - Added a test for the grammar sampling	2 years ago
Martin Evans	0294bb1303	Some of the basics of the grammar API	2 years ago
Rinne	62331852bc	Merge pull request #90 from martindevans/proposal_multi_context Multi Context	2 years ago
zombieguy	10f88ebd0e	Potential fix for .Net Framework issues (#103 ) * Added a bool to sbyte Utils convertor As an attempt to avoid using any MarshalAs attribute for .Net Framework support this Utils method will take in a bool value and return a 1 for true or 0 for false sbyte. * Changed all bool "MarshalAs" types to sbytes Changed all previous BOOL types with "MarshalAs" attributes to SBYTEs and changed all the setters of them to use the Utils.BoolToSignedByte() convertor method. * Fixed Utils bool convertor & added sbyte to bool Improved the Utils bool convertor just casting an sbyte value to get rid of the unneeded sbyte array and added an sbyte to bool convertor to convert back the way to a C# bool assuming any positive value above 0 is a bool and no bools are packed in the single byte integer. * bool to & from sbyte conversions via properties All 1byte bools are now handled where they "sit", via public properties which perform the conversions to keep all external data able to communicate as it did before.	2 years ago
Martin Evans	6c84accce8	Added `llama_sample_classifier_free_guidance` method from native API	2 years ago
Martin Evans	479ff57853	Renamed `EmbeddingCount` to `EmbeddingSize`	2 years ago
Martin Evans	d0a7a8fcd6	- Cleaned up disposal in LLamaContext - sealed some classes not intended to be extended	2 years ago
Martin Evans	f3511e390f	WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving. The biggest single change is renaming `LLamaModel` to `LLamaContext`	2 years ago
Martin Evans	d7f971fc22	Improved `NativeApi` file a bit: - Added some more comments - Modified `llama_tokenize` to not allocate - Modified `llama_tokenize_native` to take a pointer instead of an array, allowing use with no allocations - Removed GgmlInitParams (not used)	2 years ago
Martin Evans	841cf88e3b	Merge pull request #96 from martindevans/minor_quantizer_improvements Minor quantizer improvements	2 years ago

1 2

100 Commits (18b15184ea6dceba07b9e6a6a73594e823714d40)