45 Commits (ec8f83236545a1989df2f75da4e1d8d0345b0407)

Author SHA1 Message Date
  Martin Evans ce4de7d607
llama_decode lock (#595) 1 year ago
  Martin Evans a8ba9f05b3
March Binary Update (#565) 1 year ago
  Martin Evans 8ac1634233
Removed `llama_eval`. It is going to be completely removed in the next version of llama.cpp (#553) 1 year ago
  Martin Evans e9d9042576 Added `Divide` to `KvAccessor` 1 year ago
  Martin Evans 949861a581 - Added a `Modify` method to `Conversation`. This grants **temporary** access to directly modify the KV cache. 1 year ago
  Martin Evans c5146bac23 - Exposed KV debug view through `SafeLLamaContextHandle` 1 year ago
  Martin Evans 15a98b36d8 Updated everything to work with llama.cpp ce32060198 1 year ago
  Martin Evans 92b9bbe779 Added methods to `SafeLLamaContextHandle` for KV cache manipulation 1 year ago
  Martin Evans 36a9335588 Removed `LLamaBatchSafeHandle` (using unmanaged memory, created by llama.cpp) and replaced it with a fully managed `LLamaBatch`. Modified the `BatchedDecoding` example to use new managed batch. 1 year ago
  Martin Evans 2ea2048b78 - Added a test for tokenizing just a new line (reproduce issue https://github.com/SciSharp/LLamaSharp/issues/430) 1 year ago
  Martin Evans 98635a0d5a Fixed decoding of large tokens (over 16 bytes) in streaming text decoder 1 year ago
  Martin Evans 402a110a3a
Merge pull request #404 from martindevans/switched_to_LLamaToken_struct 1 year ago
  Martin Evans 1e69e265b6 Moved some native methods to do with creating/destroying resources into their respective handles. There is **no** safe way to call most of these methods, everything must be done through through handles. 1 year ago
  Martin Evans 2eb52b1630 made casts to/from int explicit, fixed places affected 1 year ago
  Martin Evans 42be9b136d Switched form using raw integers, to a `LLamaToken` struct 1 year ago
  Martin Evans f860f88c36 Code cleanup driven by R# suggestions: 1 year ago
  Martin Evans 2df3e7617e Added a method to set the RNG seed on the context 1 year ago
  Martin Evans 16ab33ba3c Added Obsolete markings to all `Eval` overloads 2 years ago
  Martin Evans 51c292ebd8 Added a safe method for `llama_get_logits_ith` 2 years ago
  Martin Evans 51d4411a58 Added two new classes for detokenization tasks: 2 years ago
  Martin Evans efdf3d630c - Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens). 2 years ago
  Martin Evans e89ca5cc17 Fixed a few minor warnings 2 years ago
  Martin Evans 9daf586ba8 Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc) 2 years ago
  Martin Evans 1f8c94e386 Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538) 2 years ago
  Martin Evans 9a0a0ae9fe Removed cloning support 2 years ago
  Martin Evans 0d40338692 Fixed out-of-context handling in stateless executor 2 years ago
  Martin Evans b306ac23dd Added `Decode` method to `SafeLLamaContextHandle` 2 years ago
  Martin Evans ce1fc51163 Added some more native methods 2 years ago
  Martin Evans bca55eace0 Initial changes to match the llama.cpp changes 2 years ago
  Martin Evans daf09eae64 Skipping tokenization of empty strings (saves allocating an empty array every time) 2 years ago
  Martin Evans bba801f4b7 Added a property to get the KV cache size from a context 2 years ago
  Martin Evans 31287b5e6e Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options: 2 years ago
  Martin Evans ebacdb666d - Moved the lower level state get/set methods onto SafeLLamaContextHandle 2 years ago
  Martin Evans a9e6f21ab8 - Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring! 2 years ago
  Martin Evans ae8ef17a4a - Added various convenience overloads to `LLamaContext.Eval` 2 years ago
  Martin Evans 479ff57853 Renamed `EmbeddingCount` to `EmbeddingSize` 2 years ago
  Martin Evans d0a7a8fcd6 - Cleaned up disposal in LLamaContext 2 years ago
  Martin Evans f3511e390f WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving. 2 years ago
  Martin Evans 2b2d3af26b Moved `Eval` out of `Utils` and into `SafeLLamaContextHandle` 2 years ago
  Martin Evans 0e5e00e300 Moved `TokenToString` from Utils into `SafeLLamaContextHandle` (thin wrappers around the same method in `SafeLlamaModelHandle`) 2 years ago
  Martin Evans 2d811b2603 - Moved `GetLogits` into `SafeLLamaContextHandle` 2 years ago
  Martin Evans cd3cf2b77d - Moved tokenization from `Utils.Tokenize` into `SafeLLamaContextHandle.Tokenize`, one less thing in `Utils`. 2 years ago
  Martin Evans f16aa58e12 Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts. 2 years ago
  Yaohui Liu 0958bbac2c
feat: add get-embedding api to LLamaModel. 2 years ago
  Yaohui Liu 5a79edeb51
feat: add the framework and basic usages. 2 years ago