68 Commits (6f9097f25bdb9726335a2aecf1f087cc6e2a4990)

Author SHA1 Message Date
  ksanchez 46a9d603f4 Add method to get BOS token. 1 year ago
  Rinne 495177fd0f fix: typos. 1 year ago
  Martin Evans c325ac9127
April 2024 Binary Update (#662) 1 year ago
  Rinne b677cdc6a3
Merge pull request #560 from eublefar/feature/chat-session-state-management 1 year ago
  Martin Evans ad682fbebd
`BatchedExecutor.Create()` method (#613) 1 year ago
  eublefar a31391edd7 Polymorphic serialization for executor state and transforms 1 year ago
  Martin Evans a8ba9f05b3
March Binary Update (#565) 1 year ago
  eublefar e05d5d4e14 Remove resetting state ops and make SessionState.ExecutorState and SessionState.ContextState no nullable 1 year ago
  eublefar b2f7dbb39b AddPromptAsync method for stateful executors, Chat session initialize from history and process system message methods for pre-processing prompts. Serializing executor state to JSON, to avoid saved states from being updated by reference. 1 year ago
  eublefar 35153a77dd Chat session Get/Load in-memory state operations, reset state ops for stateful executors and context 1 year ago
  Martin Evans 8ac1634233
Removed `llama_eval`. It is going to be completely removed in the next version of llama.cpp (#553) 1 year ago
  Martin Evans 91a7967869
`ReadOnlySpan<float>` in ISamplingPipeline (#538) 1 year ago
  Martin Evans b0acecf080 Created a new `BatchedExecutor` which processes multiple "Conversations" in one single inference batch. This is faster, even when the conversations are unrelated, and is much faster if the conversations share some overlap (e.g. a common system prompt prefix). 1 year ago
  Martin Evans 15a98b36d8 Updated everything to work with llama.cpp ce32060198 1 year ago
  Martin Evans 5da2a2f64b - Removed one of the constructors of `SafeLLamaHandleBase`, which implicitly states that memory is owned. Better to be explicit about this kind of thing! 1 year ago
  Martin Evans a2e29d393c Swapped `StatelessExecutor` to use `llama_decode`! 1 year ago
  Martin Evans 99969e538e - Removed some unused `eval` methods. 1 year ago
  Martin Evans 2eb52b1630 made casts to/from int explicit, fixed places affected 1 year ago
  Martin Evans 42be9b136d Switched form using raw integers, to a `LLamaToken` struct 1 year ago
  Martin Evans f860f88c36 Code cleanup driven by R# suggestions: 1 year ago
  Martin Evans dc8e5d88f7
Update LLama/LLamaContext.cs 1 year ago
  Martin Evans 2df3e7617e Added a method to set the RNG seed on the context 1 year ago
  Martin Evans b34f72a883 - Added `SamplingPipeline` to inference params which overrides all other options with an entirely custom pipeline. 1 year ago
  Martin Evans 16ab33ba3c Added Obsolete markings to all `Eval` overloads 2 years ago
  Martin Evans d743516070 - Added support for the MinP sampler 2 years ago
  Martin Evans dcc82e582e Fixed `Eval` on platforms < dotnet 5 2 years ago
  Martin Evans e81b3023d5 Rewritten sampling API to be accessed through the `LLamaTokenDataArray` object 2 years ago
  Martin Evans a024d2242e It works! 2 years ago
  Martin Evans 36c71abcfb Fixed `LLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLama` spam in all executors except Stateless. 2 years ago
  Martin Evans 51d4411a58 Added two new classes for detokenization tasks: 2 years ago
  Martin Evans efdf3d630c - Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens). 2 years ago
  Martin Evans 9daf586ba8 Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc) 2 years ago
  Martin Evans d8434ea9d6
Merge pull request #185 from martindevans/wip_major_api_change 2 years ago
  Martin Evans 1f8c94e386 Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538) 2 years ago
  Martin Evans 669ae47ef7 - Split parameters into two interfaces 2 years ago
  Martin Evans 9a0a0ae9fe Removed cloning support 2 years ago
  Martin Evans 0d40338692 Fixed out-of-context handling in stateless executor 2 years ago
  Martin Evans ce1fc51163 Added some more native methods 2 years ago
  Martin Evans bca55eace0 Initial changes to match the llama.cpp changes 2 years ago
  Martin Evans 08f1615e60 - Converted LLamaStatelessExecutor to run `Exec` calls inside an awaited task. This unblocks async callers while the model is being evaluated. 2 years ago
  redthing1 b78044347c
fix opaque GetState (fixes #176) 2 years ago
  Martin Evans 466722dcff
Merge pull request #165 from martindevans/better_instruct_antiprompt_checking 2 years ago
  Martin Evans d08a125020 Using the `TokensEndsWithAnyString` extensions for antiprompt checking in instruct executor. Simpler and more efficient. 2 years ago
  Martin Evans bba801f4b7 Added a property to get the KV cache size from a context 2 years ago
  Martin Evans 4dac142bd5
Merge pull request #160 from martindevans/GetState_fix 2 years ago
  Martin Evans 832bf7dbe0 Simplified implementation of `GetState` and fixed a memory leak (`bigMemory` was never freed) 2 years ago
  Martin Evans 4f7b6ffdcc Removed `GenerateResult` method that was only used in one place 2 years ago
  sa_ddam213 949b0cde16
Replace ILLamaLogger for ILogger 2 years ago
  Martin Evans 31287b5e6e Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options: 2 years ago
  Martin Evans 0c98ae1955 Passing ctx to `llama_token_nl(_ctx)` 2 years ago