8 Commits (eebe4cb1200fa858bafe72e204eb8abffc05f4c0)

Author SHA1 Message Date
  Martin Evans 51d4411a58 Added two new classes for detokenization tasks: 2 years ago
  Martin Evans efdf3d630c - Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens). 2 years ago
  Martin Evans 1d0620e634 Created a test that "roundtrips" strings through tokenization. This reveals some flaws with certain characters 2 years ago
  Martin Evans 1f8c94e386 Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538) 2 years ago
  Martin Evans fe54f6764f - Added unit tests for extension methods 2 years ago
  Martin Evans d0e57a8c92 sealed test class 2 years ago
  Martin Evans 3f082c6f2c Fixed naming in tests 2 years ago
  Martin Evans 614ba40948 - Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings. 2 years ago