Martin Evans
45118520fa
- Improved coverage of `GBNFGrammarParser` up to 96%
- Covered text transforms
- Removed unnecessary non-async transforms
2 years ago
Martin Evans
3f80190f85
Minimal changes required to remove non-async inference.
2 years ago
Martin Evans
0c98ae1955
Passing ctx to `llama_token_nl(_ctx)`
2 years ago
Martin Evans
a9e6f21ab8
- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2 years ago
Martin Evans
48bc0a6f8a
Doe the same for the second test, hopefully fixing CI
2 years ago
Martin Evans
6f2ab8e039
Not asserting the answer, just that it didn't change
2 years ago
Martin Evans
e7b217f462
Fixed out of context logic
2 years ago
Martin Evans
4738c26299
- Reduced context size of test, to speed it up
- Removed some unnecessary `ToArray` calls
- Initial pass on LLamaStatelessExecutor, the context overflow management is broken but I think I found where it's ported from
2 years ago
Martin Evans
4d0c044b9f
Added tests for the StatelessExecutor, one is currently failing
2 years ago