Martin Evans
b7379b7124
Moved spinner out to an extension, so it can easily be used in other examples
2 years ago
Martin Evans
08f1615e60
- Converted LLamaStatelessExecutor to run `Exec` calls inside an awaited task. This unblocks async callers while the model is being evaluated.
- Added a "spinner" to the `StatelessModeExecute` demo, which spins while waiting for the next token (demonstrating that it's not blocked).
2 years ago
Martin Evans
3f80190f85
Minimal changes required to remove non-async inference.
2 years ago
Martin Evans
29df14cd9c
Converted ModelParams into a `record` class. This has several advantages:
- Equality, hashing etc all implemented automatically
- Default values are defined in just one place (the properties) instead of the constructor as well
- Added test to ensure that serialization works properly
2 years ago
Martin Evans
a9e6f21ab8
- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2 years ago
Martin Evans
ae8ef17a4a
- Added various convenience overloads to `LLamaContext.Eval`
- Converted `SafeLLamaContextHandle` to take a `ReadOnlySpan` for Eval, narrower type better represents what's really needed
2 years ago
Martin Evans
02a46fc363
Updated demos to use the new loading/multi context system
2 years ago
Martin Evans
f3511e390f
WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving.
The biggest single change is renaming `LLamaModel` to `LLamaContext`
2 years ago
Yaohui Liu
2eb2d6df83
test: add 9 examples of the new version.
2 years ago