Conversations can be "forked", to create a copy of a conversation at a given point. This allows e.g. prompting a conversation with a system prefix just once and then forking it again and again for each individual conversation. Conversations can also be "rewound" to an earlier state.
Added two new examples, demonstrating forking and rewinding.
- Added a `DecodeAsync` overload which runs the work in a task
- Replaced some `NativeHandle` usage in `BatchedDecoding` with higher level equivalents.
- Made the `LLamaBatch` grow when token capacity is exceeded, removing the need to manage token capacity externally.
- Refactored the chat completion implementation in `LLamaSharpChatCompletion.cs` to use `StatelessExecutor` instead of `InteractiveExecutor`.
- Updated the chat history prompt in `LLamaSharpChatCompletion.cs` to include a conversation between the assistant and the user.
- Modified the `HistoryTransform` class in `HistoryTransform.cs` to append the assistant role to the chat history prompt.
- Updated the constructor of `LLamaSharpChatCompletion` to accept optional parameters for `historyTransform` and `outputTransform`.
- Modified the `GetChatCompletionsAsync` and `GetChatCompletions` methods in `LLamaSharpChatCompletion.cs` to use the new `StatelessExecutor` and `outputTransform`.
- Updated the `ExtensionMethods.cs` file to include the assistant and system roles in the list of anti-prompts.
- Renamed files and updated namespaces in Examples folder.
- Moved files from NewVersion folder to Examples folder.
- Removed TestRunner.cs file.
- Updated Runner.cs to include new examples.
- Update Program.cs to use the new Runner class instead of NewVersionTestRunner
- Update LLama.Examples namespace in Program.cs
- Update await NewVersionTestRunner.Run() in Program.cs to await Runner.Run()