LLamaSharp

333 MB

Branch: master

Author	SHA1	Message	Date
Martin Evans	3ba49754b1	Removed (marked as obsolete) prompting with a string for `Conversation`. Tokenization requires extra parameters (e.g. addBos, special) which require special considersation. For now it's better to tokenize using other tools and pass the tokens directly.	1 year ago
Martin Evans	ccc49eb1e0	BatchedExecutor Save/Load (#681 ) * Added the ability to save and load individual conversations in a batched executor. - New example - Added `BatchedExecutor.Load(filepath)` method - Added `Conversation.Save(filepath)` method - Added new (currently internal) `SaveState`/`LoadState` methods in LLamaContext which can stash some extra binary data in the header * Added ability to save/load a `Conversation` to an in-memory state, instead of to file. * Moved the new save/load methods out to an extension class specifically for the batched executor. * Removed unnecessary spaces	1 year ago
Martin Evans	268f3a6b07	BatchedExecutor Fixed Forking (#621 ) * Previously when a conversation was forked this would result in both the parent and the child sharing exactly the same logits. Since sampling is allowed to modify logits this could lead to issues in sampling (e.g. one conversation is sampled and overwrites logits to be all zero, second conversation is sampled and generates nonsense). Fixed this by setting a "forked" flag, logits are copied if this flag is set. Flag is cleared next time the conversation is prompted so this extra copying only happens once after a fork occurs. * Removed finalizer from `BatchedExecutor`. This class does not directly own any unmanaged resources so it is not necessary.	1 year ago
Martin Evans	ad682fbebd	`BatchedExecutor.Create()` method (#613 ) Replaced `BatchedExecutor.Prompt(string)` method with `BatchedExecutor.Create()` method. This improves the API in two ways: - A conversation can be created, without immediately prompting it - Other prompting overloads (e.g. prompt with token list) can be used without duplicating all the overloads onto `BatchedExecutor` Added `BatchSize` property to `LLamaContext`	1 year ago
Martin Evans	c7d0dc915a	Assorted small changes to clean up some code warnings	1 year ago
Martin Evans	1cc463b9b7	Added a finalizer to `BatchedExecutor`	1 year ago
Martin Evans	949861a581	- Added a `Modify` method to `Conversation`. This grants temporary access to directly modify the KV cache. - Re-implmented `Rewind` as an extension method using `Modify` internally - Implemented `ShiftLeft`, which shifts everything over except for some starting tokens. This is the same as the `StatelessExecutor` out-of-context handling. - Starting batch at epoch 1, this ensures that conversations (starting at zero) are below the current epoch. It also means `0` can always be used as a value guaranteed to be below the current epoch.	1 year ago
Martin Evans	b0acecf080	Created a new `BatchedExecutor` which processes multiple "Conversations" in one single inference batch. This is faster, even when the conversations are unrelated, and is much faster if the conversations share some overlap (e.g. a common system prompt prefix). Conversations can be "forked", to create a copy of a conversation at a given point. This allows e.g. prompting a conversation with a system prefix just once and then forking it again and again for each individual conversation. Conversations can also be "rewound" to an earlier state. Added two new examples, demonstrating forking and rewinding.	1 year ago

8 Commits (master)