|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151 |
- # BatchedExecutor
-
- Namespace: LLama.Batched
-
- A batched executor that can infer multiple separate "conversations" simultaneously.
-
- ```csharp
- public sealed class BatchedExecutor : System.IDisposable
- ```
-
- Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [BatchedExecutor](./llama.batched.batchedexecutor.md)<br>
- Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable)
-
- ## Properties
-
- ### **Context**
-
- The [LLamaContext](./llama.llamacontext.md) this executor is using
-
- ```csharp
- public LLamaContext Context { get; }
- ```
-
- #### Property Value
-
- [LLamaContext](./llama.llamacontext.md)<br>
-
- ### **Model**
-
- The [LLamaWeights](./llama.llamaweights.md) this executor is using
-
- ```csharp
- public LLamaWeights Model { get; }
- ```
-
- #### Property Value
-
- [LLamaWeights](./llama.llamaweights.md)<br>
-
- ### **BatchedTokenCount**
-
- Get the number of tokens in the batch, waiting for [BatchedExecutor.Infer(CancellationToken)](./llama.batched.batchedexecutor.md#infercancellationtoken) to be called
-
- ```csharp
- public int BatchedTokenCount { get; }
- ```
-
- #### Property Value
-
- [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
-
- ### **IsDisposed**
-
- Check if this executor has been disposed.
-
- ```csharp
- public bool IsDisposed { get; private set; }
- ```
-
- #### Property Value
-
- [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
-
- ## Constructors
-
- ### **BatchedExecutor(LLamaWeights, IContextParams)**
-
- Create a new batched executor
-
- ```csharp
- public BatchedExecutor(LLamaWeights model, IContextParams contextParams)
- ```
-
- #### Parameters
-
- `model` [LLamaWeights](./llama.llamaweights.md)<br>
- The model to use
-
- `contextParams` [IContextParams](./llama.abstractions.icontextparams.md)<br>
- Parameters to create a new context
-
- ## Methods
-
- ### **Prompt(String)**
-
- #### Caution
-
- Use BatchedExecutor.Create instead
-
- ---
-
- Start a new [Conversation](./llama.batched.conversation.md) with the given prompt
-
- ```csharp
- public Conversation Prompt(string prompt)
- ```
-
- #### Parameters
-
- `prompt` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
-
- #### Returns
-
- [Conversation](./llama.batched.conversation.md)<br>
-
- ### **Create()**
-
- Start a new [Conversation](./llama.batched.conversation.md)
-
- ```csharp
- public Conversation Create()
- ```
-
- #### Returns
-
- [Conversation](./llama.batched.conversation.md)<br>
-
- ### **Infer(CancellationToken)**
-
- Run inference for all conversations in the batch which have pending tokens.
-
- If the result is `NoKvSlot` then there is not enough memory for inference, try disposing some conversation
- threads and running inference again.
-
- ```csharp
- public Task<DecodeResult> Infer(CancellationToken cancellation)
- ```
-
- #### Parameters
-
- `cancellation` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
-
- #### Returns
-
- [Task<DecodeResult>](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>
-
- ### **Dispose()**
-
- ```csharp
- public void Dispose()
- ```
-
- ### **GetNextSequenceId()**
-
- ```csharp
- internal LLamaSeqId GetNextSequenceId()
- ```
-
- #### Returns
-
- [LLamaSeqId](./llama.native.llamaseqid.md)<br>
|