You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

llama.batched.batchedexecutor.md 3.2 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151
  1. # BatchedExecutor
  2. Namespace: LLama.Batched
  3. A batched executor that can infer multiple separate "conversations" simultaneously.
  4. ```csharp
  5. public sealed class BatchedExecutor : System.IDisposable
  6. ```
  7. Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [BatchedExecutor](./llama.batched.batchedexecutor.md)<br>
  8. Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable)
  9. ## Properties
  10. ### **Context**
  11. The [LLamaContext](./llama.llamacontext.md) this executor is using
  12. ```csharp
  13. public LLamaContext Context { get; }
  14. ```
  15. #### Property Value
  16. [LLamaContext](./llama.llamacontext.md)<br>
  17. ### **Model**
  18. The [LLamaWeights](./llama.llamaweights.md) this executor is using
  19. ```csharp
  20. public LLamaWeights Model { get; }
  21. ```
  22. #### Property Value
  23. [LLamaWeights](./llama.llamaweights.md)<br>
  24. ### **BatchedTokenCount**
  25. Get the number of tokens in the batch, waiting for [BatchedExecutor.Infer(CancellationToken)](./llama.batched.batchedexecutor.md#infercancellationtoken) to be called
  26. ```csharp
  27. public int BatchedTokenCount { get; }
  28. ```
  29. #### Property Value
  30. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  31. ### **IsDisposed**
  32. Check if this executor has been disposed.
  33. ```csharp
  34. public bool IsDisposed { get; private set; }
  35. ```
  36. #### Property Value
  37. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  38. ## Constructors
  39. ### **BatchedExecutor(LLamaWeights, IContextParams)**
  40. Create a new batched executor
  41. ```csharp
  42. public BatchedExecutor(LLamaWeights model, IContextParams contextParams)
  43. ```
  44. #### Parameters
  45. `model` [LLamaWeights](./llama.llamaweights.md)<br>
  46. The model to use
  47. `contextParams` [IContextParams](./llama.abstractions.icontextparams.md)<br>
  48. Parameters to create a new context
  49. ## Methods
  50. ### **Prompt(String)**
  51. #### Caution
  52. Use BatchedExecutor.Create instead
  53. ---
  54. Start a new [Conversation](./llama.batched.conversation.md) with the given prompt
  55. ```csharp
  56. public Conversation Prompt(string prompt)
  57. ```
  58. #### Parameters
  59. `prompt` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  60. #### Returns
  61. [Conversation](./llama.batched.conversation.md)<br>
  62. ### **Create()**
  63. Start a new [Conversation](./llama.batched.conversation.md)
  64. ```csharp
  65. public Conversation Create()
  66. ```
  67. #### Returns
  68. [Conversation](./llama.batched.conversation.md)<br>
  69. ### **Infer(CancellationToken)**
  70. Run inference for all conversations in the batch which have pending tokens.
  71. If the result is `NoKvSlot` then there is not enough memory for inference, try disposing some conversation
  72. threads and running inference again.
  73. ```csharp
  74. public Task<DecodeResult> Infer(CancellationToken cancellation)
  75. ```
  76. #### Parameters
  77. `cancellation` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
  78. #### Returns
  79. [Task&lt;DecodeResult&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>
  80. ### **Dispose()**
  81. ```csharp
  82. public void Dispose()
  83. ```
  84. ### **GetNextSequenceId()**
  85. ```csharp
  86. internal LLamaSeqId GetNextSequenceId()
  87. ```
  88. #### Returns
  89. [LLamaSeqId](./llama.native.llamaseqid.md)<br>