Tree: ee6bccc6d6

3.2 kB

Raw Blame History

BatchedExecutor

Namespace: LLama.Batched

A batched executor that can infer multiple separate "conversations" simultaneously.

public sealed class BatchedExecutor : System.IDisposable

Inheritance Object → BatchedExecutor

Implements IDisposable

Properties

Context

The LLamaContext this executor is using

public LLamaContext Context { get; }

Property Value

LLamaContext

Model

The LLamaWeights this executor is using

public LLamaWeights Model { get; }

Property Value

LLamaWeights

BatchedTokenCount

Get the number of tokens in the batch, waiting for BatchedExecutor.Infer(CancellationToken) to be called

public int BatchedTokenCount { get; }

Property Value

Int32

IsDisposed

Check if this executor has been disposed.

public bool IsDisposed { get; private set; }

Property Value

Boolean

Constructors

BatchedExecutor(LLamaWeights, IContextParams)

Create a new batched executor

public BatchedExecutor(LLamaWeights model, IContextParams contextParams)

Parameters

model LLamaWeights

The model to use

contextParams IContextParams

Parameters to create a new context

Methods

Prompt(String)

Caution

Use BatchedExecutor.Create instead

Start a new Conversation with the given prompt

public Conversation Prompt(string prompt)

Parameters

prompt String

Returns

Conversation

Create()

Start a new Conversation

public Conversation Create()

Returns

Conversation

Infer(CancellationToken)

Run inference for all conversations in the batch which have pending tokens.

If the result is NoKvSlot then there is not enough memory for inference, try disposing some conversation
threads and running inference again.

public Task<DecodeResult> Infer(CancellationToken cancellation)

Parameters

cancellation CancellationToken

Returns

Task<DecodeResult>

Dispose()

public void Dispose()

GetNextSequenceId()

internal LLamaSeqId GetNextSequenceId()

Returns

LLamaSeqId

3.2 kB Raw Blame History

BatchedExecutor

Properties

Context

Property Value

Model

Property Value

BatchedTokenCount

Property Value

IsDisposed

Property Value

Constructors

BatchedExecutor(LLamaWeights, IContextParams)

Parameters

Methods

Prompt(String)

Caution

Parameters

Returns

Create()

Returns

Infer(CancellationToken)

Parameters

Returns

Dispose()

GetNextSequenceId()

Returns

3.2 kB

Raw Blame History