Namespace: LLama.Native
A reference to a set of llama model weights
public sealed class SafeLlamaModelHandle : SafeLLamaHandleBase, System.IDisposable
Inheritance Object → CriticalFinalizerObject → SafeHandle → SafeLLamaHandleBase → SafeLlamaModelHandle
Implements IDisposable
Total number of tokens in vocabulary of this model
public int VocabCount { get; }
Total number of tokens in the context
public int ContextSize { get; }
Get the rope frequency this model was trained with
public float RopeFrequency { get; }
Dimension of embedding vectors
public int EmbeddingSize { get; }
Get the size of this model in bytes
public ulong SizeInBytes { get; }
Get the number of parameters in this model
public ulong ParameterCount { get; }
Get a description of this model
public string Description { get; }
Get the number of metadata key/value pairs
public int MetadataCount { get; }
public bool IsInvalid { get; }
public bool IsClosed { get; }
public SafeLlamaModelHandle()
protected bool ReleaseHandle()
Load a model from the given file path into memory
public static SafeLlamaModelHandle LoadFromFile(string modelPath, LLamaModelParams lparams)
modelPath String
lparams LLamaModelParams
Apply a LoRA adapter to a loaded model
path_base_model is the path to a higher quality model to use as a base for
the layers modified by the adapter. Can be NULL to use the current loaded model.
The model needs to be reloaded before applying a new adapter, otherwise the adapter
will be applied on top of the previous one
public static int llama_model_apply_lora_from_file(SafeLlamaModelHandle model_ptr, string path_lora, float scale, string path_base_model, int n_threads)
model_ptr SafeLlamaModelHandle
path_lora String
scale Single
path_base_model String
n_threads Int32
Int32
Returns 0 on success
Get metadata value as a string by key name
public static int llama_model_meta_val_str(SafeLlamaModelHandle model, Byte* key, Byte* buf, long buf_size)
model SafeLlamaModelHandle
key Byte*
buf Byte*
buf_size Int64
Int32
The length of the string on success, or -1 on failure
Apply a LoRA adapter to a loaded model
public void ApplyLoraFromFile(string lora, float scale, string modelBase, Nullable<int> threads)
lora String
scale Single
modelBase String
A path to a higher quality model to use as a base for the layers modified by the
adapter. Can be NULL to use the current loaded model.
threads Nullable<Int32>
Convert a single llama token into bytes
public uint TokenToSpan(LLamaToken token, Span<byte> dest)
token LLamaToken
Token to decode
dest Span<Byte>
A span to attempt to write into. If this is too small nothing will be written
UInt32
The size of this token. nothing will be written if this is larger than dest
Use a StreamingTokenDecoder instead
Convert a sequence of tokens into characters.
internal Span<char> TokensToSpan(IReadOnlyList<LLamaToken> tokens, Span<char> dest, Encoding encoding)
tokens IReadOnlyList<LLamaToken>
dest Span<Char>
encoding Encoding
Span<Char>
The section of the span which has valid data in it.
If there was insufficient space in the output span this will be
filled with as many characters as possible, starting from the last token.
Convert a string of text into tokens
public LLamaToken[] Tokenize(string text, bool add_bos, bool special, Encoding encoding)
text String
add_bos Boolean
special Boolean
Allow tokenizing special and/or control tokens which otherwise are not exposed and treated as plaintext.
encoding Encoding
Create a new context for this model
public SafeLLamaContextHandle CreateContext(LLamaContextParams params)
params LLamaContextParams
Get the metadata key for the given index
public Nullable<Memory<byte>> MetadataKeyByIndex(int index)
index Int32
The index to get
Nullable<Memory<Byte>>
The key, null if there is no such key or if the buffer was too small
Get the metadata value for the given index
public Nullable<Memory<byte>> MetadataValueByIndex(int index)
index Int32
The index to get
Nullable<Memory<Byte>>
The value, null if there is no such value or if the buffer was too small
internal IReadOnlyDictionary<string, string> ReadMetadata()
IReadOnlyDictionary<String, String>
internal static int <llama_model_meta_key_by_index>g__llama_model_meta_key_by_index_native|23_0(SafeLlamaModelHandle model, int index, Byte* buf, long buf_size)
model SafeLlamaModelHandle
index Int32
buf Byte*
buf_size Int64
internal static int <llama_model_meta_val_str_by_index>g__llama_model_meta_val_str_by_index_native|24_0(SafeLlamaModelHandle model, int index, Byte* buf, long buf_size)
model SafeLlamaModelHandle
index Int32
buf Byte*
buf_size Int64