You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

llama.native.nativeapi.md 25 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786
  1. # NativeApi
  2. Namespace: LLama.Native
  3. ```csharp
  4. public class NativeApi
  5. ```
  6. Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [NativeApi](./llama.native.nativeapi.md)
  7. ## Constructors
  8. ### **NativeApi()**
  9. ```csharp
  10. public NativeApi()
  11. ```
  12. ## Methods
  13. ### **llama_print_timings(SafeLLamaContextHandle)**
  14. ```csharp
  15. public static void llama_print_timings(SafeLLamaContextHandle ctx)
  16. ```
  17. #### Parameters
  18. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  19. ### **llama_reset_timings(SafeLLamaContextHandle)**
  20. ```csharp
  21. public static void llama_reset_timings(SafeLLamaContextHandle ctx)
  22. ```
  23. #### Parameters
  24. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  25. ### **llama_print_system_info()**
  26. Print system information
  27. ```csharp
  28. public static IntPtr llama_print_system_info()
  29. ```
  30. #### Returns
  31. [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  32. ### **llama_model_quantize(String, String, LLamaFtype, Int32)**
  33. ```csharp
  34. public static int llama_model_quantize(string fname_inp, string fname_out, LLamaFtype ftype, int nthread)
  35. ```
  36. #### Parameters
  37. `fname_inp` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  38. `fname_out` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  39. `ftype` [LLamaFtype](./llama.native.llamaftype.md)<br>
  40. `nthread` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  41. #### Returns
  42. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  43. ### **llama_sample_repetition_penalty(SafeLLamaContextHandle, IntPtr, Int32[], UInt64, Single)**
  44. Repetition penalty described in CTRL academic paper https://arxiv.org/abs/1909.05858, with negative logit fix.
  45. ```csharp
  46. public static void llama_sample_repetition_penalty(SafeLLamaContextHandle ctx, IntPtr candidates, Int32[] last_tokens, ulong last_tokens_size, float penalty)
  47. ```
  48. #### Parameters
  49. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  50. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  51. Pointer to LLamaTokenDataArray
  52. `last_tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  53. `last_tokens_size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  54. `penalty` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  55. ### **llama_sample_frequency_and_presence_penalties(SafeLLamaContextHandle, IntPtr, Int32[], UInt64, Single, Single)**
  56. Frequency and presence penalties described in OpenAI API https://platform.openai.com/docs/api-reference/parameter-details.
  57. ```csharp
  58. public static void llama_sample_frequency_and_presence_penalties(SafeLLamaContextHandle ctx, IntPtr candidates, Int32[] last_tokens, ulong last_tokens_size, float alpha_frequency, float alpha_presence)
  59. ```
  60. #### Parameters
  61. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  62. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  63. Pointer to LLamaTokenDataArray
  64. `last_tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  65. `last_tokens_size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  66. `alpha_frequency` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  67. `alpha_presence` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  68. ### **llama_sample_softmax(SafeLLamaContextHandle, IntPtr)**
  69. Sorts candidate tokens by their logits in descending order and calculate probabilities based on logits.
  70. ```csharp
  71. public static void llama_sample_softmax(SafeLLamaContextHandle ctx, IntPtr candidates)
  72. ```
  73. #### Parameters
  74. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  75. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  76. Pointer to LLamaTokenDataArray
  77. ### **llama_sample_top_k(SafeLLamaContextHandle, IntPtr, Int32, UInt64)**
  78. Top-K sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
  79. ```csharp
  80. public static void llama_sample_top_k(SafeLLamaContextHandle ctx, IntPtr candidates, int k, ulong min_keep)
  81. ```
  82. #### Parameters
  83. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  84. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  85. Pointer to LLamaTokenDataArray
  86. `k` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  87. `min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  88. ### **llama_sample_top_p(SafeLLamaContextHandle, IntPtr, Single, UInt64)**
  89. Nucleus sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
  90. ```csharp
  91. public static void llama_sample_top_p(SafeLLamaContextHandle ctx, IntPtr candidates, float p, ulong min_keep)
  92. ```
  93. #### Parameters
  94. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  95. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  96. Pointer to LLamaTokenDataArray
  97. `p` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  98. `min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  99. ### **llama_sample_tail_free(SafeLLamaContextHandle, IntPtr, Single, UInt64)**
  100. Tail Free Sampling described in https://www.trentonbricken.com/Tail-Free-Sampling/.
  101. ```csharp
  102. public static void llama_sample_tail_free(SafeLLamaContextHandle ctx, IntPtr candidates, float z, ulong min_keep)
  103. ```
  104. #### Parameters
  105. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  106. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  107. Pointer to LLamaTokenDataArray
  108. `z` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  109. `min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  110. ### **llama_sample_typical(SafeLLamaContextHandle, IntPtr, Single, UInt64)**
  111. Locally Typical Sampling implementation described in the paper https://arxiv.org/abs/2202.00666.
  112. ```csharp
  113. public static void llama_sample_typical(SafeLLamaContextHandle ctx, IntPtr candidates, float p, ulong min_keep)
  114. ```
  115. #### Parameters
  116. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  117. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  118. Pointer to LLamaTokenDataArray
  119. `p` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  120. `min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  121. ### **llama_sample_temperature(SafeLLamaContextHandle, IntPtr, Single)**
  122. ```csharp
  123. public static void llama_sample_temperature(SafeLLamaContextHandle ctx, IntPtr candidates, float temp)
  124. ```
  125. #### Parameters
  126. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  127. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  128. `temp` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  129. ### **llama_sample_token_mirostat(SafeLLamaContextHandle, IntPtr, Single, Single, Int32, Single*)**
  130. Mirostat 1.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
  131. ```csharp
  132. public static int llama_sample_token_mirostat(SafeLLamaContextHandle ctx, IntPtr candidates, float tau, float eta, int m, Single* mu)
  133. ```
  134. #### Parameters
  135. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  136. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  137. A vector of `llama_token_data` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
  138. `tau` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  139. The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
  140. `eta` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  141. The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
  142. `m` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  143. The number of tokens considered in the estimation of `s_hat`. This is an arbitrary value that is used to calculate `s_hat`, which in turn helps to calculate the value of `k`. In the paper, they use `m = 100`, but you can experiment with different values to see how it affects the performance of the algorithm.
  144. `mu` [Single*](https://docs.microsoft.com/en-us/dotnet/api/system.single*)<br>
  145. Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (`2 * tau`) and is updated in the algorithm based on the error between the target and observed surprisal.
  146. #### Returns
  147. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  148. ### **llama_sample_token_mirostat_v2(SafeLLamaContextHandle, IntPtr, Single, Single, Single*)**
  149. Mirostat 2.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
  150. ```csharp
  151. public static int llama_sample_token_mirostat_v2(SafeLLamaContextHandle ctx, IntPtr candidates, float tau, float eta, Single* mu)
  152. ```
  153. #### Parameters
  154. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  155. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  156. A vector of `llama_token_data` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
  157. `tau` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  158. The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
  159. `eta` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  160. The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
  161. `mu` [Single*](https://docs.microsoft.com/en-us/dotnet/api/system.single*)<br>
  162. Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (`2 * tau`) and is updated in the algorithm based on the error between the target and observed surprisal.
  163. #### Returns
  164. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  165. ### **llama_sample_token_greedy(SafeLLamaContextHandle, IntPtr)**
  166. Selects the token with the highest probability.
  167. ```csharp
  168. public static int llama_sample_token_greedy(SafeLLamaContextHandle ctx, IntPtr candidates)
  169. ```
  170. #### Parameters
  171. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  172. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  173. Pointer to LLamaTokenDataArray
  174. #### Returns
  175. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  176. ### **llama_sample_token(SafeLLamaContextHandle, IntPtr)**
  177. Randomly selects a token from the candidates based on their probabilities.
  178. ```csharp
  179. public static int llama_sample_token(SafeLLamaContextHandle ctx, IntPtr candidates)
  180. ```
  181. #### Parameters
  182. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  183. `candidates` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  184. Pointer to LLamaTokenDataArray
  185. #### Returns
  186. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  187. ### **llama_empty_call()**
  188. ```csharp
  189. public static bool llama_empty_call()
  190. ```
  191. #### Returns
  192. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  193. ### **llama_context_default_params()**
  194. ```csharp
  195. public static LLamaContextParams llama_context_default_params()
  196. ```
  197. #### Returns
  198. [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
  199. ### **llama_mmap_supported()**
  200. ```csharp
  201. public static bool llama_mmap_supported()
  202. ```
  203. #### Returns
  204. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  205. ### **llama_mlock_supported()**
  206. ```csharp
  207. public static bool llama_mlock_supported()
  208. ```
  209. #### Returns
  210. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  211. ### **llama_init_from_file(String, LLamaContextParams)**
  212. Various functions for loading a ggml llama model.
  213. Allocate (almost) all memory needed for the model.
  214. Return NULL on failure
  215. ```csharp
  216. public static IntPtr llama_init_from_file(string path_model, LLamaContextParams params_)
  217. ```
  218. #### Parameters
  219. `path_model` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  220. `params_` [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
  221. #### Returns
  222. [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  223. ### **llama_init_backend()**
  224. not great API - very likely to change.
  225. Initialize the llama + ggml backend
  226. Call once at the start of the program
  227. ```csharp
  228. public static void llama_init_backend()
  229. ```
  230. ### **llama_free(IntPtr)**
  231. Frees all allocated memory
  232. ```csharp
  233. public static void llama_free(IntPtr ctx)
  234. ```
  235. #### Parameters
  236. `ctx` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  237. ### **llama_apply_lora_from_file(SafeLLamaContextHandle, String, String, Int32)**
  238. Apply a LoRA adapter to a loaded model
  239. path_base_model is the path to a higher quality model to use as a base for
  240. the layers modified by the adapter. Can be NULL to use the current loaded model.
  241. The model needs to be reloaded before applying a new adapter, otherwise the adapter
  242. will be applied on top of the previous one
  243. ```csharp
  244. public static int llama_apply_lora_from_file(SafeLLamaContextHandle ctx, string path_lora, string path_base_model, int n_threads)
  245. ```
  246. #### Parameters
  247. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  248. `path_lora` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  249. `path_base_model` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  250. `n_threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  251. #### Returns
  252. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  253. Returns 0 on success
  254. ### **llama_get_kv_cache_token_count(SafeLLamaContextHandle)**
  255. Returns the number of tokens in the KV cache
  256. ```csharp
  257. public static int llama_get_kv_cache_token_count(SafeLLamaContextHandle ctx)
  258. ```
  259. #### Parameters
  260. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  261. #### Returns
  262. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  263. ### **llama_set_rng_seed(SafeLLamaContextHandle, Int32)**
  264. Sets the current rng seed.
  265. ```csharp
  266. public static void llama_set_rng_seed(SafeLLamaContextHandle ctx, int seed)
  267. ```
  268. #### Parameters
  269. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  270. `seed` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  271. ### **llama_get_state_size(SafeLLamaContextHandle)**
  272. Returns the maximum size in bytes of the state (rng, logits, embedding
  273. and kv_cache) - will often be smaller after compacting tokens
  274. ```csharp
  275. public static ulong llama_get_state_size(SafeLLamaContextHandle ctx)
  276. ```
  277. #### Parameters
  278. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  279. #### Returns
  280. [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  281. ### **llama_copy_state_data(SafeLLamaContextHandle, Byte[])**
  282. Copies the state to the specified destination address.
  283. Destination needs to have allocated enough memory.
  284. Returns the number of bytes copied
  285. ```csharp
  286. public static ulong llama_copy_state_data(SafeLLamaContextHandle ctx, Byte[] dest)
  287. ```
  288. #### Parameters
  289. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  290. `dest` [Byte[]](https://docs.microsoft.com/en-us/dotnet/api/system.byte)<br>
  291. #### Returns
  292. [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  293. ### **llama_set_state_data(SafeLLamaContextHandle, Byte[])**
  294. Set the state reading from the specified address
  295. Returns the number of bytes read
  296. ```csharp
  297. public static ulong llama_set_state_data(SafeLLamaContextHandle ctx, Byte[] src)
  298. ```
  299. #### Parameters
  300. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  301. `src` [Byte[]](https://docs.microsoft.com/en-us/dotnet/api/system.byte)<br>
  302. #### Returns
  303. [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  304. ### **llama_load_session_file(SafeLLamaContextHandle, String, Int32[], UInt64, UInt64*)**
  305. Load session file
  306. ```csharp
  307. public static bool llama_load_session_file(SafeLLamaContextHandle ctx, string path_session, Int32[] tokens_out, ulong n_token_capacity, UInt64* n_token_count_out)
  308. ```
  309. #### Parameters
  310. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  311. `path_session` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  312. `tokens_out` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  313. `n_token_capacity` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  314. `n_token_count_out` [UInt64*](https://docs.microsoft.com/en-us/dotnet/api/system.uint64*)<br>
  315. #### Returns
  316. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  317. ### **llama_save_session_file(SafeLLamaContextHandle, String, Int32[], UInt64)**
  318. Save session file
  319. ```csharp
  320. public static bool llama_save_session_file(SafeLLamaContextHandle ctx, string path_session, Int32[] tokens, ulong n_token_count)
  321. ```
  322. #### Parameters
  323. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  324. `path_session` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  325. `tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  326. `n_token_count` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  327. #### Returns
  328. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  329. ### **llama_eval(SafeLLamaContextHandle, Int32[], Int32, Int32, Int32)**
  330. Run the llama inference to obtain the logits and probabilities for the next token.
  331. tokens + n_tokens is the provided batch of new tokens to process
  332. n_past is the number of tokens to use from previous eval calls
  333. ```csharp
  334. public static int llama_eval(SafeLLamaContextHandle ctx, Int32[] tokens, int n_tokens, int n_past, int n_threads)
  335. ```
  336. #### Parameters
  337. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  338. `tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  339. `n_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  340. `n_past` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  341. `n_threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  342. #### Returns
  343. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  344. Returns 0 on success
  345. ### **llama_eval_with_pointer(SafeLLamaContextHandle, Int32*, Int32, Int32, Int32)**
  346. ```csharp
  347. public static int llama_eval_with_pointer(SafeLLamaContextHandle ctx, Int32* tokens, int n_tokens, int n_past, int n_threads)
  348. ```
  349. #### Parameters
  350. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  351. `tokens` [Int32*](https://docs.microsoft.com/en-us/dotnet/api/system.int32*)<br>
  352. `n_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  353. `n_past` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  354. `n_threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  355. #### Returns
  356. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  357. ### **llama_tokenize(SafeLLamaContextHandle, String, Encoding, Int32[], Int32, Boolean)**
  358. Convert the provided text into tokens.
  359. The tokens pointer must be large enough to hold the resulting tokens.
  360. Returns the number of tokens on success, no more than n_max_tokens
  361. Returns a negative number on failure - the number of tokens that would have been returned
  362. ```csharp
  363. public static int llama_tokenize(SafeLLamaContextHandle ctx, string text, Encoding encoding, Int32[] tokens, int n_max_tokens, bool add_bos)
  364. ```
  365. #### Parameters
  366. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  367. `text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  368. `encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
  369. `tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  370. `n_max_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  371. `add_bos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  372. #### Returns
  373. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  374. ### **llama_tokenize_native(SafeLLamaContextHandle, SByte[], Int32[], Int32, Boolean)**
  375. ```csharp
  376. public static int llama_tokenize_native(SafeLLamaContextHandle ctx, SByte[] text, Int32[] tokens, int n_max_tokens, bool add_bos)
  377. ```
  378. #### Parameters
  379. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  380. `text` [SByte[]](https://docs.microsoft.com/en-us/dotnet/api/system.sbyte)<br>
  381. `tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  382. `n_max_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  383. `add_bos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  384. #### Returns
  385. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  386. ### **llama_n_vocab(SafeLLamaContextHandle)**
  387. ```csharp
  388. public static int llama_n_vocab(SafeLLamaContextHandle ctx)
  389. ```
  390. #### Parameters
  391. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  392. #### Returns
  393. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  394. ### **llama_n_ctx(SafeLLamaContextHandle)**
  395. ```csharp
  396. public static int llama_n_ctx(SafeLLamaContextHandle ctx)
  397. ```
  398. #### Parameters
  399. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  400. #### Returns
  401. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  402. ### **llama_n_embd(SafeLLamaContextHandle)**
  403. ```csharp
  404. public static int llama_n_embd(SafeLLamaContextHandle ctx)
  405. ```
  406. #### Parameters
  407. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  408. #### Returns
  409. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  410. ### **llama_get_logits(SafeLLamaContextHandle)**
  411. Token logits obtained from the last call to llama_eval()
  412. The logits for the last token are stored in the last row
  413. Can be mutated in order to change the probabilities of the next token
  414. Rows: n_tokens
  415. Cols: n_vocab
  416. ```csharp
  417. public static Single* llama_get_logits(SafeLLamaContextHandle ctx)
  418. ```
  419. #### Parameters
  420. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  421. #### Returns
  422. [Single*](https://docs.microsoft.com/en-us/dotnet/api/system.single*)<br>
  423. ### **llama_get_embeddings(SafeLLamaContextHandle)**
  424. Get the embeddings for the input
  425. shape: [n_embd] (1-dimensional)
  426. ```csharp
  427. public static Single* llama_get_embeddings(SafeLLamaContextHandle ctx)
  428. ```
  429. #### Parameters
  430. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  431. #### Returns
  432. [Single*](https://docs.microsoft.com/en-us/dotnet/api/system.single*)<br>
  433. ### **llama_token_to_str(SafeLLamaContextHandle, Int32)**
  434. Token Id -&gt; String. Uses the vocabulary in the provided context
  435. ```csharp
  436. public static IntPtr llama_token_to_str(SafeLLamaContextHandle ctx, int token)
  437. ```
  438. #### Parameters
  439. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  440. `token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  441. #### Returns
  442. [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  443. Pointer to a string.
  444. ### **llama_token_bos()**
  445. ```csharp
  446. public static int llama_token_bos()
  447. ```
  448. #### Returns
  449. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  450. ### **llama_token_eos()**
  451. ```csharp
  452. public static int llama_token_eos()
  453. ```
  454. #### Returns
  455. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  456. ### **llama_token_nl()**
  457. ```csharp
  458. public static int llama_token_nl()
  459. ```
  460. #### Returns
  461. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>

C#/.NET上易用的LLM高性能推理框架,支持LLaMA和LLaVA系列模型。