You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

llama.native.nativeapi.md 38 kB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163
  1. # NativeApi
  2. Namespace: LLama.Native
  3. Direct translation of the llama.cpp API
  4. ```csharp
  5. public class NativeApi
  6. ```
  7. Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [NativeApi](./llama.native.nativeapi.md)
  8. ## Constructors
  9. ### **NativeApi()**
  10. ```csharp
  11. public NativeApi()
  12. ```
  13. ## Methods
  14. ### **llama_sample_token_mirostat(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Single, Single, Int32, Single&)**
  15. Mirostat 1.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
  16. ```csharp
  17. public static int llama_sample_token_mirostat(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, float tau, float eta, int m, Single& mu)
  18. ```
  19. #### Parameters
  20. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  21. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  22. A vector of `llama_token_data` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
  23. `tau` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  24. The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
  25. `eta` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  26. The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
  27. `m` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  28. The number of tokens considered in the estimation of `s_hat`. This is an arbitrary value that is used to calculate `s_hat`, which in turn helps to calculate the value of `k`. In the paper, they use `m = 100`, but you can experiment with different values to see how it affects the performance of the algorithm.
  29. `mu` [Single&](https://docs.microsoft.com/en-us/dotnet/api/system.single&)<br>
  30. Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (`2 * tau`) and is updated in the algorithm based on the error between the target and observed surprisal.
  31. #### Returns
  32. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  33. ### **llama_sample_token_mirostat_v2(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Single, Single, Single&)**
  34. Mirostat 2.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
  35. ```csharp
  36. public static int llama_sample_token_mirostat_v2(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, float tau, float eta, Single& mu)
  37. ```
  38. #### Parameters
  39. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  40. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  41. A vector of `llama_token_data` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
  42. `tau` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  43. The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
  44. `eta` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  45. The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
  46. `mu` [Single&](https://docs.microsoft.com/en-us/dotnet/api/system.single&)<br>
  47. Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (`2 * tau`) and is updated in the algorithm based on the error between the target and observed surprisal.
  48. #### Returns
  49. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  50. ### **llama_sample_token_greedy(SafeLLamaContextHandle, LLamaTokenDataArrayNative&)**
  51. Selects the token with the highest probability.
  52. ```csharp
  53. public static int llama_sample_token_greedy(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates)
  54. ```
  55. #### Parameters
  56. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  57. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  58. Pointer to LLamaTokenDataArray
  59. #### Returns
  60. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  61. ### **llama_sample_token(SafeLLamaContextHandle, LLamaTokenDataArrayNative&)**
  62. Randomly selects a token from the candidates based on their probabilities.
  63. ```csharp
  64. public static int llama_sample_token(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates)
  65. ```
  66. #### Parameters
  67. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  68. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  69. Pointer to LLamaTokenDataArray
  70. #### Returns
  71. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  72. ### **llama_token_to_str(SafeLLamaContextHandle, Int32)**
  73. Token Id -&gt; String. Uses the vocabulary in the provided context
  74. ```csharp
  75. public static IntPtr llama_token_to_str(SafeLLamaContextHandle ctx, int token)
  76. ```
  77. #### Parameters
  78. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  79. `token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  80. #### Returns
  81. [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  82. Pointer to a string.
  83. ### **llama_token_bos(SafeLLamaContextHandle)**
  84. Get the "Beginning of sentence" token
  85. ```csharp
  86. public static int llama_token_bos(SafeLLamaContextHandle ctx)
  87. ```
  88. #### Parameters
  89. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  90. #### Returns
  91. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  92. ### **llama_token_eos(SafeLLamaContextHandle)**
  93. Get the "End of sentence" token
  94. ```csharp
  95. public static int llama_token_eos(SafeLLamaContextHandle ctx)
  96. ```
  97. #### Parameters
  98. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  99. #### Returns
  100. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  101. ### **llama_token_nl(SafeLLamaContextHandle)**
  102. Get the "new line" token
  103. ```csharp
  104. public static int llama_token_nl(SafeLLamaContextHandle ctx)
  105. ```
  106. #### Parameters
  107. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  108. #### Returns
  109. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  110. ### **llama_print_timings(SafeLLamaContextHandle)**
  111. Print out timing information for this context
  112. ```csharp
  113. public static void llama_print_timings(SafeLLamaContextHandle ctx)
  114. ```
  115. #### Parameters
  116. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  117. ### **llama_reset_timings(SafeLLamaContextHandle)**
  118. Reset all collected timing information for this context
  119. ```csharp
  120. public static void llama_reset_timings(SafeLLamaContextHandle ctx)
  121. ```
  122. #### Parameters
  123. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  124. ### **llama_print_system_info()**
  125. Print system information
  126. ```csharp
  127. public static IntPtr llama_print_system_info()
  128. ```
  129. #### Returns
  130. [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  131. ### **llama_model_n_vocab(SafeLlamaModelHandle)**
  132. Get the number of tokens in the model vocabulary
  133. ```csharp
  134. public static int llama_model_n_vocab(SafeLlamaModelHandle model)
  135. ```
  136. #### Parameters
  137. `model` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
  138. #### Returns
  139. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  140. ### **llama_model_n_ctx(SafeLlamaModelHandle)**
  141. Get the size of the context window for the model
  142. ```csharp
  143. public static int llama_model_n_ctx(SafeLlamaModelHandle model)
  144. ```
  145. #### Parameters
  146. `model` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
  147. #### Returns
  148. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  149. ### **llama_model_n_embd(SafeLlamaModelHandle)**
  150. Get the dimension of embedding vectors from this model
  151. ```csharp
  152. public static int llama_model_n_embd(SafeLlamaModelHandle model)
  153. ```
  154. #### Parameters
  155. `model` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
  156. #### Returns
  157. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  158. ### **llama_token_to_piece_with_model(SafeLlamaModelHandle, Int32, Byte*, Int32)**
  159. Convert a single token into text
  160. ```csharp
  161. public static int llama_token_to_piece_with_model(SafeLlamaModelHandle model, int llamaToken, Byte* buffer, int length)
  162. ```
  163. #### Parameters
  164. `model` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
  165. `llamaToken` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  166. `buffer` [Byte*](https://docs.microsoft.com/en-us/dotnet/api/system.byte*)<br>
  167. buffer to write string into
  168. `length` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  169. size of the buffer
  170. #### Returns
  171. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  172. The length writte, or if the buffer is too small a negative that indicates the length required
  173. ### **llama_tokenize_with_model(SafeLlamaModelHandle, Byte*, Int32*, Int32, Boolean)**
  174. Convert text into tokens
  175. ```csharp
  176. public static int llama_tokenize_with_model(SafeLlamaModelHandle model, Byte* text, Int32* tokens, int n_max_tokens, bool add_bos)
  177. ```
  178. #### Parameters
  179. `model` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
  180. `text` [Byte*](https://docs.microsoft.com/en-us/dotnet/api/system.byte*)<br>
  181. `tokens` [Int32*](https://docs.microsoft.com/en-us/dotnet/api/system.int32*)<br>
  182. `n_max_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  183. `add_bos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  184. #### Returns
  185. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  186. Returns the number of tokens on success, no more than n_max_tokens.
  187. Returns a negative number on failure - the number of tokens that would have been returned
  188. ### **llama_log_set(LLamaLogCallback)**
  189. Register a callback to receive llama log messages
  190. ```csharp
  191. public static void llama_log_set(LLamaLogCallback logCallback)
  192. ```
  193. #### Parameters
  194. `logCallback` [LLamaLogCallback](./llama.native.llamalogcallback.md)<br>
  195. ### **llama_grammar_init(LLamaGrammarElement**, UInt64, UInt64)**
  196. Create a new grammar from the given set of grammar rules
  197. ```csharp
  198. public static IntPtr llama_grammar_init(LLamaGrammarElement** rules, ulong n_rules, ulong start_rule_index)
  199. ```
  200. #### Parameters
  201. `rules` [LLamaGrammarElement**](./llama.native.llamagrammarelement**.md)<br>
  202. `n_rules` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  203. `start_rule_index` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  204. #### Returns
  205. [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  206. ### **llama_grammar_free(IntPtr)**
  207. Free all memory from the given SafeLLamaGrammarHandle
  208. ```csharp
  209. public static void llama_grammar_free(IntPtr grammar)
  210. ```
  211. #### Parameters
  212. `grammar` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  213. ### **llama_sample_grammar(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, SafeLLamaGrammarHandle)**
  214. Apply constraints from grammar
  215. ```csharp
  216. public static void llama_sample_grammar(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, SafeLLamaGrammarHandle grammar)
  217. ```
  218. #### Parameters
  219. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  220. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  221. `grammar` [SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
  222. ### **llama_grammar_accept_token(SafeLLamaContextHandle, SafeLLamaGrammarHandle, Int32)**
  223. Accepts the sampled token into the grammar
  224. ```csharp
  225. public static void llama_grammar_accept_token(SafeLLamaContextHandle ctx, SafeLLamaGrammarHandle grammar, int token)
  226. ```
  227. #### Parameters
  228. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  229. `grammar` [SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
  230. `token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  231. ### **llama_model_quantize(String, String, LLamaModelQuantizeParams*)**
  232. Returns 0 on success
  233. ```csharp
  234. public static int llama_model_quantize(string fname_inp, string fname_out, LLamaModelQuantizeParams* param)
  235. ```
  236. #### Parameters
  237. `fname_inp` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  238. `fname_out` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  239. `param` [LLamaModelQuantizeParams*](./llama.native.llamamodelquantizeparams*.md)<br>
  240. #### Returns
  241. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  242. Returns 0 on success
  243. **Remarks:**
  244. not great API - very likely to change
  245. ### **llama_sample_classifier_free_guidance(SafeLLamaContextHandle, LLamaTokenDataArrayNative, SafeLLamaContextHandle, Single)**
  246. Apply classifier-free guidance to the logits as described in academic paper "Stay on topic with Classifier-Free Guidance" https://arxiv.org/abs/2306.17806
  247. ```csharp
  248. public static void llama_sample_classifier_free_guidance(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative candidates, SafeLLamaContextHandle guidanceCtx, float scale)
  249. ```
  250. #### Parameters
  251. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  252. `candidates` [LLamaTokenDataArrayNative](./llama.native.llamatokendataarraynative.md)<br>
  253. A vector of `llama_token_data` containing the candidate tokens, the logits must be directly extracted from the original generation context without being sorted.
  254. `guidanceCtx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  255. A separate context from the same model. Other than a negative prompt at the beginning, it should have all generated and user input tokens copied from the main context.
  256. `scale` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  257. Guidance strength. 1.0f means no guidance. Higher values mean stronger guidance.
  258. ### **llama_sample_repetition_penalty(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Int32*, UInt64, Single)**
  259. Repetition penalty described in CTRL academic paper https://arxiv.org/abs/1909.05858, with negative logit fix.
  260. ```csharp
  261. public static void llama_sample_repetition_penalty(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, Int32* last_tokens, ulong last_tokens_size, float penalty)
  262. ```
  263. #### Parameters
  264. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  265. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  266. Pointer to LLamaTokenDataArray
  267. `last_tokens` [Int32*](https://docs.microsoft.com/en-us/dotnet/api/system.int32*)<br>
  268. `last_tokens_size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  269. `penalty` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  270. ### **llama_sample_frequency_and_presence_penalties(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Int32*, UInt64, Single, Single)**
  271. Frequency and presence penalties described in OpenAI API https://platform.openai.com/docs/api-reference/parameter-details.
  272. ```csharp
  273. public static void llama_sample_frequency_and_presence_penalties(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, Int32* last_tokens, ulong last_tokens_size, float alpha_frequency, float alpha_presence)
  274. ```
  275. #### Parameters
  276. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  277. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  278. Pointer to LLamaTokenDataArray
  279. `last_tokens` [Int32*](https://docs.microsoft.com/en-us/dotnet/api/system.int32*)<br>
  280. `last_tokens_size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  281. `alpha_frequency` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  282. `alpha_presence` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  283. ### **llama_sample_classifier_free_guidance(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, SafeLLamaContextHandle, Single)**
  284. Apply classifier-free guidance to the logits as described in academic paper "Stay on topic with Classifier-Free Guidance" https://arxiv.org/abs/2306.17806
  285. ```csharp
  286. public static void llama_sample_classifier_free_guidance(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, SafeLLamaContextHandle guidance_ctx, float scale)
  287. ```
  288. #### Parameters
  289. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  290. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  291. A vector of `llama_token_data` containing the candidate tokens, the logits must be directly extracted from the original generation context without being sorted.
  292. `guidance_ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  293. A separate context from the same model. Other than a negative prompt at the beginning, it should have all generated and user input tokens copied from the main context.
  294. `scale` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  295. Guidance strength. 1.0f means no guidance. Higher values mean stronger guidance.
  296. ### **llama_sample_softmax(SafeLLamaContextHandle, LLamaTokenDataArrayNative&)**
  297. Sorts candidate tokens by their logits in descending order and calculate probabilities based on logits.
  298. ```csharp
  299. public static void llama_sample_softmax(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates)
  300. ```
  301. #### Parameters
  302. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  303. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  304. Pointer to LLamaTokenDataArray
  305. ### **llama_sample_top_k(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Int32, UInt64)**
  306. Top-K sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
  307. ```csharp
  308. public static void llama_sample_top_k(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, int k, ulong min_keep)
  309. ```
  310. #### Parameters
  311. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  312. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  313. Pointer to LLamaTokenDataArray
  314. `k` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  315. `min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  316. ### **llama_sample_top_p(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Single, UInt64)**
  317. Nucleus sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
  318. ```csharp
  319. public static void llama_sample_top_p(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, float p, ulong min_keep)
  320. ```
  321. #### Parameters
  322. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  323. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  324. Pointer to LLamaTokenDataArray
  325. `p` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  326. `min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  327. ### **llama_sample_tail_free(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Single, UInt64)**
  328. Tail Free Sampling described in https://www.trentonbricken.com/Tail-Free-Sampling/.
  329. ```csharp
  330. public static void llama_sample_tail_free(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, float z, ulong min_keep)
  331. ```
  332. #### Parameters
  333. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  334. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  335. Pointer to LLamaTokenDataArray
  336. `z` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  337. `min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  338. ### **llama_sample_typical(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Single, UInt64)**
  339. Locally Typical Sampling implementation described in the paper https://arxiv.org/abs/2202.00666.
  340. ```csharp
  341. public static void llama_sample_typical(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, float p, ulong min_keep)
  342. ```
  343. #### Parameters
  344. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  345. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  346. Pointer to LLamaTokenDataArray
  347. `p` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  348. `min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  349. ### **llama_sample_temperature(SafeLLamaContextHandle, LLamaTokenDataArrayNative&, Single)**
  350. Modify logits by temperature
  351. ```csharp
  352. public static void llama_sample_temperature(SafeLLamaContextHandle ctx, LLamaTokenDataArrayNative& candidates, float temp)
  353. ```
  354. #### Parameters
  355. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  356. `candidates` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
  357. `temp` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
  358. ### **llama_empty_call()**
  359. A method that does nothing. This is a native method, calling it will force the llama native dependencies to be loaded.
  360. ```csharp
  361. public static bool llama_empty_call()
  362. ```
  363. #### Returns
  364. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  365. ### **llama_context_default_params()**
  366. Create a LLamaContextParams with default values
  367. ```csharp
  368. public static LLamaContextParams llama_context_default_params()
  369. ```
  370. #### Returns
  371. [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
  372. ### **llama_model_quantize_default_params()**
  373. Create a LLamaModelQuantizeParams with default values
  374. ```csharp
  375. public static LLamaModelQuantizeParams llama_model_quantize_default_params()
  376. ```
  377. #### Returns
  378. [LLamaModelQuantizeParams](./llama.native.llamamodelquantizeparams.md)<br>
  379. ### **llama_mmap_supported()**
  380. Check if memory mapping is supported
  381. ```csharp
  382. public static bool llama_mmap_supported()
  383. ```
  384. #### Returns
  385. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  386. ### **llama_mlock_supported()**
  387. Check if memory lockingis supported
  388. ```csharp
  389. public static bool llama_mlock_supported()
  390. ```
  391. #### Returns
  392. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  393. ### **llama_eval_export(SafeLLamaContextHandle, String)**
  394. Export a static computation graph for context of 511 and batch size of 1
  395. NOTE: since this functionality is mostly for debugging and demonstration purposes, we hardcode these
  396. parameters here to keep things simple
  397. IMPORTANT: do not use for anything else other than debugging and testing!
  398. ```csharp
  399. public static int llama_eval_export(SafeLLamaContextHandle ctx, string fname)
  400. ```
  401. #### Parameters
  402. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  403. `fname` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  404. #### Returns
  405. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  406. ### **llama_load_model_from_file(String, LLamaContextParams)**
  407. Various functions for loading a ggml llama model.
  408. Allocate (almost) all memory needed for the model.
  409. Return NULL on failure
  410. ```csharp
  411. public static IntPtr llama_load_model_from_file(string path_model, LLamaContextParams params)
  412. ```
  413. #### Parameters
  414. `path_model` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  415. `params` [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
  416. #### Returns
  417. [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  418. ### **llama_new_context_with_model(SafeLlamaModelHandle, LLamaContextParams)**
  419. Create a new llama_context with the given model.
  420. Return value should always be wrapped in SafeLLamaContextHandle!
  421. ```csharp
  422. public static IntPtr llama_new_context_with_model(SafeLlamaModelHandle model, LLamaContextParams params)
  423. ```
  424. #### Parameters
  425. `model` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
  426. `params` [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
  427. #### Returns
  428. [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  429. ### **llama_backend_init(Boolean)**
  430. not great API - very likely to change.
  431. Initialize the llama + ggml backend
  432. Call once at the start of the program
  433. ```csharp
  434. public static void llama_backend_init(bool numa)
  435. ```
  436. #### Parameters
  437. `numa` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  438. ### **llama_free(IntPtr)**
  439. Frees all allocated memory in the given llama_context
  440. ```csharp
  441. public static void llama_free(IntPtr ctx)
  442. ```
  443. #### Parameters
  444. `ctx` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  445. ### **llama_free_model(IntPtr)**
  446. Frees all allocated memory associated with a model
  447. ```csharp
  448. public static void llama_free_model(IntPtr model)
  449. ```
  450. #### Parameters
  451. `model` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
  452. ### **llama_model_apply_lora_from_file(SafeLlamaModelHandle, String, String, Int32)**
  453. Apply a LoRA adapter to a loaded model
  454. path_base_model is the path to a higher quality model to use as a base for
  455. the layers modified by the adapter. Can be NULL to use the current loaded model.
  456. The model needs to be reloaded before applying a new adapter, otherwise the adapter
  457. will be applied on top of the previous one
  458. ```csharp
  459. public static int llama_model_apply_lora_from_file(SafeLlamaModelHandle model_ptr, string path_lora, string path_base_model, int n_threads)
  460. ```
  461. #### Parameters
  462. `model_ptr` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
  463. `path_lora` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  464. `path_base_model` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  465. `n_threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  466. #### Returns
  467. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  468. Returns 0 on success
  469. ### **llama_get_kv_cache_token_count(SafeLLamaContextHandle)**
  470. Returns the number of tokens in the KV cache
  471. ```csharp
  472. public static int llama_get_kv_cache_token_count(SafeLLamaContextHandle ctx)
  473. ```
  474. #### Parameters
  475. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  476. #### Returns
  477. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  478. ### **llama_set_rng_seed(SafeLLamaContextHandle, Int32)**
  479. Sets the current rng seed.
  480. ```csharp
  481. public static void llama_set_rng_seed(SafeLLamaContextHandle ctx, int seed)
  482. ```
  483. #### Parameters
  484. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  485. `seed` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  486. ### **llama_get_state_size(SafeLLamaContextHandle)**
  487. Returns the maximum size in bytes of the state (rng, logits, embedding
  488. and kv_cache) - will often be smaller after compacting tokens
  489. ```csharp
  490. public static ulong llama_get_state_size(SafeLLamaContextHandle ctx)
  491. ```
  492. #### Parameters
  493. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  494. #### Returns
  495. [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  496. ### **llama_copy_state_data(SafeLLamaContextHandle, Byte*)**
  497. Copies the state to the specified destination address.
  498. Destination needs to have allocated enough memory.
  499. ```csharp
  500. public static ulong llama_copy_state_data(SafeLLamaContextHandle ctx, Byte* dest)
  501. ```
  502. #### Parameters
  503. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  504. `dest` [Byte*](https://docs.microsoft.com/en-us/dotnet/api/system.byte*)<br>
  505. #### Returns
  506. [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  507. the number of bytes copied
  508. ### **llama_copy_state_data(SafeLLamaContextHandle, Byte[])**
  509. Copies the state to the specified destination address.
  510. Destination needs to have allocated enough memory (see llama_get_state_size)
  511. ```csharp
  512. public static ulong llama_copy_state_data(SafeLLamaContextHandle ctx, Byte[] dest)
  513. ```
  514. #### Parameters
  515. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  516. `dest` [Byte[]](https://docs.microsoft.com/en-us/dotnet/api/system.byte)<br>
  517. #### Returns
  518. [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  519. the number of bytes copied
  520. ### **llama_set_state_data(SafeLLamaContextHandle, Byte*)**
  521. Set the state reading from the specified address
  522. ```csharp
  523. public static ulong llama_set_state_data(SafeLLamaContextHandle ctx, Byte* src)
  524. ```
  525. #### Parameters
  526. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  527. `src` [Byte*](https://docs.microsoft.com/en-us/dotnet/api/system.byte*)<br>
  528. #### Returns
  529. [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  530. the number of bytes read
  531. ### **llama_set_state_data(SafeLLamaContextHandle, Byte[])**
  532. Set the state reading from the specified address
  533. ```csharp
  534. public static ulong llama_set_state_data(SafeLLamaContextHandle ctx, Byte[] src)
  535. ```
  536. #### Parameters
  537. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  538. `src` [Byte[]](https://docs.microsoft.com/en-us/dotnet/api/system.byte)<br>
  539. #### Returns
  540. [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  541. the number of bytes read
  542. ### **llama_load_session_file(SafeLLamaContextHandle, String, Int32[], UInt64, UInt64*)**
  543. Load session file
  544. ```csharp
  545. public static bool llama_load_session_file(SafeLLamaContextHandle ctx, string path_session, Int32[] tokens_out, ulong n_token_capacity, UInt64* n_token_count_out)
  546. ```
  547. #### Parameters
  548. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  549. `path_session` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  550. `tokens_out` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  551. `n_token_capacity` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  552. `n_token_count_out` [UInt64*](https://docs.microsoft.com/en-us/dotnet/api/system.uint64*)<br>
  553. #### Returns
  554. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  555. ### **llama_save_session_file(SafeLLamaContextHandle, String, Int32[], UInt64)**
  556. Save session file
  557. ```csharp
  558. public static bool llama_save_session_file(SafeLLamaContextHandle ctx, string path_session, Int32[] tokens, ulong n_token_count)
  559. ```
  560. #### Parameters
  561. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  562. `path_session` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  563. `tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  564. `n_token_count` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
  565. #### Returns
  566. [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  567. ### **llama_eval(SafeLLamaContextHandle, Int32[], Int32, Int32, Int32)**
  568. Run the llama inference to obtain the logits and probabilities for the next token.
  569. tokens + n_tokens is the provided batch of new tokens to process
  570. n_past is the number of tokens to use from previous eval calls
  571. ```csharp
  572. public static int llama_eval(SafeLLamaContextHandle ctx, Int32[] tokens, int n_tokens, int n_past, int n_threads)
  573. ```
  574. #### Parameters
  575. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  576. `tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  577. `n_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  578. `n_past` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  579. `n_threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  580. #### Returns
  581. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  582. Returns 0 on success
  583. ### **llama_eval_with_pointer(SafeLLamaContextHandle, Int32*, Int32, Int32, Int32)**
  584. Run the llama inference to obtain the logits and probabilities for the next token.
  585. tokens + n_tokens is the provided batch of new tokens to process
  586. n_past is the number of tokens to use from previous eval calls
  587. ```csharp
  588. public static int llama_eval_with_pointer(SafeLLamaContextHandle ctx, Int32* tokens, int n_tokens, int n_past, int n_threads)
  589. ```
  590. #### Parameters
  591. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  592. `tokens` [Int32*](https://docs.microsoft.com/en-us/dotnet/api/system.int32*)<br>
  593. `n_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  594. `n_past` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  595. `n_threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  596. #### Returns
  597. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  598. Returns 0 on success
  599. ### **llama_tokenize(SafeLLamaContextHandle, String, Encoding, Int32[], Int32, Boolean)**
  600. Convert the provided text into tokens.
  601. ```csharp
  602. public static int llama_tokenize(SafeLLamaContextHandle ctx, string text, Encoding encoding, Int32[] tokens, int n_max_tokens, bool add_bos)
  603. ```
  604. #### Parameters
  605. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  606. `text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
  607. `encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
  608. `tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  609. `n_max_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  610. `add_bos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  611. #### Returns
  612. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  613. Returns the number of tokens on success, no more than n_max_tokens.
  614. Returns a negative number on failure - the number of tokens that would have been returned
  615. ### **llama_tokenize_native(SafeLLamaContextHandle, Byte*, Int32*, Int32, Boolean)**
  616. Convert the provided text into tokens.
  617. ```csharp
  618. public static int llama_tokenize_native(SafeLLamaContextHandle ctx, Byte* text, Int32* tokens, int n_max_tokens, bool add_bos)
  619. ```
  620. #### Parameters
  621. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  622. `text` [Byte*](https://docs.microsoft.com/en-us/dotnet/api/system.byte*)<br>
  623. `tokens` [Int32*](https://docs.microsoft.com/en-us/dotnet/api/system.int32*)<br>
  624. `n_max_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  625. `add_bos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
  626. #### Returns
  627. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  628. Returns the number of tokens on success, no more than n_max_tokens.
  629. Returns a negative number on failure - the number of tokens that would have been returned
  630. ### **llama_n_vocab(SafeLLamaContextHandle)**
  631. Get the number of tokens in the model vocabulary for this context
  632. ```csharp
  633. public static int llama_n_vocab(SafeLLamaContextHandle ctx)
  634. ```
  635. #### Parameters
  636. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  637. #### Returns
  638. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  639. ### **llama_n_ctx(SafeLLamaContextHandle)**
  640. Get the size of the context window for the model for this context
  641. ```csharp
  642. public static int llama_n_ctx(SafeLLamaContextHandle ctx)
  643. ```
  644. #### Parameters
  645. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  646. #### Returns
  647. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  648. ### **llama_n_embd(SafeLLamaContextHandle)**
  649. Get the dimension of embedding vectors from the model for this context
  650. ```csharp
  651. public static int llama_n_embd(SafeLLamaContextHandle ctx)
  652. ```
  653. #### Parameters
  654. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  655. #### Returns
  656. [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
  657. ### **llama_get_logits(SafeLLamaContextHandle)**
  658. Token logits obtained from the last call to llama_eval()
  659. The logits for the last token are stored in the last row
  660. Can be mutated in order to change the probabilities of the next token.<br>
  661. Rows: n_tokens<br>
  662. Cols: n_vocab
  663. ```csharp
  664. public static Single* llama_get_logits(SafeLLamaContextHandle ctx)
  665. ```
  666. #### Parameters
  667. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  668. #### Returns
  669. [Single*](https://docs.microsoft.com/en-us/dotnet/api/system.single*)<br>
  670. ### **llama_get_embeddings(SafeLLamaContextHandle)**
  671. Get the embeddings for the input
  672. shape: [n_embd] (1-dimensional)
  673. ```csharp
  674. public static Single* llama_get_embeddings(SafeLLamaContextHandle ctx)
  675. ```
  676. #### Parameters
  677. `ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
  678. #### Returns
  679. [Single*](https://docs.microsoft.com/en-us/dotnet/api/system.single*)<br>