You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

index.html 69 kB


  1. <!doctype html>
  2. <html lang="en" class="no-js">
  3. <head>
  4. <meta charset="utf-8">
  5. <meta name="viewport" content="width=device-width,initial-scale=1">
  6. <link rel="prev" href="../NativeLibraryConfig/">
  7. <link rel="next" href="../ChatSession/">
  8. <link rel="icon" href="../../assets/images/favicon.png">
  9. <meta name="generator" content="mkdocs-1.4.3, mkdocs-material-9.1.20">
  10. <title>Use executors - LLamaSharp Documentation</title>
  11. <link rel="stylesheet" href="../../assets/stylesheets/main.eebd395e.min.css">
  12. <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
  13. <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,300i,400,400i,700,700i%7CRoboto+Mono:400,400i,700,700i&display=fallback">
  14. <style>:root{--md-text-font:"Roboto";--md-code-font:"Roboto Mono"}</style>
  15. <script>__md_scope=new URL("../..",location),__md_hash=e=>[...e].reduce((e,_)=>(e<<5)-e+_.charCodeAt(0),0),__md_get=(e,_=localStorage,t=__md_scope)=>JSON.parse(_.getItem(t.pathname+"."+e)),__md_set=(e,_,t=localStorage,a=__md_scope)=>{try{t.setItem(a.pathname+"."+e,JSON.stringify(_))}catch(e){}}</script>
  16. </head>
  17. <body dir="ltr">
  18. <script>var palette=__md_get("__palette");if(palette&&"object"==typeof palette.color)for(var key of Object.keys(palette.color))document.body.setAttribute("data-md-color-"+key,palette.color[key])</script>
  19. <input class="md-toggle" data-md-toggle="drawer" type="checkbox" id="__drawer" autocomplete="off">
  20. <input class="md-toggle" data-md-toggle="search" type="checkbox" id="__search" autocomplete="off">
  21. <label class="md-overlay" for="__drawer"></label>
  22. <div data-md-component="skip">
  23. <a href="#llamasharp-executors" class="md-skip">
  24. Skip to content
  25. </a>
  26. </div>
  27. <div data-md-component="announce">
  28. </div>
  29. <div data-md-color-scheme="default" data-md-component="outdated" hidden>
  30. </div>
  31. <header class="md-header md-header--shadow" data-md-component="header">
  32. <nav class="md-header__inner md-grid" aria-label="Header">
  33. <a href="../.." title="LLamaSharp Documentation" class="md-header__button md-logo" aria-label="LLamaSharp Documentation" data-md-component="logo">
  34. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M12 8a3 3 0 0 0 3-3 3 3 0 0 0-3-3 3 3 0 0 0-3 3 3 3 0 0 0 3 3m0 3.54C9.64 9.35 6.5 8 3 8v11c3.5 0 6.64 1.35 9 3.54 2.36-2.19 5.5-3.54 9-3.54V8c-3.5 0-6.64 1.35-9 3.54Z"/></svg>
  35. </a>
  36. <label class="md-header__button md-icon" for="__drawer">
  37. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M3 6h18v2H3V6m0 5h18v2H3v-2m0 5h18v2H3v-2Z"/></svg>
  38. </label>
  39. <div class="md-header__title" data-md-component="header-title">
  40. <div class="md-header__ellipsis">
  41. <div class="md-header__topic">
  42. <span class="md-ellipsis">
  43. LLamaSharp Documentation
  44. </span>
  45. </div>
  46. <div class="md-header__topic" data-md-component="header-topic">
  47. <span class="md-ellipsis">
  48. Use executors
  49. </span>
  50. </div>
  51. </div>
  52. </div>
  53. <label class="md-header__button md-icon" for="__search">
  54. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M9.5 3A6.5 6.5 0 0 1 16 9.5c0 1.61-.59 3.09-1.56 4.23l.27.27h.79l5 5-1.5 1.5-5-5v-.79l-.27-.27A6.516 6.516 0 0 1 9.5 16 6.5 6.5 0 0 1 3 9.5 6.5 6.5 0 0 1 9.5 3m0 2C7 5 5 7 5 9.5S7 14 9.5 14 14 12 14 9.5 12 5 9.5 5Z"/></svg>
  55. </label>
  56. <div class="md-search" data-md-component="search" role="dialog">
  57. <label class="md-search__overlay" for="__search"></label>
  58. <div class="md-search__inner" role="search">
  59. <form class="md-search__form" name="search">
  60. <input type="text" class="md-search__input" name="query" aria-label="Search" placeholder="Search" autocapitalize="off" autocorrect="off" autocomplete="off" spellcheck="false" data-md-component="search-query" required>
  61. <label class="md-search__icon md-icon" for="__search">
  62. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M9.5 3A6.5 6.5 0 0 1 16 9.5c0 1.61-.59 3.09-1.56 4.23l.27.27h.79l5 5-1.5 1.5-5-5v-.79l-.27-.27A6.516 6.516 0 0 1 9.5 16 6.5 6.5 0 0 1 3 9.5 6.5 6.5 0 0 1 9.5 3m0 2C7 5 5 7 5 9.5S7 14 9.5 14 14 12 14 9.5 12 5 9.5 5Z"/></svg>
  63. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M20 11v2H8l5.5 5.5-1.42 1.42L4.16 12l7.92-7.92L13.5 5.5 8 11h12Z"/></svg>
  64. </label>
  65. <nav class="md-search__options" aria-label="Search">
  66. <button type="reset" class="md-search__icon md-icon" title="Clear" aria-label="Clear" tabindex="-1">
  67. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M19 6.41 17.59 5 12 10.59 6.41 5 5 6.41 10.59 12 5 17.59 6.41 19 12 13.41 17.59 19 19 17.59 13.41 12 19 6.41Z"/></svg>
  68. </button>
  69. </nav>
  70. </form>
  71. <div class="md-search__output">
  72. <div class="md-search__scrollwrap" data-md-scrollfix>
  73. <div class="md-search-result" data-md-component="search-result">
  74. <div class="md-search-result__meta">
  75. Initializing search
  76. </div>
  77. <ol class="md-search-result__list" role="presentation"></ol>
  78. </div>
  79. </div>
  80. </div>
  81. </div>
  82. </div>
  83. </nav>
  84. </header>
  85. <div class="md-container" data-md-component="container">
  86. <main class="md-main" data-md-component="main">
  87. <div class="md-main__inner md-grid">
  88. <div class="md-sidebar md-sidebar--primary" data-md-component="sidebar" data-md-type="navigation" >
  89. <div class="md-sidebar__scrollwrap">
  90. <div class="md-sidebar__inner">
  91. <nav class="md-nav md-nav--primary" aria-label="Navigation" data-md-level="0">
  92. <label class="md-nav__title" for="__drawer">
  93. <a href="../.." title="LLamaSharp Documentation" class="md-nav__button md-logo" aria-label="LLamaSharp Documentation" data-md-component="logo">
  94. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M12 8a3 3 0 0 0 3-3 3 3 0 0 0-3-3 3 3 0 0 0-3 3 3 3 0 0 0 3 3m0 3.54C9.64 9.35 6.5 8 3 8v11c3.5 0 6.64 1.35 9 3.54 2.36-2.19 5.5-3.54 9-3.54V8c-3.5 0-6.64 1.35-9 3.54Z"/></svg>
  95. </a>
  96. LLamaSharp Documentation
  97. </label>
  98. <ul class="md-nav__list" data-md-scrollfix>
  99. <li class="md-nav__item">
  100. <a href="../.." class="md-nav__link">
  101. Overview
  102. </a>
  103. </li>
  104. <li class="md-nav__item">
  105. <a href="../../QuickStart/" class="md-nav__link">
  106. Quick Start
  107. </a>
  108. </li>
  109. <li class="md-nav__item">
  110. <a href="../../Architecture/" class="md-nav__link">
  111. Architecture
  112. </a>
  113. </li>
  114. <li class="md-nav__item">
  115. <a href="../../FAQ/" class="md-nav__link">
  116. FAQ
  117. </a>
  118. </li>
  119. <li class="md-nav__item">
  120. <a href="../../ContributingGuide/" class="md-nav__link">
  121. Contributing Guide
  122. </a>
  123. </li>
  124. <li class="md-nav__item md-nav__item--active md-nav__item--nested">
  125. <input class="md-nav__toggle md-toggle " type="checkbox" id="__nav_6" checked>
  126. <label class="md-nav__link" for="__nav_6" id="__nav_6_label" tabindex="0">
  127. Tutorials
  128. <span class="md-nav__icon md-icon"></span>
  129. </label>
  130. <nav class="md-nav" data-md-level="1" aria-labelledby="__nav_6_label" aria-expanded="true">
  131. <label class="md-nav__title" for="__nav_6">
  132. <span class="md-nav__icon md-icon"></span>
  133. Tutorials
  134. </label>
  135. <ul class="md-nav__list" data-md-scrollfix>
  136. <li class="md-nav__item">
  137. <a href="../NativeLibraryConfig/" class="md-nav__link">
  138. Customize the native library loading
  139. </a>
  140. </li>
  141. <li class="md-nav__item md-nav__item--active">
  142. <input class="md-nav__toggle md-toggle" type="checkbox" id="__toc">
  143. <label class="md-nav__link md-nav__link--active" for="__toc">
  144. Use executors
  145. <span class="md-nav__icon md-icon"></span>
  146. </label>
  147. <a href="./" class="md-nav__link md-nav__link--active">
  148. Use executors
  149. </a>
  150. <nav class="md-nav md-nav--secondary" aria-label="Table of contents">
  151. <label class="md-nav__title" for="__toc">
  152. <span class="md-nav__icon md-icon"></span>
  153. Table of contents
  154. </label>
  155. <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
  156. <li class="md-nav__item">
  157. <a href="#text-to-text-apis-of-the-executors" class="md-nav__link">
  158. Text-to-Text APIs of the executors
  159. </a>
  160. </li>
  161. <li class="md-nav__item">
  162. <a href="#interactiveexecutor-instructexecutor" class="md-nav__link">
  163. InteractiveExecutor &amp; InstructExecutor
  164. </a>
  165. </li>
  166. <li class="md-nav__item">
  167. <a href="#statelessexecutor" class="md-nav__link">
  168. StatelessExecutor.
  169. </a>
  170. </li>
  171. <li class="md-nav__item">
  172. <a href="#batchedexecutor" class="md-nav__link">
  173. BatchedExecutor
  174. </a>
  175. </li>
  176. <li class="md-nav__item">
  177. <a href="#inference-parameters" class="md-nav__link">
  178. Inference parameters
  179. </a>
  180. </li>
  181. <li class="md-nav__item">
  182. <a href="#save-and-load-executor-state" class="md-nav__link">
  183. Save and load executor state
  184. </a>
  185. </li>
  186. </ul>
  187. </nav>
  188. </li>
  189. <li class="md-nav__item">
  190. <a href="../ChatSession/" class="md-nav__link">
  191. Use ChatSession
  192. </a>
  193. </li>
  194. <li class="md-nav__item">
  195. <a href="../UnderstandLLamaContext/" class="md-nav__link">
  196. Understand LLamaContext
  197. </a>
  198. </li>
  199. <li class="md-nav__item">
  200. <a href="../GetEmbeddings/" class="md-nav__link">
  201. Get embeddings
  202. </a>
  203. </li>
  204. <li class="md-nav__item">
  205. <a href="../Quantization/" class="md-nav__link">
  206. Quantize the model
  207. </a>
  208. </li>
  209. </ul>
  210. </nav>
  211. </li>
  212. <li class="md-nav__item md-nav__item--nested">
  213. <input class="md-nav__toggle md-toggle " type="checkbox" id="__nav_7" >
  214. <label class="md-nav__link" for="__nav_7" id="__nav_7_label" tabindex="0">
  215. Integrations
  216. <span class="md-nav__icon md-icon"></span>
  217. </label>
  218. <nav class="md-nav" data-md-level="1" aria-labelledby="__nav_7_label" aria-expanded="false">
  219. <label class="md-nav__title" for="__nav_7">
  220. <span class="md-nav__icon md-icon"></span>
  221. Integrations
  222. </label>
  223. <ul class="md-nav__list" data-md-scrollfix>
  224. <li class="md-nav__item">
  225. <a href="../../Integrations/semantic-kernel/" class="md-nav__link">
  226. semantic-kernel integration
  227. </a>
  228. </li>
  229. <li class="md-nav__item">
  230. <a href="../../Integrations/kernel-memory/" class="md-nav__link">
  231. kernel-memory integration
  232. </a>
  233. </li>
  234. <li class="md-nav__item">
  235. <a href="../../Integrations/BotSharp.md" class="md-nav__link">
  236. BotSharp integration
  237. </a>
  238. </li>
  239. <li class="md-nav__item">
  240. <a href="../../Integrations/Langchain.md" class="md-nav__link">
  241. Langchain integration
  242. </a>
  243. </li>
  244. </ul>
  245. </nav>
  246. </li>
  247. <li class="md-nav__item md-nav__item--nested">
  248. <input class="md-nav__toggle md-toggle " type="checkbox" id="__nav_8" >
  249. <label class="md-nav__link" for="__nav_8" id="__nav_8_label" tabindex="0">
  250. Examples
  251. <span class="md-nav__icon md-icon"></span>
  252. </label>
  253. <nav class="md-nav" data-md-level="1" aria-labelledby="__nav_8_label" aria-expanded="false">
  254. <label class="md-nav__title" for="__nav_8">
  255. <span class="md-nav__icon md-icon"></span>
  256. Examples
  257. </label>
  258. <ul class="md-nav__list" data-md-scrollfix>
  259. <li class="md-nav__item">
  260. <a href="../../Examples/BatchedExecutorFork/" class="md-nav__link">
  261. Bacthed executor - multi-output to one input
  262. </a>
  263. </li>
  264. <li class="md-nav__item">
  265. <a href="../../Examples/BatchedExecutorGuidance/" class="md-nav__link">
  266. Batched executor - basic guidance
  267. </a>
  268. </li>
  269. <li class="md-nav__item">
  270. <a href="../../Examples/BatchedExecutorRewind/" class="md-nav__link">
  271. Batched executor - rewinding to an earlier state
  272. </a>
  273. </li>
  274. <li class="md-nav__item">
  275. <a href="../../Examples/ChatChineseGB2312/" class="md-nav__link">
  276. Chinese LLM - with GB2312 encoding
  277. </a>
  278. </li>
  279. <li class="md-nav__item">
  280. <a href="../../Examples/ChatSessionStripRoleName/" class="md-nav__link">
  281. ChatSession - stripping role names
  282. </a>
  283. </li>
  284. <li class="md-nav__item">
  285. <a href="../../Examples/ChatSessionWithHistory/" class="md-nav__link">
  286. ChatSession - with history
  287. </a>
  288. </li>
  289. <li class="md-nav__item">
  290. <a href="../../Examples/ChatSessionWithRestart/" class="md-nav__link">
  291. ChatSession - restarting
  292. </a>
  293. </li>
  294. <li class="md-nav__item">
  295. <a href="../../Examples/ChatSessionWithRoleName/" class="md-nav__link">
  296. ChatSession - Basic
  297. </a>
  298. </li>
  299. <li class="md-nav__item">
  300. <a href="../../Examples/CodingAssistant/" class="md-nav__link">
  301. Coding assistant
  302. </a>
  303. </li>
  304. <li class="md-nav__item">
  305. <a href="../../Examples/GetEmbeddings/" class="md-nav__link">
  306. Get embeddings
  307. </a>
  308. </li>
  309. <li class="md-nav__item">
  310. <a href="../../Examples/GrammarJsonResponse/" class="md-nav__link">
  311. Grammar - json response
  312. </a>
  313. </li>
  314. <li class="md-nav__item">
  315. <a href="../../Examples/InstructModeExecute/" class="md-nav__link">
  316. Instruct executor - basic
  317. </a>
  318. </li>
  319. <li class="md-nav__item">
  320. <a href="../../Examples/InteractiveModeExecute/" class="md-nav__link">
  321. Interactive executor - basic
  322. </a>
  323. </li>
  324. <li class="md-nav__item">
  325. <a href="../../Examples/KernelMemory/" class="md-nav__link">
  326. Kernel memory integration - basic
  327. </a>
  328. </li>
  329. <li class="md-nav__item">
  330. <a href="../../Examples/KernelMemorySaveAndLoad/" class="md-nav__link">
  331. Kernel-memory - save & load
  332. </a>
  333. </li>
  334. <li class="md-nav__item">
  335. <a href="../../Examples/LLavaInteractiveModeExecute/" class="md-nav__link">
  336. LLaVA - basic
  337. </a>
  338. </li>
  339. <li class="md-nav__item">
  340. <a href="../../Examples/LoadAndSaveSession/" class="md-nav__link">
  341. ChatSession - load & save
  342. </a>
  343. </li>
  344. <li class="md-nav__item">
  345. <a href="../../Examples/LoadAndSaveState/" class="md-nav__link">
  346. Executor - save/load state
  347. </a>
  348. </li>
  349. <li class="md-nav__item">
  350. <a href="../../Examples/QuantizeModel/" class="md-nav__link">
  351. Quantization
  352. </a>
  353. </li>
  354. <li class="md-nav__item">
  355. <a href="../../Examples/SemanticKernelChat/" class="md-nav__link">
  356. Semantic-kernel - chat
  357. </a>
  358. </li>
  359. <li class="md-nav__item">
  360. <a href="../../Examples/SemanticKernelMemory/" class="md-nav__link">
  361. Semantic-kernel - with kernel-memory
  362. </a>
  363. </li>
  364. <li class="md-nav__item">
  365. <a href="../../Examples/SemanticKernelPrompt/" class="md-nav__link">
  366. Semantic-kernel - basic
  367. </a>
  368. </li>
  369. <li class="md-nav__item">
  370. <a href="../../Examples/StatelessModeExecute/" class="md-nav__link">
  371. Stateless executor
  372. </a>
  373. </li>
  374. <li class="md-nav__item">
  375. <a href="../../Examples/TalkToYourself/" class="md-nav__link">
  376. Talk to yourself
  377. </a>
  378. </li>
  379. </ul>
  380. </nav>
  381. </li>
  382. <li class="md-nav__item md-nav__item--nested">
  383. <input class="md-nav__toggle md-toggle " type="checkbox" id="__nav_9" >
  384. <label class="md-nav__link" for="__nav_9" id="__nav_9_label" tabindex="0">
  385. API Reference
  386. <span class="md-nav__icon md-icon"></span>
  387. </label>
  388. <nav class="md-nav" data-md-level="1" aria-labelledby="__nav_9_label" aria-expanded="false">
  389. <label class="md-nav__title" for="__nav_9">
  390. <span class="md-nav__icon md-icon"></span>
  391. API Reference
  392. </label>
  393. <ul class="md-nav__list" data-md-scrollfix>
  394. <li class="md-nav__item">
  395. <a href="../../xmldocs/" class="md-nav__link">
  396. index
  397. </a>
  398. </li>
  399. <li class="md-nav__item">
  400. <a href="../../xmldocs/llama.abstractions.adaptercollection/" class="md-nav__link">
  401. llama.abstractions.adaptercollection
  402. </a>
  403. </li>
  404. <li class="md-nav__item">
  405. <a href="../../xmldocs/llama.abstractions.icontextparams/" class="md-nav__link">
  406. llama.abstractions.icontextparams
  407. </a>
  408. </li>
  409. <li class="md-nav__item">
  410. <a href="../../xmldocs/llama.abstractions.ihistorytransform/" class="md-nav__link">
  411. llama.abstractions.ihistorytransform
  412. </a>
  413. </li>
  414. <li class="md-nav__item">
  415. <a href="../../xmldocs/llama.abstractions.iinferenceparams/" class="md-nav__link">
  416. llama.abstractions.iinferenceparams
  417. </a>
  418. </li>
  419. <li class="md-nav__item">
  420. <a href="../../xmldocs/llama.abstractions.illamaexecutor/" class="md-nav__link">
  421. llama.abstractions.illamaexecutor
  422. </a>
  423. </li>
  424. <li class="md-nav__item">
  425. <a href="../../xmldocs/llama.abstractions.illamaparams/" class="md-nav__link">
  426. llama.abstractions.illamaparams
  427. </a>
  428. </li>
  429. <li class="md-nav__item">
  430. <a href="../../xmldocs/llama.abstractions.imodelparams/" class="md-nav__link">
  431. llama.abstractions.imodelparams
  432. </a>
  433. </li>
  434. <li class="md-nav__item">
  435. <a href="../../xmldocs/llama.abstractions.itextstreamtransform/" class="md-nav__link">
  436. llama.abstractions.itextstreamtransform
  437. </a>
  438. </li>
  439. <li class="md-nav__item">
  440. <a href="../../xmldocs/llama.abstractions.itexttransform/" class="md-nav__link">
  441. llama.abstractions.itexttransform
  442. </a>
  443. </li>
  444. <li class="md-nav__item">
  445. <a href="../../xmldocs/llama.abstractions.loraadapter/" class="md-nav__link">
  446. llama.abstractions.loraadapter
  447. </a>
  448. </li>
  449. <li class="md-nav__item">
  450. <a href="../../xmldocs/llama.abstractions.metadataoverride/" class="md-nav__link">
  451. llama.abstractions.metadataoverride
  452. </a>
  453. </li>
  454. <li class="md-nav__item">
  455. <a href="../../xmldocs/llama.abstractions.metadataoverrideconverter/" class="md-nav__link">
  456. llama.abstractions.metadataoverrideconverter
  457. </a>
  458. </li>
  459. <li class="md-nav__item">
  460. <a href="../../xmldocs/llama.abstractions.tensorsplitscollection/" class="md-nav__link">
  461. llama.abstractions.tensorsplitscollection
  462. </a>
  463. </li>
  464. <li class="md-nav__item">
  465. <a href="../../xmldocs/llama.abstractions.tensorsplitscollectionconverter/" class="md-nav__link">
  466. llama.abstractions.tensorsplitscollectionconverter
  467. </a>
  468. </li>
  469. <li class="md-nav__item">
  470. <a href="../../xmldocs/llama.antipromptprocessor/" class="md-nav__link">
  471. llama.antipromptprocessor
  472. </a>
  473. </li>
  474. <li class="md-nav__item">
  475. <a href="../../xmldocs/llama.batched.alreadypromptedconversationexception/" class="md-nav__link">
  476. llama.batched.alreadypromptedconversationexception
  477. </a>
  478. </li>
  479. <li class="md-nav__item">
  480. <a href="../../xmldocs/llama.batched.batchedexecutor/" class="md-nav__link">
  481. llama.batched.batchedexecutor
  482. </a>
  483. </li>
  484. <li class="md-nav__item">
  485. <a href="../../xmldocs/llama.batched.cannotforkwhilerequiresinferenceexception/" class="md-nav__link">
  486. llama.batched.cannotforkwhilerequiresinferenceexception
  487. </a>
  488. </li>
  489. <li class="md-nav__item">
  490. <a href="../../xmldocs/llama.batched.cannotmodifywhilerequiresinferenceexception/" class="md-nav__link">
  491. llama.batched.cannotmodifywhilerequiresinferenceexception
  492. </a>
  493. </li>
  494. <li class="md-nav__item">
  495. <a href="../../xmldocs/llama.batched.cannotsamplerequiresinferenceexception/" class="md-nav__link">
  496. llama.batched.cannotsamplerequiresinferenceexception
  497. </a>
  498. </li>
  499. <li class="md-nav__item">
  500. <a href="../../xmldocs/llama.batched.cannotsamplerequirespromptexception/" class="md-nav__link">
  501. llama.batched.cannotsamplerequirespromptexception
  502. </a>
  503. </li>
  504. <li class="md-nav__item">
  505. <a href="../../xmldocs/llama.batched.conversation/" class="md-nav__link">
  506. llama.batched.conversation
  507. </a>
  508. </li>
  509. <li class="md-nav__item">
  510. <a href="../../xmldocs/llama.batched.conversationextensions/" class="md-nav__link">
  511. llama.batched.conversationextensions
  512. </a>
  513. </li>
  514. <li class="md-nav__item">
  515. <a href="../../xmldocs/llama.batched.experimentalbatchedexecutorexception/" class="md-nav__link">
  516. llama.batched.experimentalbatchedexecutorexception
  517. </a>
  518. </li>
  519. <li class="md-nav__item">
  520. <a href="../../xmldocs/llama.chatsession-1/" class="md-nav__link">
  521. llama.chatsession-1
  522. </a>
  523. </li>
  524. <li class="md-nav__item">
  525. <a href="../../xmldocs/llama.chatsession/" class="md-nav__link">
  526. llama.chatsession
  527. </a>
  528. </li>
  529. <li class="md-nav__item">
  530. <a href="../../xmldocs/llama.common.authorrole/" class="md-nav__link">
  531. llama.common.authorrole
  532. </a>
  533. </li>
  534. <li class="md-nav__item">
  535. <a href="../../xmldocs/llama.common.chathistory/" class="md-nav__link">
  536. llama.common.chathistory
  537. </a>
  538. </li>
  539. <li class="md-nav__item">
  540. <a href="../../xmldocs/llama.common.fixedsizequeue-1/" class="md-nav__link">
  541. llama.common.fixedsizequeue-1
  542. </a>
  543. </li>
  544. <li class="md-nav__item">
  545. <a href="../../xmldocs/llama.common.inferenceparams/" class="md-nav__link">
  546. llama.common.inferenceparams
  547. </a>
  548. </li>
  549. <li class="md-nav__item">
  550. <a href="../../xmldocs/llama.common.mirostattype/" class="md-nav__link">
  551. llama.common.mirostattype
  552. </a>
  553. </li>
  554. <li class="md-nav__item">
  555. <a href="../../xmldocs/llama.common.modelparams/" class="md-nav__link">
  556. llama.common.modelparams
  557. </a>
  558. </li>
  559. <li class="md-nav__item">
  560. <a href="../../xmldocs/llama.exceptions.grammarexpectedname/" class="md-nav__link">
  561. llama.exceptions.grammarexpectedname
  562. </a>
  563. </li>
  564. <li class="md-nav__item">
  565. <a href="../../xmldocs/llama.exceptions.grammarexpectednext/" class="md-nav__link">
  566. llama.exceptions.grammarexpectednext
  567. </a>
  568. </li>
  569. <li class="md-nav__item">
  570. <a href="../../xmldocs/llama.exceptions.grammarexpectedprevious/" class="md-nav__link">
  571. llama.exceptions.grammarexpectedprevious
  572. </a>
  573. </li>
  574. <li class="md-nav__item">
  575. <a href="../../xmldocs/llama.exceptions.grammarformatexception/" class="md-nav__link">
  576. llama.exceptions.grammarformatexception
  577. </a>
  578. </li>
  579. <li class="md-nav__item">
  580. <a href="../../xmldocs/llama.exceptions.grammarunexpectedcharaltelement/" class="md-nav__link">
  581. llama.exceptions.grammarunexpectedcharaltelement
  582. </a>
  583. </li>
  584. <li class="md-nav__item">
  585. <a href="../../xmldocs/llama.exceptions.grammarunexpectedcharrngelement/" class="md-nav__link">
  586. llama.exceptions.grammarunexpectedcharrngelement
  587. </a>
  588. </li>
  589. <li class="md-nav__item">
  590. <a href="../../xmldocs/llama.exceptions.grammarunexpectedendelement/" class="md-nav__link">
  591. llama.exceptions.grammarunexpectedendelement
  592. </a>
  593. </li>
  594. <li class="md-nav__item">
  595. <a href="../../xmldocs/llama.exceptions.grammarunexpectedendofinput/" class="md-nav__link">
  596. llama.exceptions.grammarunexpectedendofinput
  597. </a>
  598. </li>
  599. <li class="md-nav__item">
  600. <a href="../../xmldocs/llama.exceptions.grammarunexpectedhexcharscount/" class="md-nav__link">
  601. llama.exceptions.grammarunexpectedhexcharscount
  602. </a>
  603. </li>
  604. <li class="md-nav__item">
  605. <a href="../../xmldocs/llama.exceptions.grammarunknownescapecharacter/" class="md-nav__link">
  606. llama.exceptions.grammarunknownescapecharacter
  607. </a>
  608. </li>
  609. <li class="md-nav__item">
  610. <a href="../../xmldocs/llama.exceptions.llamadecodeerror/" class="md-nav__link">
  611. llama.exceptions.llamadecodeerror
  612. </a>
  613. </li>
  614. <li class="md-nav__item">
  615. <a href="../../xmldocs/llama.exceptions.loadweightsfailedexception/" class="md-nav__link">
  616. llama.exceptions.loadweightsfailedexception
  617. </a>
  618. </li>
  619. <li class="md-nav__item">
  620. <a href="../../xmldocs/llama.exceptions.runtimeerror/" class="md-nav__link">
  621. llama.exceptions.runtimeerror
  622. </a>
  623. </li>
  624. <li class="md-nav__item">
  625. <a href="../../xmldocs/llama.extensions.icontextparamsextensions/" class="md-nav__link">
  626. llama.extensions.icontextparamsextensions
  627. </a>
  628. </li>
  629. <li class="md-nav__item">
  630. <a href="../../xmldocs/llama.extensions.imodelparamsextensions/" class="md-nav__link">
  631. llama.extensions.imodelparamsextensions
  632. </a>
  633. </li>
  634. <li class="md-nav__item">
  635. <a href="../../xmldocs/llama.grammars.grammar/" class="md-nav__link">
  636. llama.grammars.grammar
  637. </a>
  638. </li>
  639. <li class="md-nav__item">
  640. <a href="../../xmldocs/llama.grammars.grammarrule/" class="md-nav__link">
  641. llama.grammars.grammarrule
  642. </a>
  643. </li>
  644. <li class="md-nav__item">
  645. <a href="../../xmldocs/llama.ichatmodel/" class="md-nav__link">
  646. llama.ichatmodel
  647. </a>
  648. </li>
  649. <li class="md-nav__item">
  650. <a href="../../xmldocs/llama.llamacache/" class="md-nav__link">
  651. llama.llamacache
  652. </a>
  653. </li>
  654. <li class="md-nav__item">
  655. <a href="../../xmldocs/llama.llamaembedder/" class="md-nav__link">
  656. llama.llamaembedder
  657. </a>
  658. </li>
  659. <li class="md-nav__item">
  660. <a href="../../xmldocs/llama.llamamodel/" class="md-nav__link">
  661. llama.llamamodel
  662. </a>
  663. </li>
  664. <li class="md-nav__item">
  665. <a href="../../xmldocs/llama.llamamodelv1/" class="md-nav__link">
  666. llama.llamamodelv1
  667. </a>
  668. </li>
  669. <li class="md-nav__item">
  670. <a href="../../xmldocs/llama.llamaparams/" class="md-nav__link">
  671. llama.llamaparams
  672. </a>
  673. </li>
  674. <li class="md-nav__item">
  675. <a href="../../xmldocs/llama.llamaquantizer/" class="md-nav__link">
  676. llama.llamaquantizer
  677. </a>
  678. </li>
  679. <li class="md-nav__item">
  680. <a href="../../xmldocs/llama.llamastate/" class="md-nav__link">
  681. llama.llamastate
  682. </a>
  683. </li>
  684. <li class="md-nav__item">
  685. <a href="../../xmldocs/llama.llamatransforms/" class="md-nav__link">
  686. llama.llamatransforms
  687. </a>
  688. </li>
  689. <li class="md-nav__item">
  690. <a href="../../xmldocs/llama.llavaweights/" class="md-nav__link">
  691. llama.llavaweights
  692. </a>
  693. </li>
  694. <li class="md-nav__item">
  695. <a href="../../xmldocs/llama.native.decoderesult/" class="md-nav__link">
  696. llama.native.decoderesult
  697. </a>
  698. </li>
  699. <li class="md-nav__item">
  700. <a href="../../xmldocs/llama.native.ggmltype/" class="md-nav__link">
  701. llama.native.ggmltype
  702. </a>
  703. </li>
  704. <li class="md-nav__item">
  705. <a href="../../xmldocs/llama.native.gpusplitmode/" class="md-nav__link">
  706. llama.native.gpusplitmode
  707. </a>
  708. </li>
  709. <li class="md-nav__item">
  710. <a href="../../xmldocs/llama.native.llamabatch/" class="md-nav__link">
  711. llama.native.llamabatch
  712. </a>
  713. </li>
  714. <li class="md-nav__item">
  715. <a href="../../xmldocs/llama.native.llamabeamsstate/" class="md-nav__link">
  716. llama.native.llamabeamsstate
  717. </a>
  718. </li>
  719. <li class="md-nav__item">
  720. <a href="../../xmldocs/llama.native.llamabeamview/" class="md-nav__link">
  721. llama.native.llamabeamview
  722. </a>
  723. </li>
  724. <li class="md-nav__item">
  725. <a href="../../xmldocs/llama.native.llamachatmessage/" class="md-nav__link">
  726. llama.native.llamachatmessage
  727. </a>
  728. </li>
  729. <li class="md-nav__item">
  730. <a href="../../xmldocs/llama.native.llamacontextparams/" class="md-nav__link">
  731. llama.native.llamacontextparams
  732. </a>
  733. </li>
  734. <li class="md-nav__item">
  735. <a href="../../xmldocs/llama.native.llamaftype/" class="md-nav__link">
  736. llama.native.llamaftype
  737. </a>
  738. </li>
  739. <li class="md-nav__item">
  740. <a href="../../xmldocs/llama.native.llamagrammarelement/" class="md-nav__link">
  741. llama.native.llamagrammarelement
  742. </a>
  743. </li>
  744. <li class="md-nav__item">
  745. <a href="../../xmldocs/llama.native.llamagrammarelementtype/" class="md-nav__link">
  746. llama.native.llamagrammarelementtype
  747. </a>
  748. </li>
  749. <li class="md-nav__item">
  750. <a href="../../xmldocs/llama.native.llamakvcacheview/" class="md-nav__link">
  751. llama.native.llamakvcacheview
  752. </a>
  753. </li>
  754. <li class="md-nav__item">
  755. <a href="../../xmldocs/llama.native.llamakvcacheviewcell/" class="md-nav__link">
  756. llama.native.llamakvcacheviewcell
  757. </a>
  758. </li>
  759. <li class="md-nav__item">
  760. <a href="../../xmldocs/llama.native.llamakvcacheviewsafehandle/" class="md-nav__link">
  761. llama.native.llamakvcacheviewsafehandle
  762. </a>
  763. </li>
  764. <li class="md-nav__item">
  765. <a href="../../xmldocs/llama.native.llamaloglevel/" class="md-nav__link">
  766. llama.native.llamaloglevel
  767. </a>
  768. </li>
  769. <li class="md-nav__item">
  770. <a href="../../xmldocs/llama.native.llamamodelkvoverridetype/" class="md-nav__link">
  771. llama.native.llamamodelkvoverridetype
  772. </a>
  773. </li>
  774. <li class="md-nav__item">
  775. <a href="../../xmldocs/llama.native.llamamodelmetadataoverride/" class="md-nav__link">
  776. llama.native.llamamodelmetadataoverride
  777. </a>
  778. </li>
  779. <li class="md-nav__item">
  780. <a href="../../xmldocs/llama.native.llamamodelparams/" class="md-nav__link">
  781. llama.native.llamamodelparams
  782. </a>
  783. </li>
  784. <li class="md-nav__item">
  785. <a href="../../xmldocs/llama.native.llamamodelquantizeparams/" class="md-nav__link">
  786. llama.native.llamamodelquantizeparams
  787. </a>
  788. </li>
  789. <li class="md-nav__item">
  790. <a href="../../xmldocs/llama.native.llamanativebatch/" class="md-nav__link">
  791. llama.native.llamanativebatch
  792. </a>
  793. </li>
  794. <li class="md-nav__item">
  795. <a href="../../xmldocs/llama.native.llamapoolingtype/" class="md-nav__link">
  796. llama.native.llamapoolingtype
  797. </a>
  798. </li>
  799. <li class="md-nav__item">
  800. <a href="../../xmldocs/llama.native.llamapos/" class="md-nav__link">
  801. llama.native.llamapos
  802. </a>
  803. </li>
  804. <li class="md-nav__item">
  805. <a href="../../xmldocs/llama.native.llamaropetype/" class="md-nav__link">
  806. llama.native.llamaropetype
  807. </a>
  808. </li>
  809. <li class="md-nav__item">
  810. <a href="../../xmldocs/llama.native.llamaseqid/" class="md-nav__link">
  811. llama.native.llamaseqid
  812. </a>
  813. </li>
  814. <li class="md-nav__item">
  815. <a href="../../xmldocs/llama.native.llamatoken/" class="md-nav__link">
  816. llama.native.llamatoken
  817. </a>
  818. </li>
  819. <li class="md-nav__item">
  820. <a href="../../xmldocs/llama.native.llamatokendata/" class="md-nav__link">
  821. llama.native.llamatokendata
  822. </a>
  823. </li>
  824. <li class="md-nav__item">
  825. <a href="../../xmldocs/llama.native.llamatokendataarray/" class="md-nav__link">
  826. llama.native.llamatokendataarray
  827. </a>
  828. </li>
  829. <li class="md-nav__item">
  830. <a href="../../xmldocs/llama.native.llamatokendataarraynative/" class="md-nav__link">
  831. llama.native.llamatokendataarraynative
  832. </a>
  833. </li>
  834. <li class="md-nav__item">
  835. <a href="../../xmldocs/llama.native.llamatokentype/" class="md-nav__link">
  836. llama.native.llamatokentype
  837. </a>
  838. </li>
  839. <li class="md-nav__item">
  840. <a href="../../xmldocs/llama.native.llamavocabtype/" class="md-nav__link">
  841. llama.native.llamavocabtype
  842. </a>
  843. </li>
  844. <li class="md-nav__item">
  845. <a href="../../xmldocs/llama.native.llavaimageembed/" class="md-nav__link">
  846. llama.native.llavaimageembed
  847. </a>
  848. </li>
  849. <li class="md-nav__item">
  850. <a href="../../xmldocs/llama.native.nativeapi/" class="md-nav__link">
  851. llama.native.nativeapi
  852. </a>
  853. </li>
  854. <li class="md-nav__item">
  855. <a href="../../xmldocs/llama.native.nativelibraryconfig/" class="md-nav__link">
  856. llama.native.nativelibraryconfig
  857. </a>
  858. </li>
  859. <li class="md-nav__item">
  860. <a href="../../xmldocs/llama.native.ropescalingtype/" class="md-nav__link">
  861. llama.native.ropescalingtype
  862. </a>
  863. </li>
  864. <li class="md-nav__item">
  865. <a href="../../xmldocs/llama.native.safellamacontexthandle/" class="md-nav__link">
  866. llama.native.safellamacontexthandle
  867. </a>
  868. </li>
  869. <li class="md-nav__item">
  870. <a href="../../xmldocs/llama.native.safellamagrammarhandle/" class="md-nav__link">
  871. llama.native.safellamagrammarhandle
  872. </a>
  873. </li>
  874. <li class="md-nav__item">
  875. <a href="../../xmldocs/llama.native.safellamahandlebase/" class="md-nav__link">
  876. llama.native.safellamahandlebase
  877. </a>
  878. </li>
  879. <li class="md-nav__item">
  880. <a href="../../xmldocs/llama.native.safellamamodelhandle/" class="md-nav__link">
  881. llama.native.safellamamodelhandle
  882. </a>
  883. </li>
  884. <li class="md-nav__item">
  885. <a href="../../xmldocs/llama.native.safellavaimageembedhandle/" class="md-nav__link">
  886. llama.native.safellavaimageembedhandle
  887. </a>
  888. </li>
  889. <li class="md-nav__item">
  890. <a href="../../xmldocs/llama.native.safellavamodelhandle/" class="md-nav__link">
  891. llama.native.safellavamodelhandle
  892. </a>
  893. </li>
  894. <li class="md-nav__item">
  895. <a href="../../xmldocs/llama.quantizer/" class="md-nav__link">
  896. llama.quantizer
  897. </a>
  898. </li>
  899. <li class="md-nav__item">
  900. <a href="../../xmldocs/llama.sampling.basesamplingpipeline/" class="md-nav__link">
  901. llama.sampling.basesamplingpipeline
  902. </a>
  903. </li>
  904. <li class="md-nav__item">
  905. <a href="../../xmldocs/llama.sampling.defaultsamplingpipeline/" class="md-nav__link">
  906. llama.sampling.defaultsamplingpipeline
  907. </a>
  908. </li>
  909. <li class="md-nav__item">
  910. <a href="../../xmldocs/llama.sampling.greedysamplingpipeline/" class="md-nav__link">
  911. llama.sampling.greedysamplingpipeline
  912. </a>
  913. </li>
  914. <li class="md-nav__item">
  915. <a href="../../xmldocs/llama.sampling.isamplingpipeline/" class="md-nav__link">
  916. llama.sampling.isamplingpipeline
  917. </a>
  918. </li>
  919. <li class="md-nav__item">
  920. <a href="../../xmldocs/llama.sampling.isamplingpipelineextensions/" class="md-nav__link">
  921. llama.sampling.isamplingpipelineextensions
  922. </a>
  923. </li>
  924. <li class="md-nav__item">
  925. <a href="../../xmldocs/llama.sampling.mirostate2samplingpipeline/" class="md-nav__link">
  926. llama.sampling.mirostate2samplingpipeline
  927. </a>
  928. </li>
  929. <li class="md-nav__item">
  930. <a href="../../xmldocs/llama.sampling.mirostatesamplingpipeline/" class="md-nav__link">
  931. llama.sampling.mirostatesamplingpipeline
  932. </a>
  933. </li>
  934. <li class="md-nav__item">
  935. <a href="../../xmldocs/llama.sessionstate/" class="md-nav__link">
  936. llama.sessionstate
  937. </a>
  938. </li>
  939. <li class="md-nav__item">
  940. <a href="../../xmldocs/llama.streamingtokendecoder/" class="md-nav__link">
  941. llama.streamingtokendecoder
  942. </a>
  943. </li>
  944. <li class="md-nav__item">
  945. <a href="../../xmldocs/llama.types.chatcompletion/" class="md-nav__link">
  946. llama.types.chatcompletion
  947. </a>
  948. </li>
  949. <li class="md-nav__item">
  950. <a href="../../xmldocs/llama.types.chatcompletionchoice/" class="md-nav__link">
  951. llama.types.chatcompletionchoice
  952. </a>
  953. </li>
  954. <li class="md-nav__item">
  955. <a href="../../xmldocs/llama.types.chatcompletionchunk/" class="md-nav__link">
  956. llama.types.chatcompletionchunk
  957. </a>
  958. </li>
  959. <li class="md-nav__item">
  960. <a href="../../xmldocs/llama.types.chatcompletionchunkchoice/" class="md-nav__link">
  961. llama.types.chatcompletionchunkchoice
  962. </a>
  963. </li>
  964. <li class="md-nav__item">
  965. <a href="../../xmldocs/llama.types.chatcompletionchunkdelta/" class="md-nav__link">
  966. llama.types.chatcompletionchunkdelta
  967. </a>
  968. </li>
  969. <li class="md-nav__item">
  970. <a href="../../xmldocs/llama.types.chatcompletionmessage/" class="md-nav__link">
  971. llama.types.chatcompletionmessage
  972. </a>
  973. </li>
  974. <li class="md-nav__item">
  975. <a href="../../xmldocs/llama.types.chatmessagerecord/" class="md-nav__link">
  976. llama.types.chatmessagerecord
  977. </a>
  978. </li>
  979. <li class="md-nav__item">
  980. <a href="../../xmldocs/llama.types.chatrole/" class="md-nav__link">
  981. llama.types.chatrole
  982. </a>
  983. </li>
  984. <li class="md-nav__item">
  985. <a href="../../xmldocs/llama.types.completion/" class="md-nav__link">
  986. llama.types.completion
  987. </a>
  988. </li>
  989. <li class="md-nav__item">
  990. <a href="../../xmldocs/llama.types.completionchoice/" class="md-nav__link">
  991. llama.types.completionchoice
  992. </a>
  993. </li>
  994. <li class="md-nav__item">
  995. <a href="../../xmldocs/llama.types.completionchunk/" class="md-nav__link">
  996. llama.types.completionchunk
  997. </a>
  998. </li>
  999. <li class="md-nav__item">
  1000. <a href="../../xmldocs/llama.types.completionlogprobs/" class="md-nav__link">
  1001. llama.types.completionlogprobs
  1002. </a>
  1003. </li>
  1004. <li class="md-nav__item">
  1005. <a href="../../xmldocs/llama.types.completionusage/" class="md-nav__link">
  1006. llama.types.completionusage
  1007. </a>
  1008. </li>
  1009. <li class="md-nav__item">
  1010. <a href="../../xmldocs/llama.types.embedding/" class="md-nav__link">
  1011. llama.types.embedding
  1012. </a>
  1013. </li>
  1014. <li class="md-nav__item">
  1015. <a href="../../xmldocs/llama.types.embeddingdata/" class="md-nav__link">
  1016. llama.types.embeddingdata
  1017. </a>
  1018. </li>
  1019. <li class="md-nav__item">
  1020. <a href="../../xmldocs/llama.types.embeddingusage/" class="md-nav__link">
  1021. llama.types.embeddingusage
  1022. </a>
  1023. </li>
  1024. <li class="md-nav__item">
  1025. <a href="../../xmldocs/logger/" class="md-nav__link">
  1026. logger
  1027. </a>
  1028. </li>
  1029. </ul>
  1030. </nav>
  1031. </li>
  1032. </ul>
  1033. </nav>
  1034. </div>
  1035. </div>
  1036. </div>
  1037. <div class="md-sidebar md-sidebar--secondary" data-md-component="sidebar" data-md-type="toc" >
  1038. <div class="md-sidebar__scrollwrap">
  1039. <div class="md-sidebar__inner">
  1040. <nav class="md-nav md-nav--secondary" aria-label="Table of contents">
  1041. <label class="md-nav__title" for="__toc">
  1042. <span class="md-nav__icon md-icon"></span>
  1043. Table of contents
  1044. </label>
  1045. <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
  1046. <li class="md-nav__item">
  1047. <a href="#text-to-text-apis-of-the-executors" class="md-nav__link">
  1048. Text-to-Text APIs of the executors
  1049. </a>
  1050. </li>
  1051. <li class="md-nav__item">
  1052. <a href="#interactiveexecutor-instructexecutor" class="md-nav__link">
  1053. InteractiveExecutor &amp; InstructExecutor
  1054. </a>
  1055. </li>
  1056. <li class="md-nav__item">
  1057. <a href="#statelessexecutor" class="md-nav__link">
  1058. StatelessExecutor.
  1059. </a>
  1060. </li>
  1061. <li class="md-nav__item">
  1062. <a href="#batchedexecutor" class="md-nav__link">
  1063. BatchedExecutor
  1064. </a>
  1065. </li>
  1066. <li class="md-nav__item">
  1067. <a href="#inference-parameters" class="md-nav__link">
  1068. Inference parameters
  1069. </a>
  1070. </li>
  1071. <li class="md-nav__item">
  1072. <a href="#save-and-load-executor-state" class="md-nav__link">
  1073. Save and load executor state
  1074. </a>
  1075. </li>
  1076. </ul>
  1077. </nav>
  1078. </div>
  1079. </div>
  1080. </div>
  1081. <div class="md-content" data-md-component="content">
  1082. <article class="md-content__inner md-typeset">
  1083. <h1 id="llamasharp-executors">LLamaSharp executors</h1>
  1084. <p>LLamaSharp executor defines the behavior of the model when it is called. Currently, there are four kinds of executors, which are <code>InteractiveExecutor</code>, <code>InstructExecutor</code>, <code>StatelessExecutor</code> and <code>BatchedExecutor</code>.</p>
  1085. <p>In a word, <code>InteractiveExecutor</code> is suitable for getting answer of your questions from LLM continuously. <code>InstructExecutor</code> let LLM execute your instructions, such as "continue writing". <code>StatelessExecutor</code> is best for one-time job because the previous inference has no impact on the current inference. <code>BatchedExecutor</code> could accept multiple inputs and generate multiple outputs of different sessions at the same time, significantly improving the throughput of the program.</p>
  1086. <h2 id="text-to-text-apis-of-the-executors">Text-to-Text APIs of the executors</h2>
  1087. <p>All the executors implements the interface <code>ILLamaExecutor</code>, which provides two APIs to execute text-to-text tasks.</p>
  1088. <pre><code class="language-cs">public interface ILLamaExecutor
  1089. {
  1090. /// &lt;summary&gt;
  1091. /// The loaded context for this executor.
  1092. /// &lt;/summary&gt;
  1093. public LLamaContext Context { get; }
  1094. // LLava Section
  1095. //
  1096. /// &lt;summary&gt;
  1097. /// Identify if it's a multi-modal model and there is a image to process.
  1098. /// &lt;/summary&gt;
  1099. public bool IsMultiModal { get; }
  1100. /// &lt;summary&gt;
  1101. /// Muti-Modal Projections / Clip Model weights
  1102. /// &lt;/summary&gt;
  1103. public LLavaWeights? ClipModel { get; }
  1104. /// &lt;summary&gt;
  1105. /// List of images: Image filename and path (jpeg images).
  1106. /// &lt;/summary&gt;
  1107. public List&lt;string&gt; ImagePaths { get; set; }
  1108. /// &lt;summary&gt;
  1109. /// Asynchronously infers a response from the model.
  1110. /// &lt;/summary&gt;
  1111. /// &lt;param name=&quot;text&quot;&gt;Your prompt&lt;/param&gt;
  1112. /// &lt;param name=&quot;inferenceParams&quot;&gt;Any additional parameters&lt;/param&gt;
  1113. /// &lt;param name=&quot;token&quot;&gt;A cancellation token.&lt;/param&gt;
  1114. /// &lt;returns&gt;&lt;/returns&gt;
  1115. IAsyncEnumerable&lt;string&gt; InferAsync(string text, IInferenceParams? inferenceParams = null, CancellationToken token = default);
  1116. }
  1117. </code></pre>
  1118. <p>The output of both two APIs are <strong>yield enumerable</strong>. Therefore, when receiving the output, you can directly use <code>foreach</code> to take actions on each word you get by order, instead of waiting for the whole process completed.</p>
  1119. <h2 id="interactiveexecutor-instructexecutor">InteractiveExecutor &amp; InstructExecutor</h2>
  1120. <p>Both of them are taking "completing the prompt" as the goal to generate the response. For example, if you input <code>Long long ago, there was a fox who wanted to make friend with humen. One day</code>, then the LLM will continue to write the story.</p>
  1121. <p>Under interactive mode, you serve a role of user and the LLM serves the role of assistant. Then it will help you with your question or request. </p>
  1122. <p>Under instruct mode, you give LLM some instructions and it follows.</p>
  1123. <p>Though the behaviors of them sounds similar, it could introduce many differences depending on your prompt. For example, "chat-with-bob" has good performance under interactive mode and <code>alpaca</code> does well with instruct mode.</p>
  1124. <pre><code>// chat-with-bob
  1125. Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.
  1126. User: Hello, Bob.
  1127. Bob: Hello. How may I help you today?
  1128. User: Please tell me the largest city in Europe.
  1129. Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
  1130. User:
  1131. </code></pre>
  1132. <pre><code>// alpaca
  1133. Below is an instruction that describes a task. Write a response that appropriately completes the request.
  1134. </code></pre>
  1135. <p>Therefore, please modify the prompt correspondingly when switching from one mode to the other.</p>
  1136. <h2 id="statelessexecutor">StatelessExecutor.</h2>
  1137. <p>Despite the differences between interactive mode and instruct mode, both of them are stateful mode. That is, your previous question/instruction will impact on the current response from LLM. On the contrary, the stateless executor does not have such a "memory". No matter how many times you talk to it, it will only concentrate on what you say in this time. It is very useful when you want a clean context, without being affected by previous inputs.</p>
  1138. <p>Since the stateless executor has no memory of conversations before, you need to input your question with the whole prompt into it to get the better answer.</p>
  1139. <p>For example, if you feed <code>Q: Who is Trump? A:</code> to the stateless executor, it may give the following answer with the antiprompt <code>Q:</code>.</p>
  1140. <pre><code>Donald J. Trump, born June 14, 1946, is an American businessman, television personality, politician and the 45th President of the United States (2017-2021). # Anexo:Torneo de Hamburgo 2022 (individual masculino)
  1141. ## Presentación previa
  1142. * Defensor del título: Daniil Medvédev
  1143. </code></pre>
  1144. <p>It seems that things went well at first. However, after answering the question itself, LLM began to talk about some other things until the answer reached the token count limit. The reason of this strange behavior is the anti-prompt cannot be match. With the input, LLM cannot decide whether to append a string "A: " at the end of the response.</p>
  1145. <p>As an improvement, let's take the following text as the input:</p>
  1146. <pre><code>Q: What is the capital of the USA? A: Washingtong. Q: What is the sum of 1 and 2? A: 3. Q: Who is Trump? A:
  1147. </code></pre>
  1148. <p>Then, I got the following answer with the anti-prompt <code>Q:</code>.</p>
  1149. <pre><code>45th president of the United States.
  1150. </code></pre>
  1151. <p>At this time, by repeating the same mode of <code>Q: xxx? A: xxx.</code>, LLM outputs the anti-prompt we want to help to decide where to stop the generation.</p>
  1152. <h2 id="batchedexecutor">BatchedExecutor</h2>
  1153. <p>Different from other executors, <code>BatchedExecutor</code> could accept multiple inputs from different sessions and geneate outputs for them at the same time. Here is an example to use it.</p>
  1154. <pre><code class="language-cs">using LLama.Batched;
  1155. using LLama.Common;
  1156. using LLama.Native;
  1157. using LLama.Sampling;
  1158. using Spectre.Console;
  1159. namespace LLama.Examples.Examples;
  1160. /// &lt;summary&gt;
  1161. /// This demonstrates using a batch to generate two sequences and then using one
  1162. /// sequence as the negative guidance (&quot;classifier free guidance&quot;) for the other.
  1163. /// &lt;/summary&gt;
  1164. public class BatchedExecutorGuidance
  1165. {
  1166. private const int n_len = 32;
  1167. public static async Task Run()
  1168. {
  1169. string modelPath = UserSettings.GetModelPath();
  1170. var parameters = new ModelParams(modelPath);
  1171. using var model = LLamaWeights.LoadFromFile(parameters);
  1172. var positivePrompt = AnsiConsole.Ask(&quot;Positive Prompt (or ENTER for default):&quot;, &quot;My favourite colour is&quot;).Trim();
  1173. var negativePrompt = AnsiConsole.Ask(&quot;Negative Prompt (or ENTER for default):&quot;, &quot;I hate the colour red. My favourite colour is&quot;).Trim();
  1174. var weight = AnsiConsole.Ask(&quot;Guidance Weight (or ENTER for default):&quot;, 2.0f);
  1175. // Create an executor that can evaluate a batch of conversations together
  1176. using var executor = new BatchedExecutor(model, parameters);
  1177. // Print some info
  1178. var name = executor.Model.Metadata.GetValueOrDefault(&quot;general.name&quot;, &quot;unknown model name&quot;);
  1179. Console.WriteLine($&quot;Created executor with model: {name}&quot;);
  1180. // Load the two prompts into two conversations
  1181. using var guided = executor.Create();
  1182. guided.Prompt(positivePrompt);
  1183. using var guidance = executor.Create();
  1184. guidance.Prompt(negativePrompt);
  1185. // Run inference to evaluate prompts
  1186. await AnsiConsole
  1187. .Status()
  1188. .Spinner(Spinner.Known.Line)
  1189. .StartAsync(&quot;Evaluating Prompts...&quot;, _ =&gt; executor.Infer());
  1190. // Fork the &quot;guided&quot; conversation. We'll run this one without guidance for comparison
  1191. using var unguided = guided.Fork();
  1192. // Run inference loop
  1193. var unguidedSampler = new GuidedSampler(null, weight);
  1194. var unguidedDecoder = new StreamingTokenDecoder(executor.Context);
  1195. var guidedSampler = new GuidedSampler(guidance, weight);
  1196. var guidedDecoder = new StreamingTokenDecoder(executor.Context);
  1197. await AnsiConsole
  1198. .Progress()
  1199. .StartAsync(async progress =&gt;
  1200. {
  1201. var reporter = progress.AddTask(&quot;Running Inference&quot;, maxValue: n_len);
  1202. for (var i = 0; i &lt; n_len; i++)
  1203. {
  1204. if (i != 0)
  1205. await executor.Infer();
  1206. // Sample from the &quot;unguided&quot; conversation. This is just a conversation using the same prompt, without any
  1207. // guidance. This serves as a comparison to show the effect of guidance.
  1208. var u = unguidedSampler.Sample(executor.Context.NativeHandle, unguided.Sample(), Array.Empty&lt;LLamaToken&gt;());
  1209. unguidedDecoder.Add(u);
  1210. unguided.Prompt(u);
  1211. // Sample from the &quot;guided&quot; conversation. This sampler will internally use the &quot;guidance&quot; conversation
  1212. // to steer the conversation. See how this is done in GuidedSampler.ProcessLogits (bottom of this file).
  1213. var g = guidedSampler.Sample(executor.Context.NativeHandle, guided.Sample(), Array.Empty&lt;LLamaToken&gt;());
  1214. guidedDecoder.Add(g);
  1215. // Use this token to advance both guided _and_ guidance. Keeping them in sync (except for the initial prompt).
  1216. guided.Prompt(g);
  1217. guidance.Prompt(g);
  1218. // Early exit if we reach the natural end of the guided sentence
  1219. if (g == model.EndOfSentenceToken)
  1220. break;
  1221. // Update progress bar
  1222. reporter.Increment(1);
  1223. }
  1224. });
  1225. AnsiConsole.MarkupLine($&quot;[green]Unguided:[/][white]{unguidedDecoder.Read().ReplaceLineEndings(&quot; &quot;)}[/]&quot;);
  1226. AnsiConsole.MarkupLine($&quot;[green]Guided:[/][white]{guidedDecoder.Read().ReplaceLineEndings(&quot; &quot;)}[/]&quot;);
  1227. }
  1228. private class GuidedSampler(Conversation? guidance, float weight)
  1229. : BaseSamplingPipeline
  1230. {
  1231. public override void Accept(SafeLLamaContextHandle ctx, LLamaToken token)
  1232. {
  1233. }
  1234. public override ISamplingPipeline Clone()
  1235. {
  1236. throw new NotSupportedException();
  1237. }
  1238. protected override void ProcessLogits(SafeLLamaContextHandle ctx, Span&lt;float&gt; logits, ReadOnlySpan&lt;LLamaToken&gt; lastTokens)
  1239. {
  1240. if (guidance == null)
  1241. return;
  1242. // Get the logits generated by the guidance sequences
  1243. var guidanceLogits = guidance.Sample();
  1244. // Use those logits to guide this sequence
  1245. NativeApi.llama_sample_apply_guidance(ctx, logits, guidanceLogits, weight);
  1246. }
  1247. protected override LLamaToken ProcessTokenDataArray(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, ReadOnlySpan&lt;LLamaToken&gt; lastTokens)
  1248. {
  1249. candidates.Temperature(ctx, 0.8f);
  1250. candidates.TopK(ctx, 25);
  1251. return candidates.SampleToken(ctx);
  1252. }
  1253. }
  1254. }
  1255. </code></pre>
  1256. <h2 id="inference-parameters">Inference parameters</h2>
  1257. <p>Different from context parameters, which is indicated in <a href="../UnderstandLLamaContext/">understand-llama-context</a>, executors accept parameters when you call its API to execute the inference. That means you could change the parameters every time you ask the model to generate the outputs.</p>
  1258. <p>Here is the parameters for LLamaSharp executors.</p>
  1259. <pre><code class="language-cs">/// &lt;summary&gt;
  1260. /// The paramters used for inference.
  1261. /// &lt;/summary&gt;
  1262. public record InferenceParams
  1263. : IInferenceParams
  1264. {
  1265. /// &lt;summary&gt;
  1266. /// number of tokens to keep from initial prompt
  1267. /// &lt;/summary&gt;
  1268. public int TokensKeep { get; set; } = 0;
  1269. /// &lt;summary&gt;
  1270. /// how many new tokens to predict (n_predict), set to -1 to inifinitely generate response
  1271. /// until it complete.
  1272. /// &lt;/summary&gt;
  1273. public int MaxTokens { get; set; } = -1;
  1274. /// &lt;summary&gt;
  1275. /// logit bias for specific tokens
  1276. /// &lt;/summary&gt;
  1277. public Dictionary&lt;LLamaToken, float&gt;? LogitBias { get; set; } = null;
  1278. /// &lt;summary&gt;
  1279. /// Sequences where the model will stop generating further tokens.
  1280. /// &lt;/summary&gt;
  1281. public IReadOnlyList&lt;string&gt; AntiPrompts { get; set; } = Array.Empty&lt;string&gt;();
  1282. /// &lt;inheritdoc /&gt;
  1283. public int TopK { get; set; } = 40;
  1284. /// &lt;inheritdoc /&gt;
  1285. public float TopP { get; set; } = 0.95f;
  1286. /// &lt;inheritdoc /&gt;
  1287. public float MinP { get; set; } = 0.05f;
  1288. /// &lt;inheritdoc /&gt;
  1289. public float TfsZ { get; set; } = 1.0f;
  1290. /// &lt;inheritdoc /&gt;
  1291. public float TypicalP { get; set; } = 1.0f;
  1292. /// &lt;inheritdoc /&gt;
  1293. public float Temperature { get; set; } = 0.8f;
  1294. /// &lt;inheritdoc /&gt;
  1295. public float RepeatPenalty { get; set; } = 1.1f;
  1296. /// &lt;inheritdoc /&gt;
  1297. public int RepeatLastTokensCount { get; set; } = 64;
  1298. /// &lt;inheritdoc /&gt;
  1299. public float FrequencyPenalty { get; set; } = .0f;
  1300. /// &lt;inheritdoc /&gt;
  1301. public float PresencePenalty { get; set; } = .0f;
  1302. /// &lt;inheritdoc /&gt;
  1303. public MirostatType Mirostat { get; set; } = MirostatType.Disable;
  1304. /// &lt;inheritdoc /&gt;
  1305. public float MirostatTau { get; set; } = 5.0f;
  1306. /// &lt;inheritdoc /&gt;
  1307. public float MirostatEta { get; set; } = 0.1f;
  1308. /// &lt;inheritdoc /&gt;
  1309. public bool PenalizeNL { get; set; } = true;
  1310. /// &lt;inheritdoc /&gt;
  1311. public SafeLLamaGrammarHandle? Grammar { get; set; }
  1312. /// &lt;inheritdoc /&gt;
  1313. public ISamplingPipeline? SamplingPipeline { get; set; }
  1314. }
  1315. </code></pre>
  1316. <h2 id="save-and-load-executor-state">Save and load executor state</h2>
  1317. <p>An executor also has its state, which can be saved and loaded. That means a lot when you want to support restore a previous session for the user in your application.</p>
  1318. <p>The following code shows how to use save and load executor state.</p>
  1319. <pre><code class="language-cs">InteractiveExecutor executor = new InteractiveExecutor(model);
  1320. // do some things...
  1321. executor.SaveState(&quot;executor.st&quot;);
  1322. var stateData = executor.GetStateData();
  1323. InteractiveExecutor executor2 = new InteractiveExecutor(model);
  1324. executor2.LoadState(stateData);
  1325. // do some things...
  1326. InteractiveExecutor executor3 = new InteractiveExecutor(model);
  1327. executor3.LoadState(&quot;executor.st&quot;);
  1328. // do some things...
  1329. </code></pre>
  1330. </article>
  1331. </div>
  1332. </div>
  1333. </main>
  1334. <footer class="md-footer">
  1335. <div class="md-footer-meta md-typeset">
  1336. <div class="md-footer-meta__inner md-grid">
  1337. <div class="md-copyright">
  1338. Made with
  1339. <a href="https://squidfunk.github.io/mkdocs-material/" target="_blank" rel="noopener">
  1340. Material for MkDocs
  1341. </a>
  1342. </div>
  1343. </div>
  1344. </div>
  1345. </footer>
  1346. </div>
  1347. <div class="md-dialog" data-md-component="dialog">
  1348. <div class="md-dialog__inner md-typeset"></div>
  1349. </div>
  1350. <script id="__config" type="application/json">{"base": "../..", "features": [], "search": "../../assets/javascripts/workers/search.74e28a9f.min.js", "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}, "version": {"provider": "mike"}}</script>
  1351. <script src="../../assets/javascripts/bundle.220ee61c.min.js"></script>
  1352. </body>
  1353. </html>

C#/.NET上易用的LLM高性能推理框架,支持LLaMA和LLaVA系列模型。

Contributors (1)