You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

GetStarted.md 5.3 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118
  1. # Get Started
  2. ## Install packages
  3. Firstly, search `LLamaSharp` in nuget package manager and install it.
  4. ```
  5. PM> Install-Package LLamaSharp
  6. ```
  7. Then, search and install one of the following backends:
  8. ```
  9. LLamaSharp.Backend.Cpu
  10. LLamaSharp.Backend.Cuda11
  11. LLamaSharp.Backend.Cuda12
  12. ```
  13. Here's the mapping of them and corresponding model samples provided by `LLamaSharp`. If you're not sure which model is available for a version, please try our sample model.
  14. | LLamaSharp.Backend | LLamaSharp | Verified Model Resources | llama.cpp commit id |
  15. | - | - | -- | - |
  16. | - | v0.2.0 | This version is not recommended to use. | - |
  17. | - | v0.2.1 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama), [Vicuna (filenames with "old")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | - |
  18. | v0.2.2 | v0.2.2, v0.2.3 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama_ggmlv2), [Vicuna (filenames without "old")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | 63d2046 |
  19. | v0.3.0 | v0.3.0 | [LLamaSharpSamples v0.3.0](https://huggingface.co/AsakusaRinne/LLamaSharpSamples/tree/v0.3.0), [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/main) | 7e4ea5b |
  20. ## Download a model
  21. One of the following models could be okay:
  22. - LLaMA 🦙
  23. - [Alpaca](https://github.com/ggerganov/llama.cpp#instruction-mode-with-alpaca)
  24. - [GPT4All](https://github.com/ggerganov/llama.cpp#using-gpt4all)
  25. - [Chinese LLaMA / Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca)
  26. - [Vigogne (French)](https://github.com/bofenghuang/vigogne)
  27. - [Vicuna](https://github.com/ggerganov/llama.cpp/discussions/643#discussioncomment-5533894)
  28. - [Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)
  29. - [OpenBuddy 🐶 (Multilingual)](https://github.com/OpenBuddy/OpenBuddy)
  30. - [Pygmalion 7B / Metharme 7B](#using-pygmalion-7b--metharme-7b)
  31. - [WizardLM](https://github.com/nlpxucan/WizardLM)
  32. **Note that because `llama.cpp` is under fast development now and often introduce break changes, some model weights on huggingface which works under a version may be invalid with another version. If it's your first time to configure LLamaSharp, we'd like to suggest for using verified model weights in the table above.**
  33. ## Run the program
  34. Please create a console program with dotnet runtime >= netstandard 2.0 (>= net6.0 is more recommended). Then, paste the following code to `program.cs`;
  35. ```cs
  36. using LLama.Common;
  37. using LLama;
  38. string modelPath = "<Your model path>" // change it to your own model path
  39. var prompt = "Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.\r\n\r\nUser: Hello, Bob.\r\nBob: Hello. How may I help you today?\r\nUser: Please tell me the largest city in Europe.\r\nBob: Sure. The largest city in Europe is Moscow, the capital of Russia.\r\nUser:"; // use the "chat-with-bob" prompt here.
  40. // Load model
  41. var parameters = new ModelParams(modelPath)
  42. {
  43. ContextSize = 1024
  44. };
  45. using var model = LLamaWeights.LoadFromFile(parameters);
  46. // Initialize a chat session
  47. using var context = model.CreateContext(parameters);
  48. var ex = new InteractiveExecutor(context);
  49. ChatSession session = new ChatSession(ex);
  50. // show the prompt
  51. Console.WriteLine();
  52. Console.Write(prompt);
  53. // run the inference in a loop to chat with LLM
  54. while (true)
  55. {
  56. await foreach (var text in session.ChatAsync(prompt, new InferenceParams() { Temperature = 0.6f, AntiPrompts = new List<string> { "User:" } }))
  57. {
  58. Console.Write(text);
  59. }
  60. Console.ForegroundColor = ConsoleColor.Green;
  61. prompt = Console.ReadLine();
  62. Console.ForegroundColor = ConsoleColor.White;
  63. }
  64. ```
  65. After starting it, you'll see the following outputs.
  66. ```
  67. Please input your model path: D:\development\llama\weights\wizard-vicuna-13B.ggmlv3.q4_1.bin
  68. llama.cpp: loading model from D:\development\llama\weights\wizard-vicuna-13B.ggmlv3.q4_1.bin
  69. llama_model_load_internal: format = ggjt v3 (latest)
  70. llama_model_load_internal: n_vocab = 32000
  71. llama_model_load_internal: n_ctx = 1024
  72. llama_model_load_internal: n_embd = 5120
  73. llama_model_load_internal: n_mult = 256
  74. llama_model_load_internal: n_head = 40
  75. llama_model_load_internal: n_layer = 40
  76. llama_model_load_internal: n_rot = 128
  77. llama_model_load_internal: ftype = 3 (mostly Q4_1)
  78. llama_model_load_internal: n_ff = 13824
  79. llama_model_load_internal: n_parts = 1
  80. llama_model_load_internal: model size = 13B
  81. llama_model_load_internal: ggml ctx size = 7759.48 MB
  82. llama_model_load_internal: mem required = 9807.48 MB (+ 1608.00 MB per state)
  83. ....................................................................................................
  84. llama_init_from_file: kv self size = 800.00 MB
  85. Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.
  86. User: Hello, Bob.
  87. Bob: Hello. How may I help you today?
  88. User: Please tell me the largest city in Europe.
  89. Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
  90. User:
  91. ```
  92. Now, enjoy chatting with LLM!