You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

GetStarted.md 5.2 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110
  1. # Get Started
  2. ## Install packages
  3. Firstly, search `LLamaSharp` in nuget package manager and install it.
  4. ```
  5. PM> Install-Package LLamaSharp
  6. ```
  7. Then, search and install one of the following backends:
  8. ```
  9. LLamaSharp.Backend.Cpu
  10. LLamaSharp.Backend.Cuda11
  11. LLamaSharp.Backend.Cuda12
  12. ```
  13. Here's the mapping of them and corresponding model samples provided by `LLamaSharp`. If you're not sure which model is available for a version, please try our sample model.
  14. | LLamaSharp.Backend | LLamaSharp | Verified Model Resources | llama.cpp commit id |
  15. | - | - | -- | - |
  16. | - | v0.2.0 | This version is not recommended to use. | - |
  17. | - | v0.2.1 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama), [Vicuna (filenames with "old")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | - |
  18. | v0.2.2 | v0.2.2, v0.2.3 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama_ggmlv2), [Vicuna (filenames without "old")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | 63d2046 |
  19. | v0.3.0 | v0.3.0 | [LLamaSharpSamples v0.3.0](https://huggingface.co/AsakusaRinne/LLamaSharpSamples/tree/v0.3.0), [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/main) | 7e4ea5b |
  20. ## Download a model
  21. One of the following models could be okay:
  22. - LLaMA 🦙
  23. - [Alpaca](https://github.com/ggerganov/llama.cpp#instruction-mode-with-alpaca)
  24. - [GPT4All](https://github.com/ggerganov/llama.cpp#using-gpt4all)
  25. - [Chinese LLaMA / Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca)
  26. - [Vigogne (French)](https://github.com/bofenghuang/vigogne)
  27. - [Vicuna](https://github.com/ggerganov/llama.cpp/discussions/643#discussioncomment-5533894)
  28. - [Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)
  29. - [OpenBuddy 🐶 (Multilingual)](https://github.com/OpenBuddy/OpenBuddy)
  30. - [Pygmalion 7B / Metharme 7B](#using-pygmalion-7b--metharme-7b)
  31. - [WizardLM](https://github.com/nlpxucan/WizardLM)
  32. **Note that because `llama.cpp` is under fast development now and often introduce break changes, some model weights on huggingface which works under a version may be invalid with another version. If it's your first time to configure LLamaSharp, we'd like to suggest for using verified model weights in the table above.**
  33. ## Run the program
  34. Please create a console program with dotnet runtime >= netstandard 2.0 (>= net6.0 is more recommended). Then, paste the following code to `program.cs`;
  35. ```cs
  36. using LLama.Common;
  37. using LLama;
  38. string modelPath = "<Your model path>" // change it to your own model path
  39. var prompt = "Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.\r\n\r\nUser: Hello, Bob.\r\nBob: Hello. How may I help you today?\r\nUser: Please tell me the largest city in Europe.\r\nBob: Sure. The largest city in Europe is Moscow, the capital of Russia.\r\nUser:"; // use the "chat-with-bob" prompt here.
  40. // Initialize a chat session
  41. var ex = new InteractiveExecutor(new LLamaModel(new ModelParams(modelPath, contextSize: 1024, seed: 1337, gpuLayerCount: 5)));
  42. ChatSession session = new ChatSession(ex);
  43. // show the prompt
  44. Console.WriteLine();
  45. Console.Write(prompt);
  46. // run the inference in a loop to chat with LLM
  47. while (true)
  48. {
  49. foreach (var text in session.Chat(prompt, new InferenceParams() { Temperature = 0.6f, AntiPrompts = new List<string> { "User:" } }))
  50. {
  51. Console.Write(text);
  52. }
  53. Console.ForegroundColor = ConsoleColor.Green;
  54. prompt = Console.ReadLine();
  55. Console.ForegroundColor = ConsoleColor.White;
  56. }
  57. ```
  58. After starting it, you'll see the following outputs.
  59. ```
  60. Please input your model path: D:\development\llama\weights\wizard-vicuna-13B.ggmlv3.q4_1.bin
  61. llama.cpp: loading model from D:\development\llama\weights\wizard-vicuna-13B.ggmlv3.q4_1.bin
  62. llama_model_load_internal: format = ggjt v3 (latest)
  63. llama_model_load_internal: n_vocab = 32000
  64. llama_model_load_internal: n_ctx = 1024
  65. llama_model_load_internal: n_embd = 5120
  66. llama_model_load_internal: n_mult = 256
  67. llama_model_load_internal: n_head = 40
  68. llama_model_load_internal: n_layer = 40
  69. llama_model_load_internal: n_rot = 128
  70. llama_model_load_internal: ftype = 3 (mostly Q4_1)
  71. llama_model_load_internal: n_ff = 13824
  72. llama_model_load_internal: n_parts = 1
  73. llama_model_load_internal: model size = 13B
  74. llama_model_load_internal: ggml ctx size = 7759.48 MB
  75. llama_model_load_internal: mem required = 9807.48 MB (+ 1608.00 MB per state)
  76. ....................................................................................................
  77. llama_init_from_file: kv self size = 800.00 MB
  78. Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.
  79. User: Hello, Bob.
  80. Bob: Hello. How may I help you today?
  81. User: Please tell me the largest city in Europe.
  82. Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
  83. User:
  84. ```
  85. Now, enjoy chatting with LLM!