diff --git a/README.md b/README.md
index 94c3c299..76dbb874 100644
--- a/README.md
+++ b/README.md
@@ -10,14 +10,19 @@
 [![LLamaSharp Badge](https://img.shields.io/nuget/v/LLamaSharp.Backend.Cuda12?label=LLamaSharp.Backend.Cuda12)](https://www.nuget.org/packages/LLamaSharp.Backend.Cuda12)
 
 
-The C#/.NET binding of [llama.cpp](https://github.com/ggerganov/llama.cpp). It provides APIs to inference the LLaMa Models and deploy it on native environment or Web. It works on 
-both Windows and Linux and does NOT require compiling llama.cpp yourself. Its performance is close to llama.cpp.
+**The C#/.NET binding of [llama.cpp](https://github.com/ggerganov/llama.cpp). It provides APIs to inference the LLaMa Models and deploy it on local environment. It works on 
+both Windows, Linux and MAC without requirment for compiling llama.cpp yourself. Its performance is close to llama.cpp.**
 
-- LLaMa models inference
-- APIs for chat session
-- Model quantization
-- Embedding generation, tokenization and detokenization
-- ASP.NET core integration
+**Furthermore, it provides integrations with other projects such as [BotSharp](https://github.com/SciSharp/BotSharp) to provide higher-level applications and UI.**
+
+
+## Documentation
+
+- [Quick start](https://scisharp.github.io/LLamaSharp/0.4/GetStarted/)
+- [Tricks for FAQ](https://scisharp.github.io/LLamaSharp/0.4/Tricks/)
+- [Full documentation](https://scisharp.github.io/LLamaSharp/0.4/)
+- [API reference](https://scisharp.github.io/LLamaSharp/0.4/xmldocs/)
+- [Examples](./LLama.Examples/NewVersion/)
 
 ## Installation
 
@@ -42,7 +47,7 @@ Here's the mapping of them and corresponding model samples provided by `LLamaSha
 | - | v0.2.0 | This version is not recommended to use. | - |
 | - | v0.2.1 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama), [Vicuna (filenames with "old")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | - |
 | v0.2.2 | v0.2.2, v0.2.3 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama_ggmlv2), [Vicuna (filenames without "old")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | 63d2046 |
-| v0.3.0 | v0.3.0 | [LLamaSharpSamples v0.3.0](https://huggingface.co/AsakusaRinne/LLamaSharpSamples/tree/v0.3.0), [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/main) | 7e4ea5b |
+| v0.3.0, v0.3.1 | v0.3.0, v0.4.0 | [LLamaSharpSamples v0.3.0](https://huggingface.co/AsakusaRinne/LLamaSharpSamples/tree/v0.3.0), [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/main) | 7e4ea5b |
 
 We publish the backend with cpu, cuda11 and cuda12 because they are the most popular ones. If none of them matches, please compile the [llama.cpp](https://github.com/ggerganov/llama.cpp)
 from source and put the `libllama` under your project's output path. When building from source, please add `-DBUILD_SHARED_LIBS=ON` to enable the library generation.
@@ -53,44 +58,40 @@ from source and put the `libllama` under your project's output path. When buildi
 2. Unsupported model: `llama.cpp` is under quick development and often has break changes. Please check the release date of the model and find a suitable version of LLamaSharp to install, or use the model we provide [on huggingface](https://huggingface.co/AsakusaRinne/LLamaSharpSamples).
 
 
-## Simple Benchmark
-
-Currently it's only a simple benchmark to indicate that the performance of `LLamaSharp` is close to `llama.cpp`. Experiments run on a computer 
-with Intel i7-12700, 3060Ti with 7B model. Note that the benchmark uses `LLamaModel` instead of `LLamaModelV1`. 
-
-#### Windows
-
-- llama.cpp: 2.98 words / second
-
-- LLamaSharp: 2.94 words / second
 
 ## Usages
 
 #### Model Inference and Chat Session
 
-Currently, `LLamaSharp` provides two kinds of model, `LLamaModelV1` and `LLamaModel`. Both of them works but `LLamaModel` is more recommended 
-because it provides better alignment with the master branch of [llama.cpp](https://github.com/ggerganov/llama.cpp).
-
-Besides, `ChatSession` makes it easier to wrap your own chat bot. The code below is a simple example. For all examples, please refer to 
-[Examples](./LLama.Examples).
+LLamaSharp provides two ways to run inference: `LLamaExecutor` and `ChatSession`. The chat session is a higher-level wrapping of the executor and the model. Here's a simple example to use chat session.
 
 ```cs
+using LLama.Common;
+using LLama;
+
+string modelPath = "<Your model path>" // change it to your own model path
+var prompt = "Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.\r\n\r\nUser: Hello, Bob.\r\nBob: Hello. How may I help you today?\r\nUser: Please tell me the largest city in Europe.\r\nBob: Sure. The largest city in Europe is Moscow, the capital of Russia.\r\nUser:"; // use the "chat-with-bob" prompt here.
 
-var model = new LLamaModel(new LLamaParams(model: "<Your path>", n_ctx: 512, repeat_penalty: 1.0f));
-var session = new ChatSession<LLamaModel>(model).WithPromptFile("<Your prompt file path>")
-                .WithAntiprompt(new string[] { "User:" });
-Console.Write("\nUser:");
-while (true)
+// Initialize a chat session
+var ex = new InteractiveExecutor(new LLamaModel(new ModelParams(modelPath, contextSize: 1024, seed: 1337, gpuLayerCount: 5)));
+ChatSession session = new ChatSession(ex);
+
+// show the prompt
+Console.WriteLine();
+Console.Write(prompt);
+
+// run the inference in a loop to chat with LLM
+while (prompt != "stop")
 {
-    Console.ForegroundColor = ConsoleColor.Green;
-    var question = Console.ReadLine();
-    Console.ForegroundColor = ConsoleColor.White;
-    var outputs = session.Chat(question); // It's simple to use the chat API.
-    foreach (var output in outputs)
+    foreach (var text in session.Chat(prompt, new InferenceParams() { Temperature = 0.6f, AntiPrompts = new List<string> { "User:" } }))
     {
-        Console.Write(output);
+        Console.Write(text);
     }
+    prompt = Console.ReadLine();
 }
+
+// save the session
+session.SaveSession("SavedSessionPath");
 ```
 
 #### Quantization
@@ -125,6 +126,12 @@ Since we are in short of hands, if you're familiar with ASP.NET core, we'll appr
 
 ## Roadmap
 
+---
+
+✅: completed. ⚠️: outdated but will be updated. 🔳: not completed
+
+---
+
 ✅ LLaMa model inference
 
 ✅ Embeddings generation, tokenization and detokenization
@@ -135,7 +142,11 @@ Since we are in short of hands, if you're familiar with ASP.NET core, we'll appr
 
 ✅ State saving and loading
 
-✅ ASP.NET core Integration
+✅ BotSharp Integration
+
+⚠️ ASP.NET core Integration
+
+⚠️ Semantic-kernel Integration
 
 🔳 MAUI Integration
 
@@ -161,7 +172,7 @@ The prompts could be found below:
 
 ## Contributing
 
-Any contribution is welcomed! You can do one of the followings to help us make `LLamaSharp` better:
+Any contribution is welcomed! Please read the [contributing guide](https://scisharp.github.io/LLamaSharp/0.4/ContributingGuide/). You can do one of the followings to help us make `LLamaSharp` better:
 
 - Append a model link that is available for a version. (This is very important!)
 - Star and share `LLamaSharp` to let others know it.