You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 6.2 kB

2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121
  1. # GPTtrace 🤖
  2. [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
  3. [![Actions Status](https://github.com/eunomia-bpf/GPTtrace/workflows/Pylint/badge.svg)](https://github.com/eunomia-bpf/GPTtrace/actions)
  4. [![DeepSource](https://deepsource.io/gh/eunomia-bpf/eunomia-bpf.svg/?label=active+issues&show_trend=true&token=rcSI3J1-gpwLIgZWtKZC-N6C)](https://deepsource.io/gh/eunomia-bpf/eunomia-bpf/?ref=repository-badge)
  5. [![CodeFactor](https://www.codefactor.io/repository/github/eunomia-bpf/eunomia-bpf/badge)](https://www.codefactor.io/repository/github/eunomia-bpf/eunomia-bpf)
  6. Generate eBPF programs and tracing with ChatGPT and natural language
  7. ## Key Features 💡
  8. ### Interact and Tracing your Linux with natural language, it can tell how to write eBPF programs in `BCC`, `libbpf` styles.
  9. example: tracing with Count page faults by process
  10. ![result](doc/result.gif)
  11. ### Generate eBPF programs with natural language
  12. ![generate](doc/generate.png)
  13. For detail documents and tutorials about how we train ChatGPT to write eBPF programs, please refer to: [`bpf-developer-tutorial`](https://github.com/eunomia-bpf/bpf-developer-tutorial) (a libbpf tool tutorial to teach ChatGPT to write eBPF programs)
  14. **Note that the `GPTtrace` tool now is only a demo project to show how it works, the result may not be accuracy, and it is not recommended to use it in production. We are working to make it more stable and complete!**
  15. ## Usage and Setup 🛠
  16. ```console
  17. $ ./GPTtrace.py
  18. usage: GPTtrace [-h] [-i | -v | -e TEXT | -g TEXT] [-u UUID] [-t ACCESS_TOKEN]
  19. Use ChatGPT to write eBPF programs (bpftrace, etc.)
  20. optional arguments:
  21. -h, --help show this help message and exit
  22. -i, --info Let ChatGPT explain what's eBPF
  23. -v, --verbose Print the prompt and receive message
  24. -e TEXT, --execute TEXT
  25. Generate commands using your input with ChatGPT, and run it
  26. -g TEXT, --generate TEXT
  27. Generate eBPF programs using your input with ChatGPT
  28. -u UUID, --uuid UUID Conversion UUID to use, or passed through environment variable `GPTTRACE_CONV_UUID`
  29. -t ACCESS_TOKEN, --access-token ACCESS_TOKEN
  30. ChatGPT access token, see `https://chat.openai.com/api/auth/session` or passed through
  31. `GPTTRACE_ACCESS_TOKEN`
  32. ```
  33. ### First: login to ChatGPT
  34. - get the `Conversion ID` from ChatGPT, and then set it to the environment variable `GPTTRACE_CONV_UUID` or use the `-u` option. The `Conversion ID` is the last part of the URL of the conversation, for example, the `Conversion ID` of `https://chat.openai.com/conv/1a2b3c4d-0000-0000-0000-1k2l3m4n5o6p` is `1a2b3c4d-0000-0000-0000-1k2l3m4n5o6p`(example, not usable).
  35. - get the `access token` from ChatGPT, and then set it to the environment variable `GPTTRACE_ACCESS_TOKEN` or use the `-t` option. see `https://chat.openai.com/api/auth/session` for the access token.
  36. ### Use prompts to teach ChatGPT to write eBPF programs
  37. ```console
  38. $ ./GPTtrace.py --train
  39. ----------------------------
  40. Training ChatGPT with `1.md`
  41. ----------------------------
  42. ....
  43. Trained session: cbd73f64-64b8-4f1d-80d3-c5f4f2fe292e
  44. ```
  45. This will use the material in the `prompts` directory to teach ChatGPT to write eBPF programs in bpftrace, libbpf, and BCC styles. You can also do that manually by sending the prompts to ChatGPT in the Website.
  46. ### start your tracing! 🚀
  47. For example:
  48. ```sh
  49. ./GPTtrace.py -e "Count page faults by process"
  50. ```
  51. If the eBPF program cannot be loaded into the kernel, The error message will be used to correct ChatGPT, and the result will be printed to the console.
  52. ## How it works
  53. 1. GPTtrace pre-trains its eBPF programs using various eBPF development resources, has multiple conversations with ChatGPT to teach it how to write different types of eBPF programs and bpftrace DSLs.
  54. 2. The user inputs their request in natural language, and GPTtrace calls the ChatGPT API to generate an eBPF program. The generated program is then executed via shell or written to a file for compilation and execution.
  55. 3. If there are errors in compilation or loading, the error is sent back to ChatGPT to generate a new eBPF program or command.
  56. ## Room for improvement
  57. There is still plenty of room for improvement, including:
  58. 1. Once the ChatGPT can search online, it should be much better to let the tool get sample programs from the bcc/bpftrace repository and learn them, or let the tool look at Stack Overflow or something to see how to write eBPF programs, similar to the method used in new Bing search.
  59. 2. Providing more high-quality documentation and tutorials to improve the accuracy of the output and the quality of the code examples.
  60. 3. Making multiple calls to other tools to execute commands and return results. For example, GPTtrace could output a command, have bpftrace query the current kernel version and supported tracepoints, and return the output as part of the conversation.
  61. 4. Incorporating user feedback to improve the quality of the generated code and refine the natural language processing capabilities of the tool.
  62. And also, new LLM models will certainly lead to more realistic and accurate language generation.
  63. ## Installation 🔧
  64. ```sh
  65. ./install.sh
  66. ```
  67. ## Examples
  68. - Files opened by process
  69. - Syscall count by program
  70. - Read bytes by process:
  71. - Read size distribution by process:
  72. - Show per-second syscall rates:
  73. - Trace disk size by process
  74. - Count page faults by process
  75. - Count LLC cache misses by process name and PID (uses PMCs):
  76. - Profile user-level stacks at 99 Hertz, for PID 189:
  77. - Files opened, for processes in the root cgroup-v2
  78. ## LICENSE
  79. MIT
  80. ## 🔗 Links
  81. - detail documents and tutorials about how we train ChatGPT to write eBPF programs: https://github.com/eunomia-bpf/bpf-developer-tutorial (基于 CO-RE (一次编写,到处运行) libbpf 的 eBPF 开发者教程:通过 20 个小工具一步步学习 eBPF(尝试教会 ChatGPT 编写 eBPF 程序)
  82. - bpftrace: https://github.com/iovisor/bpftrace
  83. - ChatGPT: https://chat.openai.com/
  84. - Python API: https://github.com/mmabrouk/chatgpt-wrapper

Generate eBPF programs and tracing with ChatGPT and natural language