Rinne
c641dbdb83
Merge pull request #69 from martindevans/fixed_mirostat_spelling
Fixed Spelling Mirostate -> Mirostat
2 years ago
Rinne
8d37abd787
Merge pull request #68 from martindevans/sampling_improvements
Fixed Memory pinning in Sampling API
2 years ago
Rinne
1d29b240b2
Merge pull request #64 from martindevans/new_llama_state_loading_mechanism
Low level new loading system
2 years ago
Martin Evans
add3d5528b
Removed `MarshalAs` on array
2 years ago
Martin Evans
2245b84906
Update LLamaContextParams.cs
2 years ago
Martin Evans
c64507cb41
Correctly passing through mu value to mirostate instead of resetting it every time.
2 years ago
Rinne
0e438e6303
Merge pull request #63 from martindevans/more_multi_enumeration_fixes
Fixed More Multiple Enumeration
2 years ago
Rinne
cd015055a8
Merge branch 'master' into more_multi_enumeration_fixes
2 years ago
Rinne
e5ca18fd86
Merge pull request #54 from martindevans/fix_multiple_enumeration
Fixed Multiple Enumeration
2 years ago
Martin Evans
23629a15f8
Merge pull request #1 from saddam213/new_llama_state_loading_mechanism
LLamaContextParams epsilon and tensor split changes
2 years ago
sa_ddam213
3e252c81f6
LLamaContextParams epsilon and tensor split changes
2 years ago
Martin Evans
36735f7908
Fixed spelling of "mirostat" instead of "mirostate"
2 years ago
Martin Evans
ec49bdd6eb
- Most importantly: Fixed issue in `SamplingApi`, `Memory` was pinned, but never unpinned!
- Moved repeated code to convert `LLamaTokenDataArray` into a `LLamaTokenDataArrayNative` into a helper method.
- Modified all call sites to dispose the `MemoryHandle`
- Saved one copy of the `List<LLamaTokenData>` into a `LLamaTokenData[]` in `LlamaModel`
2 years ago
Martin Evans
6985d3ab60
Added comments on two properties
2 years ago
Martin Evans
c974c8429e
Removed leftover `using`
2 years ago
Martin Evans
afb9d24f3a
Added model `Tokenize` method
2 years ago
Martin Evans
369c915afe
Added TokenToString conversion on model handle
2 years ago
Martin Evans
b721072aa5
Exposed some extra model properties on safe handle
2 years ago
Martin Evans
44b1e93609
Moved LoRA loading into `SafeLlamaModelHandle`
2 years ago
Martin Evans
c95b14d8b3
- Fixed null check
- Additional comments
2 years ago
Martin Evans
f16aa58e12
Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts.
This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction.
It is built upon llama `b3f138d`, necessary DLLs are **not** included in this commit.
2 years ago
Martin Evans
8848fc6e3d
Fixed 2 more "multi enumeration" issues
2 years ago
Martin Evans
ad28a5acdb
Merge branch 'master' into fix_multiple_enumeration
2 years ago
Rinne
4d7d4f2bfe
Merge pull request #59 from saddam213/master
Instruct & Stateless web example implemented
2 years ago
Rinne
66d6b00b49
Merge pull request #57 from martindevans/larger_states
Larger states
2 years ago
Martin Evans
3d07721a00
Fixed eager count check
2 years ago
Rinne
c5e8b3eba2
Merge pull request #56 from martindevans/memory_mapped_save_loading_and_saving
Memory Mapped LoadState/SaveState
2 years ago
Rinne
dee9afc471
Merge pull request #55 from martindevans/removed_dictionary_extensions
Cleaned up unnecessary extension methods
2 years ago
Rinne
d17fa991cc
Merge pull request #53 from martindevans/xml_docs_fixes
XML docs fixes
2 years ago
Rinne
ae98fa19b1
Merge pull request #52 from martindevans/docs_spelling_and_grammar
Documentation Spelling/Grammar
2 years ago
sa_ddam213
3fec7a63c7
Add Instruct and Stateless support
2 years ago
sa_ddam213
a32a5e4ffe
Decouple connectionId from ModelSession
2 years ago
sa_ddam213
d9fbd56f10
Strongly type connection status
2 years ago
sa_ddam213
ef8cf0b283
Add RequestVerificationToken logic fo ajax prefilter, Tidy up js cancel logic
2 years ago
sa_ddam213
e574d89a40
Send prompt on Enter key
2 years ago
Rinne
ac7f1865ee
Merge pull request #51 from fwaris/master
fix breaking change in llama.cpp; bind to latest version llama.cpp to…
2 years ago
Rinne
36ad09790c
Merge branch 'master' into master
2 years ago
Rinne
98825d8a9b
Merge pull request #48 from saddam213/master
Basic ASP.NET Core website example
2 years ago
Rinne
1b0523f630
Merge branch 'master' into master
2 years ago
Rinne
098d5b1544
Merge pull request #47 from SignalRT/master
MacOS metal support
2 years ago
SignalRT
e5d885050e
Align llama.cpp binaries
2 years ago
Martin Evans
f3fa73de2b
Implemented a new `LlamaModel.State` handle which internally stores the state as natively allocated memory. This allows it to exceed the 2GB limit on C# arrays.
2 years ago
Martin Evans
4d72420a04
Replaced `SaveState` and `LoadState` implementations. These new implementations map the file into memory and then pass the pointer directly into the native API. This improves things in two ways:
- A C# array cannot exceed 2,147,483,591 bytes. In my own use of LlamaSharp I encountered this limit.
- This saves an extra copy of the entire state data into a C# `byte[]`, so it should be faster.
This does _not_ fix some other places where `GetStateData` is used. I'll look at those in a separate PR.
2 years ago
Martin Evans
18462beb31
- Removed the `Update` and `GetOrDefault` extension methods (they were unused).
- Renamed `DictionaryExtensions` to `KeyValuePairExtensions`, since nothing in that file extends dictionary any more!
2 years ago
Martin Evans
7cf1f8ac28
Fixed multiple cases where an `IEnumerable<T>` was enumerated multiple times.
2 years ago
Martin Evans
2e76b79af6
Various minor XML docs fixes
2 years ago
Martin Evans
b39805dfcc
Fixed some spelling and grammar mistakes in the documentation.
2 years ago
Faisal Waris
17838bba49
fix breaking change in llama.cpp; bind to latest version llama.cpp to support new quantization method
2 years ago
sa_ddam213
a139423581
Move session management to service, Use ILLamaExecutor in session to make more versatile, scroll bug
2 years ago
SignalRT
a5c089e7b1
Update llama.cpp libraries
Keep update binaries
2 years ago