Skip to main content

v0.6.2

October 9, 2023 · One min read

llama.cpp updated to b1256
rwkv updated to 8db73b1
add grammar sampling for llama models, you can put .gbnf files to the grammars directory

gpt-2 updated
rwkv_eval_sequence 20% increase speed
handle GGML_ASSERT
starcoder (santacoder) with Metal and mmap support (GGUF)
infinite text generation by reseting n_past
new llama2 and saiga template
fixed some errors
fixed mmap always false
fixed many errors
fix crash on n_tokens > context size
fix llama mlock
fix crash on n_tokens > context size
fix llama mlock