v0.6.2
· One min read
- llama.cpp updated to b1256
- rwkv updated to 8db73b1
- add grammar sampling for llama models, you can put .gbnf files to the grammars directory
- gpt-2 updated
- rwkv_eval_sequence 20% increase speed
- handle GGML_ASSERT
- starcoder (santacoder) with Metal and mmap support (GGUF)
- infinite text generation by reseting n_past
- new llama2 and saiga template
- fixed some errors
- fixed mmap always false
- fixed many errors
- fix crash on n_tokens > context size
- fix llama mlock
- fix crash on n_tokens > context size
- fix llama mlock