Skip to main content

v0.6.2

· One min read
  • llama.cpp updated to b1256
  • rwkv updated to 8db73b1
  • add grammar sampling for llama models, you can put .gbnf files to the grammars directory
  • gpt-2 updated
  • rwkv_eval_sequence 20% increase speed
  • handle GGML_ASSERT
  • starcoder (santacoder) with Metal and mmap support (GGUF)
  • infinite text generation by reseting n_past
  • new llama2 and saiga template
  • fixed some errors
  • fixed mmap always false
  • fixed many errors
  • fix crash on n_tokens > context size
  • fix llama mlock
  • fix crash on n_tokens > context size
  • fix llama mlock