Skip to main content

v0.7.0

· One min read
  • llama.cpp updated to b1396
  • LoRA adapters support (More about LoRA here)
  • added support for MPT and Bloom models
  • added support Metal for q5_0, q5_1 quantization
  • gpt-2 now with Metal support
  • added special token support for prompt template like User: {{prompt}}
  • fixed tokenizer bug that could cause application crashing
  • fixed mirostat for non llama
  • fixed premature completion of predictions
  • fixed many other errors