v0.7.0
· One min read
- llama.cpp updated to b1396
- LoRA adapters support (More about LoRA here)
- added support for MPT and Bloom models
- added support Metal for q5_0, q5_1 quantization
- gpt-2 now with Metal support
- added special token support for prompt template like User: {{prompt}}
- fixed tokenizer bug that could cause application crashing
- fixed mirostat for non llama
- fixed premature completion of predictions
- fixed many other errors