Skip to main content

v1.3.0

· One min read
  • llama.cpp updated to b3190
  • Added support for DeepseekV2, GPTNeoX (Pythia and others)
  • Added support for Markdown formatting
  • Added support for using history in Shortcuts
  • Added Flash Attention support
  • Added NPredict option
  • Metal and CPU inference improvements
  • Sampling and eval improvements
  • Some fixes for phi-3 and MiniCPM
  • Fixed some errors
  • Added Qwen template