Skip to main content

Inference options

When creating a chat, a JSON file is generated in which you can specify additional model parameters. The chat files are located in the "chats" directory.

parametrdefaultdescription
title[Model file name]Chat title
iconava0ava[0-7]
modelmodel file path
model_inferenceautomodel_inference: llama | gptneox | replit | gpt2
prompt_formatautoExample for stablelm:
"<USER> {{prompt}} <ASSISTANT>"
numberOfThreads0 (max)number of threads
context1024context size
n_batch512batch size for prompt processing
temp0.8temperature
top_k40top-k sampling
top_p0.95top-p sampling
tfs_z1.0tail free sampling, parameter z
typical_p1.0locally typical sampling, parameter p
repeat_penalty1.1penalize repeat sequence of tokens
repeat_last_n64last n tokens to consider for penalize
frequence_penalty0.0repeat alpha frequency penalty
presence_penalty0.0repeat alpha presence penalt
mirostat0use Mirostat sampling
mirostat_tau5.0Mirostat target entropy, parameter tau
mirostat_eta0.1Mirostat learning rate, parameter eta