Inference options
When creating a chat, a JSON file is generated in which you can specify additional model parameters. The chat files are located in the "chats" directory.
parametr | default | description |
---|---|---|
title | [Model file name] | Chat title |
icon | ava0 | ava[0-7] |
model | model file path | |
model_inference | auto | model_inference: llama | gptneox | replit | gpt2 |
prompt_format | auto | Example for stablelm: |
"<USER> {{prompt}} <ASSISTANT>" | ||
numberOfThreads | 0 (max) | number of threads |
context | 1024 | context size |
n_batch | 512 | batch size for prompt processing |
temp | 0.8 | temperature |
top_k | 40 | top-k sampling |
top_p | 0.95 | top-p sampling |
tfs_z | 1.0 | tail free sampling, parameter z |
typical_p | 1.0 | locally typical sampling, parameter p |
repeat_penalty | 1.1 | penalize repeat sequence of tokens |
repeat_last_n | 64 | last n tokens to consider for penalize |
frequence_penalty | 0.0 | repeat alpha frequency penalty |
presence_penalty | 0.0 | repeat alpha presence penalt |
mirostat | 0 | use Mirostat sampling |
mirostat_tau | 5.0 | Mirostat target entropy, parameter tau |
mirostat_eta | 0.1 | Mirostat learning rate, parameter eta |