RAG Settings

There is also a debugging menu in the settings menu, where you can test your settings and rebuild the search index with new parameters

Settings

Chunk Size

The more, the more text fragments will be sent to the LLM input.

Chunk Overlap

The more the more fragments will be created and the more likely it is to find the correct answer in one of them. However, it increases the index creation time and the response search time.

Embedding Model

Model	Use Case	Size	Source
`MiniLMAll`	Text similarity, fastest inference	46 MB	HuggingFace
`Distilbert`	Q&A search, highest accuracy	86 MB (quantized)	HuggingFace
`MiniLMMultiQA`	Q&A search, fastest inference	46 MB	HuggingFace

Similarity Metric

Find the nearest neighbors given a query embedding vector and a list of embeddings vectors.

Metric	Description
`DotProduct`	Dot product is a similarity metric that measures the similarity between two vectors by calculating the sum of their corresponding products. It is well-suited for dense embeddings and when the magnitude of the embeddings does not impact the similarity.
`CosineSimilarity`	Cosine similarity is a metric that measures the cosine of the angle between two vectors. It is well-suited for sparse embeddings and when the magnitude of the embeddings impacts the similarity.
`EuclideanDistance`	Euclidean distance is a metric that measures the distance between two points in a Euclidean space. It is well-suited for cases where the embeddings are well-distributed in the vector space and when magnitudes of the embeddings impact the similarity.

Text Splitter

This is used for splitting long documents into smaller chunks for embedding.

Splitter	Description
`token`	Encodes input text and return chunks based on chunk size. Ideal for speed if you don't mind losing some information to unknown tokens during the encode/decode process.
`character`	Split chunks based on seperator, append until the chunk size is reached. Default separator is character breaks.
`recursive`	Uses a progressively smaller set of text seperators to try to fit the goal chunk size in tokens without going over. Ideal if you need to maintain punctuation or unknown tokens from original text because it doesn't decode the final text.

Debug

Rebuild index

Build a new index with updated settings. If you do not do this, changing the settings will not affect the search results.

Load index

Load the index for testing.

RAG Settings

Settings​

Chunk Size​

Chunk Overlap​

Embedding Model​

Similarity Metric​

Text Splitter​

Debug​

Rebuild index​

Load index​

LLM Farm