Skip to main content

RAG Settings

There is also a debugging menu in the settings menu, where you can test your settings and rebuild the search index with new parameters

Settings

Chunk Size

The more, the more text fragments will be sent to the LLM input.

Chunk Overlap

The more the more fragments will be created and the more likely it is to find the correct answer in one of them. However, it increases the index creation time and the response search time.

Embedding Model

ModelUse CaseSizeSource
MiniLMAllText similarity, fastest inference46 MBHuggingFace
DistilbertQ&A search, highest accuracy86 MB (quantized)HuggingFace
MiniLMMultiQAQ&A search, fastest inference46 MBHuggingFace

Similarity Metric

Find the nearest neighbors given a query embedding vector and a list of embeddings vectors.

MetricDescription
DotProductDot product is a similarity metric that measures the similarity between two vectors by calculating the sum of their corresponding products. It is well-suited for dense embeddings and when the magnitude of the embeddings does not impact the similarity.
CosineSimilarityCosine similarity is a metric that measures the cosine of the angle between two vectors. It is well-suited for sparse embeddings and when the magnitude of the embeddings impacts the similarity.
EuclideanDistanceEuclidean distance is a metric that measures the distance between two points in a Euclidean space. It is well-suited for cases where the embeddings are well-distributed in the vector space and when magnitudes of the embeddings impact the similarity.

Text Splitter

This is used for splitting long documents into smaller chunks for embedding.

SplitterDescription
tokenEncodes input text and return chunks based on chunk size. Ideal for speed if you don't mind losing some information to unknown tokens during the encode/decode process.
characterSplit chunks based on seperator, append until the chunk size is reached. Default separator is character breaks.
recursiveUses a progressively smaller set of text seperators to try to fit the goal chunk size in tokens without going over. Ideal if you need to maintain punctuation or unknown tokens from original text because it doesn't decode the final text.

Debug

Rebuild index

Build a new index with updated settings. If you do not do this, changing the settings will not affect the search results.

Load index

Load the index for testing.