RAG Settings
There is also a debugging menu in the settings menu, where you can test your settings and rebuild the search index with new parameters
Settings
Chunk Size
The more, the more text fragments will be sent to the LLM input.
Chunk Overlap
The more the more fragments will be created and the more likely it is to find the correct answer in one of them. However, it increases the index creation time and the response search time.
Embedding Model
Model | Use Case | Size | Source |
---|---|---|---|
MiniLMAll | Text similarity, fastest inference | 46 MB | HuggingFace |
Distilbert | Q&A search, highest accuracy | 86 MB (quantized) | HuggingFace |
MiniLMMultiQA | Q&A search, fastest inference | 46 MB | HuggingFace |
Similarity Metric
Find the nearest neighbors given a query embedding vector and a list of embeddings vectors.
Metric | Description |
---|---|
DotProduct | Dot product is a similarity metric that measures the similarity between two vectors by calculating the sum of their corresponding products. It is well-suited for dense embeddings and when the magnitude of the embeddings does not impact the similarity. |
CosineSimilarity | Cosine similarity is a metric that measures the cosine of the angle between two vectors. It is well-suited for sparse embeddings and when the magnitude of the embeddings impacts the similarity. |
EuclideanDistance | Euclidean distance is a metric that measures the distance between two points in a Euclidean space. It is well-suited for cases where the embeddings are well-distributed in the vector space and when magnitudes of the embeddings impact the similarity. |
Text Splitter
This is used for splitting long documents into smaller chunks for embedding.
Splitter | Description |
---|---|
token | Encodes input text and return chunks based on chunk size. Ideal for speed if you don't mind losing some information to unknown tokens during the encode/decode process. |
character | Split chunks based on seperator, append until the chunk size is reached. Default separator is character breaks. |
recursive | Uses a progressively smaller set of text seperators to try to fit the goal chunk size in tokens without going over. Ideal if you need to maintain punctuation or unknown tokens from original text because it doesn't decode the final text. |
Debug
Rebuild index
Build a new index with updated settings. If you do not do this, changing the settings will not affect the search results.
Load index
Load the index for testing.