Frontend-only live semantic search with transformers.js. GitHub

Semantic search right in your browser! Calculates the embeddings and cosine similarity client-side without server-side inferencing. Your data is private and stays in your browser.
Just copy & paste any text in the text area or load one more PDFs in the advanced settings and hit Find. Set a different chunk size for finer or coarser search.
Large books can be indexed too and searched in less than 2 seconds!
Examples: The Bible (en), Les Misérables (fr), Das Kapital (de), Don Quijote (es), Divina Commedia (it), Iliad (gr), IPCC Report 2023 (en). Full catalogue with pre-indexed examples on Huggingface. Contribute the indices of the documents you indexed or open a request on GitHub with a source URL.

Model Selection
Chunking Settings
App Settings
Include Words
Exclude Words
Import one or multiple PDF File(s)
Import Remote PDF File(s) space-separated using corsproxy.io
Import Local Index File
Import Remote Index File (Examples)
Export Index File
Style Preferences
Experimental Expert Settings (best leave defaults)

    Dimensionality Reduction (New🔥)

    Run a search as usual or load an index. Then hit "Dim-Reduction" in the advanced settings. More iterations yield better results but take more time to compute. If the points are too small increase the radius. Using a fast wasm implementation of Barnes-Hut tSNE (wasm-bhtSNE).


    Enter a question to be answered and use the placeholders SEARCH_RESULTS or FULL_TEXT for context (Retrieval Augmented Generation, RAG).
    If you encounter errors, the input is probably too long (either too many or too long results or too long prompt). Also, make sure to check the right prompting style! Xenova/Qwen1.5-1.8B-Chat is by far the best quantized model currently available and delivers good results. At some point Falcon & Mistral/Zephyr models will probably become available here.
    Attention: Loads very large models with more than 1.5Gb (!) of resources.


    Ollama Chat Integration (New🔥)

    Enter a question to be answered and use the placeholders SEARCH_RESULTS or FULL_TEXT for context.
    Install Ollama locally on macOS, Linux or Windows and connect your server (currently only default http://localhost:11434 supported).
    Make sure to set the environment variable so that requests from SemanticFinder are allowed:
    - on Windows Powershell: $env:OLLAMA_ORIGINS="https://do-me.github.io"; ollama serve
    - on Ubuntu: OLLAMA_ORIGINS="https://do-me.github.io" ollama serve
    Due to CORS issues currently only working on Chromium-based browsers like Chrome and Edge.

    Summary (Retrieval Augmented Generation, RAG)

    Summarizes the top search results. Works best with non-fictional texts and longer text chunks (>200 chars).
    Attention: Loads very large models with hundreds of MB!