Introduction

InfiSearch is a client-side search solution made for static sites, including a search UI and library depending on a pre-built index generated by a CLI tool.

Features

  • Relevant Search 🔍: spelling correction, automatic prefix search, boolean and phrase queries, BM25 scoring, proximity scoring, facet filters and more…

  • Speedy 🏇: WebAssembly & WebWorker powered, enabling efficient, non-blocking query processing. Also includes persistent caching to minimize network requests, and a multi-threaded CLI indexer powered by Rust.

  • Semi-Scalable, achieved by optionally splitting the index into tiny morsels, complete with incremental indexing.

  • A customisable, accessible user interface 🖥️

  • Support for multiple file formats (.json,csv,pdf,html) to satisfy more custom data requirements.

Search Features

A little more about some of InfiSearch’s search features.

Blazing Fast

Powered by WebAssembly and Webworkers, InfiSearch blazes through searches on tens of thousands of documents. Index downloads are persistently cached using the Cache API that backs service workers, but comes without its setup hassle. Users will never download the same data twice.

Some efficient, high-return compression schemes are also employed, so you get all these features without much penalty. This documentation for example, which has all features enabled, generates a main index file of just 20KB, and a dictionary of 9KB.

Scalable

A monolithic index is built by default to reduce network latency, which suffices for 90% of use cases. But, you also have the option of splitting up the index so users retrieve only what’s necessary, greatly improving client-side search scalability.

Ranking Model & Query Refinement

InfiSearch adopts industry standard scoring schemes. Queries are first ranked using the BM25 model, then a soft disjunctive maximum of the document’s field scores is taken. By default, <title>, <h1>, <h2-6>, then other texts are indexed as four separate fields.

Query term proximity ranking is InfiSearch’s highlight here, and is enabled by default. Results are scaled according to how close search expressions are to one another, greatly improving contextuality of searches.

InfiSearch also gives the searchers the a powerful boolean query syntax, made known to them through an advanced search tips icon. You also have the option of setting up custom facet filters such as multi-select checkboxes, numeric filters, and date time filters for ease of use.

How it Works:

InfiSearch depends on a static, pre-built index that is a collection of various files.

  1. The CLI indexer tool first generates:
    • Binary index chunk(s)
    • JSON field store(s) containing raw document texts
    • Supporting metadata, for example the search dictionary
  2. The search UI:
    1. Figures out which index files are needed from the query
    2. Retrieves the files from cache/memory/network requests
    3. Obtains and ranks the result set
    4. Lastly, retrieves field stores from cache/memory/network requests progressively to generate result previews