This is a benchmark of Postgres FTS versus other solutions:
To run the tests, please ensure you have the following installed on your machine:
gunzip(part of thegzipsoftware distribution)- Docker
- NodeJS (
nodeandnpm) pnpm(i.e.npm install -g pnpm)sqlite
To set up testing data and run the full benchmark with all FTS engines:
make # equivalent to `make setup run-all`To run only a single benchmark (in this case, with Postgres FTS):
FTS_ENGINE=pg make setup run(FTS_ENGINE = 'pg' | 'meilisearch' | 'typesense' | 'opensearch' | 'sqlite-disk')
To only install dependencies:
make setupThe benchmark in this repository uses the a public domain movie dataset:
-
On HuggingFace, in particular the following columns:
-
homepage -
title -
original_title -
overview -
production_companies -
spoken_languages -
tagline
Data is processed from CSV into newline delimited JSON (see movies.ndjson.json.gz).
| ENV Variable | Default | Example | Description |
|---|---|---|---|
FTS_ENGINE |
N/A | pg |
The FTS engine to use |
DEBUG |
N/A | true |
Enable debug mode |
TIMING |
N/A | true |
Enable timing information display |
DATA_MOVIES_CSV_ZIPPED_PATH |
./movies.csv.gz |
/path/to/movies.csv.gz |
Path to the movie data set |
DATA_MOVIES_CSV_PATH |
./movies.csv |
/path/to/movies.csv |
Path to the movie data set, uncompressed |
DATA_MOVIES_NDJSON_PATH |
./movies.ndjson.json |
/path/to/movies.ndjson.json |
Path to the newline delimited JSON data for movies |
SEARCH_PHRASES_NDJSON_PATH |
./search-phrases.ndjson.json |
/path/to/search-phrases.ndjson.json |
Path to search phrases to use as newline delimited JSON |
Some variables are used per-run and are normally set by more ergonomic top-level Makefile targets:
| ENV Variable | Default | Example | Description |
|---|---|---|---|
INPUT_CSV_PATH |
$(DATA_MOVIES_CSV_ZIPPED_PATH) |
/path/to/movies2.csv.gz |
Path to compressed CSV (normally unzipped by Makefile target) |
OP |
N/A | ingest |
Operation to perform |
SQLITE_DISK_DB_PATH |
./fts-sqlite-disk-db.sqlite |
:memory: |
SQLite DB path |
PG_URL |
postgres://$(PG_USER):$(PG_PASSWORD)@$(PG_HOST):$(PG_PORT)/$(PG_DB) |
postgres://localhost |
Postgres DB path |
TYPESENSE_HOST |
localhost |
typesense.domain.tld |
Hostname for Typesense server |
TYPESENSE_PORT |
8108 |
8109 |
Port for Typesense server |
TYPESENSE_API_KEY |
badtypesenseapikey |
tttttttttttttttt |
API key for Typesense server |
MEILI_HOST |
localhost |
meili.domain.tld |
Hostname for MeiliSearch server |
MEILI_PORT |
7700 |
7701 |
Port for MeiliSearch |
MEILI_URL |
http://$(MEILI_HOST):$(MEILI_PORT) |
https://meili.domain.tld |
Full URL to use when accessing Meilisearch |
MEILI_API_KEY |
$(MEILI_MASTER_KEY) |
xxxxxxxxxxxxxxxxxxx |
MeiliSearch API key |
OPENSEARCH_PROTOCOL |
http |
https |
Protocol to use when accessing OpenSearch service |
OPENSEARCH_HOST |
localhost |
opensearch.domain.tld |
Host for OpenSearch server |
OPENSEARCH_PORT |
9200 |
9201 |
Port for OpenSearch server |
OPENSEARCH_AUTH_USERNAME |
admin |
admin |
Admin username for OpenSearch server |
OPENSEARCH_AUTH_PASSWORD |
admin |
hunter2 |
Admin password for OpenSearch server |
See Makefile for the code and other variables that might be excluded here.
A single benchmark can be run with the following command:
FTS_ENGINE=<engine> make setup runOptions for FTS_ENGINE:
pgmeilisearchtypesensesqlite.
To run the ingest & query tests with Postgres:
TIMING=true FTS_ENGINE=pg make runIf an error occurs during set up, consider tearing down the existing FTS_ENGINE:
FTS_ENGINE=pg make engine-stopTo control the setup/teardown of a single backing service, use the engine-start and engine-stop top level targets.
For example, if you wanted to start MeiliSearch and poke around on the instance:
FTS_ENGINE=meilisearch make engine-startAfter this command returns, you should have an instance of meilisearch running with a stable name (fts-$(FTS_ENGINE)):
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4d7c0efdf5cf getmeili/meilisearch:v0.28.1 "tini -- /bin/meilis…" 7 seconds ago Up 6 seconds 127.0.0.1:7700->7700/tcp fts-meili
To stop the service:
FTS_ENGINE=meilisearch make engine-stopIngesting data into each separate solution is different, and code to do each can be found under src/driver/<engine>.js. For example, the src/driver/pg.mjs contains the code to enable document ingestion to Postgres.
Queries to be performed in the test are specified via YAML and stored in search-phrases.ndjson.json.
This file is read by the automation and related scripts.
To clear all the data inbetween runs:
sudo make clean # sudo is likely needed to clear docker container data folders