Skip to main content

Getting Started with Lucenia in Five Easy Steps

· 5 min read
Dr. Nicholas Knize
Co-founder & CTO

At Lucenia, our mission is to make advanced search capabilities accessible to everyone. Lucenia 0.1.0 embodies this philosophy by bringing PhD-level science and analytics into everyday hybrid search use cases, all while ensuring enterprise-grade Role-Based and Attribute-Based Access Control (RBAC and ABAC) security by default. If you missed our exclusive sneak peek, be sure to check it out. This post marks the first in a series of tutorial blog posts that will guide you from getting started with Lucenia to mastering some of its most sophisticated features.

Prerequisites

Before diving in, ensure you have satisfied the following prerequisites:

  1. Java Installation: You need Java 19 or later, but we strongly recommend using Java 21 or above to take full advantage of search optimizations, including Project Panama and SIMD.
  2. Docker Installation: While not a hard requirement, Docker is used in this post to run Lucenia. If you don't have Docker installed, you can find the installation instructions here. Other ways to install and run Lucenia can be found in our installation guide on the Lucenia Documentation.
  3. Lucenia License: Obtain a Lucenia license by registering at https://cloud.lucenia.io. This license gives you full access to all features for 30 days. After that, you can either purchase a yearly license or stay tuned for the launch of our free developer license, which allows for a minimally scalable production cluster at no cost. Watch the video below for a quick walkthrough on how to register and obtain your Lucenia license.

Step 1: Spinning Up Lucenia

To get started, clone the Lucenia Tutorials repository and navigate to the getting-started tutorial directory. Watch the video below for a complete walkthrough on spinning up Lucenia.

git clone git@github.com:lucenia/lucenia-tutorials
cd lucenia-tutorials/1_getting-started
source env.sh
cp ~/Downloads/trial.crt node/config
docker compose up

Verify operation:

curl "https://localhost:9200" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure

Step 2: Creating the Index

Now that Lucenia is up and running, the next step is to create an index to store the data. Watch the video below for a walkthrough.

curl -XPUT "https://localhost:9200/nyc_taxis" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure \
--header 'Content-Type: application/json' \
--data-binary "@mappings.json"

Step 3: Bulk Indexing Data

With the index created, it's time to load some data. We'll bulk index 1 million NYC taxi documents. Watch the video below for a walkthrough.

sh index_data.sh bulk-data.json.bz2

Step 4: Query the Data

Now that the data is indexed, let's run some queries. Watch the video below for a walkthrough of querying and analytics.

curl -XGET "https://localhost:9200/nyc_taxis/_search?pretty" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure \
--header 'Content-Type: application/json' \
-d '{
"query": {
"range": {
"dropoff_datetime": {
"gte": "2016-03-03 00:00:00",
"lte": "2016-03-08 00:00:00"
}
}
}
}'

Step 5: Query with Analytics

Now let's perform geo-distance aggregation analyzing taxi distribution from Manhattan:

curl -XGET "https://localhost:9200/nyc_taxis/_search?pretty" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure \
--header 'Content-Type: application/json' \
-d '{
"aggs": {
"manhattan_rings": {
"geo_distance": {
"field": "dropoff_location",
"origin": "POINT (-73.971321 40.776676)",
"unit": "mi",
"ranges": [{"to": 1}, {"from": 1, "to": 3}, {"from": 3, "to": 5}]
}
}
}
}'

Conclusion

In this tutorial, we covered how to spin up Lucenia, create an index, bulk load data, and run queries with analytics. Lucenia 0.1.0 offers 17% performance improvements over existing platforms, alongside reduced index sizes and resource consumption.

Stay tuned for more tutorials in this series as we dive deeper into Lucenia's advanced features including geospatial search, vector search, and security configurations.