Skip to main content
Dr. Nicholas Knize
Co-founder & CTO
View all authors

Reducing Cloud Storage for Generative AI: Lucenia's Approach to Vector Search

· 7 min read
Dr. Nicholas Knize
Co-founder & CTO

As the hype around Generative AI and Retrieval Augmented Generation (RAG) grows, the number of vector dimensions a database product supports is often used as a measuring stick to compare solutions. Consequently, most search vendors prioritize increasing vector dimensions over more critical factors like signal quality or recall performance. This focus leads to higher costs for end users, as managing and storing data in the cloud becomes increasingly more expensive. Vector compression techniques offer a promising solution to this problem, but they are only one piece of the larger puzzle in the quest for efficient cloud data consolidation.

Beyond Geo Coordinates: Revolutionizing Spatial Search with Natural Language and Gen AI

· 7 min read
Dr. Nicholas Knize
Co-founder & CTO

In the realm of natural language processing (NLP), Large Language Models (LLMs) have emerged as powerful tools for understanding and generating human-like text. However, when it comes to geospatial search and location-based queries, LLMs need to improve their heavy reliance on location names (e.g., city, state, country) derived from text data through the use of supporting physical locations (latitude, longitude) in real-world space. In this blog post, we dive into the limitations of LLMs in geospatial search and explore the need for integrating traditional Geospatial Information Science (GIS) techniques to enhance Geospatial Search for Generative AI applications.

Hidden Cost of Running a Search Cluster

· 9 min read
Dr. Nicholas Knize
Co-founder & CTO

Hidden Cost of Running a Search Cluster

The cost of running a search cluster varies significantly depending on its deployment method. For cloud-based clusters, for example, expenses are incurred through usage of a vendor's services. Conversely, on-premise clusters require investment in both hardware and software infrastructure for operation. This blog compares the various considerations of self-hosting a search cluster with using a managed service or self managed cloud deployment. Additionally, it offers insights into Lucenia's hybrid approach designed to decrease cloud expenditures without amplifying the level of effort or maintenance required.

Fortify and Thrive: Lucenia's Cyber Security Advantages

· 5 min read
Dr. Nicholas Knize
Co-founder & CTO

Fortify and Thrive: Lucenia's Cyber Security Advantages

Part 1: Memory-Safe Serverless Search Architecture

In software development for multi-user distributed applications, the choice of programming language can significantly impact the security and performance of the application or service. A fundamental consideration in this regard is the distinction between memory-safe and memory-unsafe languages. Memory-safe languages provide built-in mechanisms to prevent common memory-related vulnerabilities, whereas unsafe languages offer faster performance but are more susceptible to memory corruption or exploitation issues. In this first part of a two part series blog post we touch on the topic of memory safety, what it means for software security with the government's need for heightened security, and how Lucenia prioritizes memory safety and security in the serverless autoscale microservice design.

Meet Lucenia Live at the 20th Annual GeoINT Symposium

· 2 min read
Dr. Nicholas Knize
Co-founder & CTO

GeoINT Symposium 2024

Ready to save on cloud spend? Book a meeting during the event.


Lucenia is thrilled to announce our participation in the milestone 20th anniversary Geospatial Intelligence Symposium, taking place from May 5th to 8th at the prestigious Gaylord Palms in Orlando, Florida. With the theme "GEOINT 2024: Essential in All Dimensions and Domains," this event promises to be a grand convergence of geospatial intelligence expertise across the public and private industries.

Generative AI and the Curse of Dimensionality: Lucenia's Vector Compression

· 5 min read
Dr. Nicholas Knize
Co-founder & CTO

Generative AI and the Curse of Dimensionality

In AI and Machine Learning applications the "dimensionality" of a data set refers to the number of features, often referred to as input variables, that represent some real world phenomena (e.g., physical objects, conceptual meanings). These features are typically the columns in a data table, the more columns the more features to describe the natural phenomena. The "samples", or rows in the table, represent the specific objects or phenomena in the real world. These samples typically serve as a "training" set to produce a new data set that represents the contextual meaning and/or relationships between the objects in the original training data set. The resulting data set are referred to as vector embeddings. These vector embeddings serve as numerical representations in a multi-dimensional space and are pivotal in capturing the semantic essence of textual information and real world relationships. Each vector, typically high-dimensional, encapsulates various features that enable the Generative AI processor to comprehend and process language intelligently. In modern Generative AI applications (e.g., OpenAI), vector embeddings can range from 256 to 3072 dimensions and are often growing every new version.

Revolutionizing Search Economics & Technology with Self-Hosted to Cloud Autoscaling

· 4 min read
Dr. Nicholas Knize
Co-founder & CTO

Revolutionizing Search Economics

In today's data-driven landscape, the efficiency and cost-effectiveness of search and analytics solutions can make or break a business. Imagine a world where your organization can harness the power of search without being tethered to the cloud, reingesting data every time you adopt a new technology, or succumbing to rigid business models dictated by technology providers. Welcome to the future of search economics – where flexibility, control, and substantial cost savings converge.