Introducing the Ellipse Ingest Processor with Lucenia 0.2+ for Telecom Cellular Data Analysis in Five Steps

Nick Knize

Introducing the Ellipse Ingest Processor with Lucenia 0.2+ for Telecom Cellular Data Analysis in Five Steps

This tutorial will guide you through ingesting and querying cellular tower data using Lucenia's ellipse processing, making it easier than ever to model complex geospatial data in practical, actionable ways.

Accurate modeling and analysis of spatial data is crucial for making well-informed, impactful decisions. Ellipse analysis is especially valuable in fields like military intelligence, weather disaster response, and telecommunications, where directional coverage and reach often vary. For telecommunication applications, ellipses more accurately represent cellular coverage areas, accounting for real-world conditions where signal strength isn’t uniform in all directions. With Lucenia 0.2.0, the first-to-market solution with Lucene 10 support, organizations can ingest, index, and analyze ellipses using Lucenia’s powerful Ellipse Ingest Processor. This technology enables telecom providers to precisely map and analyze cellular tower coverage, assess maintenance needs, and identify overlapping coverage areas in various regions. 

Prerequisites:

  1. Java: Lucenia 0.2+, with Lucene 10 support, now requires JDK 21 or higher.
  2. Docker: Ensure Docker is installed. Follow Docker Installation Instructions.
  3. Lucenia License: Obtain a trial license at Lucenia Cloud or via AWS Marketplace.

Step 1: Start Lucenia 0.2

a. Spin up Lucenia and source environment variables

git clone git@github.com:lucenia/lucenia-tutorials && cd lucenia-tutorials/4_ellipse-demo && source env.sh

b. Copy your Lucenia license

cp ~/Downloads/trial.crt node/config

c. Launch Lucenia 0.2

docker compose up

Step 2: Create the Index and Ellipse Ingest Processor

a. Create the cellular_towers index with the proper mappings

To store our cellular tower data with proper structure, we’ll create an index with specified field mappings in the mappings.json:

curl -XPUT "https://localhost:9200/cellular_towers" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure \
--header 'Content-Type: application/json' \
--data-binary "@mappings.json"

b. Create the Ellipse Ingest Processor

The Ellipse Ingest Processor can parse ellipses from WKT or GeoJSON and approximate them into polygons for efficient spatial indexing. Here, we create a processor called ellipse_ingest; it will index ellipses with centimeter accuracy:

curl -XPUT "https://localhost:9200/_ingest/pipeline/ellipse_ingest" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure -H 'Content-Type: application/json' \
-d '{
  "description" : "Index ellipses from WKT or GeoJSON",
  "processors" : [
     {
       "ellipse" : {
         "field" : "ellipse",
         "error_distance" : 0.01,
         "shape_type" : "geo_shape"
       }
     }
   ]
 }'

To speed up indexing and save space, increase the ellipse.error_distance from 0.01 (1 centimeter), to a larger distance, such as 0.1 (10 centimeters). Natural language distances can also be used (e.g., ”1m”)

Step 3: Bulk Index Cellular Tower Data

Using the Ellipse Ingest Processor, we’ll index a dataset of 142,100 cellular towers, each with an ellipse coverage area, location, and metadata.

a. Bulk load the data:

sh index_data.sh bulk-data.json.bz2

b. Verify all documents are indexed

curl "https://localhost:9200/cellular_towers/_count?pretty" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure

This command provides a count of indexed documents, confirming that all data is loaded successfully. The below image provides an example of an indexed ellipse.

Example ellipse for visualization purposes only

Step 4: Query Tower Data by Overlapping Coverage Area

Next, we’ll use a geo_shape query with multi-polygon boundaries to identify towers whose coverage overlaps Hill County, Texas and Tarrant County, Texas, filtering those installed within the last two years.

curl "https://localhost:9200/cellular_towers/_search?pretty" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure --header 'Content-Type: application/json' \
-d '{
  "_source": false,
  "query": {
    "bool": {
      "must": [
        {
          "geo_shape": {
            "ellipse": {
              "shape": {
                "type": "multipolygon",
                "coordinates": [
                  [
                    [
                      [-97.382, 32.897],
                      [-97.382, 32.396],
                      [-96.682, 32.396],
                      [-96.682, 32.897],
                      [-97.382, 32.897]
                    ]
                  ],
                  [
                    [
                      [-97.382, 32.897],
                      [-97.382, 32.396],
                      [-96.682, 32.396],
                      [-96.682, 32.897],
                      [-97.382, 32.897]
                    ]
                  ]
                ]
              },
              "relation": "intersects"
            }
          }
        }
      ],
      "filter": {
        "range": {
          "maintenance.installation_date": {
            "lt": "now-2y"
          }
        }
      }
    }
  }
}'

This query identifies towers with coverage areas overlapping Hill and Tarrant Counties that were installed in the last two years, a valuable insight for maintenance scheduling and planning.

Step 5: Analyze Infrastructure by Installation Year

Finally, to understand the distribution of tower installations over time, we’ll aggregate results by installation_date to see the number of installations per year.

curl "https://localhost:9200/cellular_towers/_search?pretty" \
-u "admin:${LUCENIA_INITIAL_ADMIN_PASSWORD}" \
--insecure --header 'Content-Type: application/json' \
-d '{
  "_source": false,
  "query": {
    "geo_shape": {
      "ellipse": {
        "shape": {
          "type": "multipolygon",
          "coordinates": [
            [
              [
                [-97.382, 32.397],
                [-97.202, 32.292],
                [-97.132, 31.892],
                [-96.982, 31.702],
                [-97.382, 31.902],
                [-97.382, 32.397]
              ]
            ],
            [
              [
                [-97.580, 33.046],
                [-97.430, 32.946],
                [-97.180, 32.896],
                [-97.280, 32.796],
                [-97.580, 32.946],
                [-97.580, 33.046]
              ]
            ]
          ]
        },
        "relation": "intersects"
      }
    }
  },
  "aggs": {
    "installations_by_year": {
      "date_histogram": {
        "field": "maintenance.installation_date",
        "calendar_interval": "year"
      }
    }
  }
}'

This query provides a breakdown of installations by year, giving insights into the infrastructure's age and guiding future maintenance and upgrade planning.

All set! 

Ellipse analysis with Lucenia enables organizations to gain a deeper understanding of spatial data, empowering them to make targeted decisions in network planning, maintenance, and resource allocation. In this tutorial, we demonstrated setting up Lucenia 0.2+, created the new Ellipse ingest processor, and performed powerful spatial and numeric hybrid queries and analytics to identify and analyze cellular tower coverage in specific targeted regions. These steps provide a solid foundation for further exploration, such as integrating real-time analytics to monitor network performance or automating maintenance scheduling based on coverage data. Lucenia is designed to scale with your needs, offering a comprehensive suite of tools for geospatial analysis.

For those looking to expand their capabilities, Lucenia also provides additional tools for integrating with other spatial data, enabling deeper analyses such as overlap with land usage, population density, or environmental factors. Ready to see how Lucenia can drive your data insights? Lucenia is now available on AWS Marketplace for streamlined access and deployment—experience the power of first-to-market Lucene 10 technology for spatial analysis today!