Mastering search: The semantic edge

January 19, 2024

Primer

This post is archived.

References & links may be out of date

The US government has a massive search challenge. It generates billions of pages every year, from tax audits and budget reports to the president’s daily intelligence brief. Classified data alone is growing at an estimated rate of 50 million documents per year.[1]

When searching for information, no one has a tougher job than intelligence analysts. Crucial information is scattered among data silos, each with their own specific search tools. Locating the right information is the most time-intensive part of their day, leaving a thin sliver of time to read, synthesize, and write.

Artificial intelligence can help analysts find the information they need, particularly with a new technique called semantic search. But to understand the analysts’ dilemma, one must understand the tool they use today: boolean search.

A Search Nightmare

Crafting a good boolean query is a painfully learned art. Consider this boolean query used to find relevant documents about Ukraine’s relationship with NATO member states:

In boolean search, an analyst can’t simply search for “Ukraine foreign relations” because that will retrieve only documents with those exact words. Intelligence analysts spend hours honing their boolean searches over and over until, finally, they get the results they need. Good booleans get passed down from analyst to analyst like treasured family heirlooms.

Dream Search

What if our analyst could search with a simple plain language query? She simply types “What are Ukraine’s relationships with NATO member states?” and the documents that come back are about exactly that, regardless of whether they include those exact words. Even better, the most relevant parts of the retrieved documents are highlighted.

This is semantic search. Rather than having the user define what counts as relevant, the search system “understands” the meaning of the user’s query and retrieves documents with truly relevant content.

In semantic search, every chunk of text in your documents and your query is assigned an address in a high-dimensional “semantic space”. The system then gathers up documents that are in the same semantic neighborhood as her query. The closer they are in the semantic space, the higher the relevancy. The breakthrough that makes this possible is an AI tool called a language model.

Voilà! Now you understand the core concept behind semantic search.

Even Better Search

What if instead of search, our analyst could just ask the question that motivated her search in the first place?

This is semantic question-answering. The trick is to add a large language model to the end of the search flow. Taking the user’s question and search results, the model generates a direct answer to the user’s question, including in-line citations to sources.

Primer has semantic question-answering deployed in products serving US Defense and IC customers. Below is a screenshot showing it in action.

With the help of AI, analysts can now engage with the vast corpus of government data in a more natural way, unburdened by keyword limitations. This translates to faster analysis and faster decision-making.

¹_{https://www.nytimes.com/2023/01/27/briefing/classified-documents-government.html}

‍

Primer Enterprise

Informed, defensible analysis

Primer Enterprise is a secure AI platform that helps analysts and mission teams across the Intelligence Community, Defense, and Civilian agencies analyze massive volumes of unstructured data. It transforms fragmented reports, proprietary data, and open-source information into structured, traceable insight that supports faster, defensible decision-making.

Learn about Primer Enterprise

Webpage discussing the impact of the global AI chip race on US security in the Pacific, featuring a text summary, an interactive map with numbered locations, and a sidebar with insights and relevant document titles.

Primer Command

Real-time operational clarity

Primer Command is an AI-powered monitoring platform that helps mission teams keep track of narratives, track evolving topics, and detect emerging threats across global news and social media. It provides real-time visibility into the information environment so leaders can understand events as they unfold.

Learn about Primer Command

Dashboard showing social media analytics including trending extractions for people, organizations, locations, hashtags, social highlights, sentiment analysis, social feed posts, and news feed about AI chip security concerns and cyber attacks.

Learn about AI solutions for better, faster decisions

Book a demo

Mastering search: The semantic edge

A Search Nightmare

Dream Search

Even Better Search

Informed, defensible analysis

Real-time operational clarity

Recommended reading

Accelerate the homeland security mission with Primer real-time intelligence

The future of AI in healthcare

Why AI in cybersecurity is a double-edged sword

Learn about AI solutions for better, faster decisions