Introducing Primer EnginesLearn more
Process all of your documents, emails, PDFs, text messages, and social media to find the information that matters most. Primer Extract uses cutting-edge machine learning tools to help you explore your data quickly, and at scale. Going beyond keyword search, Extract also gives you translation, OCR, and image recognition capabilities.
Parse all available data sources, including emails, PDFs, Word documents, text messages, social media, and even handwritten notes. Quickly extract valuable information from the sheer volume of data.
Use customizable machine learning models to instantly pinpoint the people, places, topics, or other details that are critical to your intelligence operations.
Find relevant information across languages, as well as in image, video, or audio files, and add it to your knowledge base as searchable English text.
Multiple users can work on the same data set at the same time and get to insights faster. Benefit from the collective intelligence of your team as you train models together, share information, and support each other.
Extract people, organizations, locations, or dozens of other types of information from your text sources and use them to return comprehensive search results.
U.S. Special Operations forces capture an enormous amount of material during enemy raids, including video, audio, and text files. Intelligence analysts must explore and analyze this data as quickly as possible in order to surface useful intelligence. With a skyrocketing volume of captured data, analysts can only exploit a small portion of material using manual processes. Primer Extract enables analysts to find hidden intelligence within vast troves of messy multimedia data pulled from laptops, hard drives, and mobile phones. Deployed with Primer Automate, Extract also provides them with rapid data ingestion, enabling non-data scientists to train machine learning models on-the-fly, or choose from a menu of prebuilt options, and rapidly triage with integrated language translation, image recognition, speech to text, and handwriting recognition.
One of the first steps in addressing a data breach is to determine the extent of exposure. Say that a hundred thousand company emails containing confidential information were leaked online. Primer Extract can help the company’s cybersecurity team quickly parse millions of websites, emails, text messages, and social platform messages to determine what data was leaked and where, so they can take the appropriate next steps.
Many large, long-established organizations maintain vast quantities of records that span many years of operations. These records are often kept in different formats and systems, in the cloud and on-premises, making it hard to locate specific data easily. Primer Extract helps employees or external individuals like journalists or auditors search through the organization’s entire collection of files to find specific topics of interest in legacy documents. The entire process takes only seconds, rather than the hours or days it would take with manual research processes.
Extract allows you to upload, search, and understand any kind of unstructured data, including text documents, images, pdfs, emails, video. It uses Primer’s NLP engines and integrated computer vision, audio, and video machine learning models to process the data.
Extract can translate text written in any detectable language into English to enable English-speaking users to search through and understand its data. It works with over 109 languages, including Farsi, Chinese, French, Russian, Spanish, and Portuguese.
Yes. Extract’s integrated object detection turns objects depicted in images into searchable text. For example, you can find all images that contain weapons by searching for “weapon” on the “Explore” page.
Individual projects can consist of hundreds of thousands, or even millions, of documents and files. Search models can be transferred between projects of various sizes, so an unlimited amount of data can be searched through using the same tools.
Absolutely. Extract has been optimized to work in low-resource environments, making it easy to “bring it with you” to wherever you need to search through document caches.
Yes. Extract has integrated, world-class Optical Character Recognition (OCR) technology to read most handwritten and scanned documents across multiple languages.
Primer Extract’s ”Explore” page allows you to specify search criteria using filters associated with over 20 different data attributes. Some filters are as simple as “document creator,” “document language(s),” and “date created.” Powered by machine learning, filters can detect entity types and the complex relationships between them, such as the people, places, and topics being discussed in the document.
Yes. Primer Extract allows you to specify whether a project should be shared with your team or be private. Team members can also share and reuse their custom NLP models.
Yes. You can easily prioritize and export a list of documents on Extract’s ”Review” page, allowing you to create a quick compilation of key information in your project.
Our services team can help you connect Extract to your existing database or other internal system and upload extracted data.
Yes. Extract keeps track of conversations and messaging threads while maintaining parent-child relationships between documents.
In addition to using Extract’s out-of -the-box exploration and search tools, you can make a lightweight machine learning model that prioritizes the documents that you need to review. It’s as easy as defining what you’re looking for, manually reviewing 30 to 60 examples, and letting the model check the other tens of thousands of files to find what you’re looking for.