An Inside Look at Primer’s R&D

Primer’s R&D and Engineering teams are working to solve some of the most complex and challenging problems in machine learning. Here’s where Primer stands apart—and what’s coming next in a rapidly advancing field.

For machine learning engineering teams trying to implement NLP, it’s challenging to select the right tools to enable this capability in your organization. What does it look like on the ground to solve problems in a deep learning NLP company, one where sometimes you’re building your own tools? Here’s how we solve things at Primer, as well as where we are focusing our research & development next.

Read More: The 5 Predictions for NLP

1. Zero shot entity recognition 

Natural language processing is based on models that recognize entities – for example, peoples’ names, or specific classifiers like ‘tanks,’ ‘trees,’ or ‘fruits.” The classic ML process is to get a dataset with those entities and then train the model how to recognize them. But zero shot entity recognition is a model that is able to recognize arbitrary entities without further training. In other words, with a zero shot model, you essentially skip training, give users the model, and it works – in “zero shots.” The closest thing to zero shot models in machine learning currently are “few shot models,” which do the same thing with limited training.    

Still in R&D at Primer, zero shot entity recognition is something that has not been pulled off publicly by any company to date. Primer’s R&D engineers recently created a model that can theoretically perform zero shot, essentially a “Version 0” zero shot model. In this model, the parameters are there, but performance isn’t high enough yet to deploy, prompting additional R&D. The current zero shot model is trained on thousands of entity classes and documents. 

Much like a human might be able to read new words through context and have a decent understanding of the meaning, a zero shot model should, in theory, be able to generally recognize a word if it exists in the English language, even if it wasn’t trained on that word. In short, the zero shot model is trained to recognize so many entities that it’s able to generalize to entities it hasn’t seen yet. As for the impact? Months of work building a model could be reduced down to just a few hours. 

2. Inference triage

GPU usage is a major part of a company’s spend when investing in ML and NLP. But what if that cost could be reduced by up to 85% simply by changing the way a model is deployed? 

To solve this problem of expensive GPU run time, Primer created inference triage – reducing computational load by training a cheaper and faster ML model to outsource tasks only when necessary to a larger model which require more compute power to perform. But that’s the key – tasks are only outsourced when necessary. Let’s break down how this works.

At Primer, we like to have fun along the way. We gave the name of this project “BabyBear,”– that’s the name of the cheaper and faster model, or the “younger” model. In this project, “Baby Bear” will look to another model, named “MamaBear,” to answer questions that the BabyBear model cannot answer on their own. Our current version of inference triage algorithm works on classification and entity recognition tasks. In this algorithm, predictions from the MamaBear model are considered the gold-labeled training dataset for the BabyBear model. For every dataset, if the BabyBear model is confident in the predictions, it will be considered as the final prediction—otherwise MamaBear is called.

Depending on the task, GPU run time was reduced by up to 85% in Primer’s model testing. 

Reducing GPU needs has profound effects on cost in ML, and it also increases the hardware flexibility for on-prem deployments.  Having models that are more efficient means AI can be run on a wider range of hardware with a smaller footprint.

Primer has already started to deploy these models on Named Entity Recognition (NER), but it could be deployed on a wider variety of text data tasks. Even better? Reducing GPU not only affects run time and financial output, but significantly reduces the carbon footprint of running the models—a significant need in NLP.

3. Custom summarization  

The ability of ML models to not just read and write, but truly understand and synthesize information, is one of the most desired aspects of NLP. 

In partnership with Summari, Primer’s forward-deploy engineering team used Primer’s Platform to train and deploy a customized text-to-text summarization model. Coupled with Primer’s data ingestion pipeline, the custom model delivers human-quality summaries of any long-form article on the internet instantly. Faster delivery times allow Summari to expand their offerings from a few dozen publications to the entire internet. The market’s reception has been strong, earning Summari the top spot on Product Hunt. It’s a process that could be repeated with other models. 

It’s not just a breakthrough for Summari, but for NLP technology itself. An instant summary, in human quality, of a massive amount of information displays the power of NLP.

4. Topic modeling 

Topic modeling is a feature that was developed by Primer for a leading business publication. The publication has a large volume of content that lacks search and findability for its customers. The publication wanted to explore how ML pipelines could help. 

Primer’s team built a custom text-to-text model to help tag their enterprise taxonomy of 700 different terms. The result was a model that helped automate the process of tagging articles and cases, resulting in a better and cleaner taxonomy with clear hierarchy and categorization. This solution could also apply to any larger business with a corpus of data to both save labor time and costs for manual curation, and also provide better metadata to create a better search and user experience for customers.

5. Synthetic text detection

Detecting the spread of disinformation is a top priority for many NLP providers. How can analysts make sense of their world at scale, without accurate detections of networks and bots that seek to manipulate data? The implications are immense: Bots are often detectable because they are posting the same message over and over. When you look at a bots profile they often have ‘tells’ such as imperfect use of the language, appear to have a singular theme to their posts, and have numerous bot followers. But these ‘tells’ are getting increasingly difficult to identify with recent advancements in synthetic text generation. Primer Command can already detect synthetic text and disinformation within specific data sources. Even more valuable, Primer’s R&D engineers are developing an advanced version of this technology, which you could deploy on any available data. 

6. An end-to-end full stack platform for natural language processing 

Doing all of this R&D is a lot of work. The last thing data scientists want to do is set up the infrastructure around it. Primer is building infrastructure for their data scientists and forward deployed team to build tools, models, and systems faster. That means instead of using one vendor to ingest data, another to label data, and yet another to build and deploy models, each of these steps lives together on a unified platform. 

We also built Data Map to empower data scientists to find mislabeled examples in an automated way. Spending less time labeling and wrangling data means focusing on creating ML models. 

It also means ML teams don’t have to build and maintain this complex toolchain—one which is so complex, most projects often fail. Now engineering and data science teams have the tools that give them the full capabilities to do this for any use case or data source.

For more on Primer’s products and infrastructure, visit the resource library.