Applying machine learning at Primer to extract all the people, places, and things named in Harry Potter fan fiction books
Welcome to the age of machine-generated headlines.
NLP technologies are also eroding the tradeoff analysts historically have had to make between making timely judgments and judgments based on a comprehensive analysis of available intelligence. These technologies are enabling analysts to read-in each morning in a fraction of the time, and interact with all of the reports hitting their inboxes each day, not just those flagged as highest priority or from the most prominent press outlets. The effect of these algorithms goes beyond accelerating the speed and scale that individual analysts can operate, to also mitigating hitherto unavoidable analytic biases associated with source bias. This is lowering the cost analysts face for pursuing hunches, exploring new angles to vexing issues, and creating time for them to learn about new issues.
Enterprise software has typically been a challenge to get in the hands of customers, especially when it involves integration with their own infrastructure. By leveraging cloud native technologies, Primer has seen decreased implementation time and effort with its customers.
Human-generated knowledge bases like Wikipedia have excellent precision but poor recall. To help the humans, we created a self-updating knowledge base that can describe itself in natural language. We call it Quicksilver.
We’re thrilled to be recognized by the World Economic Forum today as a 2018 Technology Pioneer. Primer is one of the 61 companies selected from around the world building technologies that are having a significant impact on both business and society.
It has become standard practice in the Natural Language Processing (NLP) community. Release a well-optimized English corpus model, and then procedurally apply it to dozens (or even hundreds) of additional foreign languages. These secondary language models are usually trained in a fully unsupervised manner. They're published a few months after the initial English version on ArXiv, and it all makes a big splash in the tech press. But can English-trained models be naively extended to supplementary non-English languages, or is some native-level understanding of a language required prior to a model update?
Let's get our hands dirty and train a state-of-the-art deep learning model to summarize news articles. We'll introduce TensorFlow, discuss how to set up the training task, and present some tips for implementing a seq-to-seq model using RNNs. Once we have a working model, we'll dive into some insights for how to train the model much more quickly (decreasing time from three days to under half a day).
We analyzed the things journalists described with color in 33 million English-language news articles published in 2017. This was the year of black holes, white supremacists, and pink hats.
We love getting more information: from news, social media, and even blog posts. Given the bottleneck of reading long-form text, wouldn't it be amazing if we could immediately grasp their main ideas?
Algorithmically generating human-level summaries is a challenging task: it requires identifying entities, concepts, and relationships, and converting learned information into grammatical sentences. In this post, we'll look at how the latest deep learning methods, using recurrent neural networks and attention mechanisms, try to achieve these tasks to bring you smarter summaries.
At Primer we deploy natural language processing (NLP) pipelines that need to support many different languages, including English, Russian and Chinese. One powerful NLP approach is to apply machine learning techniques on raw text by representing words as vectors. In this post, we look at how to encode Chinese words as vectors. But are algorithms developed for English NLP effective on Chinese text? How can we take advantage of the unique linguistic features of the Chinese language?
What would a map of the world's attention look like? What if you compared Russian vs. English speakers? We chose the topic of terrorism to test out our first prototype of this visualization. We call it a diff map.
We are Primer, a machine intelligence company based in San Francisco.