When you read the news, you are taking part in a massive experiment. It plays out day by day, minute by minute. The output is the answer to a question: Among all the events unfolding around the world, where do we pay attention?
No one decides. The answer emerges from the activity of millions of journalists and commentators competing for the attention of billions of people. The news media shines its spotlights across a vast, dark landscape. A breaking news report suddenly focuses attention on a far-flung location. Other news sources join and that location’s share of the world’s attention grows. The bright focus might last a week for a highly significant event. But usually the spotlight moves on by the end of the day.
While our attention was focused on this one event in one place, what were we missing? Other highly relevant events were surely unfolding. Most were barely noticed, not due to a lack of significance but simply the limits of human attention.
Russian vs. English-language media attention on terrorism on 25 May 2016
If you want to see what you missed, you need a map of the world’s attention in time and space. Here’s how to do it: For any given topic, identify all of the relevant events—not just the big ones that dominate the media but all the tiny ones covered by local beat reporters, too. You don’t have a prayer of reading them all, so you need to teach a machine to read them. Then you must teach the machine to identify the real-world events described by the humans. Now you need a good algorithm for measuring attention—take care of the biases and uncertainties in your data. And finally, to map the distribution of the world’s attention, teach the machines to extract the locations relevant to each event and generate a time series. Now you’re ready.
No such map exists. So we built one. For the topic of our first prototype, we explored terrorism.
We started by mapping attention paid to terrorism-related events first in English-language media. Our natural language processing algorithms also work for Russian, so we developed the same map for Russian-language media. We were fascinated by the differences between the two. Then we realized what we really needed to build: a diff map.
In computer programming, diffing is a technique that reveals the difference between two files, or how a single file has changed over time. We’re diffing across corpuses of millions of documents in multiple languages. Of course, we could diff on other topics, other languages, or indeed any segmentation of the data.
Are you curious how liberal vs. conservative media attention will differ on next year’s US congressional elections? How about media attention on product releases and recalls? Russian vs. Chinese attention on events related to climate change?
So are we. Stay tuned.