The Path from Structured to Unstructured Data

Technological advances in deep learning models have made it easier to use unstructured data for a whole new class of ML tasks.

Explaining natural language processing (NLP) and what it can do for an enterprise can be a daunting task. It’s a complex technology. Simply put, NLP can instantly “read” and process massive volumes of text data and find insights that would be almost impossible for humans to surface at scale. Thanks to advances in deep learning models, the ability of NLP to scan, process, and analyze unstructured data is increasing exponentially.

What is unstructured data? 

Knowing the difference between structured and unstructured data helps to gain an understanding of the power of NLP. Fortunately, it isn’t difficult. Structured data is likely what you’re already thinking of when you consider what NLP does. For instance, if you wanted to get answers about a local housing market, much of the data you would want analyzed can be easily categorized and placed in a spreadsheet. Numbers relating to cost, size, days on market, number of bedrooms and bathrooms, parking spaces, and so forth would help someone find insights into this topic.

The unstructured data that is available isn’t nearly as neat and orderly. Using the same example of a local housing market, if you also wanted to add data regarding public perception of neighborhoods, municipal services, and quality of local schools and businesses, the data points aren’t exactly spreadsheet-ready. 

What’s exciting in NLP is that it is gaining the ability to accurately analyze more abstract data, like comments posted on internet review sites, tweets, or relevant news and video about a location. These aren’t numbers you can add to a table like square footage or property taxes. Rather, machine learning is developing the ability to parse human sentiment, positive and negative attitudes, sarcasm, hashtag meaning, and other intangible perceptions and add the results to the analysis of simpler, structured data.

For a more detailed look at the topic, download the white paper “The Path from Structured to Unstructured Data.” In it, you’ll get information on

  • Unstructured data use in the enterprise
  • The effect of advanced deep learning models
  • Data infra requirements for unstructured data
  • How distributive and generative tasks make sense of both kinds of data


For more information about Primer and to access product demos, contact Primer here.