What’s in the data?

“User data” is pretty much everything that a particular company knows about you. It’s often broken down into three main categories: explicit/declared data, implicit/inferred data, and third-party data. Explicit data is anything that a user gives a company voluntarily. Generally, this is done when a customer signs up for a service or creates some kind of profile, and it can include everything from your name (though it’s often anonymized), age, and location to your hobbies and personality type. These details are just basic user data, and since they’re so widespread and easy to obtain, the most common data points aren’t worth very much on their own. Implicit data is data that is collected about you without needing your direct input. Your browsing habits, how long you stay on a webpage, the ads you click on, your mouse movements, your playlists – almost anything you do online could theoretically be collected and sent to a database for analysis. Implicit data is often also mixed in with inferred data, which is produced by analyzing a profile and making guesses about the person it represents. By looking at the information it’s given, an algorithm can decide whether to label you as a candidate for a bicycle ad or a cheeseburger ad, for instance. Once it’s collected and analyzed, though, your data doesn’t just sit around – it will likely travel on epic journeys between companies that specialize in finding, updating, and selling off user data to companies that need it. These third-party data brokers and data exchange centers are a big reason why the data industry is growing by billions of dollars per year.

How is it used?

User data collection may have acquired a bad name, but modern companies really wouldn’t survive without it. At a minimum, they have to analyze user behavior to follow market trends and figure out consumer preferences. If they can also guarantee advertisers an audience that will click on their ads, then they can make money, which allows them to stay in business without charging for their services. The advertising model that basically runs the modern Internet is probably the biggest driver behind the market for user data. Figuring out ways to identify who is on your site, look up their advertising profile, and serve them relevant ads is a lot better than just randomly firing content-based banners at whoever browses by. Well-targeted ads (but not too targeted, or users find them creepy) make real money, and they’re worth all the hassle of getting user data. Unsurprisingly, there’s a lot of money involved in getting the right ads to the right people, which means that there’s also no shortage of data breaches and privacy issues surrounding the industry.

But data isn’t all about invading people’s privacy to get them to buy new smartphones – it’s also massively helpful for companies doing market research, trying to comply with regulations, or working to improve their products. Many of them do keep user information confidential and take the proper precautions to ensure that they protect user privacy, and even in advertising the information making the rounds often isn’t personally identifiable.

What is it worth?

There wouldn’t be so much of a fuss about user data if it wasn’t worth something. When it comes down to it, much of the Internet is about getting users to give their money to companies. As long as they keep doing that, advertisers will pay websites for their ad space, and if they do it more, advertisers will pay more. This means that different users have different values. A high-income frequent traveler from the U.S is a much juicier target than a Canadian university student, simply because a shot at selling a Rolex is better than a shot at selling ramen. How much exactly is your data worth then? It turns out to be a very subjective calculation, and it gets even harder when you take black market/under-the-table data deals into consideration. Depending on your demographics and how much data a source has on you, your data could be being bought and sold for anywhere between a few cents and hundreds of dollars every year. This calculator from the Financial Times gives you a pretty good idea of what affects your data’s value.

Data is forever

Data is the new oil in more ways than one: It runs the engines of modern e-commerce, contributes to the development of new products and technologies, is controlled by a large network of possibly untrustworthy companies, and frequently gets spilled into the wild. Overall, it’s probably been a net positive for human progress, as it provides lots of helpful human insights and allows a vast array of technologies to be distributed for free. Nonetheless, there are a myriad of issues surrounding the vast trove of user data that exists. As a more serious dialogue develops, users may eventually be granted greater transparency and control. As long as it remains valuable, though, it will reliably be misused. Image credit: Photo.iep