Is data the oil of Artificial Intelligence? Should they pay you for yours?

Last month Facebook announced that facial recognition functionality, which notifies a person if their profile picture is used by someone else, is now available to all of their users. He also announced that they removed labeling suggestions, which used facial recognition to suggest a person to tag his friends in photographs.

These announcements could relate to a judge's decision in the United States to deny Facebook's request to dismiss a class action it faces, which could cost them up to $5 billion. This lawsuit is for the unauthorized use of the user's image in the labeling suggestion, on the basis that they did not authorize third parties to be able to identify them in photographs.

On the other hand, Microsoft joined other institutions that have withdrawn databases  with millions of faces that were available online for research purposes, while Uber faces a lawsuit in the UK by which a group of operators demand access to statistical information about their operation.


These conflicts reflect a growing concern about the privacy and security of personal data, a subject that, although already legislated in Mexico and other countries,  has gained impulse by news about the misuse of data and social networks to try to manipulate elections, strip people in photographs or continually bombard with unsolicited advertising, all through the use of artificial intelligence (AI) and machine learning (ML) tools.

Concern has led some governments, such as the European Union  or California, to issue very strict laws to give people control over their data, based on the principle that people own them, so they should give express and detailed consent on the different uses that will be given to the data it provides to companies and institutions in their daily relationship.

In addition to the topic of control over personal information, there is a discussion about the legitimacy of the profits that companies derive from the use of the data they get in their relationship with people. This has even led to proposals for wealth redistribution schemes such as the payment of a data dividend, similar to that in Alaska, where citizens receive an annual payment from the profits of oil exploitation.


A recent study estimates that data usage generated $227 billion in 2018, including social media and internet targeted advertising providers, brokers who aggregate and analyze data, and companies that save costs or generate revenue in their operation. This amount will reach $400 billion by 2025, given the growth trend in data generation.

Who can deny that the billions of dollars that Google, Facebook or Uber got on their IPO are due to the exploitation of data obtained from people with vague consents? This is off course with the use of systems and technologies developed by them, but none would serve much without data. This argument, together with the gigantic amounts of money being produced, has created high expectations of redistribution schemes in some sectors.

It is undeniable the importance of the privacy and security of personal data, it is not a new issue. It is also legitimate to think about wealth distribution when huge profits are generated with customer data, as has always been done, but now on a gigantic scale. However, some considerations must be considered to maintain realistic expectations, focus on substantive issues, and avoid producing an environment that restricts the development of AI for the benefit of society.


a) Data is valuable in large quantities, individual data  is not  worth  much

The $227 billion generated by data exploitation in 2018 is an impressive amount, but it is generated from the information of also billions of people, so the value of individual data is small. To confirm this, we may see the following:

  • The global average of the value of data generated by an individual internet user is $1.18 per month (23.60 Mexican pesos at an exchange rate of 20 pesos per dollar).
  • Facebook's global annual revenue per user is $25 (500 pesos), equivalent to $2.08 per month (41.67 pesos).

These amounts are the revenues of companies so, if a global dividend were established, which is highly unlikely, it would be only for a small fraction of them.

b) Making it difficult for companies to obtain data can benefit the big ones

Large companies already have established their position and their relationship with the user, so obtaining consent for the use of data is facilitated, if not granted the user loses access to the applications and the benefits that he already receives. Moreover, unless there is a massive movement, the decision of some to unsubscribe has no impact. This gives large companies a great advantage, as others must build databases with an individual consent work that they did not have to do or pay them to use their information.

c) Improper use of data is what must be penalized

When personal data is used to try to influence an election, discredit a person or harass them with unsolicited advertising, the evil is not in the availability of data or the use of AI or ML to extract information from it, but in the commission of acts that are illegal regardless of technology. Therefore, legislation should preserve people's privacy and dominance over their data without privileging the elimination of illegal uses by restricting availability. Making it difficult to sell cell phones or cars is not the way to avoid extortion and mugging.

d) Positive uses of data should be promoted and facilitated

AI and ML have the potential to generate extraordinary advances in disease diagnosis and treatment, environmental conservation and the fight against poverty and inequality. However, like any other application of this kind, large amounts of information are needed to achieve enough degree of reliability for deployment. Achieving some of these advances can be considered a matter of public interest, so at some point it should be discussed how to balancing it with the individual interest of privacy.

Data is the oil of AI only in the sense that they are the input to generate the extraordinary applications that are revolutionizing economies and industries, however, the situation of the owner of an oiled land in countries where there is private property has nothing to see with the owner of the data. The value generated is in the innovative application of algorithms to generate information that can be of great benefit to society. Therefore, the very important work of ensuring the privacy and security of our data must be balanced with the need to create conditions to make those benefits a reality.

Did you enjoy this post? Read another of our posts here.

Visit our other sections