Join us on November 9 to learn how to successfully innovate and achieve efficiency by upscaling and scaling citizen developers at the Low-Code/No-Code Summit. Register here.
“The Terminator,” “The Matrix,” “Me, Robot.”
These are all movies where machines become aware and try to take over the world (or at least kill all humans). It’s a popular plot line because it speaks to our deep-seated fears about technology. Will our devices and the data they collect be used against us as we move to Web3?
It’s not just Hollywood paranoia. In recent years, we’ve seen mounting evidence that our data is being used in ways we never intended or expected. The Cambridge Analytica scandal showed how Facebook data was collected and used to manipulate voters in the US presidential election.
Google has been fined for collecting data from children without parental consent. And facial recognition technology is used by law enforcement and companies with little regulation or oversight.
Top with little code/no code
Learn how to easily build, scale, and manage low-code programs that will ensure success for all of this on November 9. Register today for your free pass.
In this article, we take a closer look at the dangers of unobstructed data pipelines and how blockchain technology, especially as we move towards Web3, can reduce the opacity of the algorithm black box.
The world runs on algorithms
We live in a time where algorithms are making more and more decisions for us. They determine what we see on social media, which advertisements we like and who gets a loan and who doesn’t.
Algorithms can be simple, such as the one that determines the order in which results are displayed in a search engine. Or they can be more complex, like the ones used by social media companies to decide which posts to show in our news feeds.
Some of these algorithms are designed to be transparent. For example, we know how Google’s search algorithm works. But many others are opaque, meaning we don’t know how they work or what data they use to make decisions.
This lack of transparency is worrying for a number of reasons. First, it can lead to biased decisions. If an algorithm uses race or gender as a factor in making decisions, that bias is reflected in the results.
Second, opaque algorithms can be used. If we don’t know how an algorithm works, we can’t figure out how to play it. This is why many companies keep their algorithms secret – they don’t want people to manipulate the system.
Finally, opaque algorithms are difficult to hold to account. If an algorithm makes a mistake, it can be difficult to figure out why or how to fix it. This lack of accountability is especially problematic when algorithms are used to make important decisions, such as whether or not to get a loan or a job.
The Dangers of Data Pipelines
The problem with algorithms is that they are only as good as the data they use. If the data is biased, the algorithm will be biased. If the data is incomplete, the algorithm makes inaccurate predictions.
And often the data that algorithms use is far from perfect. It comes from a variety of sources, including social media, sensors, and government databases. This data is then collected and processed by various companies before reaching the algorithm.
Each step in this process introduces potential errors and biases. For example, data from social media is often not representative of the population as a whole. And sensors can be inaccurate. The result is a data pipeline that is often opaque, biased and difficult to account for.
Admittedly, killer robots tend a bit towards the fantastic – but there are more discreet ways in which your data can be misused by the powers that be. What will your data be used for? Here are some possibilities:
How your data is used
- To score your political views and manipulate you during election season
- To sell you products you don’t need
- To track your location and movements
- To target you with ads
- To avoid getting a job or loan
These are just a few examples – the list goes on. And it’s not just companies that do this. Government agencies use data to track citizens, predict crime and even fight wars.
In short, data is used in various ways to control and manipulate people. And often these applications are hidden from the people who are affected.
Web3 and the potential of blockchain-based data markets
One possible solution to the data pipeline problems is a blockchain-based data market. In this type of market, data would be collected and stored on a decentralized network.
This would have a number of advantages. First, it would make the data pipeline more transparent. We would know where the data comes from and how it was collected. It would also make the data more reliable. If the data is stored on a decentralized network, it would be much more difficult to manipulate. This type of data storage could become an even more important concept as we move towards Web3.
Finally, it would make the data more accessible. Anyone would have access to the data and use it to build algorithms. Privacy wouldn’t be an issue because the data would be anonymized and there are mechanisms in place to prevent abuse.
For example the Ocean Protocol is a decentralized data exchange protocol that enables data sharing while preserving data privacy. It is built on the Ethereum blockchain and uses smart contracts to ensure that data is only shared with parties who have permission to use it.
The Ocean Protocol can be used to create a data marketplace where data is collected, stored and distributed in a transparent and reliable manner. This allows data to be used more efficiently and may resolve some issues with the current data pipeline.
Of course, you can see how this is one of Web3’s biggest limits, as data is the lifeblood of the new internet.
Overcoming the challenges of blockchain-based data markets
It is important to note that a blockchain-based data market is not a perfect solution. There are still some challenges that need to be addressed. For example, it is not clear how data would be priced in such a market.
The blockchain community should be proactive in participating in and developing data markets. Otherwise, they risk being left behind as the centralized platforms continue to dominate.
People can view their data as an asset, but first the right infrastructure must be in place. One way to do this is to develop data wallets that allow people to manage their data and get compensation for sharing it.
The uPort platform – now split into Serto and Veramo – is an example of a data wallet being developed on the Ethereum blockchain. uPort allows users to manage their identity, personal information and data. It also allows them to share this information with others in a secure and decentralized way.
Data quality is imperative
Another challenge is data quality. In a centralized system, data is managed by a single entity. This means that the data is likely to be more accurate and of high quality.
In a decentralized system, however, there is not a single source of truth. This means that the quality of data can vary greatly. Data quality is an important issue that needs to be addressed for Web3 and blockchain-based data markets to be successful.
One possible solution to the data quality problem is the use of data management markets. In these markets, people would be incentivized to provide accurate and high-quality data. The iGrant data wallet platform is an example of a data curation market being developed on the Ethereum blockchain.
These are just some of the challenges that need to be addressed. If they can be overcome, blockchain-based data markets have the potential to revolutionize the way data is collected, stored and distributed.
The following article concludes our Web3 series, bringing together everything we’ve talked about to see the big picture of how data, crypto, blockchain, and Web3 will shape the internet — and the world — in the years to come. Stay tuned!
Daniel Saito is CEO and co-founder of StrongNode.
Welcome to the VentureBeat Community!
DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.
If you want to read about the very latest ideas and up-to-date information, best practices and the future of data and data technology, join us at DataDecisionMakers.
You might even consider contributing an article yourself!
Read more from DataDecisionMakers
Janice has been with businesskinda for 5 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider businesskinda team, Janice seeks to understand an audience before creating memorable, persuasive copy.