Web3 offers a secure alternative because it doesn't require users to submit personal information. The challenge, however, is that a lot of data generated when using Web3 apps is stored transparently and publicly on blockchains. So while a user’s personal data might be safe, their account balances and transaction history are visible to the public. We’ll explore that tension further, but first, let’s back up and give some context.
The Problems with Data Privacy in Web2
Today, data is the oil on which the machinery of the digital economy runs. Many Web2 businesses monitor and collect user data as part of surveillance capitalism. To use Web2 products and services, users must provide information about their identity when they sign up. After signing up, their activities across these services generate tons of data. Those services can then package that data to advertisers who pay to have their products advertised to users based on that data, and, in exchange, users get free (or discounted) products. That’s the way Web2 often works, and that comes with problems.
Users Have Little Control Over Their Data
Large companies like Facebook and Google have ultimate control over the data generated by their users. Of course, not all data collection is bad. Businesses may need to collect data to know how to improve their services, and personalization makes for a better user experience.
The challenge is the lack of transparency. Users have little visibility into what data is being collected, how it is used, and who can access it. For example, these companies often sell data to third parties, and their users don’t even know it. This monetization of data is part of the reason why users see hyper-targeted personalized ads.
Web2 services use algorithms to process user data and make predictions about users’ interests or behavior, and those models may have data biases. Users may be presented with ads or content that reinforces existing cultural biases or even be discriminated against based on their race, gender, or other personal characteristics.
It’s hard to distinguish between fair and intrusive data collection—Web2 platforms often have opaque data collection practices. The data collection is backed by lengthy terms of service agreements or privacy policies that users are unlikely to read or understand. When’s the last time you read a terms of service document and thought to yourself “ok, I get it”? Still other platforms may collect data through third-party trackers, which can be invisible to users and difficult to identify.
Centralized Data Has Single Points of Failure
The client-server architecture and the centralized databases in Web2 also often create a single point of failure. If the central entity that stores the data is compromised, all of the data stored within it is at risk. The company could suffer data loss or service downtime, and even worse user data could be compromised, making them vulnerable to identity theft or blackmail.
In 2022, the U.S. recorded 1,802 cases of data breaches, leading to data compromises for more than 422M people. In 2022, the average total cost of a data breach was $4.35 million. 60% of companies that suffered data breaches passed the cost on to their customers through increased prices.
The worst part is that users may not know their data is compromised until criminals have sold or used their data. Criminals could use the stolen data to create new IDs, rack up credit card debt, or commit other crimes in the name of unsuspecting people. In 2022, many tech companies, including Nvidia and Samsung, suffered data breaches without disclosing the extent and impact of the breaches.
Web2 platforms claim to offer some privacy protection through security software and privacy policies. Privacy-focused regulations, like GDPR, are also designed to protect users' privacy. However, these measures are insufficient when considering scandals such as Cambridge Analytica's unauthorized access to users' private data on Facebook.
Data Privacy in the Web3 Industry
Web3 has gained traction because it offers an alternative to centralized tech companies that handle data irresponsibly (among other reasons). Web3 enhances data privacy by eliminating the need for users to share private data before using digital services. All users need to do is connect to an app or service via a pseudonymous wallet—a digital wallet that uses a pseudonym or alias (a string of letters and numbers, in this case) to identify the wallet owner. No more submitting personal info, be it name, email address, phone number or something else, in order to use an app.
This means that if a Web3 service is compromised, the attacker won't find any personal information of users, such as home addresses or dates of birth, because users never gave that information to the service in the first place. In addition, users can create as many wallets as they like to increase the number of aliases they use for transactions.
Web3 also enhances data privacy by giving users control over their data. They can choose what data to share or who to share data with. Web3 services such as Streamr allow users to opt into the data economy to monetize their data through Data Unions. Other Web3 services without a data economy often have business models that let users earn a share of the economic returns.
However, Web3 apps also face some data privacy challenges arising from the differences between on-chain and off-chain data. On-chain data refers to data that is stored directly on the blockchain (this data is not private!). Off-chain data refers to data stored outside of the blockchain, such as on centralized servers or decentralized file storage systems like IPFS. Off-chain data can also be vulnerable to the centralization risks found in Web2.
The Tension Between Transparency and Web3 Privacy
In its earliest days, crypto acquired a reputation as the currency of choice for people who wanted to hide their financial activities. This reputation led many to believe that crypto enabled financial privacy.
However, blockchains are not private; the whole point is that they are public databases. Blockchains are permissionless and open ledgers—anyone can run a node to keep a copy of the ledger, and everybody can see all of the recorded transactions. You can think of a blockchain as a Twitter feed of your bank account, and every transaction is broadcasted like tweets across the entire network.
Importantly, crypto transactions are only pseudonymous, not anonymous—meaning the transactions happen through wallets that identify users with “aliases” rather than their official IDs. However, such transactions are not anonymous because it is still possible to connect wallets or transactions to owners through blockchain forensic analysts or cross-chain graphing, among other methods. Services such as Chainalysis have also found ways to peel off the layer of pseudonymity to link crypto transactions to real-world entities.
Beyond transparency, the immutability of blockchains also impacts Web3 privacy. It is challenging to alter or delete data written on a blockchain, if it can be done at all (it usually can’t). While this permanence enhances data integrity and security, it becomes an issue if there's a need to delete on-chain data. For example, how should we think about GDPR considerations, such as the right to be forgotten, with on-chain data?
The State of Web3 Privacy Today
The state of Web3 privacy today is still a work in progress to balance transparency with data privacy. Several Web3 privacy solutions are being developed that could enhance user privacy. Cryptographic techniques such as RingCT (Ring Confidential Transactions) and zero-knowledge proofs can obscure transaction amounts and origins. Several Web3 projects, such as Aztec and Zcash, already use this technology to offer increased privacy.
Privacy coins such as Monero and Dash also offer increased privacy by protecting users against financial surveillance. Privacy coins use shielded sets of transactions to obfuscate transfers and make it harder to identify parties in a trade.
There are also decentralized identity projects, such as Verida and Polygon ID, among others, that are working on Web3 privacy. They use DIDs (Decentralized Identifiers) to create a more user-centric approach to identity management and zero-knowledge proofs—enabling users to prove their identity without revealing or sharing details of their identity.
Strengthening Web3 privacy without sacrificing transparency is still a challenge, but these promising developments (and others) offer hope for a more secure and private digital future.
Start Building Apps That Can Enhance User Privacy
Privacy is a normal human desire. People have things they want to keep private, like healthcare records or their contact information. Web3’s lack of user data makes it better positioned to meet the desire for online privacy than Web2, but the fact that transactions are public introduces its own privacy complications that are still being researched and worked on.
If you'd like to know how to build apps that are better positioned to meet users' privacy needs, our comprehensive guide to Web3 development is a great starting point. The guide discusses many ideas, including the basics of smart contracts and how to connect your dApp to a blockchain.