Data protection plays a major role in many blockchain-based project ideas, and presents issues which
are not always easy to resolve.
Blockchain is a technology which enables the protection of data against manipulation. So, in this
sense, it increases the security of data. However, simply put, this security is achieved by making the
records saved in the blockchain transparent and immutable; and this, in turn, is achieved through
the redundant and distributed storage of each record at multiple nodes throughout a large network.
If we consider the requirements of the EU General Data Protection Regulation (GDPR), the very
essence of the security of blockchain is therefore in contradiction with the privacy required for the
protection of personal data. As a result, the development of a blockchain project needs to include
careful examination of what kind of data is being stored, and whether that data could be considered
to be personal data.
Applicability of data protection law to blockchain
When we consider the basic applicability of data protection law in a blockchain environment, the
first question we need to ask is whether data processing is involved. As a rule, the answer is yes. The
second question is then whether personal data comes into play. If it is possible to answer this
question with a no, then there are no further privacy issues. You can tick all the boxes, and know
that the project is not subject to data protection law.
However, if you answer this question with a yes, then the next step is to look at whether the project
has a relevance to the EU – as a rule, this is the case for projects that are founded and based in the
EU as well as for projects that collect data from within the EU.
Then you are within the scope of the GDPR, and data processing is forbidden unless you have a legal
basis for the processing of data. It is important to understand that the GDPR contains the very
paradigmatic principle that data processing is generally prohibited unless certain exceptions make it
permissible.
What is considered data processing?
When are data personal, and who is then responsible for the data processing? These are questions
that are not so easy to answer when we look at blockchain.
we break these questions down somewhat, it is perhaps best to initially consider what data
processing entails, because the term in itself is incredibly broad: every handling of data which occurs
automatically or in a structured form is to be included in the term data processing. The only thing
that is not considered data processing is when I talk to you face-to-face. This effectively means that
every flow of data in the operation of a blockchain is included in the term “data processing”. This is
the case for the submission of transactions through nodes; it’s the case for the storing of hashes, the
transferal of transaction data by the miners and the verification process, and also in the case of the
saving and synchronization to the blockchain as a whole. The blockchain must be regularly updated
among all participants of the network, so that everyone has the most up-to-date version of the
complete database – this update of the database across the network is also considered data
processing. This means that all participants in the operation of a blockchain – not just the
transaction partners and the miners, but also the nodes – are involved in the data processing.
When is data personal?
Now it starts getting a bit more difficult. According to the GDPR, personal data includes all
information that refers to an identified or identifiable natural person. An identified person is
relatively simple: my name, or an email address that includes my name; a finger print, perhaps a
photo of the face, and so on – these are immediate identifiers. Identifiable is a bit more complicated.
Here, the immediate identifiability is set aside, and information that third parties have becomes
relevant. To understand what kind of third-party knowledge falls within this scope, the question is
whether the identity can be ascertained with a proportionate amount of effort with the means
available to the processing party or any other person. Factors for this include the cost of
identification, the time required for available technologies, and technological development, which is
always changing.
This could, for example, be the IP address. Are IP addresses personal data? The European Court of
Justice has now answered this question, maintaining that an attribution is possible for an ISP, given
that, at least for a short period of time, there is the possibility of attributing an IP address to a
customer via that customer account.
For Cookie IDs, the question has now also been answered. The French Data Protection Authority
recently took a decision in conjunction with an ad tech company. This company had collected
location data via mobile phones and used the Mobile Advertising IDs built in to mobile devices to
achieve this. This is, as an application case, very similar to Cookies. Therefore, we can now deduce
that Cookie IDs that are enriched with further information – traffic data or metadata – are also
personal data.
The key question which arises in relation to blockchain relates, in the first place, to public keys. Take
public keys in Bitcoin: do they entail “personal reference”?
There are so many people who, for example, publish their public key on their Facebook profile and
ask for donations in Bitcoin. In this case, of course, there is immediately a connection to the
Facebook profile. And given that I will not be able to check every single public key, and I cannot
exclude the possibility that one of the owners has made theirs public at some stage in the past, I
need to assume that all public keys represent personal data.
When do blockchains have personal reference?
A further differentiation we can make is to look at personal reference in terms of the type of
blockchain. In the case of public blockchain, we have the example of the published public key from
Bitcoin – and in this case, that will be classified as personal data. In the case of a private blockchain,
it looks a bit different. Here, of course, we have a limited user group. It is always necessary to look at
what role the participant currently plays. If I am internal – so, if I’m a participant in this private
blockchain – then it depends on the use case. If I am an external participant, this is not necessarily
the case. If I, as an external third party, look at this private blockchain, then I cannot necessarily
assume that it involves personal data. And then there is also the payload in the blockchain. I can
transfer data as payload in the blockchain, and here it depends on what this payload is – and the
payload can of course be personal data. If I save this in the database unencrypted, it is open for all. If
I save it in encrypted form, then whether or not I have to assess this as personal data is a question of
how easy it is for me to link this back to the actual person.
Here, we can make a differentiation between the roles. Miners must in some circumstances be
judged differently from the simple user who is looking in from the outside. Intermediaries – the
virtual currency exchange, for example, or the surrounding systems that are needed for the
operation of a blockchain – often have a connection. Such a currency exchange has KYC (know your
customer) obligations, and it is also possible that they will have to fulfill further obligations regarding
money laundering. As a result, before I can register for an account with the currency exchange, I
must produce some kind of ID, such as my passport. In this case, the possibility to connect a
blockchain identity with a real identity definitely exists. So, from this perspective, we will always
need to say that even pseudonymous data within a blockchain represents personal data.
Who is the actual Data Controller for the data processing?
The next series of questions concern who is actually responsible for the data processing, and who
therefore needs to prove that they have legal grounds, and who is responsible for ensuring
compliance? This leads to the following subset of questions: Who needs to maintain the
documentation? Who potentially needs to conclude data processing contracts with contractors?
Who needs to make contracts with data recipients, and so on. This includes a whole range of
obligations – among others, the appointment of a Data Protection Officer, if you fulfill certain
prerequisites.
If we look at the different roles in the Public Blockchain, let’s consider who is engaged in data
processing, and who will take responsibility as a data controller:
• The developer? No, the developer is not involved in data processing. The developer simply
produces the code that we can use, which still needs to be brought to life.
• The initiator of a transaction? Yes, this is someone who processes data. Making use of a
destination address for a Bitcoin transaction, for example, is an act of data processing.
• For the miner and the node operator, it is a little contentious. Some assume that this is a
form of contracted data processing, whilst others assert that these roles are synonymous
with being data controllers – that is, they are themselves responsible, because they are all
doing what they do for their own business purposes.
This eventuality is provided for in the GDPR. It makes use of the term “joint data controller” and
attaches to this concept the legal consequence of needing to conclude a multilateral contract
between all joint controllers. When you imagine Bitcoin from this perspective, it becomes a little
absurd. We have literally hundreds of thousands of participants, there are thousands of nodes, and
countless miners. They would all need to be connected through a multilateral contract which
regulates who has which roles and which responsibilities with regards to the data processing of one
individual player. It is hard to imagine, but it could perhaps be taken into the T&Cs, which can simply
be clicked on to accept when an individual gets involved as a miner or a node. But it nevertheless
leaves us with a lot of legal problems.
For the private blockchain it is somewhat simpler. Again, the developer is not involved in data
processing. The governance structure will act as the data controller. Here also, the initiator of a
transaction is a data controller, and the nodes – who then simply support the infrastructure on
behalf of this blockchain – would clearly be contracted data processors.
Types of permissibility
As already mentioned, we always need a legal basis, or else the processing of personal data is
prohibited. What grounds do we have for permissibility? In principle, we have three possible types of
permissibility that can be considered: Consent, contract fulfilment, and legitimate interests, which
requires a balancing of interests of the party processing the data and the person whose data is
processed.
We really won’t get far with consent in the blockchain environment. The declaration of consent
requires that the data subject is informed about to whom the data will be transferred. Let’s just look
at one example where we can see that this is doomed to failure: I cannot identify to which nodes,
and to which miners my transaction data will perhaps be transferred. It simply cannot be foreseen
who in future may be involved in the network. I can also not predict who may simply have a look at
the transaction data via a view function in the interface. Consent is probably not a path we can
negotiate for public blockchain.
Contract fulfilment could be a possibility, at least for certain participants, namely for the follow-on
transactions that this initial transaction triggers, and for those that receive it. Here, for example, the
processing of data can be justified on the grounds of a sales contract to be paid in Bitcoin – and then
of course the recipient of the transaction is permitted to see the corresponding data, and the
initiator of a transaction is also allowed to. But this does not include the nodes or the miners.
As a result, in almost all cases, we will be dependent on demonstrating legitimate interests. Here,
the balancing of interests is always required. And as long as only the data of participants is
processed, we can go a long way with this. Here there are also obligations to inform which, although
they may not be so easy to fulfil, are nonetheless manageable. But, to re-emphasize, this only
applies in instances where purely the data of the participants is processed. As soon as I designate as
payload third-party data from people that have nothing to do with the operation of the blockchain
or the transaction, it becomes more difficult to argue legitimate interests. From a data protection
perspective, this is certainly not a trivial issue.
Data minimization
The principle of data minimization means that data can only be processed as long as it is appropriate
for the purpose and it must be limited to the minimum necessary for this purpose. There are rules
called Privacy by Design and Privacy by Default. Privacy by Design means that, already during the
development of a system, it must be ensured that as little data as possible is required, and that this
data is handled in as data-protection friendly a manner as possible. The data-protection friendly
default setting also means that the system must be built so that anything that is not absolutely
required from the user is actively consented to by the user.
Here, we have the potential for conflict with blockchain. The redundant data storage in a distributed
network fundamentally contradicts the principle of data minimization. The principle of blockchain is
to distribute as widely as possible. This is a security aspect for blockchain. This is not particularly
welcome from a data protection perspective. Openness and transparency are also problems for
Privacy by Default and Privacy by Design, which represent quite the opposite precepts.
There are a range of ways of approaching the situation, depending on the use case. Under some
circumstances, it is possible to build systems so that no identifying data is in the blockchain. It is
possible, for example, to build in access limitations such as those in payment channels. Here,
multiple transactions are merged off blockchain, and then only one overall transaction is written into
the blockchain, so that attribution of the individual transactions is no longer possible. For ConSozial
Blockchains certain permissions can be given about who can read or write what, and when.
But what we must remember is that the blockchain use case does not in itself provide justification
for the distribution of data. This always needs to be viewed in conjunction with the potential risks.
And here, I mean the risk for the data subject – so the data protection risk. This means that if you
have a use case that contains medical data, justification for using a public blockchain solution will be
more difficult than a relatively trivial use case like cryptocurrency transactions.
The right to deletion or the right to be forgotten
With the right to deletion and the right to be forgotten, we have a major problem with blockchain.
Blockchain simply does not provide for the option that anything should be deleted – this is also a
security feature of blockchain. We specifically do not want anything to be deleted, because that
would represent a manipulation. Data protection law sees this differently: Everyone must have the
right to be deleted from publicly accessible registers, databases, and so on.
The immutability of blockchain, as I mentioned, is one of its security mechanisms. So how can this be
achieved? Zero-Knowledge Proofs can be one method. If only assignment data is written into the
blockchain, and then the link to the off blockchain data is broken, this is a good argument that
something like deletion has taken place. As soon as the personal data itself is written into the
blockchain, for example in the form of Public Keys, it is problematic.
Transfers to third states
Then, we have transfers to third states, which is also a big topic for public blockchain and which I will
just briefly touch on here. Anyone can participate, which means that people from outside the EU are
also involved. If we assume that a public key is personal data, then – again using Bitcoin as an
example – the operation of or participation in the blockchain is always connected with an
international data transfer. The international data transfer requires a safeguarding of the
international data transfer according to data protection law. There are several possibilities: There
are the standard contractual clauses of the European Commission. It may be possible to implement
these as T&Cs, so no signature is needed. Then, we have the EU-US Privacy Shield. If the transfer is
going to the USA, then the recipient in the USA must be registered and must have committed to
these principles, so this also represents legal uncertainty. Then you would need to use the T&Cs or
some kind of standard contract to obligate all participating nodes and miners to agree to certain
contractual commitments.
Achieving data protection compliance in blockchain projects
As we have seen, there are a number of issues when we combine blockchain and potentially
personal data. Although blockchain technology offers the advantages of transparency and
immutability, it is exactly these characteristics that can lead to conflicts with data protection law.
The developers of blockchain projects should therefore carefully analyse the kind of data intended
to be stored in the blockchain, and weigh up the advantages and disadvantages of the type of
blockchain to be used. Certainly, the principles of data minimization and mechanisms for ensuring
the anonymization of personal data are essential elements to consider.
Recent Comments