Over the last few months, we have creating our own privacy preserving software license, the Memri Privacy Preserving License, based on the Mozilla Public License (MPL). The license adds a privacy clause to protect our users and make sure that we take responsibility for the software that we put out in the world. This is an early version of the license and we are actively looking for feedback on the moral side and implementation details made so far.
You can find the latest version as a diff to the MPL here, and to facilitate a fruitful discussion we describe our reasoning behind the biggest additions to the MPL license below.
It is useful background information that in general, software licenses make an agreement between two parties: the original creator of the software, and a party extending or using the software in its own products. In this license we refer to the latter party as “You”. Moreover, we introduce a third party: the “Subject”, which is the end-user of the of the software, that may interact with the software and thereby exposes its personal data to the software.
1.1. Adequately Encrypted | Explanation |
means encrypted using encryption standards generally accepted for encrypting sensitive (personal) information, whereby the decryptio keys may only be available to Subject. | We make destinction between encrypted and unencrypted information. The main idea here is that the Subject (read user) only has access to the keys. Third parties may still provide services like storage to users, as long as these parties cannot access the plaintext data. |
1.2. Aggregate Information | Explanation |
means all information that is generated by combining Persona Information about a group of individuals, that does no longer directly contain such Personal Information, and which was created for another purpose than extracting Personal information from that information. |
Before explaining what this is, let's make this clear: users can always opt out.
Many applications will require to compute user statistics, grouped crash reports and other aggregated information. Making sure that aggregated information does not contain any information about users in the widest sense is a very hard problem, and proving that no information is leaked at all form an information theoretical perspective can be impossible in practice. Therefore, we will allow parties to generate aggregate information. As we will read later, what we won’t allow, is parties actively trying to extract information about individuals from this aggregate information. |
1.15. Personal Information | Explanation |
Means all information related to or generated by a Subject, generated by the interaction between a Subject and a computer system by means of input devices and/or sensors, or generated by Processing Personal Information, excluding any Aggregate Information. | Many definitions of personal information are too narrow in our eyes. They often address direct sources of information, but forget to include information that can be deduced about you based on your personal information. ***COMMENT RUBEN: alle informatie die google in hun socalled "shadow text" (ref age of surveillance capitalism) opneemt en die je dus niet krijgt via de GDPR*** Our aim here was to have the broadest definition of personal information possible. |
1.16. Process / Processing | Explanation |
means any action, whether performed by a human or by a computing device involving Personal Information. Such actions include, but are not limited to, storing, retrieving, viewing, displaying, copying, removing, editing, displaying, and showing. | We are aiming to define all things third parties may want to do with your data here. We are not going into restrictions just yet. |
1.19. Static Data Set | Explanation |
means a fixed amount of data that is provided at one, immediate point in time. | Definition of a fixed size dataset (E.g. "these 10 photo's), instead of data that streams in over time (E.g. "All my e-mails, including incoming") |
1.20. Subject | Explanation |
means the person that is a user of Your Covered Software. | This defines the user we are trying to protect with the license. |
1.21. Unencrypted | Explanation |
means not Adequately Encrypted. | Definition of unencrypted data. Note that we define encrypted, but not adequately as unencrypted. |
2.1. Grants | Explanation |
Provided that You comply with all the terms of this License, each Contributor hereby grants You a world-wide, royalty-free, non-exclusive license: | This is an obvious but essential definition in our license. It defines that anyone that uses or extends the software covered under this license, will have to comply with the privacy restrictions for its users that are described by the license |
4.1. Protecting Personal Information | Explanation |
Personal Information is not subject to ownership, by either You or the Subject. Instead, privacy is a fundamental human right. Determining if and how Personal Information is Processed is an essential part of privacy. This clause aims to guarantee that You respect this fundamental human right.
You must ensure that Personal Information is Adequately Encrypted wherever possible. This clause 4 does not apply if and in so far as You are the Subject. |
This is the start of our privacy preserving restrictions. First of all, we are trying to set the tone. We believe that as a user, your data should always be yours. You have control over it, and no one else does. If other parties want to use your data, it should be encrypted where possible.
This definition is not airtight from a legal perspective. Especially the “where possible” leaves some room for interpretation. Our aim here was to have a definition that makes sense from both 1) a practical perspective, and 2) a privacy perspective. It aims to ensure that third parties do everything what can be expected from them to protect your data, but it also makes sure users don’t expect the impossible. If some encryption algorithm has a bug, it is unreasonable to expect parties to immediately know about that. TODO: REWRITE THIS |
4.2. Processing Adequately Encrypted Personal Information | Explanation |
You may Process Personal Information by means of Covered Software, or by means of a Larger Work, without obtaining the prior authorization of clause 4.3, only if and in so far as necessary for providing Your service to the Subject, and provided that this Personal Information is Adequately Encrypted. In all other cases, the terms of clause 4.3 apply. | This section defines that parties may provide services for your data in encrypted form, such as storage. |
4.3. Processing Unencrypted Personal Information | Explanation |
You may not Process Unencrypted Personal Information by means of Covered Software or by means of a Larger Work, unless You obtain prior authorization from the Subject. You will ensure that such authorization is always:
(a) For a specific period of time; (b) For a specific, pre-determined, communicated and unchangeable © Based on clear and understandable information about the specific (d) Based on a separate agreement, and not concealed in general terms (e) Accompanied by an offer to fairly compensate the Subject for the The specific period of time listed under clause 4.3 (a) may be an unlimited period of time only if and in so far as the Personal Information is a Static Data Set. If the Personal Information is not a Static Data Set, the specific period of time may not be longer than two years, which term may be extended for another period of two years. Such extension may not occur automatically, and requires renewed authorization in accordance with this clause 4.3. The terms of this clause 4.3 apply to Processing Adequately Encrypted Personal Information for other purposes than the purposes described in clause 4.2 as well. |
With this section we aim to restrict what parties can do with your data, if you allow them to use it in unencrypted form. We do not want to protect the user against it self by saying that no one can ever use your unencrypted data, if you want to borrow or give away your information, for instance for a good cause, that should be your choice. At the same time we do want to circumvent situations that are not acceptable under any circumstances. Our aim here has been to provide a set of clear rules that cannot easily be misused.
In all cases, when parties use your unencrypted data, it should be clearly communicated what the data is used for and for how long it is used. Also, users should be “fairly” compensated for the value that is derived from their data. Again, we leave some room for interpretation here, because calculating this value exactly is very subjective in practice. However, it makes situations where companies earn billions of dollars by selling products that are build on user data while users getting nothing less likely. This section has an exception for fixed datasets in terms of time constraints. This allows parties to do research or create models (aggregate information) over data while maintaining reproducibility of their work. This exception is only applies to the time constraint, the other constraints still apply. |
4.4. Requirements for third party access | Explanation |
If You want to provide access to Personal Information to a third party,
then You must ensure that the Subject is a party to the transaction
(whether that transaction is for a fee, or not) with regard to their
Personal Information, so that the Subject can authorize that access.
The criteria for authorization listed in clause 4.3 apply in full. In addition, you must ensure that the third party complies with the obligations of clause 4.3 and 4.4, even if the third party does not use Covered Software or a Larger Work to Process the acquired Personal Information. This Clause does not apply if and in so far as the Subject initiates the process of providing access to such third party, for instance by sending a limited Static Data Set to a recipient of their choice. |
This section defines what constraints apply when parties want to sell your data to other parties. This is allowed, if the user authorises it, but the same privacy constraints apply for the buying party. |
4.5. Revoking authorization | Explanation |
If a Subject does not extend the authorization previously given, You will delete all their Personal Information and after deletion You must send proof of deletion to the Subject and subsequently delete such proof. | This section defines that a user revokes access after borrowing their information by default, and it ensures that parties delete the users data when access is revoked. |
4.7 Restrictions for Processing Aggregate Information | Explanation |
If you want to Process Personal Information to generate Aggregate Information, the terms of clause 4.3 apply. In addition:
You may only Process Personal Information to generate Aggregate Information if and in so far as it’s reasonable to assume that the chance of extracting Personal Information from the Aggregate Information is negligible. Creating Aggregate Information may not serve the purpose of obtaining more Personal Information about a Subject or extracting Personal Information from such Aggregate Information. Furthermore, You must do everything in Your power to prevent anyone, including Yourself, from trying to extract Personal Information about a Subject from Your Aggregate Information or Your Inferences. If you want to provide access to or sell Aggregate Information to a third party, You must ensure that the terms of clause 4.7 (b) and © apply in all material respects to such third party, even if the third party does not use Covered Software or a Larger Work to Process the Aggregate Information Provided You comply with the terms of this clause 4.7, you may use Aggregate Information to generate Inferences. |
This section restricts how parties van generate and use Aggregate information. The main point here is that parties can make models and statistics over user data, but the purpose of doing that can never be to learn more about individual users. |
4.8 General Restrictions on Processing Personal Information | Explanation |
You may not Process Personal Information by means of Covered Software or by means of a Larger Work, if the purpose of such Processing is:
(a) Surveillance; (b) Tracking; © Influencing or recording political views; (d) identifying, or obtaining information about, other persons than the Subject. |
This section constraints what the purposes of processing personal information may be. We exclude some obvious activities that do not allign with any privacy values. |