GDPR & Your website - Guide to privacy and peace of mind

Updated: May 30, 2018

One day, you're a carefree blogger. The next, you're suddenly dealing with this big, looming thing called GDPR. The EU has introduced a new privacy-focused regulation, GDPR, and it dictates important privacy, security and data transparency requirements for websites handling personal data. You're asking yourself, does this affect me? And you're worried. Today, this article will help you better understand who, what, when and how, and hopefully give you both the knowledge and the tools to become a carefree blogger once again AND be merrily compliant.

Now, the one extra question that you maybe asking yourselves is: why are publishing this only now, AFTER the regulation came into effect? Well, the answer is, believe or not, most tools and services out there released GDPR-compliant updates only in the past week or so, and that finally allowed me to put this guide together. Let's see what gives.

Teaser


Table of contents


  1. Disclaimer
  2. GDPR in 60 seconds
    1. What if you are processing personal data?
    2. It's not about the fines, either
  3. Follow the data flow
    1. Connection to a website
    2. Loading a page
  4. Google Analytics
    1. Google Analytics & personal data
    2. Google Analytics & local storage (cookies)
    3. IP address anonymization
    4. What if you want to collect some data (or use cookies)?
    5. Additional Google Analytics reading
  5. Google Custom Search Engine
  6. Google Adsense
  7. ShareThis buttons
  8. Matched content
  9. Other types of data
    1. E-commerce forms and buttons (PayPal)
    2. User registration
    3. Comments
    4. Email (and mailing lists)
    5. Embedded videos (Youtube)
    6. Embedded scripts
  10. Content Management Systems (CMS): WordPress
    1. Data collection = plugins
    2. Google Analytics
    3. Comments (disqus)
      1. Disqus & WordPress database integration
    4. General script control
      1. Wp_enqueue_script
      2. Wp_deregister_script
    5. Session cookies
    6. WordPress 4.9.6 - GDPR changes
  11. Data backups
  12. Encryption tools
  13. Documentation
  14. Conclusion
  15. Resources

Disclaimer

Let's start with some important clarifications:

I am not a lawyer, and you should not interpret this article as legal advice any which way.

I might be wrong or deluded, so don't just blindly trust the information below.

This article is primarily intended for individuals running websites and blogs, not companies.

GDPR in 60 seconds

If you haven't already heard or read about GDPR, here's the briefest of versions. General Data Protection Regulation (GDPR) is a new EU regulation on data protection and privacy. It came into effect on May 25, 2018. It is designed to provide additional transparency and choice around personal data handling of EU/EEA citizens and residents. The regulation requires businesses to introduce additional safeguards and clarifications on how they handle, store and use personal data. The emphasis on personal. Not personal = not a problem.

GDPR applies to anyone handling, storing or using personal data. So if you're about to ask whether the regulation applies to you (you, the individual with a small blog on cooking, travels or IT repairs), the answer lies in the following question: do you handle, store or use personal data of EU/EEA citizens and residents?

The answer to that question is NOT trivial.

What if you are processing personal data?

Before we can say yes or not, let's assume a quick yes for a moment.

GDPR does not prevent data processing - it requires transparency and additional protection. In other words, you need to be able to justify your reasons for using personal data, explain why and how you do that, and have mechanisms in place to protect the data. These requirements can be broadly defined as:

  1. Need to know - if you're handling data, there should be a reason.
  2. Privacy - if you are handling data, try to make it less personal.
  3. Security - if you are handling data, it needs to be secure.

Reading online, I've noticed a lot of people getting really upset around the data handling. They immediately assume that data handling is wrong and that it should be avoided. The right approach is - excessive and unnecessary data handling should be avoided.

If you need certain pieces of personal data, you need them. That's fine. If you have legitimate business interests, you exercise them. The thing is, GDPR came into life because a lot of businesses were playing with personal data WITHOUT really having a true, genuine, core business need - hence all these privacy scandals you've been reading about. Say you have a plumbing business. Do you need to know the age or social interests of your customers? No. But then, if you sell T-shirts, you do need to know their sex and possibly age and physical address to ship the merchandise? Yes, you do.

If you do handle the data, then make sure you are transparent about it (need to know). And then, protect it. In this article, I will use concepts like anonymization (making personal stuff non-personal) and encryption (making personal data less visible/accessible).

It's not about the fines, either

This is another hot element of the regulation. Hefty fines. The numbers are scary. But focusing on them is the wrong approach. You do not check the law and the associated punishment and then decide whether you should do something right or wrong. You don't abstain from theft just because the prison sentences might be long. You don't do it, because it's wrong. Same here. You should not obey GDPR because it comes with fines. Look at it as an opportunity to create a better, transparent online presence.

That process can be painful and costly, so it all goes back to our original question - personal data. Are you handling it?

Follow the data flow

There are several levels of clarification needed to answer the question above:

  1. What classifies as personal data?
  2. Does your website have mechanisms in place that collect personal data?

The answer to the first sub-question is also not trivial. Personal data is anything that can identify a person. Like most legal terminology, the definition is vague. The best answer is, if you are not sure, or if you have a doubt, it's best to assume that a certain "ambiguous" piece of information can indeed be used as personal data.

A good example is the IP address of a computer. In technical terms, this is like your phone number. It's a number (address) that allows computers to communicate with one another, and it uniquely identifies your device. So yes, it can be, under certain circumstances, used to identify a person, the same way your phone number can be traced to your contract or your bill or similar.

It is also an inseparable piece of Internet communication. And this is where the battle between technology versus law comes into play. Which is also why privacy regulations are (or need to be) ambiguous. This creates a headache and a worry, especially among less tech-savvy people. One day, someone just wanted to share their food recipes. The next, they need to think about IP addresses.

This brings us to the second sub-question. Some (maybe most) bloggers will not have setup their own infrastructure, and will be using a platform created by someone else - a blogging platform, a website built by a contractor, etc. They may not even know what is happening in the background.

And this is why we must follow the data trail.

If you want to figure out whether GDPR applies to your site/blog - and if it does, what you need to do to make sure you are compliant - then let's follow the journey of Internet traffic - from a curious user to your site.

Connection to a website

Without going into technical details, when you go to a website, say dedoimedo.com, what happens is, your browser uses the Internet address book (called DNS), to figure out where dedoimedo.com can be found on the wider Internet, and then contacts this address. If everything is configured ok, a piece of software called Web server will accept this contact request and return information back in the form of web pages, like the one you are reading right now.

When you initiate a connection to a Web server, your browser sends some necessary information, including your IP address, your operating system, your browser version, and some additional data. Each browser has its own so-called signature (user agent), which helps the Web server provide best content.

Web servers log connections into their data logs. Quite often, this is a must for: 1) security reasons, because you need a trail in case of hacking, fraud 2) technical problems 3) Web traffic balancing 4) compliance and auditing.

We mentioned the IP address. So we know there's already one piece of personal information that is being logged with the server, so GDPR comes into play. But wait. We need to full understand if this really applies. Hence, the following questions:

The reason for these question is because there are broadly TWO types of Web services available to people. They could have their own servers or they could be using someone else's servers. The latter is known as hosting, and you could be paying money to a hosting provider to set up and maintain your Web server for you. Very often, these are shared services. You could be hosting with GoDaddy, Digital Ocean, Media Temple, etc.

If you have your own website, things are more complicated. But it also probably means you are a tech-savvy person, because setting up a Web server is not a trivial thing, at all.

If you are paying money to a hosting company to give you the infrastructure to set up your own blog, then the full responsibility for GDPR is not just yours. Your hosting provider also has a responsibility, because they have control over the infrastructure.

If you have no access to the server logs, this is not your concern. If you do have access - some hosting providers do allow limited access to their users - do you have any control? Quite likely, you will not be able to remove any data from these logs or change what and how gets logged. Again, this is primarily for security reasons, and that's fine.

But then, are you using the Web server logs?

You may not be able to create server logs - or decide what goes into them -  but your hosting provider might have given you read access. Some bloggers use automated tools like AWStats to analyze Web server statistics. This open-source log analyzer can read and parse data from Web server logs to create hourly, daily, weekly, monthly traffic graphs and charts, show the number of visits and visit duration, domains, countries, top-hitting traffic, etc. Among other things, AWStats will also read and parse IP addresses.

If you are doing so, then you are processing personal data - and so you need to be GDPR compliant.

AWStats is a good example, because it is a popular tool. But it also opens a whole range of follow-up questions. If you are processing personal data, what safeguards do you have in place around data protection? Do you share your site statistics publicly?

AWStats is simple and friendly - but it stores data as plain text, for instance. This can be a problem, because GDPR stipulates enhanced privacy and data protection mechanisms. We mentioned anonymization and encryption, and AWStats does not satisfy these two.

So if you're using a tool like AWStats (or similar), you would need to see if they have a (relatively easily) configurable option to de-select certain types of information, like the IP addresses, and whether the parsed data can be kept in encrypted files. If not, you should NOT analyze the Web server logs, if you have access to them, because you are then transferring part of the GDPR responsibility solely from the hands of your hosting provider into your own. Not fun, but then we will talk about other types of site statistics a little later on.

Loading a page

Once you're connected to a server, you will get the page and it will display in your browser. Now look at it from the other direction. A user will connect to your site, and it will show in their browser. If you do not know what your site is doing, you can now work top down, to try to figure out what's happening in the background.

Every displayed Web page has a visible portion (what you see) and invisible portion - Web language directives and declarations and various scripts. Web pages are written in HTML/CSS language and they often load scripts written in Javascript language.

You can follow the loading of any which page in your browser using advanced developer tools in modern browsers. For instance, both Firefox and Chrome offer these tools. Right-click > Inspect. This will open a separate page full of tech stuff, including a console that can show the sequence of loaded resources. This is the backend of your pages, and it is completely different from the visible part.

Web console, sources

Please note: if you have never, ever done this, the task of deciphering your site's execution flow is not going to be easy. If you're not tech-savvy, you will struggle. However, if you do want to be GDPR-compliant, you do need to know what your site is doing to decide whether the regulation applies to you. If you already know, that's great, because then you want focus on the actual data.

I will give you my own website as an example. It's not the best or most representative example, because dedoimedo.com is largely a static site with no user interaction. I do not have any registered users, I do no use comments, and the data exchange portion is very limited. Still, I believe this is a useful exercise, because even a seemingly innocent site may have several data-collection vectors that need to be taken into account.

Looking at my page loading, normally, the flow (both visible and invisible elements) will be something like this (don't worry about specific details for the moment, we will talk about relevant ones in detail):

Google Analytics

Remember Web server logs? Well, some people may or may not have access, some may or may not have the technical knowledge needed to setup manual analysis tools. Which is where Google Analytics comes in. This is probably the most popular site traffic statistics tool out there. The configuration is simple - you just add a piece of code to every web page, and done! You win, and Google wins, because they like data.

Google Analytics offers several types of code. In the past, Google Analytics was based on a tool called Urchin, and the snippet of code you added to your pages reflected this. Then, several years ago, Google deprecated Urchin, although the functionality remains active (for now) for backward compatibility with websites still using it. The successor to Urchin is Universal Analytics. More recently, Google also introduced Global Site Tag, which offers additional features and site measurement capabilities. Then, there are some other methods, too. The three main pieces of code look as follows.

Urchin (urchin.js):

<script src="https://www.google-analytics.com/urchin.js" type="text/javascript"></script>
<script type="text/javascript">
_uacct = "UA-xxxxxx";
urchinTracker();
</script>

Universal Analytics (analytics.js):

<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', 'UA-xxxxxx', 'auto');
ga('send', 'pageview');
</script>

Global Site Tag (gtag.js):

<script async src="https://www.googletagmanager.com/gtag/js?id=UA-xxxxxx"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());

gtag('config', 'UA-xxxxxx');
</script>

These are the default snippets of code, and to make them work, you need to change the xxxxxx letters with your Google Analytics code. By default, Google Analytics sets cookies and collects some information. So we need to understand: a) what are cookies? b) is the collected information personal?

To answer that question, we need to focus on two things: the code you run and the stuff that happens inside Google Analytics. The simple answer is: by default, Google Analytics is not configured for any analysis of personal data apart from the IP address. Indeed, Google has done a lot of work on making their tools GDPR-compliant, and this is reflected in the options available inside the Google Analytics admin dashboard. Let's start with that.

Personal data collection comes across several vectors - and if you use these, you need to notify your EU users and ask for their permission (consent) to collect the data. This means that the execution of the analytics script is dependent on user choice. In other words, if someone does not want their personal data collected, the analytics script will not run, and you will not have collected ANY statistics on that particular user.

So you have a choice:

  1. Collect personal data pending user consent; possibly collect less (theoretically as much as nothing from EU).
  2. Not collect personal data - in which case GDPR clauses do not apply.

Google Analytics & personal data

You can run Google Analytics as a non-personal site traffic statistics tool. This means you will have information on the number of visits and pages, real-time traffic and some other metrics, but you will not have anything on your users. Let's see what you need to do online in the Google Analytics dashboard first. Google has a very detailed FAQ that explains what you can and cannot do - and must do.

Google Analytics allows you to use advertising and remarketing = personal data. You can turn these to off.

GA & remarketing

You need to set a retention policy (old data will be deleted). This means you won't be able to make analysis based on old historic data (before the set retention period), but again, this helps minimize the amount of data processing, which is what we want (not relevant anyway if you follow the next step below). Still, it's always best to err on the side of caution, and to be privacy conscious.

GA & retention

Then, there's also the User-ID step. Google already prohibits using any data that can identify people, but if you do use this feature, you will need to ask for consent. Hence, this feature needs to be set to off.

GA & User-ID

You should also make sure don't have any custom dimensions and metrics that may potentially identify people. Because if you do have those, then you're moving from non-personal space into personal space. Again, it is fine if you have those, but that's a separate topic. For now, we're focusing on anonymous, non-personal data collection using Google Analytics.

Custom metrics

This covers the online part in the dashboard. But we still haven't talked about the IP address and the cookies. So, the former is clear. Now, cookies ...

Cookies are text files stored locally on your computer and used by websites to create continuity into your online activity. Think of cookies are small log files that tell websites about your last visit, your login, etc. It's a way for sites to be able to remember you, like when you come back and you're still logged in, for instance. Websites can set cookies, and they will have an expiration date, after which the cookies will be deleted and their contents erased, effectively erasing your activity history against a website.

Technically, some information stored in cookies could potentially identify people, which means, you need to ask users for consent to set cookies. If they do not agree, the cookies must not be set.

Google Analytics sets several cookies. So you need to ask the user for permission. However, it's a little more complicated than that. Even if you ask for consent for the initial set of cookies, when the website page loads, if Google Analytics is allowed to run, it will set cookies during the browsing session, so you actually do need to ask users for consent to not only set cookies but also run Google Analytics in the first place. This goes back to site statistics. If users don't consent, their visits won't be counted (even the non-personal part), and you will see less (EU) traffic counted in your statistics.

So if you want to be able to run Google Analytics without any personal data collection, you need to disable cookies, and we also need to handle the IP address part - in addition to all the online work.

Google Analytics & local storage (cookies)

The developers tools section for Google Analytics contains a wealth of information, including the settings to disable local storage, which means no cookies will be set at all. Indeed, for the Universal Analytics, you need to change the following line:

ga('create', 'UA-xxxxxx', 'auto');

To this:

ga('create', 'UA-xxxxxx', { 'storage': 'none' });

And then, no cookies are set.

IP address anonymization

Google Analytics captures IP addresses = personal information. But you can anonymize the IP address. By doing this, the IP address, which looks something like aaa.bbb.ccc.ddd will become aaa.bbb.ccc.0. Without going into technical details, this will make the unique identifier (IP address) identical to the 254 available addresses for the last segment (octet) of the address, making the user's Internet presence via IP address as seen by Google Analytics anonymous. This is done by adding a line before the Google Analytics event is sent to the server. Add the following line after the 'create' line and the 'send' line (Universal Analytics):

ga('set', 'anonymizeIp', true);

Something like this:

ga('create', 'UA-xxxxxx', { 'storage': 'none' });
ga('set', 'anonymizeIp', true);
ga('send', 'pageview');

And now you have anonymous, non-personal data collection via Google Analytics.

The downsides of this method is that you will have no knowledge of your user loyalty. Your bounce rate (how many people come to your site for just a single page and then leave) will go up to 100%, and the average session duration (time spent on site) will drop to 0 seconds. But that's fine, because that's EXACTLY what we want. We want aggregate, non-personal traffic data only.

GA, bounce rate & session duration

What if you want to collect some data (or use cookies)?

In that case, you need to make the loading of the Google Analytics script dependent on explicit user consent. And for that purpose, you need some kind of a tool (another script really) that will block the execution of Google Analytics and the setting of cookies until (and if) the user has consented.

I decided to implement Civic Cookie Control, a tool specifically designed to offer cookie compliance that meet GDPR requirements. This tool, as well as several others are listed on Google Cookie Choices. Having tested some of the available solutions, I decided this was the most adequate tool for my needs.

What I liked (apart from the obvious technical bits) about the Cookie Control is that the PRO version allows you to filter out non-EU countries, so the applet is not shown to users who are not affected by GDPR. In other words, you can selectively show your users the cookie consent applet. While a free version exists, it's more limited, and the PRO version is not cheap though, which goes back to what I mentioned in the beginning: GDPR compliance can cost money, time and require expertise that most individual bloggers or site owners may not have.

Cookie Control

I implemented Cookie Control on every page on my site - you can choose the wording, the styling, and create necessary categories. Most importantly, for each category, you have two functions: onAccept and onRevoke. The former will run some code (and set cookies) ONLY if the user has consent. And the latter will delete cookies and possibly stop scripts if there's no consent. The exact configuration of this particular tool is beyond the scope of this article.

So if you want to run Google Analytics + cookies on every page, then with a tool like Cookie Control, in the EU, the declaration would look something like (using Universal Analytics as our example):

onAccept : function(){
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', 'UA-xxxxxx', 'auto');
ga('send', 'pageview');
},
onRevoke: function(){
window['ga-disable-UA-xxxxxx'] = true;
}

This also shows that we have the option to block scripts, if we need to, in additional to cookie control [sic]. We will see this example later on, when we discuss Content Management Systems (CMS) like WordPress.

The Cookie Control script is declared on every page on my site - it controls other third-party content, not Google Analytics, as I've opted for the anonymized, non-personal route. The added code on each page looks something like this (I put the code into a single line, because it makes for easier search & replace if needed):

<script src="https://cc.cdn.civiccomputing.com/8.0/cookieControl-8.0.min.js"></script>
<script>var config = { apiKey: '3d26a534992cc77791b5baf9525828de5a1213b9', product: 'PRO_MULTISITE', text: { title: 'This site uses cookies.', intro: 'Dedoimedo.com has been designed for ...

LINES REMOVED FOR BREVITY

... onRevoke: function(){}, thirdPartyCookies: [{"name": "ShareThis", "optOutLink": "https://www.sharethis.com/privacy/"}]}], initialState: "OPEN", position: 'RIGHT', theme: 'LIGHT'}; CookieControl.load( config );</script>

Additional Google Analytics reading

There's more in the resources section, but here are some additional things that you may want to consider. And by consider, I mean inform your users that they have the choice to make their browsing more private and to opt-out of data collection, not just on your site, but in general.

Google Analytics Opt-Out Browser Add-on

Google Custom Search Engine

Google Custom Search Engine (CSE) is a piece of code that you can add to your pages, allowing people to search for content on your site and/or globally. Visually, it will create a search box, and people can use it like any other Google search. Here, we need to ask ourselves: is there any personal data collection involved?

<script>
(function() {
var cx = 'partner-pub-123456789:123456789';
var gcse = document.createElement('script');
gcse.type = 'text/javascript';
gcse.async = true;
gcse.src = 'https://cse.google.com/cse.js?cx=' + cx;
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(gcse, s);
})();
</script>
<div class="gcse-searchbox-only"></div>

Now, the tricky part, Google CSE can also display ads - and this ties into Google Adsense, which we still haven't talked about. But in general, Google CSE can be considered an ad feature, because it is actually generated through the Google Adsense dashboard. So there's definitely a personal data aspect - and cookies. We know how to handle cookies. We need to talk about ads.

Google Adsense

Google Adsense is probably the most popular advertisement network in the world. Again, like most other Google tools, it's very simple to use. You just add snippets of code to your pages, and when users come by and visit, they may see ads displayed. Again, we need to ask, are we handling personal data?

There are two types of ads - non-personal (just random stuff) and personal (specifically targeted to a user based on their previous behavior and collected metrics). If you allow personal ads via Google Adsense on your site, by proxy, you handle personal data. But giving consent to run Google Adsense scripts is much more difficult than Google Analytics. One, the placement of ads is important, unlike analytics, where you just place the code anywhere, and the action is invisible to the user. Two, if the user does not consent, the ads do not run and show, and consent effectively becomes an ad blocker, which means less revenue for the publisher.

Google realized this, so they figured they need to provide a CENTRAL tool that allows their users to configure how ads are served to the EU users. Indeed, shortly before the May 25 deadline for GDPR enforcement, Google introduced a separate section in the Adsense dashboard, which allows serving only non-personal ads to the EU users. This is the ideal way how EVERY tool and service should handle it, alas it is not always possible.

Adsense, non-personal ads

As the EU user consent tab explains, you still need to let users consent to cookies. This also covers the search engine implementation, because some of the cookies are shared, since CSE can display ads. So going back to our question, we use non-personal ads in the EU, and we ask users for consent on cookies, which covers this section.

ShareThis buttons

When I visually revamped my website in early 2018, I added the ShareThis inline buttons script to my pages, as a way of increasing user engagement around content. Essentially, the functionality comes in two pieces, an HTML element that shows the buttons and the script that runs in the background:

ShareThis placement

<script src="//platform-api.sharethis.com/js/sharethis.js#property=
5a3a29849d192f001374331f&product=inline-share-buttons"></script>

Unfortunately - and we will see this throughout the article - ShareThis only released an update for GDPR compliance on May 25, 2018, and it was too late for me. I could not wait till then. I decided to implement the functionality beforehand, and I did this using conditional loading via Cookie Control, similar to my Google Analytics (with cookies) example. So if a user consents, the buttons will be shown and cookies set, and if not, then they won't. I lose some sharing/engagement traffic this way, but the implementation is compliant.

Placing the script as is above into the onAccept function won't work. The syntax you need is slightly different, but it is applicable to any Javascript script really:

onAccept : function(){var file=document.createElement('script');
file.setAttribute("type","text/javascript");
file.setAttribute("src", "//platform-api.sharethis.com/js/sharethis.js#property=
5a3a29849d192f001374331f&product=inline-share-buttons");
document.getElementsByTagName("head")[0].appendChild(file);},

In a more abstract manner:

function(){
    var file=document.createElement('script');
    file.setAttribute("type","text/javascript");
    file.setAttribute("src", "SCRIPT HERE");
    document.getElementsByTagName("head")[0].appendChild(file);
}

I also tested the new ShareThis functionality. I found it not to be ideal, because it comes with its own overlay window (which clashes with my cookie applet), and the granularity of control given to the user for consent is not sufficient (in my opinion). I am willing to accept the loss of sharing traffic for the sake of compliance. But this is another good example where the new regulation may interfere with your ongoing activities, like the bounce rate and session duration when you use Google Analytics without cookies.

ShareThis compliance settings

Matched content

This is very similar to the CSE and Adsense. Again, the code for matched content can be generated in the Google Adsense dashboard, so the same rules apply. You need to use non-personal ads in the EU and ask for their consent for cookies. We're all set here.

Other types of data

My examples above only illustrate a small number of possible types of embedded content and scripts. Indeed, you may also have a shopping cart. And this is the next type of data that I'd like to discuss.

You may be interacting with your users other than just showing them words and images. We've seen invisible scripts (they run in the background), we've seen cookies, we also handled visible third-party content like search and ads and sharing buttons, we didn't really process any deliberate personal information per se.

E-commerce forms and buttons (PayPal)

My example in this section will be PayPal. Indeed, PayPal is one of the most popular online payment services. The integration with web pages is dead simple. You copy and paste code onto your pages, and then users can use the Buy button or Donate button to send you money. They authenticate with their username and password, and a transaction is made. Data and money are exchanged between parties.

PayPal form

In GDPR terms, we have a lot of things to consider:

Compliance wise, the short answer is yes. Moreover, people need to agree to PayPal terms and conditions, which means that if they use the payment form or button on your pages, they have already agreed for PayPal to have access to some of their information. You are not privy to any of this data UNLESS a transaction is made, and then, only a small part of this data is shared with you - which you need for obvious business reasons, including tax declarations. The need is not in dispute (no different than any receipt or invoice), but then we will touch some more on the data handling.

Another thing that needs to be mentioned is that PayPal is a financial institution, so some data collection is mandatory, for security and fraud-prevention reasons. And even if you use their services (or any other e-commerce platform), you do not really have much say in the matter.

Once the form or button is loaded, a script will run - and this script will most likely log some important security-related information. The form will also set a cookie - not for your own domain but for paypal.com. And here, we have a situation where we have a non-optional situation. So far, we could ask the user for consent or use our scripts in a non-personal manner. But here, we cannot control this. There is a necessary cookie that is required for the form to work properly, and for PayPal to do their thing. Your only options are to allow the form to load (without any tampering or changes) or not to have it, but this may also mean losing all business.

This constitutes a situation where necessary data collection is done - for security reasons. Cookie control tools will also not be able to delete the PayPal objects even if you want to. The way browsers work, for security reasons, domains can only delete cookies for themselves and nothing else. So if you open a site page, it cannot tamper with cookies that belong to other sites. Makes sense, because you don't want anyone to be able to steal your session login data.

So what do you do in a situation like this? Well, be transparent about it. You still use a service, but explain to your users how you do that, what data is collected - and best yet, refer them to the relevant privacy policy so they can read and understand what happens.

We answered two of the questions. The last one is - you now have some personal data, directly related to financial transactions performed on your site. This is necessary data (like necessary cookies set by the e-commerce platform), and you will need to use this data when you submit tax reports and such. In other words, you will be transferring this data from PayPal, which has its own security measures in place to safeguard data, to your own machine, probably a desktop, a laptop or a phone of some kind. What now?

Remember we talked about two major aspects related to data collection: anonymization and encryption. The first is around minimizing data collection (to the point of making it non-personal). We did that throughout most of the steps above, but we do end up with some data related to core business. We cannot anonymize this any further (that would probably raise a flag with tax authorities, for instance). No problem. So we need to protect this data.

Once you download a form or a report from PayPal (or any other e-commerce platform), you become responsible for it. So you must make sure the data is always safe. This means, if you accidentally lose it, or it gets stolen, you must make any retrieval of personal information difficult. This is where encryption comes into play.

Encryption is a mathematical process that renders human-readable data into random bits of text that have no meaning and that cannot be deciphered without a proper encryption key. Good encryption is near impossible to break with conventional computing methods. Moreover, to make things GDPR-compliant, you need be the sole owner of the encryption key so that only you can open the data. In other words, if someone else comes into possession of this data, it will be useless.

Bottom line: if you keep an offline copy of data outside its original source (like PayPal online console), then encrypt that data. Open it only for the purpose of necessary exchange, like with an accountant, tax authorities, etc. The last step will sometimes require that you send personal data via email or physical mail. Again, this is necessary for legal purposes, and it is unavoidable. But that's fine, because the other businesses and entities handling personal data will also have  their piece on GDPR compliance, and this includes your email provider, your accountant, your government. The idea is to minimize such actions and maintain high security (this is a vague term, but we will elaborate more on that). See the section on encryption farther below.

User registration

Your site may have a user registration form. Normally, such forms will usually have at least a name and email fields. Data provided into the registration form will then be saved into some form of database. If you have this functionality on your site, you need to ask yourselves the following questions:

We go back to what we already discussed - anonymization and encryption. If you can make user data you handle less personal (without hurting your blog/site activity), then do it. If you keep this data somewhere, try to encrypt it. For most people, the answer to how one goes about encrypting a SQL database hosted with a shared provider is very complex. But. You are not alone in this game. Your provider has its own responsibility when it comes to their services: they need to be robust and secure.

Your additional GDPR responsibility comes into play when you handle user data. In other words:

If you struggle with this, you will need to hire a professional to do this for you (money, trust, beware scams). Otherwise, you may not be in the best position regarding GDPR compliance. It sounds tough and cruel, but think about it: if you're clueless around how your user data handling mechanisms work, would you really want people to use your services? Or put yourself in their shoes. Would you trust a site that has no 'idea' what they are doing with user data?

Then, the last piece: data transfer. If you possess personal data, you bear responsibility for its safekeeping. Selling data or giving it to other entities without an explicit permission from your users is a big no-no. The worst thing here is, you could 'accidentally' give away data without being aware. A service on your site might crawl or index your user database, and you wouldn't even know it.

The aspect of active user data collection is definitely more complex than just passive services you have enabled in the background. It requires far more effort. We will talk about this some more when we discuss Content Management Systems (CMS).

Comments

A similar logic applies to comments as to user registration. In essence, the two functionalities are identical. You will have users providing personal information. True, it's their choice to register and write comments, but it is your responsibility to let them know what you intend to do with the data. We will revisit this topic in detail a little later on.

Email (and mailing lists)

This is a very interesting aspect of data privacy, for many, many reasons. First, email is almost unavoidable when it comes to online sites/blogs. You will definitely have some kind of contact information, and people will use it to reach you. And with every email you receive, you will add new data points to your collection.

Emails contain a lot of personal data, too. A typical email will include:

Moreover, emails are normally not encrypted, so the message itself is readable if you possess the actual mail message. Wait. Don't panic. This sounds ominous, but it is not. While the messages are written in plain text, so to speak, most mail systems use additional encryption to establish the connection and send data. We will discuss this in a jiffy.

Okay, so what do we do now?

We will need to address two main topics: 1) receiving and sending email 2) storing email information.

First, you have no control over some of the data you receive in an email. Mail protocols dictate certain basic fields (for functionality and security). Furthermore, if you have a publicly available/accessible email address, you have no control over who contacts you, or why. That constitutes consent on behalf of the user.

The GDPR compliance comes into place when you initiate contact with other people via email. You need consent from your users to do so. If you have user registration and comments, then you will inevitably end up with a number of registered users, and you will have access to their email addresses.

The possession of this data does NOT give you the right to send your users information (most often marketing, advertising and promotion) unless they approve. Think about it. If someone signs in to write a comment, that does not mean they want you to spam them with your newsletter. They need to agree to the newsletter separately.

To make it simple: you need to use email for the intended purpose only:

And that's the black magic of email exchange. You can use mailing lists. You can use newsletters. You need to build these lists with your explicit user approval. Simple.

The second part is: data handling. Email data is like any other personal data. You need to make sure that you keep it safe. It's very similar to e-commerce data. You will most likely have to keep emails, for legal and accounting reasons. Make sure you do that in a secure and auditable manner:

To sum it up:

Embedded videos (Youtube)

This is an interesting one. Youtube allows you to embed videos into your pages. Not bad. But there's a whole range of things that come into play here. Youtube videos combine various elements: there might be an element of tracking, which allows Youtube to offer video recommendations; there might be ads shown to users, again based on various preferences; there might be cookies; and more.

If you choose to embed videos, then you must understand in what way these videos could potentially be used to profile users viewing these videos, and if there's an element of personal data involved, then you need to ask users for consent. Moreover, largely, there are two types of embedded videos - Flash-based clips (older) and native HTML5 videos (newer). This also makes a big difference.

A Flash-based video might look something like this:

<object width="680" height="412">
<param value="http://www.youtube.com/v/EWrJqUtYW7Y" name="movie" />
<param value="window" name="wmode" />
<param value="true" name="allowFullScreen" />
<embed width="680" height="412" wmode="window" allowfullscreen="true" type="application/x-shockwave-flash" src="http://www.youtube.com/v/EWrJqUtYW7Y"></embed>
</object>

An HTML version will look something like this:

<iframe width="560" height="315" src="https://www.youtube.com/embed/KHG6fXEba0A" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>

So the question is - how to make videos GDPR-compliant?

First, regarding Flash clips (specifically Adobe Flash Player). In general, Flash as a technology is probably not the best way moving forward, at least when it comes to online videos. For over a decade, this technology was hugely popular as the preferred method for embedding interactive multimedia content into web pages, but it is gently being phased out in the recent years, both because of associated security issues with the Flash player and also because all modern browsers support native HTML5 video.

Flash clips may also set cookies - not just ordinary cookies - Flash cookies. They are similar in nature to regular Web cookies, but they are stored separately and used exclusively by Adobe Flash Player. And you may not have a straightforward way of controlling them using cookie control tools. So what you should do:

Then, this leaves us with the familiar aspects of data collection and cookies. Luckily, Youtube offers an enhanced privacy feature for sharing. When you click on the share option for any which video online, you have the option to embed the video. Then, you need to tick the box that reads Enable privacy-enhanced mode. This will generate a different embed code, and it will set no cookies.

Youtube, enhanced privacy mode

Privacy-enhanced code then looks like this:

<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/KHG6fXEba0A" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>

And this way - given the fact Youtube (Google) no longer uses third-party ad serving and pixel tracking as part of the GDPR compliance, and we use the privacy-enhanced mode with no cookies, we now have user-friendly embedded videos. Otherwise, you need to ask for consent as before.

Lastly, you may also have other non-standard, non-Youtube Flash clips on your pages. For instance:

<object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase=
"http://active.macromedia.com/flash5/cabs/swflash.cab#version=5,0,0,0" height="550" width="690">
<param name="movie" value="video.swf">
<param name="play" value="true">
<param name="loop" value="false">
<param name="quality" value="low">
<embed src="video.swf" quality="low" loop="false" type=
"application/x-shockwave-flash" pluginspage=
"http://www.macromedia.com/shockwave/download/index.cgi?P1_Prod_Version=
ShockwaveFlash" height="550" width="690">
</object>

There is no easy solution for these. I am not aware of any trivial ways to limit Flash clips or disable cookies by design. If they need to be available publicly, the best recommendation is to upload these videos to Youtube and then offer them with the new, privacy-enhanced embed options. It is possible that these videos cannot be used on Youtube or similar sharing platforms for various legal/content reasons. In that case, you may want to have a separate disclaimer, or remove such clips altogether.

Embedded scripts

You may also embed other content onto your pages, other than videos. For instance, polls, maps, etc. These will come in some form of script, most often Javascript (something.js). The best way to handle these, if you want or need them, is to ask users for consent, in a way similar to what we did with ShareThis buttons. Whatever tool you use, the onAccept function will include:

function(){
    var file=document.createElement('script');
    file.setAttribute("type","text/javascript");
    file.setAttribute("src", "SCRIPT HERE");
    document.getElementsByTagName("head")[0].appendChild(file);
}

Content Management Systems (CMS): WordPress

This section will probably be of most interest to most of the people reading this article. WordPress is the most popular website and blogging management system in the world. It is very easy to use, relatively easy to configure, and it offers a wealth of flexible, useful plugins. I will indeed demonstrate with WordPress, but do note that other CMS exist, like Drupal, Joomla!, Magento, phpWiki, and many more.

WordPress released its first GDPR-ready version, 4.9.6, on May 17, 2018. Wpbeginner.com has released a very useful guide on GDPR on May 23, 2018, only two days before the regulation came into effect. My own testing and implementation of necessary changes shows certain differences to this guide, and so I'd like to share in detail the changes and settings that I've undertaken to make my book-writing site, The Lost Words, in line with the regulation.

The Lost Words is a fairly simple WordPress site - no user registration. I had comments enabled using the Disqus plugin (instead of organic comments) and collected data analytics using the Google Analytics for WordPress by MonsterInsights plugin (which is also mentioned in the guide above). I have disabled both these plugins prior to the GDPR enforcement data and implemented alternative methods for compliance, and I will explain exactly what and how.

So, the journey is the same as what we did earlier - only we will now be doing this with WordPress. Establish a connection to your site. Try to to understand the data flow. Open the dashboard and examine your installed and activated plugins.

Data collection = plugins

If you're using one or more WordPress plugins, it is quite likely you are collecting data, possibly personal data. The simplest way to verify if your plugins are collecting data is to read their privacy policy, if one exists. Then, you may need a little bit of tech knowledge to understand better what happens. And do remember, the fact core WordPress is GDPR-compliant does not mean your plugins - or your activities - are.

In my case, I have several plugins installed. Some of these do not collect any personal data whatsoever. These are mostly security hardening tools, designed to make the website more robust. Some plugins have a configurable option to collect data (this is similar to Web server logs we discussed and Google Analytics options). Some definitely do collect personal data. So we have three groups:

Google Analytics

The Website traffic statistics collection via Google Analytics falls into the second group. We have already seen how to use Google Analytics in an anonymous, non-personal data collection way. Here, I will show you how to run the script without personal data but with cookies. We will combine the two.

I decided to implement the conditional loading of Google Analytics based on user consent via Cookie Control, as I've shown you earlier. Basically, we ask users to consent, and if they do, the script will run and the cookies will be set. We still use IP address anonymization. Cookie Control, the tool of my choice, actually has a helpful example for exactly this purpose, so this should be easy to set up. See earlier for the actual copy & pastable snippet of code.

WP, GA setup

Cookie control overlay

Why not MonsterInsights?

A simple matter of practicality and timelines. Originally, I actually had the Google Analytics for WordPress by MonsterInsights plugin configured for non-personal data collection (except cookies), even before they released their GDPR-ready update. Indeed, the plugin now comes with a separate EU Compliance tool that will configure everything in this regard (automatically). This addon costs money - and I've already invested money in a different cookie control solution (which also handles scripts).

EU compliance tool

Reading on what it does, the plugin handles IP anonymization, disables UserID tracking and disables Author tracking, this is a custom dimension. You can manually handle these, BUT the advantage of doing through this plugin is that these changes will only apply to your EU visitors. If you make the change through your Google Analytics accounts, they will be global. Google Analytics for WordPress by MonsterInsights also integrates with cookie plugins. I do not know if there's any conditional loading of the script, though.

For me, the introduced changes were a little late - like most GDPR tools, they only happened a short week before the enforcement date, which is hardly enough to properly test. Moreover, given that I had to ask users for consent for cookies (although I could use no local storage option), then I could also load Google Analytics conditionally via Cookie Control, and I've already purchased the multi-site license for this.

So it's the matter of practicality really. You can use MonsterInsights (with some extra cost) and integrate with several cookie control plugins. Or you could invest your money in a different tool (like I did), and handle it this way, with the advantage that you can control additional scripts and cookies and not specifically Google Analytics, like say ShareThis buttons, as we've already seen.

Comments (disqus)

I used to run Disqus for comments. Unfortunately, Disqus released an update only on May 25, 2018, and this was too late for me. At this point, I had disabled comments on the blog. I decided I would re-introduce them at a later stage, after I have ascertained that either WordPress comments or Disqus are compliant.

Before the GDPR update for Disqus, I did try to 'intercept' Discus using Cookie Control and its onAccept and onRevoke functions. This was a valuable lesson that we will discuss shortly. Now, Disqus has introduced necessary GDPR measures. If you want to login and post a comment, you first need to agree to their new privacy policy. If you're already logged in, you will not see the comments field until you agree. And you also have the option to opt-out of tracking. As a user (someone using Disqus), the downside of this choice is that you will have only temporary session logins, and you will need to log in anew each time.

Disqus, new policy

Disqus & WordPress database integration

Do note that by default, Disqus will not write user comments to your database. You will need to manually configure that. If you do, the comments will be added to your database, and if you have had a non-personal database until now, it will now include personal data, and so you will need to treat it accordingly. For example, if you used to have non-encrypted database backups, now they will have to be encrypted. We will address this in the backups section below.

General script control

You may have additional plugins or scripts on your WordPress site, which may not have any GDPR accessories. If these plugins or scripts collect personal data, you will need to block their execution until the user has consented. This is similar to what I've outlined earlier with the ShareThis buttons. To that end, we need to talk about two WordPress functions - wp_enqueue_script() and wp_deregister_script().

Wp_enqueue_script

This function allows you to run scripts on demand. The usage is far from trivial, but if you know what script needs to run, you can wrap it in wp_enqueue_script and then place it into the onAccept section of your cookie control tool, like we did earlier.

The difficult part is actually finding the script that you need to run. This goes back to checking the Web console in a browser, and trying to figure out what your site is doing. There's no easy way around this, especially if you have plugins that behave in an odd, undocumented way. Justin Tadlock discussed this on his site way back in 2009, and it's an extremely valuable piece of advice. Your conditional script may look something like:

wp_enqueue_script( 'contact-form-7', wpcf7_plugin_url( 'contact-form-7.js' ), array('jquery', 'jquery-form'), WPCF7_VERSION, $in_footer );

Wp_deregister_script

This function does the opposite of enqueue; it removes/stops a script. Something like:

add_action( 'wp_print_scripts', 'my_deregister_javascript', 100 );

function my_deregister_javascript() {
    wp_deregister_script( 'contact-form-7' );
}

Session cookies

WordPress (and your plugins and themes) may set session cookies. Again, we need consent, like before.

WordPress 4.9.6 - GDPR changes

There's no point repeating what's been said a million times before. Very briefly, WordPress now has additional tools that can help you manage user data. If you have registered users, you will be able to delete them, and there's also a template for privacy policy.

Privacy policy template

Erase user data

Data backups

We talked about backups earlier - around emails and databases. In general, there's a fairly good chance you will be backing up your site data (and your user data along with it). This is a healthy, recommended practice. You should have backups. By all means. What you need to make sure is: anonymization and encryption.

You should have covered the anonymization part already (by design). Encryption is the next step. Any which data backup you make should go to: 

The best way to achieve this is with encryption - and I'm not talking about the fact some backup providers (mostly cloud) offer encryption. Example: You might be backing up data to Dropbox. The service is secure, it offers 2FA and encryption. So if someone actually hacks into the Dropbox systems, they will not be able to read any data they might steal. But that is not enough. Because you do not control the encryption method.

The solution is to encrypt the backups with your own tool and key (password if you will), and then, for whatever reason, if someone has access to this data, they will not be able to access the actual contents. And we can finally discuss this technology called encryption.

Encryption tools

Well, we talked a lot about encryption above. Let's put it to some good use.

There are many ways to encrypt data. I will now elaborate on a few possible tools. Please note that the quality and security of encryption tools may differ. Also, different data types may require different types of encryption and standards. In some cases, you may have to use specific methods to be compliant. For ordinary bloggers, most tools should suffice. Some level of technical knowledge is required.

The most basic encryption method is to create a password-protected ZIP archive. This isn't foolproof, but a tool like 7-Zip can create encrypted files in its native 7z format (including filename encryption) using the  AES-256 method, which is an encryption standard adopted by the US government. In Linux, you can use GPG (GnuPG), a free implementation of the OpenPGP standard, to encrypt files as well as emails. Several frontend tools are available.

You can also create encrypted containers - large files that you can then use as virtual folders to store many different files inside. To a casual observer, an encrypted container looks like a random file, something like Documents.bin, with a size of 603 MB, for instance. When you mount it (open it in a special program capable of reading and decrypting the encrypted format), it will be shown as a folder or a separate drive in your file manager, and you can interact with it like any other location on your disk.

Friendly GUI tools that can create encrypted containers include TrueCrypt and VeraCrypt, the latter being a fork of TrueCrypt, which was discontinued in 2015. This last piece of information may give you a pause, as you may assume there's an inherent security risk in this. However, an independent audit of the program found no significant security risks (for now). Both these programs use several encryption standards, including AES.

VeraCrypt

Another Linux tool - Vault in Plasma-based distributions like KDE neon or Kubuntu, for instance. This program is integrated into the desktop and lets you create encrypted containers on the fly. The encryption backend is CryFS, which also uses AES-256 plus another cipher.

Vault

Documentation

The last piece, once you've made your site compliant, is to wrap things up. Document everything, and that also means also writing a privacy policy. It does not have to be long or fancy. In fact, short and sweet is the best way forward. Be transparent about what you do and why, and explain the changes. If in doubt, always err on the side of caution. Give your user a choice rather than assume that it's ok. In the end, it's about transparency of your actions, anonymization of user data (if possible) and secure handling (encryption).

Finally, audits. Website layout and content can change over time. In fact, they always, inevitably do. You need to make sure you stay complaint. Every now then, it could be once a month or even once a quarter or whatever cadence, take a look at your Web stack. If things have changed, document them, and if they affect your users, reflect them in the privacy policy and/or the relevant tools and services your users interact with.

Conclusion

And here we are, the end. I would like to believe this article has been useful. Simple, clear and practical. I tried to cover as much material as possible, avoid vague language, and provide real-life, relatable examples from my own blogging adventures. GDPR compliance is NOT a trivial thing. It will require mental and technical effort, it will require hours of work, and it may even cost you money, both in direct investment and post-compliance changes to your traffic. But you will have a nice, tidy site that is friendly and fun to your users.

There are no shortcuts. The most important thing in this journey is education. Ignorance breeds fear. But if you take time to understand what you're doing - and your site is doing - you will both enjoy your work and the outcome. This means being careful around quick promises and silver-bullet solutions out there. Don't rush, take your time, try to figure out the best, most elegant way to make your site compliant. Finally, if you have any feedback, suggestions or requests around this topic, feel free to contact me, and I will try to update this guide. Take care, children of the Internet.

Resources

In order of appearance and relevance (most also linked throughout the document):

GDPR topics:

EU GDPR Information Portal

Wikipedia article on GDPR

Website statistics and analytics tools:

AWStats

Google Analytics

Google Analytics Measurement options for Web pages

Policy requirements for Google Analytics Advertising Features

Google Universal Analytics IP Anonymization

Google Analytics Opt-Out Browser Add-on

Cookie control tools:

Google Cookie Choices

Civic Cookie Control

Types of cookies used by Google

Google Analytics Cookie Usage on Website

Google Universal Analytics Cookies and User Identification

Google Adsense, Custom Search Engine & Matched Content:

Google Adsense

Google CSE

Sharing buttons:

ShareThis opt-out page

E-commerce:

PayPal privacy policy

Embedded videos & Flash player:

Wiki page on Local Shared Objects (Flash cookies)

Adobe Flash Player settings manager

Youtube Embed videos & playlists

WordPress:

WordPress home page

WordPress version 4.9.6

The Ultimate guide to WordPress and GDPR compliance

Google Analytics for WordPress by MonsterInsights EU compliance plugin add-on

Disqus opt-out policy

How to disable WordPress scripts and styles

WordPress developers tools reference: wp_enqueue_script

WordPress developers tools reference: wp_deregister_script

Encryption tools:

7-Zip homepage

The GNU Privacy Guard

OpenPGP encryption standard

Wiki page on Advanced Encryption Standard (AES)

TrueCrypt (GRC page)

TrueCrypt audit report

VeraCrypt

You may also like: