In response to the book So You Think You Have Nothing to Hide? Read more about the book here (currently available only in Dutch). So You Think You Have Nothing to Hide? Read more about the book here (currently available only in Dutch). by Maurits Martijn and Dimitri Tokmetzis, I wrote a piece Here’s what I wrote earlier on making De Correspondent even more privacy-friendly (in Dutch only). a piece Here’s what I wrote earlier on making De Correspondent even more privacy-friendly (in Dutch only). last September in which I ran a fine-toothed comb over our own website.
In that article, I explained why De Correspondent uses services that track your data Third parties use trackers – software and other technologies – to monitor your online behavior. Cookies are a kind of tracker. and what we could do about it. Nearly four months later, it’s time for an update, because a lot has changed.
Goodbye, Google Analytics
The most important step we’ve taken is that we’ve said goodbye to Google Analytics. We used this popular free That is: we didn’t pay for it, but by using the service we did give Google information about our visitors’ surfing behavior. service to see which pages and features on our website work well and which ones need improvement, so that our readers and contributors get more out of the experience.
After searching for alternatives, we decided on the open-source Piwik platform, Read more about Piwik here. open-source Piwik platform, Read more about Piwik here. most importantly because we can host it on our own servers and thus have full control Because we run the software on our own servers, we own the data, which keeps it out of third-party hands. Our statistics are hosted on our own domain: stats.decorrespondent.nl. over the data. That means Google is no longer receiving information about you, our visitors.
Our first test with Piwik seemed to go well, but it soon became clear the platform couldn’t handle large numbers of visitors. The server hosting the software crashed at the drop of a hat. Farming it out to a hosting provider familiar with Piwik Piwik PRO, the open-source company’s for-profit division, provides full-service configuration, setup, and management. would cost us about €15,000 per year. That’s based on our current website traffic – but our reader and member base is growing by the day, so the cost would quickly run higher. We couldn’t justify spending this amount.
Another thing that delayed the transition was that we’d been spoiled by Google Analytics’ features and reporting options. Piwik is a fine package, but not nearly as deluxe as Google Analytics. For a while, we tried to extract the same information from Piwik that Google always gave us.
Until we realized this: Thanks in part to an illuminating conversation with someone at Piwik PRO. a whole lot of that information is superfluous. Instead of measuring everything but the kitchen sink and then seeing if we could find patterns to help us improve De Correspondent, we stopped and asked ourselves what it is we really want to know.
Since then, things have moved quickly. Earlier this week we put an end to our use of Google Analytics. We’re now running our own Piwik system that scales itself up when traffic gets busier and scales back down once the peak has passed. That keeps our costs down, yet still enables us to handle periods of heavy traffic. The bill for the switch to Piwik: including the cost of training to master the software and all its quirks, Every software package has a few bugs or oddly implemented features we run into as developers. We often end up asking the notorious question “Is it a bug, or is it a feature?” – which makes every day of work an interesting one, to put it mildly. we’ve spent €7,000 up front and will spend between €200 and €300 per month Not a bad price, when you consider that we no longer have to pay the “bill” with your data and thus are able to protect your privacy a little better. going forward.
More control over trackers
At De Correspondent, we use YouTube, Vimeo, and SoundCloud to play videos and podcasts for you. These services make use of trackers, tiny programs that follow you around the Internet.
There, too, we’ve found a solution: we only enable those trackers when you start the video or podcast. A video (or podcast) is usually loaded along with the page, and the trackers it’s linked to can instantly start analyzing your behavior. It doesn’t matter whether you watch the video or not. When you load an article, we display a preview We host this preview on our own servers, which means we don’t yet have to connect to YouTube, Vimeo, or SoundCloud (so you can’t be tracked yet). of the media element. The associated service – and its tracker – aren’t loaded until you watch the video or listen to the podcast. The choice to allow that or not is up to you. We’ve since implemented this solution on all the articles we’ve published at De Correspondent. There are still a few pages on our website, such as our public home page and our About page, left to convert.
What about other trackers? To enhance our articles with visualizations, we use LocalFocus. This program also uses Google Analytics. So we sat down with the folks at LocalFocus and came up with a solution that doesn’t use trackers.
That leaves just New Relic Read more about New Relic here. New Relic Read more about New Relic here. which we use to monitor our servers. So far we haven’t found a good alternative A good alternative for us would be a package we could host ourselves, the way we do with Piwik. Suggestions are welcome! Let us know in the comments. for that.
In the coming months, we’re going to share as much as we can about how we’re wriggling free of the data collection ecosystem we’ve grown so used to in recent years.
We’ll be sharing articles about our experiences during this process, tips For example, the way we’ve implemented a solution in which a server cluster automatically swaps in more servers when traffic increases, and swaps them out when things quiet down. This solution can ultimately save a lot of money, yet still ensure you can handle peaks in visitor traffic. on what worked well for us and for others, examples of open-source code that other website developers can use, and more.
In the coming year, we’d also like to open our doors and discuss further improvements with everyone who’s interested – because privacy is and always will be one of the fundamental values that informs our choices when we’re thinking up new functionality for The Correspondent.
So we’d like to enable you to exercise more control over the data we collect. In the new privacy manifesto we’ll be launching, we’ll be as transparent as we can about what data we collect and why, For example, we keep track of which articles a member has read so they can easily find them again in their personal menu. And there’s also data we’re legally required to collect, such as information about membership payment. and we’ll give you the option to turn it on or off. And when a new website feature requires us to store privacy-sensitive data so it performs better, we’ll do that on an opt-in Most online services use the opt-out principle, which means they automatically turn new features on for you without explicitly asking permission. We’d rather let you choose for yourself what you do and don’t want to use, and that’s why we use opt-in. basis wherever we can.
That makes it our job to convince you that turning on new features like these For example, suggesting other articles based on the articles you’ve already read. will improve your reading experience.
What I’d like to ask of you
If you manage a website or directly (or indirectly) influence your organization’s guidelines on collecting visitor data, please share your expertise with us. What were your considerations, and what were the deciding factors in determining the data you collect? Not all data collection Some benign examples: newsletters, analytics, cookies, and A/B testing tools. is bad, as long as there’s a good, deliberate reason Here are some privacy-related questions your organization might ask itself (in Dutch only). good, deliberate reason Here are some privacy-related questions your organization might ask itself (in Dutch only). behind it and you have respect for the people whose data you’re recording. How did those conversations go? Have you noticed any resistance due to conflicting interests? How does your organization deal with that?
Sign up for my newsletter (Dutch only)
I’d love to connect with you and discuss what solutions might look like that respect privacy without endangering an organization’s objectives. And how we might make them publicly available so that other websites can easily implement them.
—Translated from Dutch by Grayson Morris