If you think cookie consents are messed up I recommend you to check out the advertising industry's IAB consent framework, which is their attempt to self-regulate in order to avoid or at least deflect external regulation: Its purpose is to enable them to share your personal data between hundreds or thousands of ad networks.
We develop an open-source consent manager (https://klaro.kiprotect.com) and considered implementing the IAB framework, after seeing how it works we decided not to though as we don't want to support privacy-invasive approaches.
That said players like Google also use privacy as a strategic tool to force other players out of the ad market. Since Google can easily get first-party cookies into > 95 % of all browsers they won't be impacted by the restriction of third-party cookies. As most users go through their site many times per day they also have enough other signals they can use to track people without relying on cookies (user agent, IP address and a little Bayesian statistics is sufficient if you see enough traffic). They will probably continue to remove tracking capabilities that they don't need anymore from Chrome to slowly suffocate any remaining competitors that still rely on them.
What makes our consent manager stand out is that it records all attempted tracking events and can replay the events later once consented. This helps site owners capture the entire user journey through their site.
That looks really nice. It is 118KB uncompressed though, that is a lot of code. What is in it besides storing preferences and conditionally executing the scripts?
It uses Preact and a few JS helper libraries and includes translations for more than 10 languages currently. It also includes styles by default (though you can get a version without styles). The compressed size is 37 kB which I think is reasonable, smaller is of course better. We haven't prioritized size for now as it's comparable to a PNG/JPG and usually negligible on most sites, we might look into this in the future though.
The parsing cost for 100+ kB of JavaScript is not negligible. Sure maybe the transfer over the network is comparable to an image file, but you can’t stop measuring there. And to claim “Lightweight” on the homepage for something that should be tiny when uncompressed is a bit of a stretch—lightweight compared to what?
Yes I guess we should update the site. When we started it was less than 20 kB (compressed), seems we have almost doubled in size! We added some functionality and additional languages over time, I will check the bundle though to see what takes up so much space, seems high to me as well. We’re developing this pro bono right now and don’t intend to monetize it, hence it does not always get the attention it might deserve.
If you think cookie consents are messed up I recommend you to check out the advertising industry's IAB consent framework, which is their attempt to self-regulate in order to avoid or at least deflect external regulation: Its purpose is to enable them to share your personal data between hundreds or thousands of ad networks.
I always suspected this and mistrusted those opt-out tools, but I never had any evidence for it. Would you mind elaborating?
Ultimately the consent framework is a list of common identifiers for entities (including as of 2.0, what they actually do with the data) and a standard way to send that between all the companies involved. The idea being that your popup will track that you consent to data sharing with e.g. DoubleVerify and 1000 others when you hit agree. Then as well the publisher interpreting this as consent to include DoubleVerify code on the page which may track you, they can send that list onwards to their other partners.
e.g. AppNexus is also on the page, and they have a deal with DoubleVerify, AppNexus can read the list of people you've agreed to tracking from that the publisher sent them, see DoubleVerify in that list and assume they can (in the opinion of the IAB) share your data that was shared with them by the publisher or your browser, with DoubleVerify.
The presumed transitive consent to 1000s of entities is the bit I assume that the parent poster is objecting to especially when so many of those modals are "default-accept", "we hid the reject button in three child links in the middle of legal text" etc., but I'm not sure why IAB's in particular is worse than others such as the parent poster's.
Is there any way we could move forward towards a healthier (less monopolistic) situation that you see, such as going back to content-based ads? G has been closing alternate ad channels for a long while now, and the few remaining portals such as github.com are being taken over (with most of the HN crowd not realizing what's going on IMO). I can say that my web usage has changed quite drastically with GDPR banners popping up everwhere in EU - basically I don't visit those sites anymore, never go to featured stories on medium.com, etc.
If we'd never had cookies or we ban them now we'd be left with basic auth over https. That's equivalent to a first party cookie. It would be enough to create an ecommerce site with a cart and a recommendation engine. Every single site would have the same login form, managed by the browser. I guess we'd end up with an <auth> or <input type="auth"> tag to embed it inside the page and style it.
Another consequence, there would be more server to server communications because the first party server could send all sort of information to the servers that are setting third party cookies in pages now. The focus of privacy policies would be there.
Or with session ID in URL. It's a valid mechanism - works reliably, is fast, and doesn't require special support in browsers. Remember the '?PHPSESSID=...' littered all over the early internet? Yeah that one.
Of course there are some caveats, which made cookies win in the end:
* it must be applied to every internal link in the website
* it's equivalent of a "session cookie" (no persistence after browser is closed)
* it's a privacy risk when sharing the URL via copy/paste
* it's subject to 'session fixation' attack (related to the previous caveat)
Presumably the last two could be fixed by an extra HTTP header or special in-browser handling of a designated parameter, similarly to how CORS got handled.
One problem. There have been at least three patent cases where this mechanism has been claimed to be subject to a patent. The litigants won in one, lost in two others (there may be more of these, I am only aware of 3).
How am I aware of them? I was a fact witness in these cases, and the reason I was a fact witness it that my employment agreement with Cadabra Inc., later to become Amazon.com, contains an addendum describing this technique and specifying that the company cannot attempt to patent it, since I had already implemented it while working at UWashington CS&E.
Apparently this was still not sufficient for the jury in one of the cases. Edit: none of the cases were solely about a session identifier, so to be a little fair to the final jury, it's not quite that simple.
You can also mitigate it by including in the session token something from the server side that is sent by the client as part of the HTTP request but not as part of the query, such as a hash of their user-agent.
I did something similar to this in the early 2000s, when setting a cookie lasting longer than 24h required approval from SECDEF: [0]
This was a cookie-less FTP-over-HTTP browser, where you had to agree to the terms of service, and copying the link to someone else wouldn't mean they agreed to the terms of service. It was just a prototype.
Intranet web apps didn't even require a manual login, now we've regressed and have more web apps, frequently less AD authentication (cloud apps) and more random passwords written on post it notes.
AD (and Kerberos) require the client to have a ticket from a central authority, and also require the server to be trusted by that same authority. It won't really work for the "wild" internet.
That's basically Kerberos, which ad is "built" on and extends. I believe both Kerberos and ad require the server to be known to the ad or Kerberos master.
> we'd be left with basic auth over https. That's equivalent to a first party cookie. It would be enough to create an ecommerce site with a cart
not exactly. a site can't set basic auth for you, which means you would have to log in before adding something to your cart. whereas with cookies, you can add to your cart before having to authenticate in any way. Now, as long as javascript is enabled, there are other ways to handle that, such as local storage. But that is relatively recent.
The way it was done before without cookies was including the user's session ID as a query parameter in the URL. Adding things to your cart was also done through query parameters.
It was later realized that session IDs in the URL led to lots of security problems, but it's possible we would've come up with ways to mitigate those if cookies hadn't become widespread.
Preferring only POST form submissions vs GET links might have helped some, with proxies perhaps, while previous implementations of “disabled cookies” functionality in early forum software would tie the session ID to other information known about your session, such as your IP address and browser identifier, and often there was a time-out for your session identifier that would reset every time you visited another page on the site. Thus somewhat limiting how easily you could share your session accidentally with someone else in multiple ways. Today this would be less practical as folks switch from wifi to cellular frequently. And it was still a “workaround” even then, the preference for developers was still to use a session cookie where available for simplicity. (Annoyingly, the session cookie historically often came with the same IP address and browser user agent matching restrictions though this wouldn’t be as necessary given the different technology employed.)
I do recall that in early implementations of cookies in web browsers, they were disabled by default. But when a website wanted to set a cookie, a prompt would appear. Actually, this might have just been the configuration of some of the shared computers I was using at the time at school and in other places.
I might be thinking of the cookies prompts in Links and Lynx text mode browsers, as I tended to use those more often back then over dialup. The cookie prompts in general were terrible because you never knew what part of the site would be enabled or disabled before interacting with the site. To that end, Safari’s approach is quite reasonable for a default.
Someone would have ended up inventing some X-Session-Id HTTP header that the server would send to the user agent, and the user-agents supporting the "HTTP Sessions extension" would send back a X-Session-Id with the same content to the same origin. A special empty value would have been used to terminate the session, from the client or from the server.
Another HTTP header, X-Session-Expiration, would have been added to expire sessions.
... and then we'd have essentially a single cookie per domain with wholly different higher-level semantics (user control through a password manager.) That ... actually sounds preferable?
Banning cookies wouldn't do too much against tracking: modern browsers leak enough information that you can fingerprint individual users by eg screen resolution, available fonts etc.
Which actually would allow doing away with cookies: redirect users to $hash_of_fingerprint.mysite.com. On routing, discard the hash subdomain. This would keep their session alive, if they share a URL that's fine, the hash is irreversible so their info is not leaked and if someone else visits that URL they'll be redirected to their respective hash subdomain anyways so the usual security concerns do not apply. If the hashing library includes mysite.com in input variables then the fingerprint will differ between domains so you can't use it to track people either. Likely IP address needs to be included as well. User and password could be used to tie several hash subdomains together.
How exactly are you going to hash my fingerprint if it's my first visit to the site and I'm using someone else's link? You wouldn't know til my second visit if I wasn't running something to change the value anyway, and if I am running interference on the request, you're kinda boned. For that matter, what keeps me from just hijacking someone else's token in it's entirety?
Also, how does a fingerprint, (something already capable of being deanonymized), being put through a trapdoor function (some think specifically used universally for purposes of authentication), generally arrive at making a mechanism that isn't being hijacked for tracking purposes? That's where you end up running into the fundamental problem of web traffic analysis.
As soon as you have enough information to reasonably assure you you're talking to the same person, someone else will come up out of nowhere to beg you to share it with them. You can make tons of money sharing this info, but by and large people object to you being loose-lipped about their business with strangers.
This is what makes me shake my head in wonder at the direction the Web went...
We already knew there was going to be an issue. We knew, because advertising/marketing/big data like info sharing did not exist; and where it did, it was generally in the customer's face, and they had something to gain, and not much to lose, and big, ponderous and unwieldy to do.
By letting advertising drive the direction the Web developed, we've turned the entire apparatus into the most effective surveillance device known to man.
You didn't have police looking up people's catalog or financial records before. You did have landlines being tapped on a case to case basis, but that's a big difference compared to Google's capacity to provide LE with suspect lists via geofenced warrants.
It saddens me sometimes we never got to see a Web without "user" abstractions. In the beginning it may have been close, but since the late nineties it's been far too close to a liability in terms of abusability by well positioned players for me to feel super happy about it.
And yet I've been developing stuff for it for 10+ years without pulling Stallman levels of sticking to my ideals... Hooray for being part of the problem.
Removing an entire front in the War on Privacy wouldn't help? There are many things to fix if we want to reclaim privacy. There is certainly no single magic bullet and it certainly isn't all going to happen at once.
It wouldn't help, it would just push it underground and make it harder to deal with.
The opposite would work better. Create a standard for tracking people then legally enforce that all tracking conforms to that standard. Then users can opt out.
Yeah, I do think nothing would change. The vast majority of people I know have complete apathy over 3rd party tracking. It's just not something on their minds (for both devs and non-devs)
It would be of more help if GDPR was enforced in a really painful way (to offenders). GDPR is not about cookies. It's about tracking without explicit consent. If offenders were fined into oblivion it would (hopefully) go away. Fingerprinting relies heavily on client-side Javascript, so attribution shouldn't be a problem.
If the server redirects the browser to the identity provider, and then goes back to the browser with the token, and the browser gives that to the server, that doesn't seem to require third-party cookies.
It would have needed some extension for a session id, so the browser isn't just sending the password with reach request. Even with HTTPS I feel this is an unnecessary risk, since if you compromise the server you don't need to wait for login requests, but just any kind of request. Other than that I'd actually consider using this for side projects (should they ever be web based and require authentication).
The session (or any auth) cookie is essentially the same as sending your password. Security wise there's no big difference between sending a password or a session cookie over https (other than the session cookie is short lived, which may have its merits).
I remember the original ASP before .NET would simply emulate cookies via Server State or whatever it was called. It would encrypt some server data and send it to the client.
Any website could implement cookies in a single window session this way, as long as you didn’t open more windows lol.
yeah this would actually be the ideal world. shame. I don't even know how to programmatically sign into a google account anymore. it used to be easy enough with a few curl calls and I could retrieve my adsense reports but now it's just impossible. their login process is too convoluted now and involves so much javascript, I doubt it's possible with anything but headless chrome.
Recently I tried to write a web scrapper. The website had several methods of authentication (facebook, google, some propritetary), and I couldn't make it work in one afternoon using any of these methods.
I remember writing scrappers in late 90s/early 00s and it was trivial.
It's not allowed to keep state of the user without asking (under GDPR).
No technical hack changes that, no not even an SPA or running a webpage in an iframe.
If you have state for one purpose, you are not allowed to share state for another purpose.
I know that's bad. I don't know how many things we are used to should work in such en environment, where users are trained to NOT consent to anything, cause consent is only needed for bad things they do NOT want.
You are event not allowed to tell a user: this will not work without consent.
> It's not allowed to keep state of the user without asking (under GDPR).
Why would that be? The GDPR only covers personally identifiable data, a todo list that stores everything in local storage can keep state without any problems. You can store things like language settings, dark mode theme, etc. perfectly fine.
You can also use state for multiple purposes as long as you clearly list and identify those beforehand. You can't gather personal data and then suddenly sell or analyze it if you didn't tell your customers you'd be doing that with data. However, saying "we use this email address for (a) sending you news letters (b) letting you recover your password" is perfectly fine.
From my reading of the GDPR, you can even gather personal data without explicit informed consent if the data is absolutely necessary for your system to work. You do need to provide ways to update, delete or obtain all information in human-readable form, but explicit consent for something that anyone can understand is absolutely required for the thing to work can be collected. You can keep track of the contents of a shopping cart on a web shop, for example, but you can't submit the contents of that cart to your analytics backend without consent. You can, however, track the cart contents in your backend and link it to the users' account; only when you start processing the data in a way not strictly necessary will you need the user to provide informed consent.
The problem with GDPR is that most people encounter it in the form of tracking cookies and advertising, both of which are not absolutely necessary for any application to work, which is why they need informed consent. People think all cookies are now banned until further notice and that the mere existence of a database is now punishable by law, which is not the case. GDPR sucks, but only if you're in the business of collecting a lot of extraneous information about your customers and/or selling it (through analytics or ads, for example). Which, in my opinion, is a good thing.
> The GDPR only covers personally identifiable data
some data protection officers think any two linked clicks are personally identifiable.
If have read the full cookie ruling, in some passages it's about "saving" (in all senses) any data without consent - yes it sometimes talks about personally identifiable but the "saving" part doesn't care.
To be clear, I don't think that, but it's hard to make our service comply if the customers (think webshop) data protection officers follows that semi official guideline
GDPR is about personal data. A user name (not an email) and a password are not personal data so their not in the domain of GDPR. A cart full of products is not personal data. It becomes personal data when we add a street address for delivery, an email or phone number for sending alerts, a credit card number for payment. However if delivery is to a PO Box (or an Amazon locker) and the credit card and customer name never touch the ecommerce site (a third party authorizes the transaction), then a fully anonymous ecommerce becomes possible. No GDPR and yet it keeps state and tracks orders.
> It's not allowed to keep state of the user without asking
I don't think this claim is correct. GDPR requires complete transparency with how personal data is used and stored. Asking for permission just happens to be the safest/laziest way to be compliant.
There's no way to get consent without asking (informed, freely given, opt-in affirmative action) but consent is just one of the paragraphs that give a legal basis for processing private data - it's legitimate to use data that's needed to fulfil the contract (e.g. the adress to deliver goods), comply with legal obligations (e.g. KYC in banking), for legitimate interests that don't conflict with freedoms of the user, for public interest (e.g. news reporting), etc (see Article 6 https://gdpr-info.eu/art-6-gdpr/), and all these use cases can be carried out without the user's consent. However, in some cases where the data processor could assert some other basis for processing, it may be simpler to just ask for consent.
But for the particular case of sharing data with thousands of third parties so that they can use it to target ads, consent seems to be the only one that applies. Direct marketing is one of the very few use cases explicitly listed in the GDPR, e.g. in the 21.2 'the right to object' - "Where personal data are processed for direct marketing purposes, the data subject shall have the right to object at any time to processing of personal data concerning him or her for such marketing, which includes profiling to the extent that it is related to such direct marketing."
You realize that's an un-question, right? It doesn't even parse out.
You can't have consent without the assumption that there is a choice to be made, and that the choice is not yours to make, and that the one giving consent is aware of the choice. Anything with the characteristic of producing an implied voluntary choice without the chooser being aware the choice is being made is very specifically not consent.
Sure, possibly with a contract signed before using the web service. Example: I sign a contract with a utility, maybe on paper in a shop in a mall, then I use their website to check my bills. If that's included in the contract I don't have to give consent again in the website.
> It's not allowed to keep state of the user without asking (under GDPR).
False. You're not allowed to store data about the user for any amount of time longer than is required to provide the session/service.
Session cookies are explicitly allowed without consent. Even though many cookie consent popups imply that they are not ("cookies are required for this site to function"), that is a lie by the adtech industry (and ignorant webdevelopers).
Remember: They are NOT asking for permission to store session cookies--they don't need it. Every cookie consent popup is literally asking for permission to share tracking data with third parties. Even if they word it obscurely.
Ironically, Google actually puts a huge amount of engineering effort into thinking through privacy concerns because they're a huge target and they stand to lose a lot if they lose consumer goodwill.
... but their thought process does generally include "Google and their data stores are a trusted repository."
Yeah, it's very important to remember this point. Google is an ad company first and foremost. Everything else they do just drives ads or user information to show ads more effectively.
Meh, that's reductionist. Takeout doesn't drive ads. Neither does ngram search. Or G Suite.
Google's primary revenue model is ads, and yes; that implies a lot of products they make hook up through that revenue model. But the incentive structure isn't as simple as "ALL FOR ADS." Google still styles itself a search company (but search doesn't pay for itself) and, increasingly, a cloud service company.
Still, when push comes to shove, Google will always do whatever works best for their ad system. Their browser is good, but it will never natively support the kind of anti-ad, anti-cookie measures that Firefox or Safari might.
Holy shit. Indeed it could. I wouldn't be surprised if I saw this text as a HN comment in 2020.
My favourite pieces:
> But there is a compelling argument that compensation models based on clickthroughs or more are flawed because it encourages the content developer to focus exclusively on pushing people through the ad, rather than actually providing useful content. I'm sure many content sites would refuse to partake in this.
Yet here we are. I guess it was hard to predict, 23 years ago, just how insanely proftable this business model will become.
>> The bottom line is that WWW advertising is based on a business model which didn't even exist a couple of years ago. It is based on leveraging technology which is new and evolving. I am confident that it will evolve in response to improvements to the technology..
> The business model of sponsored content has been around forever. So have these privacy issues (...) Looking to technology for a solution to the privacy problems is as misguided as blaming technology for causing them.
The kind of nonsense Brian replied to here, that this business model is new and innovative, and thus somehow worth exploring... I swear I see these kind of arguments regularly even today.
This is why older computer scientists get jaded. It's the same problems again and again, which you foresee and try to make sure don't happen, but you can't make the world do the right thing. So you say your warning, people laugh at the "paranoid nerd" and give up their privacy for a handful of cents.
The amount of time I lose trying to fight against advertisers is worrying.
I've set up several firefox containers for websites.
I'm now using two firefox profiles. My main profile has javascript disabled by default through ublock, except for some websites.
My second profile has javascript enabled by default, but I enabled a custom tougher blocklist with ublock (can't remember which). I use this second firefox profile for websites I don't really trust.
Even with all this, with the websites I'm using, I'm quite certain it's not enough.
Before the virus, I was at my mother's, and I admit that watching TV again (in europe) was not that bad. I'm starting to think that getting the news from the internet is now equally bad than watching TV.
About firefox containers: why can't firefox automatically compartmentalize cookies and such, by default? I've never understood why website A can read the cookies from website B. Ideally, there should be a setting to prevent such thing, even if it breaks some websites.
> why can't firefox automatically compartmentalize cookies and such, by default?
It breaks some websites according to Mozilla, so Firefox does not block them by default. Firefox's term for this is third-party cookies, and you can change your Firefox settings to block them (without using containers) - see https://support.mozilla.org/en-US/kb/disable-third-party-coo...
edit: the default settings block "cross-site tracking cookies" which is presumably different from "3rd party cookies", but I'm not exactly sure how.
I'm guessing they have a blacklist, probably a community one maintained by some ad blocking project. That strikes me as the only feasible way of telling apart good and bad uses of the same technology, but it unfortunately requires unending gruntwork.
Nice extension that creates a temporary container for each new tab:
https://addons.mozilla.org/en-US/firefox/addon/temporary-con...
It also still works with persistent containers (like for work, banking, etc.). Some one here pointed out this extension on HN.
Still can be tracked by third-party images and css with fingerprinting and IP address...
Of course it does not matter once javascript allowed - youtube tracked me fine without cookies. Private browsing and uBlocks could help only so much.
About firefox containers - why where is no version for those who do not care about broken sites? It is certainly possible - code is there https://hg.mozilla.org/mozilla-central/ - would you use it? Would you build it?
> Before the virus, I was at my mother's, and I admit that watching TV again (in europe) was not that bad. I'm starting to think that getting the news from the internet is now equally bad than watching TV.
I think TV ads are inherently less annoying than most web ads because, perhaps counterintuitively, they stop you from watching your show while the ad plays.
If I'm watching a TV show on say Hulu and an ad comes on the show is completely replaced by the ad and a little countdown timer appears in the upper left corner telling me how long until the show resumes.
I glance at the timer and then find something else to do for that amount of time. I might pick up my iPad and work on a New York Times crossword puzzle until the ad finishes, or if the weather is good dash outside to toss some peanuts to the squirrels. As long as the ad break came at some natural scene transition in the show this is not very distracting.
On a web site the ad is there with the site content. If it's animated or has audio it distracts while you try to read the content. Often it keeps on doing this for the entire time you are reading the content.
If an ad is visible while you are reading content, I think it really needs to be completely static, or at least static until you actually click it or at least hover over it.
First, why are you keeping cookies in the first place?
Install Forget Me Not and deny/whitelist/clean up cookies automatically. I deny all cookies, except the ones I need to login to a few sites. Another way to go about it, is to just automatically delete all cookies when you close a page.
Secondly, why aren't you using Temporart Containers?
Each domain gets it's own temporary container and there is no cross contamination. Close the page and it's all gone.
First Party Isolation is part of their quest to re-import privacy features from the Tor browser. It basically completely isolates every domain. You can look up the about:config values to enable it (they're at the bottom of the article too).
IIRC I had it enabled for about a year with no breakage, but about a month ago it suddenly broke my university's login system. But I just turned it back on and that login is working OK again.
Also probably would be good to enable browser.cache.cache_isolation.
There should be profiles, the default one should have every website in its own container. Then there should be a google profile, which shares the google cookies so they can be used. Etc.
Ideally, when following a link, the container should change as well.
From article:
"This was great. In fact, the invention of the cookie was the single most important thing we have ever added to the internet. It changed the internet from just a passive place for us to read things, to an interactive place."
There was several interactive protocols, like telnet or IRC.
Maybe that cookie implementation is one reason, why people confuse web with Internet?
To most people, the distinction is an inconsequential technical detail and so they conflate the two, but if someone is teaching the specification history of an implementation detail, they really ought to know the fundamentals.
They were influencial back in time. In alternative reality, where web stayed in original form, we probably have some other and more designed to specific problem solution.
back in 1999 or so I did a small research to check cookie support on the main browsers at the time. There was a few odd ones, but mostly mozilla and IE4 were the big ones at the time.
The mozilla cookie spec used language like "up to" for everything. Microsoft released a copy of the specification on MSDN with all the "up to" replaced with "at least". So IE4+ would support urls "at least" 1000chars. cookies at least whatever-limit it was supposed to have. etc.
They basically one-upped mozilla by offering web devs infinite resources that would, by definition, break on any browser following mozilla sane specification. While, also by definition, supporting everything made under the mozilla specification.
And all the users and webmaster of that time are directly responsible for the privacy mess we are today. And sadly, they are repeating everything again by adopting GoogleChrome and it breakneck speed in setting "standards" that self-serve Google.
This standards trumping for convenience is the ultimate incarnation of embrace-extend-extinguish and we can't stop falling for it!
Comment widgets like Disqus. You log into one once and you have your account on any website that embeds this widget. Of course, the tradeoff is that now Disqus can track the hell out of you.
Ah, yes. I've always found that a tricky case. A good request blocker can at least make sure you're only tracked on the sites where you use it, but that's probably going to remain outside the amount of configurability browsers offer.
Natural persons may be associated with online identifiers […] such as internet protocol addresses, cookie identifiers or other identifiers […]. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.
As long as you are not building profiles of people, cookies are fine and you don't even have to ask for permission to use them.
Stop collecting data you don't need, and if it is part of your business model, maybe ask a lawyer what part of GDPR actually impacts your business instead of doing what everyone else is doing or relying on internet comment sections.
And as long as you aren't including any advertisers on your site which are building profiles of people (and practically all of them are). Or using Wordpress plugins that do the same, probably without telling you. Etc.
The article is not that wrong, though. The GDPR is quite an abstract piece of legislation and it's no wonder specific technical terms aren't used. To be precise, the article author should have perhaps talked rather about GDPR-based interpretations, consensus and enforcement.
The original NSA tracking systems were meant to be anonymous too.
I don't have a source, but this came out around the time of Snowden in 2013. Basically, the story was that the original programmer had the idea to encrypt the info of the people being tracked, so that the operators would always see an alias and the underlying info would only be revealed to people with the necessary clearance. I also assume they would also only reveal the information to the operators when the targeted people became clear suspects rather than just being anyone in the general populace.
A long while ago, browsers were compliant, in that they only allowed first-party cookies by default, and users had to opt-in (by going into the browser's settings) to allow third party cookies to work.
This was difficult to do for most, but not many sites were unusable without third party cookies enabled, and there weren't that many sites to begin with, so it was "fine": not many users even touched that setting if they weren't affected.
Once there were more sites in total, and once more of them were nigh unusable without third party cookies enabled... there was a problem. A big one. Users had to be told how to configure their user-agent to make the site work. Instructions had to be provided for all common browsers. It was a mess.
User-agents could've solved this by making the UI/UX better: the user could have an easy-to-use interface to enable third-party cookies per-session, per-site, per-something...
... but instead they ended up implementing the easiest way out, and browsers (no longer user-agents now...) started defaulting to third party cookies being allowed by default.
I guess using something like lynx, today, gets one a glimpse into how it was, and how it could have been.
I actually recall options as: deny all cookies, allow first party cookies, allow all cookies, ask. (The latter one soon became unrealistic, but it was definitely in some of the earlier Netscape browsers.) Default was "allow first party cookies".
Edit: Just rechecked, Netscape Communicator 4.7 (as of Nov. 1997) had the exclusive options (radio buttons)
() Accept all cookies
() Accept only cookies that get sent back to the originating server
() Do not accept cookies
and an additional checkbox
[] Warn me before accepting cookies
(the latter triggering a prompt for any attempt to set a cookie not already denied by the previous options).
Edit 2: I got it (NS4.7) to display a cookie warning:
The server 192.168.1.2 wishes to set a cookie
that will be sent back to itself. The name
and value of the cookie are:
test=ABC
This cookie will persist until Jun 7 10:42:41 2020
Do you wish to allow the cookie to be set?
[Cancel] [OK]
Mind (a) that all vital data is included in the message and (b) the kind of general competence expected from an average user, back in the day (long before URLs were deemed too complicated).
That's not what I remember. From my memory and largely based on my experience with Phoenix/Firefox, the option to block third party cookies changed between exposed in the GUI, hidden and exposed but defunct.
I think the transition happened a little before Phoenix/Firefox started. The option "ask" had certainly become unusable some years before, and blocking third party cookies altogether already made many sites unusable at that time.
One of my family members sometimes creates some wordpress pages as a freelancer. He doesn't really know all these technical details behind Cookies and GDPR, but he knew he needed to do somthing.
Him: I need to install some kind of cookie banner plugin.
Me: Alright, why do you think that? Which cookies does your site place?
Him: I don't know...
This is the entire problem. He didn't knoe what his site does, and he wanted to add a cookie banner "just in case".
Both are fine with GDPR as long as you disclose them somewhere. That can of course be a problem if you have insufficient technical knowledge.
Does WordPress come with a page disclosing this by default?
Of course, the server access log is not really their business, just the opt-in email. So I guess the webhoster needs to educate its customers if they store server logs?
This can all get kind of complicated if you don't know what you are doing.
> The original cookie specification did not allow third-party cookies.
no, the original spec did not allow cross domain cookies, so you can not set a cookie for a third party from your domain, but you can, yesterday like today, connect to a third party server from your page and have that serve third party cookies (say, like the Facebook pixel)
One detail I found striking was about the effect of that 1996 Financial Times article:
> ...this article had a big impact. Soon after many other newspapers started looking into this, which led to two Federal Trade Commission hearings, and the web industry felt pressured to respond.
Feels like we're living in much different times now. There is a lot more noise across our media outlets today, plus the general public seems way more complacent about privacy issues as a whole.
Would an equivalent impact be possible in today's world based off of one well written article? Certainly on some topics yes, but when it comes to user privacy?
Before 3rd party cookies, companies would proxy advertising traffic through the primary domain.
It was annoying to setup, and technically complicated, but it worked.
If 3rd party cookies had never been enabled, that's what we would be doing today, and you would never even know your traffic was being sent to a 3rd party.
There's a reason they enabled 3rd party cookies: Because blocking them did not help at all.
Today such proxying would be even easier. Don't promote banning 3rd party cookies, you'll just make privacy even worse.
Proxying does not allow a 3rd party to track beyond the customer domain boundary. So yes, ads will be served, like tv ads, but they will not be a based on your behavior at other site(s)
And I think GP's idea is that if companies can't send over your identifying data from client side to advertising companies then they are more than happy to send it over from server side.
I see. I was talking about the fact that the cookies are visible to the ad/tracking company.
In my opinion (and I think this is common) ad/tracking companies are third parties and when a company proxy data to one of them it is then by definition residing with a third party ev3n if they don't share it with their other customers.
An advertising company knows I travelled to wikivoyage and spent lots of time reading about Thailand.
They then know that I (same cookie) visited cnn and read about Ebola
They then know I visited a cycle magazine and read about a new bike.
With proxying they wouldn’t be able to link those three visits together (with cookie data)
When I visit a fourth site, it’s a brand new visit - there is no way for the advertising. Company to know I’m the same version who visited the previous sites.
Proxying increases the cost and effort of tracking. It's more involved than simply adding a bit of JS to a page. Things like the Facebook 'like' button is a tracker that people adding it to their page won't even realise was a tracker and thus would be less likely to put in the effort to proxy that to Facebook.
For sites that make their money from advertising adding proxying would probably be a requirement and be worth the effort, but for most other sites that are trackers by accident it wouldn't be.
Restricting third party cookies won't prevent all tracking but it will increase the cost and reduce the number of sites doing tracking without knowing it.
I doubt it is the same when you consider some sites have more then 100 "partners". GDPR is not targeted at cookies or at any tech, it applies even if you use pen and paper so proxying will still be illegal.
> If 3rd party cookies had never been enabled, that's what we would be doing today, and you would never even know your traffic was being sent to a 3rd party.
Fortunately the GDPR also makes this illegal without consent, it legislates the conditions necessary to collect data and the permission you need to share it with third parties. It's not in any way specific to cookies.
And long before cookies the session id was stored in the query string.
But it isn’t compliant with Directive 2009/136/EC (which is the directive that created all those cookie banners to begin with), in the sense that it still allows for implementation of cookies that aren’t “strictly necessary”. Meaning even if all browsers implemented cookies strictly according to that spec, all of the distinctions that directive makes between different types of cookies would still be valid, and we’d still have the same amount of cookie dialogs.
"Without cookies, it would not be able to update without you logging in every single time. Meaning every time you wanted to see a new tweet or post a reply, you would have to login again. Every time your browser had to load anything, it forgot who you were."
That's not true. User agents can remember the credentials you use for a domain. HTTP authentication was never really adopted and sites preferred using web forms and cookies for authentication.
Not surprising, as cookies were not meant for tracking, but for simple e-commerce purposes (like keep the shopping cart). What really surprises me is that everytime someone is complaining about advertising and whatsoever: it's always the cookies.
Cookies are not bad. And in fact, there are a lot of technologies to replace tracking-cookies and make tracking even more effective.
Seems like cookies are just the scapegoat in this whole story.
People are far to focused on cookies! Any number of tricks can be used to track sessions and to track users across sites. If cookies were outlawed, trackers would just shift to browser fingerprinting.
GDPR is about what you are actually allowed to do with the obtained information.
> Cookies today have a really bad reputation, and many people are saying we shouldn't even have cookies at all.
The bad reputation is qualified because cookies have achieved 100% technological obsolescence. Here is the previous comment where I mentioned this and people wanted but failed to prove it wrong: https://news.ycombinator.com/item?id=23098196
The reliance on JavaScript, which a reasonable minority disable, is a valid rebuttal.
Your reply that cookies can also be disabled fails to address this as client side JavaScript and cookies provide overlapping, not equivalent, capabilities.
Not exactly. It says either technology can be equally invalidated by the user with the same level of effort, which makes the “disabled JavaScript” rebuttal particularly weak. In practice, though, what percentage of users disable JavaScript where cookies are used?
Self-destructing cookies make sites usable but denies them use of an easy tracking tool. It would be nice if Firefox added this as an optional built in feature to complement the 3rd party cookie blocking. 99% of the cookies you receive don't need to be persistent so why keep them around.
Obsolescence just means the technology in question is antiquated or fully capable of replacement by something of greater preference. As a functional term it has no bearing on frequency of use or popularity.
> The primary reason cookies are still preferred is due to developer ignorance.
And furthermore:
> localStorage would be preferred by end users since there is a distinctive performance difference
I would think cookies would be more performant in the most common case of exchanging a session ID with the server. How would you even achieve that with localStorage besides grabbing all dynamic data with XHR requests after the page loads?
I am not asking you to do research. You said, "If you don't know, don't guess". You said that there is a measurable performance difference. So surely you know the numbers, and you're not just guessing, right?
I’m guessing you’re a frontend guy. The two are only comparable from the frontend, if you’re developing a traditional app and want to use LS instead of cookies, it is certainly embarrassingly more complex as you’re forced into a JS intermediary to channel info between LS and your backend.
I am a fullstack developer. It is the same level of complexity because there is an intermediary either way. Web servers do provide authentication capabilities, but commercial web servers have not relied upon that since the late 90s / early 00s. Most web servers use something equivalent to Spring MVC to maintain both authentication and session management largely because the authentication capability supplied by the web server are only there to protect access to the server and thus directly doesn't account for most common security considerations, example OWASP Top 10.
Authentication and session management across a network can be accomplished across a variety of ways. Operating systems prefer Kerberos for remote authentication. Cookies are preferred by developers who fear JavaScript precisely because because they can be written and transmitted without need for JavaScript. The same end result, authentication and session management, can be accomplished with the same complexity without cookies. Complexity refers to the number of steps involved and not the challenge imposed upon the developer.
The only reason to prefer an out-dated functionally obsolete technology is fear of disruption. I have explained all this already, please see the comment I linked to from above: https://news.ycombinator.com/item?id=23098196
That’s not how it works. You can’t just gloss over needing to use JS as if it were merely a different language that compiled to the same bytecode. I’m not talking about complexity in terms of the number of frameworks or libraries you need to install. I’m talking about actual, measurable complexity in any objective metric you choose such as in terms of the number of distinct system, layers, hops, components, layers, etc you need to go through. There’s no comparing reading an http header included in the first lines of each and every request with the extremely brittle and objectively complex approach you are talking about.
JavaScript is absolutely not intended to function as a core part of merely making authenticated web requests between the browser and the server. I’m saddened by how often I need to explain this.
That is not what I said. What I said was: The only reason to prefer an out-dated functionally obsolete technology is fear of disruption. If the reason for using an obsolete technology exists for selfishness more than the needs of the business, then yes, it is a form of incompetence even when that distinction is indistinguishable.
JavaScript is a general purpose high level language that happens to natively run in the browser. My original point is that cookies are functionally obsolete and still in frequent use because developers are generally afraid of JavaScript. The more this conversation divests from authentication towards JavaScript the more credibility you lend my original opinion.
It’s the same level of complexity either way because whether you use cookies or some other storage mechanism the business requirements around authentication remain the same the and the steps to accomplish those requirements are not so drastically different.
I’m sorry, but as I have proven the two are not equal in complexity (among other things) then any future inference based on that assumption is void, most especially your assertion that the only reason people aren’t using LocalStorage over Cookies is because they’re somehow “afraid” of JavaScript.
It’s like saying “the reason people don’t use apple juice to make mimosas and continue to use obsolete oranges is because they’re irrationally afraid of apple peelers.”
Do you see how many things are wrong in that statement?
You provide a vague list of terms, such as hops, but there was no proof provided. An example of a proof would be a measured comparison, for example: Do cookies require fewer network hops than an alternative storage mechanism. If so how many fewer?
Example: Providing an identifier to a server. Cookies require one request: The identifier is automatically passed along with the initial request.
Using localStorage and JavaScript: The first request is anonymous, the server return some JavaScript which reads the identifier from localStorage and pass it back to the server in a second request.
> Cookies are preferred by developers who fear JavaScript precisely because because they can be written and transmitted without need for JavaScript.
You seem to believe that developers should always chose the most complex and intricate solution possible, and the only reason someone would chose a simpler solution would be due to incompetence or ignorance of the more complex solution.
This is completely opposite of all good engineering practice. Complexity is not a goal, it is a liability. You want to have the minimum amount of complexity which solves the problem.
I have defined complexity for you several times. Complexity is not hard just as simplicity is not easy. Just because some framework running on the server hides the challenges from you does not mean the complexity is lower. The complexity is dictated by the business requirements (the problem state).
Consider when you are visiting a page where you have been before, and you have received some identifier token.
Using cookies, the browser automatically sends the identifier token along with the initial request, and the server can serve the initial response based on your identity.
With localStorage, the browser initially send an anonymous request to the server, the server then have to return some bootstrap JavaScript which reads the identity token from localStorage and then passes it to the server in the next request, and only then the server can respond based on your identity.
By any reasonable definition, the second approach is more complex, not to mention slower and more fragile.
You’re not getting it. You don’t need any intermediary framework or library to store state in cookies from the backend. You can do it without executing anything on the client. With LS, you are forced architecturally to execute code on the backend and in the browser just to accomplish the same thing. They’re incomparable.
But you don't even need to use the API to use cookies. You set them on the server, and the browser send them back. It works entirely without any JavaScript.
In the early 2000s I was using Opera and for a while I had set it to block 3rd party cookies. Unfortunately Yahoo Mail was one of the few sites that stopped working, so I had to disable the setting again.
While it's nice that the original cookie specification is GDPR compliant, ultimately it is better to have the actual GDPR. The article mentions browser fingerprinting, and it seems to me very hard to define a technical specification that completely blocks any kind of fingerprinting. Whereas the GDPR basically says "no unnecessary data collection without consent", sidestepping the whole technical aspect and going straight for the intent.
> The browsers messed up too!
I think this is an interesting remark, because what ultimately happened is that adtech captured (a giant chunk of) the browser industry with Chrome. And I'm not sure whether IE (historically) lacking tracking protection was adtech influenced, or a case of "the browsers messed up", or something else.
The spec was too restrictive for the use cases site developers wanted, and the feature set was expanded to accomodate them.
The alternative was they were building sites in Flash when they wanted more complicated client-server interactions, which was much more expensive (development time, processor time, bandwidth) but supported all the features they wanted for the user experience.
"This also applies to other things, like browser-fingerprinting, or how Google will now try to identify people using machine learning without actually using cookies."
Even if only a User-Agent string is sent from a "browser" that has no Javascript engine and does no CSS processing or auto-loading of any resources, the online ad services folks, not to mention the "web security" folks, will still try to create a "fingerprint".
There are companies whose entire business is dependent on a demand for lists/databases of user agent strings. For example,
Fingerprinting based on user agent seems like a rather easy problem to solve. Aside from not sending the header or every user sending the same one, one idea is to keep changing the string.
The following is from a file called "mosaic-spoof-agents" and dates back to at least 1996.
#
# Agent Spoofing -- Who will you be today?
#
# "I don't think [t]he[y]'ll be too keen" -- Originally from Monty Python, run
# into the ground by Tommy Reilly
#
# NOTE! There is a hard limit of 50 agents! Mosaic will not read anymore!
#
# This file should be located in your HOME directory under the filename:
# .mosaic-spoof-agents
#
# This file consists of lines that begin with a # (comment), a - (a seperator
# for the sub-menu off options), a + (this is special...it denotes that the
# agent following it [no spaces] should be the default agent to use) and
# anything else (considered an agent spoof line).
#
- This is a seperator...as long as the first thing is a dash
# This is a comment line
#
# This is the "example" agent. If you don't like it...comment it out.
# if you want it to be your default, put a + infront of it.
#
Bond_James_Bond/007 (BMW; Z3 Roadster) DOHC_inline_16-valve_4-cylinder
Gandalf/White (Shadowfax; Wind) White_Staff
Elric/Emperor_of_Melnibone (Hellsword; StormBringer - Devourer of Souls)
Today tech companies will argue that any user agent string is an abberration from what they expect then "something is wrong". Users might get blocked or they might be alerted their "browser is not supported" and/or be advised to "upgrade". They might receive a "security" warning that someone logged in to their account. All based on simply changing the user agent string. The assumption today is that no netizen would ever do that.
Today, browsers allow changing the user agent string using some "toolbar" or "console" but if the name given to it is any indication, this is only meant for "developers", not "dumb [users]", to borrow from the infamous Zuckerberg quote.
One of the authors of Mosaic is now a Silicon Valley VC who posts on HN.
The "Who do you want to be today?" is probably a play on the Windows 95 slogan "Where do you want to go today?" A web browser was a new thing in Windows back then, the web being an idea Gates was not easily sold on, and I always thought the slogan seemed to suggest someone thought web access would be a selling point to users.
I really don't get what's with people wanting to both have and eat the cake, for free. It's obnoxious and toxic. Don't use or go to pages if you don't want to give them your information. Period.
We develop an open-source consent manager (https://klaro.kiprotect.com) and considered implementing the IAB framework, after seeing how it works we decided not to though as we don't want to support privacy-invasive approaches.
That said players like Google also use privacy as a strategic tool to force other players out of the ad market. Since Google can easily get first-party cookies into > 95 % of all browsers they won't be impacted by the restriction of third-party cookies. As most users go through their site many times per day they also have enough other signals they can use to track people without relying on cookies (user agent, IP address and a little Bayesian statistics is sufficient if you see enough traffic). They will probably continue to remove tracking capabilities that they don't need anymore from Chrome to slowly suffocate any remaining competitors that still rely on them.