Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Google Analytics expert here.

This is missing some fundamental ones: gclid and dclid. Those are the parameters that identify a specific click from a specific user on a specific ad placement. They are the keys that Google uses on the back-end to join Google Analytics data with Google Ads (formerly AdWords) and DV360 (formerly DoubleClick) data.

The utm_ parameters are tame. They are very coarse-grained, usually representing "which budget did this ad come from" rather than anything about a user. They're ugly, which is enough reason to strip them, but gclid is also ugly, and much more identifying.

This is a bit of a fringe opinion, but I actually consider tools that block utm_ but leave gclid to actually be a decrease in privacy. A lot of people misconfigure their Google Analytics so that _utm params break the gclid. Stripping utms allows that join to happen.



> This is missing some fundamental ones: gclid and dclid.

I always preferred CleanURLs as it has an autoupdate feature, which pulls down this json file https://gitlab.com/KevinRoebert/ClearUrls/raw/master/data/da...

https://gitlab.com/KevinRoebert/ClearUrls/-/wikis/Technical-...

CleanURLs does include both gclid and dclid. From memory the other url cleaning addons required a new version of the addon.


ClearURLs is my choice as well, it seems to catch a lot more stuff, and as you say it autoupdates definitions (like uBlock Origin etc.).


I mean, maybe you're in it for the privacy benefits, but I just want to auto-normalize URLs (i.e. to throw away anything that isn't necessary to get you to the page) for rendering permalinks in journal citations, in printed ad-copy, etc. I'd guess a lot of these libraries have the same goal. And in that context, they probably mostly seek to automate the stripping of the particular parameters that the author themselves has encountered, been annoyed by, and manually stripped before; rather than seeking to strip parameters based on the particular meaning or information they encode.


From a privacy perspective, that's even more horrifying.

You leave in the unique identifier, and then publish it under your real name, allowing all of those clicks to be attributed to your actual identity.


From a privacy perspective, is that really that bad? You've got a bunch of other people's clicks messing up your marketing profile. This might even be a good thing, because it pollutes the dataset that the adtech people are using, which makes the trackers less useful.

So here's a feature suggestion: instead of removing URL parameters, replace them with a randomly-generated value. Be careful to use the same length and character set (e.g. replace a hex id with only hex digits).

And also, set DNT=1.

That gives the digital marketing people an incentive to respect the DNT header, and an easy way to maintain data quality.

I think that this is a better strategy than blocking trackers and "cleaning" URLs. It shifts the economics of web tracking, which short of legislation, is the only way things are going to change.


I've added these parameters to my configuration, thanks!

Do you know of any other parameters like those, that can be safely removed? Maybe someone else here can list a few?


Most users of this would never actually click on ad (they are probably already using an ad-block, or are completely ad blind), so that omission doesn't seem that bad.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: