Same story for various Wordpress plugins and widgety things that live in site footers.
Google has turned into a cesspool. Half the time I find myself having to do ridiculous search contortions to get somewhat useful results - appending site: .edu or .gov to search strings, searching by time periods to eliminate new "articles" that have been SEOed to the hilt, or taking out yelp and other chronic abusers that hijack local business results.
Also phone problems: Google a problem with a phone and the top hit will be a whole bunch of churned out articles with generic copy on the cause (sometimes there are bugs in the software, so reboot your phone).
Any technical issue, really. There's a ton of autogenerated content out there with low-effort troubleshooting tips. A lot of it is used as lead generation for scammy antivirus/antimalware/"cleaner" software, paid tech support, or outright tech support scams.
These results are incredibly frustrating. Google should de-rank these autogenerated tech troubleshooting sites.
Yes, I clicked the link because it exactly referenced my issue. But it's not helpful to just see the same 5 tips copy pasted from elsewhere by an algorithm.
>These results are incredibly frustrating. Google should de-rank these autogenerated tech troubleshooting sites.
Why? Google makes money from advertisements either way, it's not in their interest to improve search results. If anything, terrible search results make users more likely to click on ads, which now look better by comparison.
Google became very popular very quickly because it gave much better results much faster. The more that Google allows quality to decline, the faster they approach a non-recoverable tipping point. Just ask Yahoo how quickly that can happen. Google may seem entrenched, but they have a shaky hold on search that is only as strong as its result quality. They are entrenched in advertising, but only because that's where searchers go to search.
Users may be entrenched in other Google products-- Gmail, gcal, docs, etc-- but not search. Someone using all those other Google products could change their default search engine and have zero impact on the rest of their digital life.
I'm shopping around for a preferred alternative right now, I just haven't settled yet.
Yep, not disagreeing. My point is that a short term pursuit of money over at least a reasonable quality of search will destroy what they have built very quickly if quality gets low enough to make it easy for an upstart rival to have obviously better search results. And the evidence for that is in the history of their own rise to search dominance.,
>The more that Google allows quality to decline, the faster they approach a non-recoverable tipping point. Just ask Yahoo how quickly that can happen.
Do you think we're in the same situation now as we were fully 20 years ago? I don't. Facebook killed MySpace, but Facebook is now too big to be disrupted, same with Google. The word "google" is a verb now. This is why the quality of their search results doesn't matter, people are too entrenched to switch now, which was not true in 2001.
With respect to getting users to switch, Facebook and MySpace are much more complicated services in terms of user interactions and the need for network effects. It is literally a text box you type into, and it's usefulness does not directly depend on how many other people use it.
In that respect, not much has changed in 20 years. Switching your search bar is a very low friction activity, and if quality of results is too low then people will look elsewhere. There's only so many times someone will tolerate seeing the exact same copy/paste useless answers to questions as most of the first page of results.
-#-#-#-#-#-#-
In General:
The tech industry is filled with examples of companies that had an entrenched product end up failing very rapidly. I think Google probably understands this well enough to ensure search quality remains better than a scrappy under funded startup can accomplish, but then again Google achieved search dominance by coming up with a different way to determine results, relevancy, etc. There's no reason to believe that someone couldn't come up with something superior now either.
I think the most significant threat to that possibility is 1) FAANG companies buying up many of the most talented people. 2) If a competitor did come along, buying them up as well.
But it's also hard to predict the anti-trust future. Microsoft had an extremely long run as the most dominant web browser for longer than Chrome has held that crown, but they got knocked down very quickly. I doubt that would have happened as easily if not for their anti-trust issues. Of course it doesn't help that IE grew into a slow bloated mess, but in that respect, refer back to what I said about search quality: Microsoft was entrenched, if sliding, in the browser space even after its anti trust issues, but it let it's quality slip too much for users to accept. Given viable options, users switched.
That switch was truly remarkable due to the much higher friction. IE still cam bundled with Windows, Chrome did not. Every home computer with Chrome requires a user to ignore the option right in front of them and choose Chrome instead. Now just think about how much easier it is to use a different search engine.
I'm not saying Google is doomed, but 20 years of market dominance guarantees nothing. The "big 3" US automakers owned the market for longer than Google's founders have been alive, but those days are now just another cautionary tale of poor quality and unassailable arrogance.
The last few weeks I've started noticing a very specific type of SEO that pops up when I'm doing technical search, where the first page will be a Stack Overflow result, and the 3rd or 4th result will be from some content farm, copy-pasted from SO, sometimes translated in French.
It can be worse than that when those sites get a full multi-line result billing whereas the original stackoverflow answer gets a single-line subheading under some other SO result.
If you start getting a little esoteric in your searches you’ll get tons of results that are clearly crawled from personal blogs, and hosted on personal-blog-looking domains that redirect to godawful garbage. Especially bad on mobile because Google truncates the URLs.
I haven't gotten real SO as google result in years, only those content farms, constantly. Nowadays the same even happens for github issues, they're also mostly outranked by content farms copying from them.
If I search on mobile, often all my results are these content farms. (Google used in English from Germany)
That's why I append reddit, stackoverflow, superuser when I search for technical solutions. At least those sites are still full of user-generated content with good answers upvoted to the top.
You know, I was joking the last few times the subject came up, but I'm getting seriously worried that the more people mention using that kind of trick on HN, the faster advertisers will catch on and start building reddit-based SEO strategies.
Reddit has been gamed by guerilla advertisers for years, everyone knows it, and the admins there don't seem to care/are unable to do anything about it.
r/HailCorporate used to be about calling out stealth marketing/advertising but it's morphed into just discussing how things can inadvertently act as an advertisement aka society is full of branding and consumerism. It's a shame because it used to be a very high quality sub.
Prefer resources that have some governance and aren’t entirely crowdsourced. For example if I’m looking for web tech answers my first search is ‘[whatever topic] mdn’.
Oh, it's no secret. Google's autocomplete will actually suggest appending "reddit" to certain queries. For example, let's take one of the most SEO-spammy queries imaginable, "best mattress 2021". Google will suggest:
- best mattress 2021
- best mattress 2021 consumer reports
- best mattress 2021 reddit
- best mattress 2021 for back pain
- best mattress 2021 wirecutter
etc.
But of course Reddit is already rife with shills. Not sure about CR.
I remember in the late 2000's I had a CR account. I had two weeks left on the period I had paid for. But when I cancelled the account... poof. My access was revoked immediately. Very much not consumer friendly. I was done enough with their crap that I didn't even bother with an email.
FWIW I signed up for CR recently when I was car shopping, and I canceled my subscription within the first month. They assured me that I would still have access for the remainder of the period. Of course, you're forced to subscribe rather than buy access for a set period, and they sent me a couple dozen emails during the time I was signed up, so they're not completely innocent... but at least that part felt reasonable.
I've been trying to unsubscribe from CR email spam for months now to no avail. Looking at the browser tools, it seems that their api can't handle the fact that I registered with a single letter first/last name so therefore my attempts to unsubscribe silently fail. There also appears to be no way to change my name since the api for that also fails on the single letter first/last name. I wish ungood things to happen to the people who 'designed' this Kafkaesque rubbish and in the meantime, thank GMail's mark-as-spam feature for throwing away their unrelenting pablum to the memory hole. This experience has led to me canceling my print subscription to CR plus my donations to their organization.
I keep getting results to a site 'gitmemory.com' which is just GitHub issues scraped. Super annoying that they outrank the actual GitHub issues they've taken the content from.
How is this not just spam and duplicate content. I remember when I was punished by G for duplicate content on my very small private blog when I was using jekyll and had the markdown sources and the code stored in GitHub. I didn't know of the canonical tag back than and was punished because the GitHub domain had more trust.
It is sad, bit nowadays I often just directly jump onto page 3 at Google or use other "tricks" to get okayish results.
That's a bit harsh but I agree that it is starting to fail to live up to the expectations I had with Google when it came out and destroyed Altavista in a spectacular shower of sparks.
Could I tender: "uBlacklist" as a stop gap, amongst others as we await Google being given a right old kicking?
Despite being a staunch Arch Linux user I have to deal with rather a lot of MS Windows related stuff. Being able to filter out that bloody awful Microsoft Social thing gets me closer to decent results. The majority of the next 10-100 results will be CnP clones of someone's blog but a human is able to get in reasonably quickly. I'm toying with blocking Stackoverflow and other cough slatwarts to see if results get better for me.
In my opinion: the www has hit a crossroads or perhaps a Spaghetti Junction or a Magic Roundabout for the last five years or so and continuing. However the exits are connected to the entrances on these road systems (take a look at them - they are real junctions. The MR is particularly terrifying but it works really well.)
I still won't use words like cesspool for this but I am increasingly losing my patience over the standard of results from Google. Those featured things (not the Ads - that's fine) at the top which add #blah_blah to the URL to colour search terms yellow is not working for me. The quality of the returns featured in a box are often rubbish too. It would be nice to be able to turn all that stuff off.
I understand that Google are trying to "be" the internet to try and keep the stock ticker pointing north but there seems to be a point when they have overreached themselves and I think that was passed several years ago. I also increasingly feel that Google thinks that it knows best and has removed many choices from their various UIs - that comes across as a bit arrogant.
Many years ago I left Altavista behind for Google. I will move again if I feel I have to. Of course that's not much in the grand scheme of things and I'll probably only take around 100,000 people with me but they have friends - still probably not a big deal.
I appreciate a lot of what you're saying in this comment but I disagree with this sentiment:
> not the Ads - that's fine
In my strongly held opinion, push advertising is not fine and it's the root cause of all the problems you are discussing. We will only exit this mess that the web has become when everyone blocks push advertising by default. People should only see advertising when they are interested in being advertised to, e.g. sites you consciously choose to go to that advertise products & services, like the old Yellow Pages phonebooks.
We were already there when Google was the hot thing all the nerds loved. At the time their search was a way to cut through that, not the primary window into it. The cesspool isn’t Google, now it’s just hosted by them.
There was already SEO stuff going on back then people were less aware of it. I can remember during height of the Iraq war people manipulated google to display George Bush as the top result for "Miserable Failure" and there were other exercises like that happening.
It's hard for me to pick a sweet spot for the internet in many ways I feel like I've grown up with it.
I can remember the web of circa 1995 to 1997 with Gif's that wouldn't render properly in internet explorer, HTML marquee scrolling text and the dreaded blink tag being used everywhere. You needed to play search engine bingo with Altavista, Metacrawler, Yahoo, Infoseek, Lycos etc etc. And it was a crap shoot if search engines would give you useful results.
I can remember the web of 1998 to 2000 where every web developer seemed to discover html frames at the same time. We had good search with Google but pop up ads were so rife that the internet was borderline unusable. I can remember all the free webmail sites like hotmail, yahoo etc. ICQ chat was massive (whatever happened to that - it was a staple of my teen internet).
In Early 2000's Firefox came along and saved the internet by virtue of its built in popup blocking. But there was a mishmap of "Applets" and "Plugins" everywhere Flash Player, Java Applets, Real Player etc. Video (and audio) on the web was terrible half the time it would complain about missing codecs, it would buffer forever and if something did load it would be the size of a postage stamp and look pixelated as all hell. I remember Gmail came out and everyone went gaga over it's interface.
Last period that real stands out is the mid to late 00's with development of big Social Media sites, Facebook, Twitter, Youtube etc. The web got more and more javascript heavy. Web video streaming finally became useable. Google Chrome came out and flash player finally died despite Microsoft trying to revive it with Silverlight.
I kind of feel like this last 10 years are a continuation with increased surveillance and tracking.
No, the keyword ignoring stems more from catering to the majority of people who don't know how to logically formulate a search for a search engine that expects every word to match. Most people will intuitively just try to ask the search engine a question (even if not literally phrased as such), and so Google has adapted to fill that need. Which even for those of us who would prefer something a bit more clear cut, is honestly handy a lot of the time.
I think using +plus +before +keywords still works for situations when you don't want any words ignored?
Certainly agree it seems like they could do a better job of burying auto-generated sites though. (Although I'm sure it's a difficult problem!)
In general i had more relevent results on my first search qurry compared to now admitedly thats hard to prove as i can't rerun the search side by side for a comparison now.
additionally ads were firmly separated into a colored box away from actual results
As mentioned, I removing the think the rose colored glasses won't put lipstick on this pig. Google Search (and not sure how Bing or similar would do better, baring their censorship problems) is increasingly a minefield...
This is the same problem with something like WoW classic... you can get the game that existed 15 years ago. But even if it is the exact same game, the world itself isn't. Online walkthroughs, videos, modding knowledge, theory crafting, etc. Those things are much more fleshed out today so even if the system didn't change 1 bit, WoW Original vs WoW Classic are really two separate games.
Likewise... if you dropped Google Original down today? I'd love to see how fast it would get owned by these sorts of operations that have had a decade+ of practice in skills like CEO that didn't exist in 2010.
You had more relevant results? That wouldn't change because companies live and die off of SEO now and didn't then. Highlighted ads are such a small thing on the website when compared to getting a full front page of the same Stack Overflow answers in 20 different websites that all have SO cloned and reskinned.
Yandex.com is 2010 Google search, IMHO. It's not filtered at all and seems to have that pure pagerank feel of the old Google search engine, while the modern Google seems to be hand tweaked quite a bit to only quote "authoritative sources". Search for a politically controversial topics all you want on Google and you will not have your first couple of pages being debunking or fact check sites. Compare Google's search results for "who is zhengli shi" vs. the Yandex.com results for example. You can even find Putin scandals and "Tank Man" on there, even though it's a search engine based in Russia.
I think we have loads of tools to play with but fundamentally there is a problem when you are fighting with your search engine to find stuff you want to find.
My laptop (Arch) still has Chromium as default with uBlock Origin, Privacy Badger, uBlacklist and a few others running. I will be moving back to FF and running a sync server because I am that pissed off and able to do so. I'll also take a few others with me (between 2 and rather more)
When I say move back to FF, I'm talking about something like reverting a 10-15 years change.
I've always had FF available but it fell short back in the day for long enough for me to move to the Goggle thing. Now I think I'll go back.
Noone at G will lament their loss, I'm not even a rounding error. I'm sure that all is fine there.
If you're referring to user-curated search result blocking, that's very easy with DuckDuckGo and uBlock Origin (just block elements like [data-domain="w3schools.com"]; see my comment to the GP). I don't know of any large extant lists like this though.
That won't do much if every result on the first page is blocked. Ideally a filter list like this could be pushed to the server side as a per-user preference to go with your query, so that if e.g. the top 10000 results were all filtered out, then you wouldn't have to click through (or infinite-scroll autoload) 100 empty pages before getting anything.
DDG will add more results, if enough are hidden. If I search "w3schools" with my filter, there are only two results on the first page that are not hidden, so it immediately displays the second page below. It seems that they planned for this use case.
I used to have an automatic google search-domain blocker. It was just front-end though so if a page would have website domains that were useless, it would only have 1 or 2 results on it unfortunately. Something a little better integrated would be nicer.
Comparing Google now to Alta Vista is not very helpful. They don't get to rest on their laurels. Search is less helpful now, and it's not clear to me that they care enough to do something about it.
You're giving their entire search budget credit for dealing with spam results? My observation is that it's bad and has been for some time. They are either unable or unwilling to solve the problem.
Anecdotally DuckDuckGo seems to have fewer sponsored sites than Google. DDG also makes it easy to block low-quality sites because it adds a data-domain attribute to the root of every search result. I recently started this mini uBlock Origin filter list for that (suggestions welcome!):
The reason for that is actually rational: when Amit Singhal was in charge the search rules were written by hand. Once he was fired, the Search Quality team switched to machine learning. The ML was better in many ways: it produced higher quality results with a lot less effort. It just had one possibly fatal flaw: if some result was wrong there was no recourse. And that's what you are observing now: search quality is good or excellent most of the time while sometimes it's very bad and G can't fix it.
I wouldn't call that rational. There is no reason you can't apply human weighting on top of ML.
Honestly, I don't believe for a minute they "can't fix it." They do this sort of thing all the time, for instance when ML shows dark skinned people for a search for gorilla, they obviously have recourse.
I’m confused. I read that article and it has this:
> But, as a new report from Wired shows, nearly three years on and Google hasn’t really fixed anything. The company has simply blocked its image recognition algorithms from identifying gorillas altogether — preferring, presumably, to limit the service rather than risk another miscategorization.
Is that not an example of human intervention in ML?
Yes, they can. They should simply stop measuring only positives, and start measuring negatives - e.g. people that press the back button of their browser, or click the second, third, fourth result afterwards...which should hint the ML classifiers that the first result was total crap in the first place.
But I guess this is exactly what happens if you have a business model where leads to sites where you provide ads give you a weird ethics, as your company profits from those scammers more than from legit websites.
From an ML point of view google's search results are the perfect example of overfitting. Kinda ironic that they lead the data science research field and don't realize this in their own product, but teach this flaw everywhere.
The fact that they do collect data does not mean that they use that data in any meaningful way or at all.
They ought to see humongous bounce rates with those fake SEOd pages. Normally, that would suggest shit tier quality and black-hat SEO, which is in theory punishable. Yet, they throw that data away and still rank those sites higher up.
You mean to say that no one at Google has even heard of "external SEO", which is nothing more than fancy way of saying link farming? They do know, this is punishable according to their own rules, yet it works, because either they cannot fix it or do not care to.
They'll never tell how they use the data for obvious reasons and I also can't go into any details. But any obvious thing you can think of almost certainly has been tried, they've been doing it for 20+ years and ranking alone is staffed with several hundreds of smart engineers. Mining clickthrough logs is a fairly old topic itself, has been around since at least early 2000s.
I worked in ranking for two major search engines. They all measure this, this is a really low hanging fruit - how much time it took you to come up with this idea? Why do you think so lowly of people who put decades of life into their systems that they didn't think of it?
Technically just open google serp in developer tools, network tab, set preserve/persist logs option, and watch the requests flowing back - all your clicks and back navigations are reported back for analysis. Same on other search engines. Only DDG doesn't collect your clicks/dwell time - but that's a distinguishing feature of their brand, they stripped themselves of this valuable data on purpose.
Again, this is not about data being collected, we do know how much data Google collects, it is all about what is being done with the data and by extension how good the end result is.
This touches the broader subject of systems engineering and especially validation. As far as I am aware, there are currently no tools/models for validation of machine learning models and the task gets exponentially harder with degrees of freedom given to the ML system. The more data Google collects and tries to use in ranking, the less bounded ranking task is and therefore less validatable, therefore more prone to errors.
Google is such a big player in search space that they can quantify/qualify behavior of their ranking system, publish that as SEO guidelines and have majority of good-faith actors behave in accordance, reinforcing the quality of the model - the more good-faith actors actively compete for the top spot, the more top results are of good-faith actors. However, as evidenced by the OP and other black hat SEO stories, the ranking system can be gamed and datums which should produce negative ranking score are either not weighted appropriately or in some cases contribute to positive score.
Google search results are notoriously plagued with Pinterest results, shop-looking sites which redirect to chinese marketplaces and similar. It looks like the only tool Google has to combat such actors is manual domain-based blacklisting, because, well, they would have done something systematic about it. It seems to me that the ranking algorithm at Google is given so many different inputs that it essentially lives its own life and changes are no longer proactive, but rather reactive, because Google does not have sufficient tools to monitor black hat SEO activity to punish sites accordingly.
So they do collect it, they only ignore it - just like the 10 - 30 (or more) clicks I've spent on the tiny tiny [x] in the top corner of scammy-looking-dating-site-slash-mail-order-bride ads that they served me for a decade?
My impression is that the ML algorithms at Google have the goal of increasing profitability from search. If that is the case, the quality of search will tend to be secondary to displaying pages that bring more revenue.
Since this is now the top spot here on H/N I suspect it just got the attention of some Googlers who I’m sure will review it.
They may not give the site a manual action, though. They’d rather tweak the algorithm so it naturally doesn’t rank. Google’s algo should be able to see stuff like this.
I know that I’ve seen sites tank in the rankings because they got too many links too quickly. It could be that the link part of the algorithm hasn’t fully analyzed the links yet.
I’d be interested in seeing what the Majestic link graph says about this site, ahrefs doesn’t have tier 2 and tier 3 link data.
I really don't like how easy it is to fake a "new" article on Google. You can just re-publish an old article and stick a new date on it and Googles takes it on face value and uses the new date.
I ran into this for the first time yesterday when trying to find out new info about a footy player. Some article from 15 years ago talking about how he had a good first game, tagged as 5th june 2021. Like, wtf?
I have been seeing this a lot recently too. Especially with the first result or two. Or the section up top that gives you a partial answer without having to click through. All of them always seem to have been freshly written like some made to order meal at a restaurant. It’s just too suspicious really.
I still think that the "Yahoo!" style web directory is a good model. A catalogue of hand-curated links has increasing value as the quality of Google results goes down.
I was briefly going to write "I'm surprised that DMOZ[1] still exists" but it says "Copyright 2017 AOL" at the bottom so maybe it doesn't.
Edit: ...and using the search box results in a 404 so I guess it's really dead huh.
The creation and maintenance of such a directory might additionally be more feasible now because sadly there are much fewer personal or independent websites instead of content hosted on large platforms.
I just tried to use both to look up pharmacies via navigation.. With Dmoz after my second try I was able to find CVS, but I wasn't able to find it with Curlie..
It's not a bad idea to have a curated dataset of information. But clearly there are much better ways to navigate said information, which would include search, but also dynamic filters, predictive text, sorting algorithms, context awareness, etc. All of which... is built into modern search engines.
So perhaps what we really want is a Wikipedia/OpenStreetMaps of curated, indexed, semantic content/links, that anyone can consume and write their own search interface for. Basically, an open data warehouse of website information.
> A catalogue of hand-curated links has increasing value as the quality of Google results goes down.
Who will pay for its creation, maintenance and hosting? Who will judge ranking, disputes, hacks?
Who will have an eye on discrimination issues? Whose jurisdiction will be relevant (think GDPR or the Australian press "gag order" law in the case of that cleric accused of fondling kids)?
Who will take care that the humans who will get exposed to anything from generic violence over vore/gore to pedo content get access to counseling and be fairly paid? Facebook, the world's largest website, hasn't figured out that one ffs.
These questions are ... relatively easy to bypass with an automated engine (all issues can be explained away as "it was the algorithm" and IT-illiterate judges and politicians will accept this), but as soon as you have meaningful human interaction in the loop, you suddenly have humans that can be targeted by lawsuits, police measures and other abuse.
> as soon as you have meaningful human interaction in the loop, you suddenly have humans that can be targeted by lawsuits, police measures and other abuse.
In theory, you could have a curated directory whose hosting works like ThePirateBay, and whose maintainership is entirely anonymous authors operating over Tor (even though the directory itself holds nothing the average person would find all that objectionable.)
TPB is not a good example since they're allowing everything except pedo content, thus drastically shrinking their moderation workload.
A site that wants to be compliant to the law in the major jurisdictions (US, EU) can't operate that way, not with NetzDG, copyright and other laws in play.
It doesn't need to be a corporate enterprise that has to worry about all those things. People already share directories of links via Google Docs, Notion notebooks and the like.
The irony being that 20 (more like 25?) years Yahoo search was ripe for disruption... by Google :)
Halt and Catch Fire [1] (As a nerd, I can say it's one of the few TV series that got the hackers spirit correctly) had a few episodes about the Google disruption.
Like some people often say here, things come and go in circles...
That just implies locking into an ad supported model. Personally, would prefer to pay. Stewart Russel wrote in his book that when surveying humans the value they ascribed to not being able to google fo a year was something like $17,000 per year. Just some absurd number.
Nobody said it would be easy. Industries ripe for disruption are often very hard to break into. Being ripe for disruption is more about giving up on innovating so you stagnate.
Free WordPress* themes are particularly bad in this regard. Since they're expected to contain HTML anyway, it's altogether too easy for the author of a theme to include a couple of links to a site they want to promote. Some themes take this to the next level by obfuscating the code that generates the promotional links, and/or including other code which makes the site not work properly if the links are removed.
*: and themes for other web applications, but mostly WordPress these days
Hmn. I would agree about all crap being mixed in there, but in terms of overall results (both wrt. SEO crap and other irrelevant stuff), my experience has been that the quality troughed something like 2-3 years ago and then came back (my guess is that they're incorporating all of the AI they've been doing throughout the company into search). To me it feels like it's about 80% of its best right now.
I bet it's that we do different types of searches.
I swear, Pinterest must have employees working undercover in the Image Search team for Google to have let them destroy image search results the way they have.
It's literally never the original source for anything, but you can bet it's most of the first 10 pages of results. Then it doesn't even let you right click to open the image file, and dumps you to a login prompt if you click on anything. THAT'S NOT EVEN YOUR IMAGE STOP TELLING ME WHAT I CAN DO WITH IT.
Really makes you wonder if the people at google actually use their own product. Anyone who has ever used google image search in the past couple of years will have noticed that it's filled to the brim with garbage results from pinterest.
I have fallen in love with Yandex image similarity search (search by providing a query image, not text). You can find so much more with it, it's like Pinterest but without the crap. For example I could find images for my ML model but also furniture ideas for my house and check if my kid is objectively cuter than average (lol, yeah, objectively!).
And if it is not a pintrest link it is an amp link which is equally bad in my experience. I just want to link a picture. Not a link to a page that might have the picture but might also have the entire article/reddit discussion and not the image which I was searching for.
When I'm reverse image searching something it's often to find the original artist of an illustration, photo, or whatever. I want to know who made it, see their other work, and find it in its original quality without 15 generations of jpg recompression artifacts.
But no, Pinterest has better SEO than the artist does, so it's just endless reposts upon reposts and never the original work.
Occasionally you get lucky and it's not the sort of image that Pinterest users share. Then you might actually find where it came from.
THIS. So much this. Time was when you could actually discover the provenance of an image. Almost every time, when I’m doing a reverse image search, that is my intent. It used to work. It seldom does these days.
And the interesting thing about that is, you'd think it would be (relatively speaking) straightforward for Google to keep track of the first place a given image was indexed (or possibly the first few places, or everywhere it was seen over the first X period of time since you couldn't guarantee the very first would always be the original). Assuming that original was still online, it would seem to be the place to direct searchers to, regardless of pagerank or whatever.
I'd expect a company like google, who tracks what kind of socks you have on everyday, to also track their own search engine... users mistakingly clicks on pinterest link, user immediatly clicks back, and looks for something else... is it so hard to assume, that they don't want pinterest results, because they're useless, and somehow lower their seo score? Nooo, of course not, just put the pinterest results near the top, until users puts "-pinterest" in the search bar.
All these same sites appear near the top of Bing searches too. There's nothing particularly Google-specific to this story. It's about SEO hacking that will work against anyone with a PageRank-style system.
Indeed. I recently noticed this while relying on DDG for documentation for Common Lisp, a language I still learning. The top-ranking site for any Common Lisp function was an SEO scam site, where clearly someone had hired freelancers to take preexisting CLisp documentation and rewrite it – in poor-quality English – until it would no longer be detectable as copyright violation, then loaded it with ads.
(I just checked and this copycat documentation site has, thankfully, now been pushed down a bit in DDG results.)
Note that as I quite recently learned DDG has support for a bunch of bang-commands listed at [1]. There are a bunch of them for documentation sites for all kinds of programming languages, including a couple for lisp it seems like.
I think it's high time we had a webring resurgence. It's impossible to get anywhere with plain search anymore, what we need is curated websites that other domain owners are happy to say "I endorse the people running this site, so if like my stuff you'll like them too"
Id like to see. and be happy to post some various web rings and blogrolls..
One of the things that killed them imho is when google started penalizing sites that linked to some other sites.
This was compounded by the expired-domain market..
wordpress even took out linkrolls around that time, people that had them in sidebar widgets would have them disappear unless they installed a new plugin to bring them back.
Webrings that auto-add the "nofollow tag" I guess could make them okay for people again.
Might be cool to have a github type page with a list of rings to reccomend.. a script auto-pulls it into your page, adding nofollow - and then other people could copy your list or clone/fork..
It’s fundamentally about trust models. That is to say, about the audience.
Everybody gave up trusting webrings because Google provided better results. Now that Google results are shit, there’s room for other information vendors to come along, even if it’s in narrow areas.
Actually, HN is already this for me in some respects.
In my opinion this site is not really so different from, say, Reddit, beyond having more focused rules and being smaller. So I don't think my idea that social media have supplanted the Web ring is wide of the mark.
This is my view too. Yes, I’d love to go back to a time when Google’s algorithms were unknown enough for SEO to be futile but those days are gone and the problem isn’t limited to Google.
I also noticed that Apple users see way more fake online shop results than Linux users, from the same IP, with regularly cleared browser cache and identical search terms.
Those fake shops are part of discussions in politics right now. Usually they're registered in Ireland or Malta as companies due to their specific banking laws. They make millions with those scams and people can't differ between legit online shops and fake ones - because the legit ones actually look crappier than the fake ones when it comes to the website designs.
In Germany, we have at least for hardware the "geizhals" website which is kind of an index for all kinds of electronics shops and they try to verify as much as possible.
But for other online shop sectors (e.g. clothing or home stuff) I wouldn't trust anything. Even on Amazon I got scammed a lot and heard absurd things from others...like getting packages with no content in them and Amazon refusing to see that the seller is a scammer etc.
Google is a cesspool because the spammers and SEO-hackers are in full force, and Google is only reactive to these threats these days. I mean, does it really matter if they are making hundreds of billions of dollars a year? They seem to be doing something right.
The only time something will change is when traffic starts decreasing to their site, but it's good enough such that people won't change. Look at Facebook, I don't know anyone who uses it as much as they used to 10 years ago, but it's making the most money it ever has. Why on earth would any behavior change? From their points of view, everyone is happy with it!
google isn't the cesspool, people who want to appear at the top of a list of search results are doing whatever it takes to create a cesspool, because that's what it takes to earn more money.
being willing to make other things in order to have more money always creates cesspools.
there is never only a single way to make money. some ways are easier. some ways let you take advantage of others; these are of the variety that create cesspools.
Google has turned into a cesspool. Half the time I find myself having to do ridiculous search contortions to get somewhat useful results - appending site: .edu or .gov to search strings, searching by time periods to eliminate new "articles" that have been SEOed to the hilt, or taking out yelp and other chronic abusers that hijack local business results.