Follow

Facts:

1. Gophers are mammals.
2. Gophers burrow ALL the time.
3. The purpose of the gopher is to stay underground and serve awesome content.

· Claverack · 2 · 9 · 5

@solderpunk I want search engines to not index my gopher.... how do I do that? 😅

@snowdusk_ @solderpunk Do search engines even bother with that protocol? And if they did, is there the equivalent of a robots.txt file? Not that those are enforceable anyhow...

@gemlog @solderpunk yeah they still get indexed by www search engines via html proxy 🔥

@gemlog @snowdusk_ I really wish I knew how to stop them, snowdusk! I think it would need everybody who operates a gopher webproxy to be vigilant with their robots.txt. But most of them aren't, and it only takes one for Google to sweep everything up. At the end of the day, indexing is fundamentally unavoidable. If you're serving stuff, people can save it.

@solderpunk @snowdusk_ @gemlog I didn't send a toot earlier but @solderpunk nailed it. Even the workaround I'm trying both alienates some/most of the userbase and leads to an arms race if it caught on.

@dokuja @gemlog @solderpunk ah I see... this is why I stopped posting stuff on my gopher... So far search engines have neen respecting my robots.txt on all of www site pages 👍 thanks for the feedback everyone!

@solderpunk @snowdusk_
Also, there is nothing enforceable about robots.txt. It's just a gentlemans agreement with crawlers.

@gemlog @snowdusk_ @solderpunk Heh, I have a placeholder at that ~other place~ on just that subject. Guess I should actually commit it to file.

But I will say look to archive.org if you need an example.

@gemlog @snowdusk_ @solderpunk If gopher permits links and you can generate server-side code, you can write a script that just randomly generates links pointing back to the same script with different useless parameters. It'll give up crawling or back out of the loop after a while.

Sign in to participate in the conversation
tilde.zone

masto instance for the tildeverse