Bitcoin Forum

Other => Meta => Topic started by: ripper234 on October 26, 2012, 08:19:05 AM



Title: Can we block "print pages" in robots.txt?
Post by: ripper234 on October 26, 2012, 08:19:05 AM
Often, the top Google result/s for a query is a print page (https://bitcointalk.org/index.php?action=printpage;topic=106596.0) - https://bitcointalk.org/index.php?action=printpage;topic=106596.0

I don't know why these pages appear so high on Google (relatively), but they're irrelevant, since the same content always appears in a better form if you remove the "action=printpage" from the URL.

Can we add to the site's robots.txt file a direction instructing robots not to index anything with the string "action=printpage" in its URL?


Title: Re: Can we block "print pages" in robots.txt?
Post by: Meni Rosenfeld on October 26, 2012, 08:23:28 AM
Wap2 versions also appear very often.


Title: Re: Can we block "print pages" in robots.txt?
Post by: ripper234 on October 26, 2012, 11:19:58 AM
Also, rel canonical might be useful.


Title: Re: Can we block "print pages" in robots.txt?
Post by: deepceleron on November 16, 2012, 11:42:40 AM
theymos can make a robots.txt with these lines added:

User-agent: *
...(other stuff)
Disallow: /index.php?action=printpage
Disallow:  /index.php?wap2


Title: Re: Can we block "print pages" in robots.txt?
Post by: J-Norm on December 06, 2012, 05:08:50 PM
not really hard to remove them from the url and hit enter just saying

Humans working for computers...  ???

theymos can make a robots.txt with these lines added:

User-agent: *
...(other stuff)
Disallow: /index.php?action=printpage
Disallow:  /index.php?wap2


Computers working for humans...  ;D