Title: Can we block "print pages" in robots.txt? Post by: ripper234 on October 26, 2012, 08:19:05 AM Often, the top Google result/s for a query is a print page (https://bitcointalk.org/index.php?action=printpage;topic=106596.0) - https://bitcointalk.org/index.php?action=printpage;topic=106596.0
I don't know why these pages appear so high on Google (relatively), but they're irrelevant, since the same content always appears in a better form if you remove the "action=printpage" from the URL. Can we add to the site's robots.txt file a direction instructing robots not to index anything with the string "action=printpage" in its URL? Title: Re: Can we block "print pages" in robots.txt? Post by: Meni Rosenfeld on October 26, 2012, 08:23:28 AM Wap2 versions also appear very often.
Title: Re: Can we block "print pages" in robots.txt? Post by: ripper234 on October 26, 2012, 11:19:58 AM Also, rel canonical might be useful.
Title: Re: Can we block "print pages" in robots.txt? Post by: deepceleron on November 16, 2012, 11:42:40 AM theymos can make a robots.txt with these lines added:
User-agent: * ...(other stuff) Disallow: /index.php?action=printpage Disallow: /index.php?wap2 Title: Re: Can we block "print pages" in robots.txt? Post by: J-Norm on December 06, 2012, 05:08:50 PM not really hard to remove them from the url and hit enter just saying Humans working for computers... ??? theymos can make a robots.txt with these lines added: User-agent: * ...(other stuff) Disallow: /index.php?action=printpage Disallow: /index.php?wap2 Computers working for humans... ;D |