Free Sponsored Proxy Template: IP Disguise Fresh Proxy Topsite List for Webmasters
Aug 13

I briefly covered the importance of a robots.txt file in my recent article: 5 Tips for Proxy Webmasters. There has been alot of discussion about this over at the proxy.org forums and most people have come to the conclusion it is an essential part of building a quality proxy website.

A correctly configured robots.txt tells search engine spiders where they can go on your website, if you dont stop spiders from accessing your proxy script then you could damage your website search engine rankings and trust. Search engines can be a vital source of organic traffic to proxy websites, if you dont block spiders from visiting proxied pages your site could get penalized for the following:

  • Copyright/Trademark infringement
  • Flagged as Spyware/Malware
  • Illegal/Duplicate Content
  • Stop spiders draining server resources

I have compiled some snippets of code for use with the most common proxy scripts, simply create a robots.txt, drop the code in and upload it to your server.

CGI Proxy:
User-agent: *
Disallow: /nph-proxy.pl/

PHProxy:
User-agent: *
Disallow: /proxy.php

Using the latest PHProxy version would require you to rename the proxy script proxy.php and create the index portal page. There may be an easier alternative but unfortunately I run a custom proxy script so it is difficult to give first hand experience.

3 Responses to “Why Web Proxies Need A Robots.txt File”

  1. Roman Says:

    If you are using Zelune, I believe this is what your robots.txt should have:

    User-agent: *
    Disallow: /?*

  2. A-Z Proxies Says:

    The correct version to use for an unmodified PHProxy installation is:

    User-agent: *
    Disallow: /index.php?q*

    And the one to use for Glype is:

    User-agent: *
    Disallow: /browse.php

    Although from version 0.4 > Glype comes with this by default.

  3. Mike Says:

    nice info , what about Glype proxy ?

Post A Comment