Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Not allowing 'robot' apps/Webserver

Re: Not allowing 'robot' apps/Webserver

From: Billy Verreynne <vslabs_at_onwe.co.za>
Date: 1998/03/02
Message-ID: <6dg93l$7np$1@hermes.is.co.za>#1/1

Daniel P. Looby wrote in message <6dfnb9$jot_at_acmex.gatech.edu>...
>Individuals have written some 'bot' or 'robot' apps to attack our website.

Duh. Without bots there will be no Yahoo, Excitem Lycos, Hotbot etc etc. Or how else do you think these sites gather information?

When a bot hits a website, it traverses the links in each HTML page in order to add keyword data about the site and pages to its database. A "good" bot will not spawn multiple threads and hit the same site (different pages) simultaneously, but unfortunately there are still some "bad" bots that actually just do it. Also, a bot's processing time is very fast and a fast bot can look at a substantial number of pages per minute.

The "secret" is to allow the bot access to the pages you want to have indexed by the search engine, and disallow access to dynamic pages (e.g. CGI's, or database created pages etc.). In order to standardise a format for telling a bot what it's allowed to do and not to do, a consencus document was created by bot developers. You can view it at http://info.webcrawler.com/mak/projects/robots/norobots.html

Basically you have to create a ROBOTS.TXT file in the root directory of your webserver and allow and disallow access to certain URLs, e.g.

--
User-agent: *                # allow access to any bot
Disallow: /bin/              # may not access any CGI's
Disallow: /query/            # may not access the query pages
--

Or to prevent any bot access:
--
User-agent: *              # any bot
Disallow: /                # f*& off
--

Disclaimer - it's up to the bot author to support the ROBOT.TXT file. If you
still have a bot problem, look at the webserver logs to find the bot's
source and contact the bot owners and lodge a complaint.

regards,
Billy
Received on Mon Mar 02 1998 - 00:00:00 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US