Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Save Information to DB at Crawling

Save Information to DB at Crawling

From: mich dobelman <sss_at_ddfd.com>
Date: Wed, 16 Aug 2006 09:33:53 GMT
Message-ID: <5EBEg.8354$Ch.2133@clgrps13>


I am trying to make a crawling program to grab information and store them into the database.
The web site is structured as following.

REGION

        CATEGORY
                    PROPERTY LISTING

In the site there are about 50 regions each region has 20 category or less and at the maximum one category
can be as many as 2000( can display 20 property for each page). In order to get all information, my crawler is going to each property page using regular expression
to extract specific info( Price, BR, Contact Info etc)..

I have problem to decide when and where I can save it to database. Note that this crawler is scheduled to go to the website to get info every day and if the property information is not changed from last modification date the crawler is going to skip the property.

I create the following tables to store those information

Region Table
ID, Region Name

Category Table
ID(1~20), Category Name

Property Table
ID, Category, Name, Address, Price, Contact Info, Bed Rooms, Contact Info, Location(Lat), Location(Lon) Received on Wed Aug 16 2006 - 04:33:53 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US