Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Save Information to DB at Crawling

Re: Save Information to DB at Crawling

From: DA Morgan <damorgan_at_psoug.org>
Date: Wed, 16 Aug 2006 07:22:42 -0700
Message-ID: <1155738162.637701@bubbleator.drizzle.com>


mich dobelman wrote:
> I am trying to make a crawling program to grab information and store them
> into the database.
> The web site is structured as following.
>
> REGION
> CATEGORY
> PROPERTY LISTING
>
> In the site there are about 50 regions each region has 20 category or less
> and at the maximum one category
> can be as many as 2000( can display 20 property for each page). In order to
> get all information, my crawler is going to each property page using regular
> expression
> to extract specific info( Price, BR, Contact Info etc)..
>
> I have problem to decide when and where I can save it to database. Note that
> this crawler is scheduled to go to the website
> to get info every day and if the property information is not changed from
> last modification date the crawler is going to skip
> the property.
>
> I create the following tables to store those information
>
> Region Table
> ID, Region Name
>
> Category Table
> ID(1~20), Category Name
>
> Property Table
> ID, Category, Name, Address, Price, Contact Info, Bed Rooms, Contact Info,
> Location(Lat), Location(Lon)

There is insufficient information here to offer you much advice other than to suggest you take a class on Oracle PL/SQL programming and that, in the future, you always post information about tools, operating systems, hardware (when appropriate), and versions.

To design your schema would require a complete copy of the business rules and about $250/hr. Others will undoubtedly be less expensive. ;-)

-- 
Daniel A. Morgan
University of Washington
damorgan_at_x.washington.edu
(replace x with u to respond)
Puget Sound Oracle Users Group
www.psoug.org
Received on Wed Aug 16 2006 - 09:22:42 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US