ASP: Request for Comment: Image retrieval database

From: abracad <abracad_at_hotmail.com>
Date: Tue, 10 Jul 2001 21:46:16 GMT
Message-ID: <3b4b6b62.639638_at_news.freeserve.net>


I have designed a search engine with ASP/VBScript for retrieving and displaying images and would be grateful for comments and suggestions for improvement.

The database structure is of the form:

photo---phkw--keyword
|
|

location--locPlace--places

photo consists of ID, description, locID phkw consists of phID, kwID
keyword consists of keyword, ID
location consists of location, ID
locPlace consists of locID, placeID
places consists of placeName, ID

location refers to a fully qulaified location, e.g. Burnley, Lancashire, England, Britain, UK
placeName is a single term e.g. New York, Asia, Brighton

The functioning of the search engine is as follows:

The user enters a search string in a text box. The string is parsed using VBScript to 1) remove apostrophes (to adhere to SQL syntax), 2) split it into individual (space delimited) words, 3) remove non-alphanumeric characters from the start and end of each word. The individual words are treated as search terms, additional search terms are created by combining consecutive words into pairs (e.g. if 3 words were entered 1+2 and 2+3 would be combined), this is to take account of two-word phrases.

An SQL query is then constructed for each search term thus SELECT photo.ID, description, location
FROM photo, phkw, keyword, location
WHERE keyword=searchTerm AND photo.ID=phkw.phID AND phkw.kwID=keyword.ID AND photo.locID=location.ID UNION ALL
SELECT photo.ID, description, location
FROM photo, location, places, locPlace
WHERE place=searchTerm AND locPlace.placeID=place.ID AND location.ID=locPlace.locID AND photo.locID=location.ID

The queries for all the search terms are joined with UNION ALL and the final query terminated with ORDER BY photo.ID

The query returns one row for each image that matches a search term. UNION ALL ensures that images that match n search terms are returned n times. This is for the purpose of ranking the results in order of best match.

A recordset is then created in ASP and stepped through. Three strings are constructed during this process, one holds the unique ID of matching images, a second holds the caption (description + location) corresponding to the ID, and a third holds the score corresponding to the ID (i.e. the number of times it was retrieved).

The arrays are then stepped through a number of times with a variable of decreasing value being tested against the score of each image to determine order of display. In pseudocode:- FOR I=MAXSCORE DOWNTO 1
  FOR J=1 TO NUMOF IMAGES
    IF SCORE(J)=I
      DISPLAY IMAGE + CAPTION
    ENDIF
  NEXT J
NEXT I In order to keep the number of images displayed per page to a manageable number a maximum of 30 images is displayed per page. Where more images exist the user is offered a "more" link which passes the page number and search string back to the same ASP page via the query string.   Received on Tue Jul 10 2001 - 23:46:16 CEST

Original text of this message