Re: c.d.theory glossary (repost)

From: Anne & Lynn Wheeler <lynn_at_garlic.com>
Date: Sat, 15 May 2004 09:14:38 -0600
Message-ID: <ulljt4t9d.fsf_at_mail.comcast.net>


mAsterdam <mAsterdam_at_vrijdag.org> writes:
> Narrowing this down:
>
> The glossary is a list of items that led to mutual misunderstandings
> in the c.d. theory newsgroup. It is built from contributions. The
> newsgroup uses terms from database design, implementation, operation
> and change management, cost sharing, productivity research, indexing
> and cataloging database literature, and /or basic databse research.
>
> The glossary's purpose is to limit lengthy misunderstandings. It
> consists of signposts: watch out! You may think the OP means A but
> she might mean B. Alternative names and views of the same concept
> are only introduced when the danger of mutual misunderstandings is
> appearant. When context matters, it is provided. The glossary is a
> highly biased list of problematic concepts.

slight drift ... the nlm has books, articles, papers ... there is essentially an online (card?) catalog for the library. umls is sort of the structured set of words used for the catalog. it is sort of structured into somewhat hierarchy of concepts, terms, and word sequences. however there is also mesh of complex many-to-many relationships between concepts. there are tens of thousands of concepts, hundreds of thousands of terms, and millions of word sequences.

this is sort of independent of having any definitions for the concepts, terms, and/or word. if you have a set of words that you might want to look for a article with ... umls gives other related words, terms, and/or concepts that might also be used to search for articles.

it is also used by the people cataloging the library ... lots of listed terms and word sequences have preferred relationships, i.e. if an article abstract contains certain set of terms and/or word sequences, there are guidelines about preferred terms to be used for indexing/cataloging. this structure of preferred/nonpreferred relationships can also be used for people looking up entries in the catalog

at this level, umls is effectively the structure used for understanding the cataloging of the articles (as opposed to understanding the articles themselves).

there was some statement that nlm reached the state of many current search engines possibly by the late '70s. a boolean term search would be quite bimodal, at six to seven terms there could still be hundred thousand hits ... but adding one more term dropped the number of hits to zero. the holy grail was finding magic combination of five to eight terms that resulted in 50-100 hits. in the early 80s, an online interface (grateful med) was developed that by default didn't ask for the hits but just the number of hits. then a 2-3 day task might be to discover the magic query combination that resulted in a reasonable hit result (say greater than zero but less than several hundred).

-- 
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
Received on Sat May 15 2004 - 17:14:38 CEST

Original text of this message