Re: Searching Google n-gram corpus

From: Bob Stearns <rstearns1241_at_charter.net>
Date: Sun, 16 Sep 2007 01:45:10 -0400
Message-ID: <6q3Hi.116$4z1.9_at_newsfe02.lga>


Shield wrote:

> compressing the data would take allot of time. time taken away from
> the actual experiment.
>
> what would be the fastest way to using the dataset, using the same
> conditions of searching for occurances?
>
>
>
The payoff on the compression & reindexing is less than 100 straight searches on the original corpus. If your experiment is that small (or even 1000 searches) just do the in the brute force way. Received on Sun Sep 16 2007 - 07:45:10 CEST

Original text of this message