Re: Logical and physical structure of time series (in ORACLE)

From: Peter Wieland <wieland_at_scn.de>
Date: 1997/02/25
Message-ID: <3313172F.3E27@scn.de>#1/1

Tim Spellman wrote:
>
> In article <32FF343A.4B36_at_scn.de>, Peter Wieland <wieland_at_scn.de> writes:
> > I try to find a good solution for a logical and physical database
> > design for the following problem in ORACLE:
> >
> > I get every minute a huge amount of pairs like (s;v) into my database,
> > with
> >
> > s: Signal-ID (integer)
> > v: value for this signal (float)
> >
> > that means a time series arrives at my database:
> >
> > t1: (s1;v1) (s2;v2) (s3;v3) ... (s4000;v4000)
> > t2: (s1;v1) (s2;v2) (s3;v3) ... (s4000;v4000)

> One way to handle compressing time series data that makes SELECT statements
> easier to write is to have a pair of time columns, one for the begin time,
> and one for the end time. That is, the first time is the time the value
> became valid, and the end time is the time that value expired. Then your
> table becomes:
>
> id pk
> begin_time pk
> end_time pk
> value
>
> The PK on end_time is to make SELECTs fast.
>
> > However, I usually have the following select:

> > Select *
> > from "gigant_table"
> > where ID = 123
> > and time > "12.3.97"
> > and time < "15.3.97";
>
> This becomes:
>
> Select *
> from gigant_table
> where ID = 123
> and begin_time > "12.3.97"
> and end_time < "15.3.97";
>

Tim,

thanks for your response, but I assume you will get the following problem:

assume time sequence t1 < t2 < t3 < t4 < t5

You store

id: 123
begin_time: t1
end_time: t5
value: 456

You form a select with:

         Select *
         from gigant_table
         where ID = 123
         and begin_time > t2
         and end_time < t4

oops, the id 123 is not selected even if the value 456 WAS valid at t3 !!!

The way out: introduce a "maximum time interval" where the data is not stored, i.e. a value is stored at least each x (say 60) minutes ...

Than you can do:

         Select *
         from gigant_table
         where ID = 123
         and begin_time > t3 - x
         and end_time < t3 + x

sort the result and jump into the interval of need. But than this "jump" is executed by the extract application and if you do this, you can also assume that each value is valid until a new value comes into the system, i.e. you don't have to store the "end_time".

Am I wrong?

Peter

        Peter Wieland -- http://www.iai.fzk.de/~wieland/
 o                                        o
 O\/    mailto:wieland_at_scn.de      o===L_//   Home:+47/22357757

_L/__, Remember: Telemark is it! (_) (_) Job:+47/22633343

Received on Tue Feb 25 1997 - 00:00:00 CST