Re: Looking for "gaps" (from a time perspective) in data

From: jeremy <jeremy0505_at_gmail.com>
Date: Fri, 10 Oct 2008 06:43:09 -0700 (PDT)
Message-ID: <62673eda-fc5d-479b-a962-cd57cf587054@u75g2000hsf.googlegroups.com>


On Oct 10, 12:19 pm, Robert Klemme <shortcut..._at_googlemail.com> wrote:
> On 10.10.2008 11:53, jeremy wrote:
>
>
>
> > This is on 10gR2 standard edition.
>
> > We have a table defined like this:
>
> > txn_id
> > date_start
> > date_end
> > url_called
>
> > which contains a record for every time our application processes a
> > request from a browser. We would like to be able to spot any "gaps" in
> > the data - which would indicate a period when the service was
> > unavailable for whatever reason.
>
> > To do this we need to do a comparison between the date_start of each
> > row and the one that preceded (or followed)  it and then see if the
> > period of time is greater than a to-be-determined threshold.
>
> > Can a single SQL statement be constructed which could achieve this? if
> > so could you point me in the right direction? For those who haven't
> > followed what I was asking about, if you are in the majority then
> > please accept my apologies for not being more concise :)
>
> Yes, this should be possible with analytic SQL.  You define a sliding
> window of one row preceding ordered by date_start and take the
> difference of date_start - min(date_start).
>
> An alternative might be to define the window based on a range of dates
> preceding (i.e. use you threshold) and emit all rows where the count is
> 1 (i.e. there is no other record within the threshold time).
>

Ended up with a query like this:

select *
from (
  select txn_id, web_site_id, username, date_start,

         lag(date_start) over (order by txn_id) prev_date_start,
         lead(date_start) over (order by txn_id) next_date_start,
         url

  from system_activity_log
  where date_start >= sysdate-5
  and ip_address is not null
  order by 1) tabview
where date_start - prev_date_start > 1/1440 and to_number(to_char(date_start,'hh24')) between 6 and 22;

Thanks for your comments.

--
jeremy
Received on Fri Oct 10 2008 - 08:43:09 CDT

Original text of this message