Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: SORTING PROBLEM WHEN NUMBERS ARE MIXED WITH LETTERS

Re: SORTING PROBLEM WHEN NUMBERS ARE MIXED WITH LETTERS

From: Martin Haltmayer <Martin.Haltmayer_at_d2mail.de>
Date: Thu, 27 Dec 2001 23:35:41 +0100
Message-ID: <3C2BA23D.214539CF@d2mail.de>


Hi Rob,

no, this description either does not fully qualify for a complete ordering, i. e., an order description where you always can compare either two possible strings.

How should the following list be ordered?

111aaa3
111aaa22
111aaa111

According to rule 2. we would receive

111aaa111
111aaa22
111aaa3

Next set to be ordered:

bbb111aaa3
bbb111aaa22
bbb111aaa111

According to rule 3 it should be ordered

bbb111aaa111
bbb111aaa22
bbb111aaa3

I guess in these two cases this is not your desired result but

111aaa111
111aaa22
111aaa3

and

bbb111aaa111
bbb111aaa22
bbb111aaa3

respectively.

I suppose you want to do an ordering that is done the following way:

(a) Each string is separated into several order fields. An order field is a consecutive sequence of digits or non-digits. (b) If an order field is a digit-order field then it is to be sorted numerically. Otherwise it is to be sorted ASCII-wise. (c) If you have to order two rows that do not match in the type of order fields (this can only happen if you have to compare a number and a text) you insert a dummy number before the text. This dummy number is either MINVALUE (then leading text will be prefixed with a dummy MINVALUE which results in text sorted before numbers, violation of your requirement 1.) or MAXVALUE which results in pure text ordered after the same text plus a number (violation of 3.). So you have the choice between pestilence and cholera. You could change this to "use dummy MAXVALUE if it is the beginning of the row and MINVALUE else". This would solve your dilemma but it is too much for this newsgroup. You may contact me by e-mail.

Regards,

Martin

Rob Earl wrote:
>
> Hi Martin,
>
> You answer is very very close to what I need. However, maybe my
> description was a little off when I wrote that e-mail. I'll try to
> re-outline the sorting rules so that maybe a slight change to your sql
> statement will take care of it:
>
> 1. All data beginning with a number should be first in the sorting
> order and should be in numeric order, not ASCII order. (this is
> already accomplished quite nicely by your sql statement)
>
> 2. If the data first contains numbers, and then letters, it should be
> sorted in numeric order and then in ASCII order (this is also already
> accomplished nicely by your sql statement).
>
> 3. All data beginning with a letter should be after the data that
> begins with a number. This data should be sorted in ASCII order.
> HOWEVER, if this data has letters followed by numbers, then it should
> first be sorted in ASCII and then numeric order. (this is where I
> think my explaination got screwed up before). For example:
> ab
> ab1
> ab27
> abb
> abc
> abc1
> abc2
> abcd1
>
> I guess the way I was showing the sort order before was not entirely
> correct....sorry about that. Basically what this boils down to is
> that numbers need to be sorted in numeric order and letters in ASCII
> order. When ASCII sorting, I'm guessing that capitals will come
> first. Here's a bigger example:
>
> 1
> 2
> 5
> 10
> 100
> 254
> 254a
> 254b
> 254reject
> ab
> ab1
> ab2
> abc
> abcc
> abcc2457
> abcd1
>
> Does that make more sense? Thanks for your help so far. Hopefully
> this is an easy change your to your sql statement. I will try to
> figure it out myself, but if you can help in any way I would
> appreciate it. Again, sorry for any confusion.
>
> Thanks,
> Rob
>
> Martin Haltmayer <Martin.Haltmayer_at_d2mail.de> wrote in message news:<3C239F67.8036792E_at_d2mail.de>...
> > Hi Rob,
> >
> > according to your sorting rules you should receive
> >
> > 60
> > 61
> > 62
> > 63
> > 136b
> > 137
> > 138
> > 220
> > 238
> > 239
> > TESTRUN2
> > abc2
> > abc10
> > TESTRUN
> > TestRun
> > ab
> >
> > Remember your last rule: "If the letter lines have numbers, they would proceed
> > any lines that have ONLY letters."
> >
> > The following sql statement achieves this sorting:
> >
> > define maxlen = 40
> > set linesize 10000
> > set pagesize 0
> > set trimspool on
> > -- column sortpos1 noprint
> > -- column sortpos2 noprint
> > -- column sortpos3 noprint
> > select name
> > -- If the name starts with a digit, they are ordered first.
> > , decode (
> > sign ((ascii (substr (name, 1, 1)) - (ascii ('0') - 1)) * (ascii (substr
> > (name, 1, 1)) - (ascii ('9') + 1)))
> > , - 1, 0 -- first letter is a digit
> > , 1 -- first letter is not a digit
> > ) as sortpos1
> > , decode (
> > sign ((ascii (substr (name, 1, 1)) - (ascii ('0') - 1)) * (ascii (substr
> > (name, 1, 1)) - (ascii ('9') + 1)))
> > , - 1, to_number (translate (
> > utl_raw.cast_to_varchar2 ( -- conversion back to normal string
> > utl_raw.bit_xor ( -- masking out
> > utl_raw.cast_to_raw (translate (name, '0123456789', lpad (chr (0), 10, chr
> > (0)))) -- our mask
> > , utl_raw.cast_to_raw (name) -- the string to be masked
> > )
> > )
> > , ' ' || chr (0) -- our result of masking
> > , ' ' -- our result of masking
> > ))
> > , to_number (null)
> > ) as sortpos2
> > , decode (
> > translate (
> > utl_raw.cast_to_varchar2 ( -- conversion back to normal string
> > utl_raw.bit_xor ( -- masking out
> > utl_raw.cast_to_raw (translate (name, '0123456789', lpad (chr (0), 10, chr
> > (0)))) -- our mask
> > , utl_raw.cast_to_raw (name) -- the string to be masked
> > )
> > )
> > , 'x' || chr (0), 'x'
> > )
> > , null, 1 -- name does not contain digits. These come last.
> > , 0 -- name contains digits. These come first
> > ) as sortpos3
> > , to_number (translate (
> > utl_raw.cast_to_varchar2 ( -- conversion back to normal string
> > utl_raw.bit_xor ( -- masking out
> > utl_raw.cast_to_raw (translate (name, '0123456789', lpad (chr (0), 10, chr
> > (0)))) -- our mask
> > , utl_raw.cast_to_raw (name) -- the string to be masked
> > )
> > )
> > , ' ' || chr (0) -- our result of masking
> > , ' ' -- our result of masking
> > )) as sortpos4
> > from table1
> > order by sortpos1, sortpos2, sortpos3, sortpos4
> > /
> >
> > maxlen is the maximum length name can be.
> >
> > Regards,
> >
> > Martin
> >
> >
> >
> > Rob Earl wrote:
> > >
> > > Recently I have been presented with a problem of figuring out how to
> > > display information from a table which has a VARCHAR2 field that can
> > > contain numbers as well as alphanumeric characters. When just doing a
> > > simply ORDER BY the fields is sorted in an ASCII way in which numbers
> > > are out of numeric order and are in order of there ASCII
> > > representation. I understand why the problem is occurring, but I'm
> > > stuck trying to find an easy way in SQL to sort these values properly.
> > > To make things a little tricker, there can be a mixture of
> > > alphanumeric characters as well as numbers. Here is some example data
> > > as seen with just a plain ORDER BY sort:
> > >
> > > 136b
> > > 137
> > > 138
> > > 220
> > > 238
> > > 239
> > > 60
> > > 61
> > > 62
> > > 63
> > > TESTRUN
> > > TESTRUN2
> > > TestRun
> > > ab
> > > abc10
> > > abc2
> > >
> > > Through searching the USENET I did manage to come up with an SQL
> > > statement that comes really really close to solving the problem:
> > >
> > > select NAME, nvl( trim (trailing from translate( lpad(NAME,40,' '),
> > > '1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',
> > > '1234567890' ) ), NAME) new_a
> > > from table1
> > > order by new_a;
> > >
> > > That statement returns the following:
> > >
> > > 60
> > > 61
> > > 62
> > > 63
> > > 137
> > > 138
> > > 220
> > > 238
> > > 239
> > > 136b
> > > abc2
> > > abc10
> > > TESTRUN2
> > > TESTRUN
> > > TestRun
> > > ab
> > >
> > > This was really close, but what the statement works best with is if
> > > the field only contains JUST numbers or JUST letters. If there is a
> > > mixture of both it kind of gets screwed up. The line out of place is
> > > the 136b. It really should be right before the 137 line.
> > >
> > > This seems like it should be a real common problem, considering this
> > > type of data must exist in many systems. Does anyone out there have
> > > any idea how I could possibly get my data to sort properly? When I
> > > say properly, I mean that it would be sorted like the following:
> > >
> > > 60
> > > 61
> > > 62
> > > 63
> > > 136b
> > > 137
> > > 138
> > > 220
> > > 238
> > > 239
> > > abc2
> > > abc10
> > > TESTRUN2
> > > TESTRUN
> > > TestRun
> > > ab
> > >
> > > In a nutshell, this means that to be sorted properly all lines
> > > starting with a number would come first, followed by lines starting
> > > with letters. If the letter lines have numbers, they would proceed
> > > any lines that have ONLY letters. This seems like a fairly normal
> > > sort order to me.
> > >
> > > Thanks in advance for any help. I appreciate it greatly.
> > >
> > > Rob
Received on Thu Dec 27 2001 - 16:35:41 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US