Home » SQL & PL/SQL » SQL & PL/SQL » Help with parsing a full name field (Oracle 10g)
Help with parsing a full name field [message #408556] |
Tue, 16 June 2009 18:05  |
RiverX
Messages: 9 Registered: March 2008
|
Junior Member |
|
|
Hi all,
I was wondering if anyone could help me with the parsing a full name field.
I would like to seperate it into lastname, firstname, middle initial, suffix
Here are some inputs for fullname followed by how i would like for them to be parsed.
name | parsed lastname | parsed firstname | parsed middle initial | parsed suffix |
----------------------------------------------------------------------------------------
PUBLIC, JOHN | PUBLIC | JOHN | NULL | NULL
PUBLIC, CHUN CH KIM | PUBLIC | CHUN CH KIM | NULL | NULL
PUBLIC, MARY L | PUBLIC | MARY | L | NULL
PUBLIC, FRED J JR | PUBLIC | FRED | J | JR
PUBLIC, SUE ELLEN J SR | PUBLIC | SUE ELLEN | J | SR
I have a list of all the suffix values that one is able to enter i.e.
JR, SR, I,II,III,IV,V,VI
I've gotten to a point where i split up the lastname and the rest of the name
but i can't quite figure out how to do the rest.
Here's what i currently have
select id,
name,
substr(name,1, instr(name,',')-1) as lname,
substr(name,(instr(name,',')+1),length(name)) as rest_of_the_name
from my_table
where status='A';
Thanks for any help
[Updated on: Tue, 16 June 2009 23:07] by Moderator Report message to a moderator
|
|
|
|
Re: Help with parsing a full name field [message #408726 is a reply to message #408556] |
Wed, 17 June 2009 07:24   |
 |
Michel Cadot
Messages: 68761 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
SQL> with
2 data as (
3 select 'PUBLIC, JOHN' val from dual
4 union all
5 select 'PUBLIC, CHUN CH KIM' from dual
6 union all
7 select 'PUBLIC, CHUN CH C' from dual
8 union all
9 select 'PUBLIC, MARY L' from dual
10 union all
11 select 'PUBLIC, FRED J JR' from dual
12 union all
13 select 'PUBLIC, SUE ELLEN J SR' from dual
14 ),
15 step1 as (
16 select val,
17 substr(val, 1, instr(val,',')-1) part1,
18 substr(val, instr(val,',')+2)||' ' part2
19 from data
20 ),
21 step2 as (
22 select val, trim(part1) lastname, part2,
23 instr(part2,' ') i1,
24 instr(part2,' ',-1,3) i2,
25 instr(part2,' ',-1,2) i3,
26 trim(substr(part2,
27 instr(part2,' ',-1,3)+1,
28 instr(part2,' ',-1,2)-instr(part2,' ',-1,3)))
29 part2b,
30 trim(substr(part2, instr(part2,' ',-1,2)+1)) part2c
31 from step1
32 )
33 select val, lastname,
34 trim(substr(part2, 1,
35 case
36 when i3 = 0 then length(part2)-1
37 when i2 = 0
38 and ( part2c in ('JR','SR','I','II','III','IV','V','VI')
39 or length(part2c) = 1 )
40 then i3-1
41 when i2 != 0
42 and part2c in ('JR','SR','I','II','III','IV','V','VI')
43 and length(part2b) = 1
44 then i2-1
45 when i2 != 0
46 and part2c not in ('JR','SR','I','II','III','IV','V','VI')
47 and length(part2c) = 1
48 then i3-1
49 else length(part2)-1
50 end)) firstname,
51 case
52 when i3 = 0 then null -- no initial nor suffix
53 when ( i2 = 0 -- suffix not present
54 and part2c not in ('JR','SR','I','II','III','IV','V','VI')
55 and length(part2c) = 1
56 )
57 then part2c
58 when ( i2 != 0 -- suffix present
59 and part2c in ('JR','SR','I','II','III','IV','V','VI')
60 and length(part2b) = 1
61 )
62 then part2b
63 when ( i2 != 0 -- wrong suffix present
64 and part2c not in ('JR','SR','I','II','III','IV','V','VI')
65 and length(part2c) = 1
66 )
67 then part2c
68 end middle,
69 case
70 when part2c in ('JR','SR','I','II','III','IV','V','VI')
71 then part2c
72 end suffix
73 from step2
74 /
VAL LASTNAME FIRSTNAME MIDDLE SUFFIX
------------------------- ---------------------- ----------------------- ----------------------- -----------------------
PUBLIC, JOHN PUBLIC JOHN
PUBLIC, CHUN CH KIM PUBLIC CHUN CH KIM
PUBLIC, CHUN CH C PUBLIC CHUN CH C
PUBLIC, MARY L PUBLIC MARY L
PUBLIC, FRED J JR PUBLIC FRED J JR
PUBLIC, SUE ELLEN J SR PUBLIC SUE ELLEN J SR
6 rows selected.
Regards
Michel
|
|
|
|
Re: Help with parsing a full name field [message #408981 is a reply to message #408755] |
Thu, 18 June 2009 10:19   |
 |
ebrian
Messages: 2794 Registered: April 2006
|
Senior Member |
|
|
Different spin using regex:
SQL> select name
2 , regexp_substr(name, '[^,]+') lastname
3 , case when sf not in ('JR','SR','I','II','III','IV','V','VI') and lt is null
4 then trim(regexp_substr(name, '[^,]*$'))
5 when sf not in ('JR','SR','I','II','III','IV','V','VI') and lt is not null
6 then regexp_replace(name, '(.*, )(.*)( .$)', '\2')
7 when sf in ('JR','SR','I','II','III','IV','V','VI') and lt is not null
8 then regexp_replace(name, '(.*, )(.*)( [^ ] [^ ]*$)', '\2')
9 when sf in ('JR','SR','I','II','III','IV','V','VI') and lt is null
10 then regexp_replace(name, '(.*, )(.*)( [^ ]*$)', '\2')
11 end firstname
12 , case when sf not in ('JR','SR','I','II','III','IV','V','VI') and lt is not null
13 then regexp_substr(name, '.$')
14 when sf in ('JR','SR','I','II','III','IV','V','VI') and lt is not null
15 then lt
16 end middle
17 , case when sf in ('JR','SR','I','II','III','IV','V','VI')
18 then sf
19 end suffix
20 from (
21 select name
22 , regexp_substr(name, ' .( |$)') lt
23 , regexp_substr(name, '[^ ]+$') sf
24 from full_name
25 );
NAME LASTNAME FIRSTNAME MIDDLE SUFFIX
------------------------- ---------------------- ----------------------- ----------------------- ------------
PUBLIC, FRED J JR PUBLIC FRED J JR
PUBLIC, SUE ELLEN J SR PUBLIC SUE ELLEN J SR
PUBLIC, JIM I I PUBLIC JIM I I
PUBLIC, MARY L PUBLIC MARY L
PUBLIC, JOHN PUBLIC JOHN
PUBLIC, CHUN CH KIM PUBLIC CHUN CH KIM
PUBLIC, CHUN CH C PUBLIC CHUN CH C
PUBLIC, BOB SR PUBLIC BOB SR
8 rows selected.
|
|
|
|
|
|
Re: Help with parsing a full name field [message #409053 is a reply to message #408982] |
Fri, 19 June 2009 01:45   |
 |
Michel Cadot
Messages: 68761 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
Here's (about) the query I posted.
The solution is based on splitting the line into words, characterizing each word then regrouping the words into each type and finally line.
SQL> with
2 raw_data as (
3 select 'PUBLIC, JOHN' val from dual
4 union all
5 select 'PUBLIC, CHUN CH KIM' from dual
6 union all
7 select 'PUBLIC, CHUN CH C' from dual
8 union all
9 select 'PUBLIC, MARY L' from dual
10 union all
11 select 'PUBLIC, FRED J JR' from dual
12 union all
13 select 'PUBLIC, SUE ELLEN J SR' from dual
14 union all
15 select 'PUBLIC, BOB SR' from dual
16 ),
17 data as ( -- add space and number the rows
18 select ' '||val||' ' val, rownum rn
19 from raw_data
20 ),
21 lines as ( select level line from dual connect by level < 10 ),
22 splitted as ( -- split rows into words
23 select rn, val, line word_no,
24 count(*) over (partition by rn) nb_word,
25 substr(val,
26 instr(val,' ',1,line)+1,
27 instr(val,' ',1,line+1)-instr(val,' ',1,line)-1
28 - decode(line,1,1,0)
29 ) word
30 from data, lines
31 where line < length(val)-length(replace(val,' ',''))
32 ),
33 typed as ( -- type and number each word
34 select rn, val, nb_word, word,
35 case
36 when word_no = 1 then 'LastName' -- last name
37 when word_no = nb_word -- Suffix
38 and word in ('JR','SR','I','II','III','IV','V','VI')
39 then 'Suffix'
40 when word_no = nb_word -- Initial
41 and length(word) = 1
42 then 'Initial'
43 when word_no = nb_word-1 -- Other initial case
44 and lead(word) over(partition by rn order by word_no)
45 in ('JR','SR','I','II','III','IV','V','VI')
46 and length(word) = 1
47 then 'Initial'
48 else 'FirstName'
49 end type,
50 case
51 when word_no = 1 then 1 -- LastName
52 when word_no = nb_word -- Suffix
53 and word in ('JR','SR','I','II','III','IV','V','VI')
54 then 1
55 when word_no = nb_word -- Initial
56 and length(word) = 1
57 then 1
58 when word_no = nb_word-1 -- Other initial case
59 and lead(word) over(partition by rn order by word_no)
60 in ('JR','SR','I','II','III','IV','V','VI')
61 and length(word) = 1
62 then 1
63 else word_no-1
64 end new_word_no
65 from splitted
66 ),
67 regrouped as ( -- concatenate each word per type
68 select rn, val, type, sys_connect_by_path(word,' ') value
69 from typed
70 connect by prior rn = rn and prior type = type and prior new_word_no = new_word_no-1
71 start with new_word_no = 1
72 )
73 select val,
74 trim(max(decode(type,'LastName',value))) lastname,
75 trim(max(decode(type,'FirstName',value))) firstname,
76 trim(max(decode(type,'Initial',value))) middle,
77 trim(max(decode(type,'Suffix',value))) Suffix
78 from regrouped
79 group by rn, val
80 order by rn
81 /
VAL LASTNAME FIRSTNAME MIDDLE SUFFIX
------------------------ -------- ------------ ------ ------
PUBLIC, JOHN PUBLIC JOHN
PUBLIC, CHUN CH KIM PUBLIC CHUN CH KIM
PUBLIC, CHUN CH C PUBLIC CHUN CH C
PUBLIC, MARY L PUBLIC MARY L
PUBLIC, FRED J JR PUBLIC FRED J JR
PUBLIC, SUE ELLEN J SR PUBLIC SUE ELLEN J SR
PUBLIC, BOB SR PUBLIC BOB SR
7 rows selected.
Regards
Michel
|
|
|
|
|
|
Re: Help with parsing a full name field [message #409103 is a reply to message #409101] |
Fri, 19 June 2009 07:17   |
Frank
Messages: 7901 Registered: March 2000
|
Senior Member |
|
|
Michel Cadot wrote on Fri, 19 June 2009 14:14 | Frank wrote on Fri, 19 June 2009 09:03 | How can you ever tell whether "I" is a middle initial or a suffix?
Formatting free-format fields afterwards always result in ambiguous results.
|
This has realy been asked by c_stenersen in the very first answer.
For myself, I give priority to suffix in the queries I posted (in short, test in suffix list, before test length=1). Of course, this is totally arbitrary but my purpose was to show how to build a solution (several of the subqueries in my replies can be merged for a shorter code but intentionally left to show the steps).
Regards
Michel
|
Ah, but I was not referring to your code. Just more in general pointing to the trouble with these approaches.
|
|
|
Re: Help with parsing a full name field [message #409134 is a reply to message #409103] |
Fri, 19 June 2009 10:40  |
 |
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
I seem to recall about 10 years ago, someone wrote a name parsing routine for oracle. It was supposed to be "smart" somehow. But... my memory fails me as to how to find it. Anyone else remember this?
Kevin
|
|
|
Goto Forum:
Current Time: Sun Jul 13 00:47:29 CDT 2025
|