Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> SQLLDR question
Hi,
I have file that contains PROTEIN sequences in it. I will show the
format below. It has about 2million records in it. I wanted to know if
someone can suggest a sqlldr control file that would help in uploading
the data that I need to upload. I don't want all the data from this
file but only some data. I tried to figure out on my own but i
couldn't do it..
First look at the example of this file: Looks complex but is very simple.
...............................
There two things in this file for each record.
One is Header: that starts with ">"
Two is SEQUENCE: lines after header till the next header (letters in
CAPS).
The next record again start with a ">". and so on..
I am interested to pick only two fields out of each records:
1. GI number: that is the number between ">gi|" and "|emb|" eg:2695851
(for firt record).
2. Sequence: example, line numbers 2,3, and 4 of the first record.
MGILTA......NLEKL
Can anyone write an sqlldr control file that can extract this info and
put in the following table:
create table sequences
(
gi_number NUMBER NOT NULL,
sequence CLOB NOT NULL
);
Please notice the second column is a clob. because the sequence might go beyond 4000 characters sometimes so i can't fit it into varchar.
I would appreciate any help and sorry if its not the right place to post such messages. Please let me know where can post this message if not here.
Thank you very much,
Cheers,
Dina
Received on Tue Nov 23 2004 - 14:35:22 CST