Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: SQL Loader question

Re: SQL Loader question

From: Darren Brock <brock_at_governet.net>
Date: Fri, 01 Oct 1999 13:23:01 -0600
Message-ID: <37F50A15.6DE9DDB3@governet.net>


If you are dealing with several such documents that have this format I would consider parsing it into a more wieldy format. We just did something similar and used perl to parse the docs into a delimited format that was easily loaded.

Darren

Fusheng LI wrote:
>
> Hi, There, need your help urgently. Below is a gene's sequence(one record).
> My question is: can I use the SQL Loader to select the text after ACCESSION
> (in this one, it is 'M73473') and text after ORIGIN (multi-line) as two
> field into my oracle table? I have tried to write a control file to load
> into database, but it failed. Any suggestion?
> Thanks,
>
> LOCUS EKEGAG 177 bp DNA VRL 11-APR-1996
> DEFINITION HIV-1 individual EKE from Congo, gag gene, p6 region.
> ACCESSION M73473
> NID g327340
> KEYWORDS gag protein.
> SOURCE Human immunodeficiency virus type 1 DNA.
> ORGANISM Human immunodeficiency virus type 1
> Viridae; ss-RNA enveloped viruses; Positive strand RNA virus;
> Retroviridae; Lentivirinae.
> REFERENCE 1 (bases 1 to 177)
> AUTHORS Candotti, D., Chappey, C., Rosenheim, M., M'Pele, P., Huraux,
> J.M., Agut, H.
> TITLE High variability of the gag/pol transframe region among HIV-1
> isolates
> JOURNAL C. R. Acad. Sci. III, Sci. Vie 317 (2), 183-189 (1994)
> MEDLINE 95086896.
> REFERENCE 2 (bases 1 to 177)
> AUTHORS Candotti, D., Jung, M., Kerouedan, D., Rosenheim, M.,
> Gentilini, M., M'Pele, P., Huraux, J.M., Agut, H.
> TITLE Genetic variability affects the detection of HIV by polymerase
> chain
> reaction.
> JOURNAL AIDS (8): 1003-1007 (1991)
> MEDLINE 92134646.
> COMMENT While this gag p6 region sequence clustered with subtype G in
> phylogenetic analysis, the env V3 region from the same patient
> clustered with subtype E (accession AF082313). This could be
> the first EG recombinant identified to date (July 1999) or
> it could be from a dual infection. A third possibility is
> that it represents a non-recombinant genome, similar to that
> from which the AE(CM240) circulating recombinant form is
> postulated to have arrisen. More data and complete genome
> sequencing would be required to help identify which case
> applies.
>
> In phylogenetic analysis at the HIV-DB, EKE-gag-p6 clusters
> with the AGI-recombinant Z321 (accession U76035), and the
> EKE-env-V3 branched off the CRF AE(CM240) clade slightly
> closer to the M-group root than isolates from the Central
> African Repulic do.
> FEATURES Location/Qualifiers
> source 1..177
> /organism="Human immunodeficiency virus type 1"
> /proviral
> CDS 1..177
> /partial
> /gene="gag"
> /codon_start=1
> /db_xref="PID:g327341"
>
> /translation="NFLGKIWPSNKGRPGNFLQNRPEPTAPPAESFETKEEITSSQKQ
> DPRDKELYPLTSLRS"
> BASE COUNT 55 a 45 c 43 g 34 t
> ORIGIN
> 1 aattttttag ggaaaatttg gccttccaac aaggggaggc cagggaattt tcttcagaac
> 61 aggccagagc caacagcccc acccgcagag agcttcgaga cgaaggagga gataacctcc
> 121 tctcagaagc aggatccgag ggacaaggaa ttatatccct taacttccct cagatca
> //
Received on Fri Oct 01 1999 - 14:23:01 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US