Import of large CSV datasets in Java

From: Srinivas <ps_chowdary_at_hotmail.com>
Date: 11 Aug 2003 09:09:35 -0700
Message-ID: <68dbad24.0308110809.fab32ed_at_posting.google.com>



[Quoted] Like many other enterprise applications, in our application we need to export and import of large data in CSV format. As many people suggested in the newsgroups we may use native tools such as SQL*Loader for performance reasons.

But in our application we have internally generated sequence number as PK. So we will not have luxury of using these tools directly.

For example

Student table
student_id PK Number(38,0)
student_tag UK Varchar2(40)
..

Course table
course_id PK Number(38,0)
course_tag UK Varchar2(4)
..

Student_Course table
student_id
course_id
..

If we need to import student table, then we may have only student_tag and other info and we need to find student_id (using seq_student.nextval or internally maintain seq number and refresh the count at the end of the import)

If we need to import student_course table, we need to find student_id and course_id's corresponding to student_tag and course_tag given in the CSV file of student_course table. This is where the performance problem lies.

[Quoted] Lookup of ids for tags takes long time when we process the data in batches of 100 or less. One idea could be first fill in all the IDs in the tmp_student_course.csv file (copy of student_couese.csv but without TAGs) and then import using sql*Loader.

Can any one suggest alternative design ideas or third party APIs to accomplish this? Any pointers will be greatly appreciated.

thanks,
Srinivas Received on Mon Aug 11 2003 - 18:09:35 CEST

Original text of this message