Home » SQL & PL/SQL » SQL & PL/SQL » Merge data to a table based on criteria (Oracle 11)
Merge data to a table based on criteria [message #624167] |
Thu, 18 September 2014 12:04 |
|
Member2014
Messages: 33 Registered: July 2014 Location: USA
|
Member |
|
|
I have a requirement to update data in a table based on the below rules.
Can anyone help me as to how this can be done in a single SQL statement?
Thanks in advance.
Table T1: Has PK on ID column
ID Name Salary Data
-------------------------
1 Test1 1000 201401
2 Test2 2000 201501
3 Test3 3000 201301
4 Test4 4000
10 Test10 500 200101
Table T2: Has no PK's and will have a subset of ID's from table T1 + Few new records
Seq_no Name ID Salary Data
---------------------------------------
100 Test11 1 2000 201612
101 Test11 1 2000 201401
102 Test22 2 1050 201501
103 Test22 2 1050 200108
104 Test44 4 3500 200002
105 Test55 5 5500 201205
106 Test66 6 1200 201201
107 Test10 10 700
The requirement is the merger the data from table T2 into T1
- For a given ID pick the record with max(seq_no)
- Update all data from the max record into the table T1 using ID to match
- But the data column should be picked as the highest value for a given ID in the T2 table, irrespective of the seq_no value
- If a null value comes in the data column in T2, value in T1 should be retained
So the outout should look like:
ID Name Salary Data
-------------------------
1 Test11 2000 201612 - Data picked from lower deq_num record
2 Test22 1050 201501 - Data picked from lower deq_num record
3 Test33 3000 201301
4 Test44 3500 200002 - Data updated with new record
5 Test55 5500 201205
6 Test66 1200 201201
10 Test10 700 200101 - T2 has null but data is retained as in T1
|
|
|
|
Re: Merge data to a table based on criteria [message #624171 is a reply to message #624168] |
Thu, 18 September 2014 12:42 |
|
Member2014
Messages: 33 Registered: July 2014 Location: USA
|
Member |
|
|
Apologies for not following up on the earlier posts. I will practice in the future for the time others spend to help me.
Could you elaborate on what I can add as a test case, in addition to the examples I have provided above?
I have tried a two step process:
1. Merge data from higher seq_num into T1
merge into t1 o
using (select rowid as rid,
id as i_id,
name as i_name,
salary as i_salary,
data as i_date
from t2
where rowid in (select max(rowid) from t2 group by indiv_id))
on (o.id = i_id)
WHEN MATCHED THEN
UPDATE
SET o.name = i_name,
o.salary = i_salary,
o.data = i_date
When not matched then
insert (id, name, salary, data)
values (i_id, i_name, i_salary, i_data);
2. Get the highest value for each ID and update the T1 table:
MERGE INTO T1 i
USING (
select distinct o1.id as o_id,
o1.data as o_data
from T2 o1 join T2 o2
on (o1.id = o2.id
and nvl(o1.data,'0') > nvl(o2.data,'0'))
) o
ON (i.indiv_id = o_id)
WHEN MATCHED THEN
UPDATE SET
i.data = nvl(o_data,i.data);
Both T1 and T2 tables are very huge. This is taking a long time.
Also, I am a beginner and wanted to learn if there is an easier way to do this.
Thanks in advance.
|
|
|
|
|
|
Re: Merge data to a table based on criteria [message #624178 is a reply to message #624176] |
Thu, 18 September 2014 13:31 |
|
Member2014
Messages: 33 Registered: July 2014 Location: USA
|
Member |
|
|
I have mentioned the below details in the earlier post:
- Table structures
- Test data in tables
- Problem definition/ explanation of what the requirement is
- Examples of expected result
- Queries I have tried so far.
Can you please provide me an example of what a test case looks like? I looked at a few other posts and am not sure what I am missing.
Cos our solution does not have a frontend application. Its more like DW.
[Updated on: Thu, 18 September 2014 13:33] Report message to a moderator
|
|
|
|
Re: Merge data to a table based on criteria [message #624186 is a reply to message #624182] |
Thu, 18 September 2014 13:57 |
|
Member2014
Messages: 33 Registered: July 2014 Location: USA
|
Member |
|
|
Thanks for pointing me to what I was missing:
T1 Create script:
create table T1
(id number,
name varchar2(100),
salary number,
data varchar2(6) );
ALTER TABLE T1 ADD PRIMARY KEY (id);
insert into T1 values(1 , 'Test1' , 1000, '201401');
insert into T1 values(2 , 'Test2' , 2000, '201501');
insert into T1 values(3 , 'Test3' , 3000, '201301');
insert into T1 values(4 , 'Test4' , 4000, NULL);
insert into T1 values(10, 'Test10', 500 , '200101');
T2 create script:
create table T2
(seq_no number,
id number,
name varchar2(100),
salary number,
data varchar2(6));
insert into T2 values (100, 1 ,'Test11', 2000, '201612');
insert into T2 values (101, 1 ,'Test11', 2000, '201401');
insert into T2 values (102, 2 ,'Test22', 1050, '201501');
insert into T2 values (103, 2 ,'Test22', 1050, '200108');
insert into T2 values (104, 4 ,'Test44', 3500, '200002');
insert into T2 values (105, 5 ,'Test55', 5500, '201205');
insert into T2 values (106, 6 ,'Test66', 1200, '201201');
insert into T2 values (107, 10,'Test10', 700 , NULL);
The requirement is the merger the data from table T2 into T1
- For a given ID pick the record with max(seq_no)
- Update all data from the max record into the table T1 using ID to match
- But the data column should be picked as the highest value for a given ID in the T2 table, irrespective of the seq_no value
- If a null value comes in the data column in T2, value in T1 should be retained
Expected result:
ID Name Salary Data
-------------------------
1 Test11 2000 201612 - Data picked from lower seq_no record but with highest "data" value for the ID 1
2 Test22 1050 201501 - Data picked from lower seq_no record but with highest "data" value for the ID 1
3 Test33 3000 201301
4 Test44 3500 200002 - Data updated with new record
5 Test55 5500 201205
6 Test66 1200 201201
10 Test10 700 200101 - T2 has null but data is retained as in T1
|
|
|
Re: Merge data to a table based on criteria [message #624340 is a reply to message #624186] |
Mon, 22 September 2014 03:51 |
|
Maaher
Messages: 7065 Registered: December 2001
|
Senior Member |
|
|
Well, you're looking for a way to get the data of table T2 records related to the highest "DATA" (unfortunate column name, by the way) column value within each group with the same ID. One approach is to RANK the records. Something like this:
Select id
, seq_no
, salary
, name
, data
From ( select id
, seq_no
, salary
, name
, data
, dense_rank() over ( partition by id order by data desc) dr
from t2
) Now, if you add a where condition limiting the output to the first rank (DR=1) you can use that query as a data source for your merge operation. I do have some questions though.
I don't understand the "max(seq_no)" part. Do you mean you want the column values of T2 records with the highest seq_no EXCEPT for the DATA column, where you want the highest DATA value instead of the value of the record with the highest seq_no? If so, my example won't do help you out immediately.
MHE
|
|
|
Re: Merge data to a table based on criteria [message #624345 is a reply to message #624186] |
Mon, 22 September 2014 05:16 |
Lalit Kumar B
Messages: 3174 Registered: May 2013 Location: World Wide on the Web
|
Senior Member |
|
|
By any chance, is the "DATA" column actually a DATE with 'YYYYMM' format? And you want to fetch the rows with MAX DATE? If yes, then the data type needs to be first changed from VARCHAR2 TO DATE. then use MAX to find the most recent date for an ID.
I have a test case as an example, definitely it can be improvised, it is just an example based on my understanding :
SQL> DROP TABLE T1 PURGE;
Table dropped.
SQL>
SQL> CREATE TABLE T1
2 (id NUMBER, name VARCHAR2(100), salary NUMBER, dt DATE
3 );
Table created.
SQL> ALTER TABLE T1 ADD PRIMARY KEY
2 (
3 id
4 )
5 ;
Table altered.
SQL>
SQL> INSERT INTO T1 VALUES
2 (1 , 'Test1' , 1000, TO_DATE('201401','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T1 VALUES
2 (2 , 'Test2' , 2000, TO_DATE('201501','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T1 VALUES
2 (3 , 'Test3' , 3000, TO_DATE('201301','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T1 VALUES
2 (4 , 'Test4' , 4000, NULL
3 );
1 row created.
SQL> INSERT INTO T1 VALUES
2 (10, 'Test10', 500 , TO_DATE('200101','YYYYMM')
3 );
1 row created.
SQL>
SQL> DROP TABLE T2 PURGE;
Table dropped.
SQL>
SQL> CREATE TABLE T2
2 (
3 seq_no NUMBER,
4 id NUMBER,
5 name VARCHAR2(100),
6 salary NUMBER,
7 dt DATE
8 );
Table created.
SQL>
SQL> INSERT INTO T2 VALUES
2 (100, 1 ,'Test11', 2000, TO_DATE('201612','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T2 VALUES
2 (101, 1 ,'Test11', 2000, TO_DATE('201401','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T2 VALUES
2 (102, 2 ,'Test22', 1050, TO_DATE('201501','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T2 VALUES
2 (103, 2 ,'Test22', 1050, TO_DATE('200108','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T2 VALUES
2 (104, 4 ,'Test44', 3500, TO_DATE('200002','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T2 VALUES
2 (105, 5 ,'Test55', 5500, TO_DATE('201205','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T2 VALUES
2 (106, 6 ,'Test66', 1200, TO_DATE('201201','YYYYMM')
3 );
1 row created.
SQL> INSERT INTO T2 VALUES
2 (107, 10,'Test10', 700 , NULL
3 );
1 row created.
SQL>
I would fetch the rows from table T2, which are of most recent date for an ID.
SQL> COLUMN NAME FORMAT A10;
SQL>
SQL> SELECT val id,
2 name,
3 salary,
4 TO_CHAR(dt, 'YYYYMM') max_dt
5 FROM
6 (SELECT ID val ,
7 NAME,
8 SALARY,
9 DT
10 FROM T2
11 JOIN
12 (SELECT ID val, MAX(DT) MAX_DT FROM T2 GROUP BY ID
13 ) A
14 ON (t2.dt = a.max_dt)
15 )
16 ORDER BY id
17 /
ID NAME SALARY MAX_DT
---------- ---------- ---------- ------
1 Test11 2000 201612
2 Test22 1050 201501
4 Test44 3500 200002
5 Test55 5500 201205
6 Test66 1200 201201
SQL>
I seriously don't know how do you have ID 3 and 10 in your desired output. For ID 3, there are no values in table T2, why should it be considered in the output? and ID 10 has a NULL value in the dt column, so even it should be ideally considered in the output. You need to further explain the rules to include these 2 records.
Regards,
Lalit
|
|
|
Re: Merge data to a table based on criteria [message #624432 is a reply to message #624340] |
Mon, 22 September 2014 13:23 |
|
Member2014
Messages: 33 Registered: July 2014 Location: USA
|
Member |
|
|
Thanks MHE. Sry about the DATA column name. I just named it that way thinking it would be easy to identify the exception column name. Will do better next time
And Yes, I need to pick the column values of T2 records with the highest seq_no EXCEPT for the DATA column. The value for the DATA column will be the highest value of all the records for a given ID.
As you mentioned the RANK function can be used to get data fir with the highest seq_no, but not for the the DATA column.
Thanks Lalith - The value in the DATA column is actually YYYYMM component of the date. But it can also just be YYYY. Hence its defined as a character field in the DB.
The output table I have above is the final data in T1 after data is merged:
- ID 3 is retained as is in the table
- ID 10 has NULL in T2 but value in DATA in T1. So when we merge, the value in T1 should be retained.
We want to overwrite values in T1 only with not null values from T2. Does this make sense?
I will try out your query now. Thanks!
[Updated on: Mon, 22 September 2014 13:29] Report message to a moderator
|
|
|
|
Re: Merge data to a table based on criteria [message #625726 is a reply to message #624167] |
Mon, 13 October 2014 11:51 |
|
Member2014
Messages: 33 Registered: July 2014 Location: USA
|
Member |
|
|
Thanks Lalit. This was a good read, but we decided to keep it as a character field.
I wrote this query from the query you provided. I basically applied the same concept:
SELECT name,
id ,
salary ,
data
FROM (SELECT o.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY seq_num DESC NULLS LAST) RN
FROM T2 o)
WHERE RN = 1;
It worked and we are in UAT now. Thanks for all your help. It has been great joining the forum, and am learning a lot.
|
|
|
Goto Forum:
Current Time: Thu Apr 18 01:03:21 CDT 2024
|