Re: Updating data in table depending on data in same table

From: Ed Prochak <edprochak_at_gmail.com>
Date: 8 Feb 2007 12:53:20 -0800
Message-ID: <1170968000.472522.119900@a34g2000cwb.googlegroups.com>

On Feb 8, 2:14 pm, "The J Man" <jonathan.kee..._at_gmail.com> wrote:
> Hey everyone,
>
> First of all apologies for posting this in the oracle.server section I
> meant to post it in here first.
>
> I'm still a relative newb so go easy please!
>
> I've inherited a system from someone that is in a bit of a mess and I
> need to tidy it up. It's an Oracle 10g database.
>
> The part i have to clean up basically has two tables.
> Table A attributes are: SOURCEID, NAME, DESCRIPTION, DEPARTMENT,
> LANGUAGES.
> Table B attributes are: SOURCEID, LANGID, ROLE, STATUS.
>
> Now the way this system is supposed to work is that a row is entered
> into table A and X amount of languages are added to the LANGUAGES
> field. This field is then split up into a row for each language into
> table B with the corresponding SOURCEID.
> e.g.
> Table A:
> 123 -- Homer -- lazy -- HR -- French;Spanish;German
>
> Table B:
> 123 -- French -- manager -- approved
> 123 -- Spanish -- admin -- pending
> 123 -- German -- teaboy -- proposed

it is not clear how B gets populated from A. Table A doesn't seems like a staging table as I first thought. There is obviously more going on in the process. So is it manual or automated? What are the rules?
>
> This system had been working fine until a new person started and
> instead of just adding a new language to the original row in table A
> she added 1000+ rows that are identical to those already in the table
> except for the fact the language is Japanese. So now the table looks
> like
> Table A:
> 123 -- Homer -- lazy -- HR -- French;Spanish;German
> 456 -- Homer -- lazy -- HR -- Japanese;
>
> Table B:
> 123 -- French -- manager -- approved
> 123 -- Spanish -- admin -- pending
> 123 -- German -- teaboy -- proposed
> 456 -- Japanese -- janitor -- approved
>
> I have been given the unenviable task of fixing this

The first error this shows is that using an ID column as the PK buys you ZERO data integrity.
The second error this shows is tha application fails to find the related data in table A.
yes you have a real mess.

So go to your backup from the night before and start over. reload those two tables and tell the entry person to do it right this time (it's actually not her fault, but she still has to reenter any data from that day.) You do do backups don't you? (By now this solution is too late, you let too much other stuff get inserted. time for plan B)

> ... by, identifying
> all terms that have identical attributes (except for the LANGUAGES
> attribute),

But the only common attribute is the source ID between table B and table A.

> ... changing the SOURCEID of the duplicate in table B to the
> master term SOURCEID (so in the example above, in table B change the
> row 456 -- Japanese -- janitor -- approved to 123 -- Japanese --
> janitor -- approved), tagging on Japanese onto the master term (in the
> example above, in table A change the row 123 -- Homer -- lazy -- HR --
> French;Spanish;German to 123 -- Homer -- lazy -- HR --
> French;Spanish;German;Japanese;) and then deleting the duplicate entry
> in table A (in the example above, delete the row 456 -- Homer -- lazy
> -- HR -- Japanese;)
>
> Thanks for sticking with me this far (if you have!). Luckily enough I
> know for a fact that all the master terms are in the SOURCEID range 1
> to 2000 and the duplicates are in the SOURCEID range 2500 to 6000. so

so this was either a bulk load, or manual data entry over a period of time.
in either case the process is flawed.

> identifying the terms is not a problem. I just ran an intersect query
> and found my ~1000 culprit terms. As you probably know though, from
> this I am not getting the SOURCEIDS of either row so basically i have
> been getting the SOURCEIDS of the identified terms and running the
> following command manually.
>
> update target set target.SOURCEID=MASTER_NUMBER where
> target.SOURCEID=DUPLICATE_NUMBER;
> update source set
> source.LANGSREQUIRED='French;Spanish;German;Italian;Brazilian
> Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French
> Canadian;Dutch;Korean;Japanese;' where source.SOURCEID=MASTER_NUMBER;
> DELETE FROM source WHERE sourceid = DUPLICATE_NUMBER;

Where's table A and table B??? which is which??

>
> As you can tell my wrists are about to fall off with the RSI of
> continually trying to find out the IDs and then run these commands.
> SQL is not really my game so i am not sure whether i would be able to
> use some procedure or cursor or something along those lines... I have
> already wasted an awful lot of time trying to read of some way to help
> me by running different commands but i am very quickly running out of
> time...
>
> Any help at all would be appreciated or even just a big fat no to say
> you're gonna have to put up with it and just do it manually.
>
> Thank you reading all this and I hope that I have explained it clearly
> enough,
> Jonathan.

Solutions:
1. A PL/SQL procedure would be the fastest IF you knew PL/SQL.

Given you feel less than familiar with SQL, might I ask how you got this assignment? No one else there knows Oracle?

2. If you can do it manually, then why not spool out the list of new/ old ids, and languages (ie just a SELECT query) and edit the file to make the UPDATE statements you need.

You do know how to use an editor like VI with macro commands like global substitute, right?
(If you say no to that, then I pity you since you clearly work for an incompetent manager who assigns people to tasks they cannot pereform.)

long term, either change this application or get another job.

ed Received on Thu Feb 08 2007 - 14:53:20 CST