RE: Change database character set

From: Scott Canaan <srcdco_at_rit.edu>
Date: Fri, 24 Oct 2014 20:08:23 +0000
Message-id: <3653B77DCF6DDA4ABAD1E9416F1876D7443F446B_at_ex03mail01.ad.rit.edu>



Actually, the report lists the character sets at the end. Unfortunately, in my case, it lists 2 different character sets for the “bad” 9 rows:
CHARACTER SET                            NUMBER       PERCENTAGE

------------------------------ ---------------- ----------------
UNKNOWN 350400 99.997% WE8ISO8859P1 7 0.002% WE8MSWIN1252 2 0.001%
------------------------------ ---------------- ----------------
Total 350409 100.000%

[Language detection result]

LANGUAGE                                 NUMBER       PERCENTAGE

------------------------------ ---------------- ----------------
UNKNOWN 350400 99.997% ENGLISH 9 0.003%
------------------------------ ---------------- ----------------
Total 350409 100.000%

Which tells me that the data was entered on 2 different PCs with different character sets. I’m going to have to insist that they re-enter this data once the conversion (and application upgrade) is complete.

Scott Canaan ’88 (srcdco_at_rit.edu<mailto:srcdco_at_rit.edu>)

(585) 475-7886 – work                (585) 339-8659 – cell
“Life is like a sewer, what you get out of it depends on what you put into it.” – Tom Lehrer

From: Kenny Payton [mailto:k3nnyp_at_gmail.com] Sent: Friday, October 24, 2014 11:13 AM To: Scott Canaan
Cc: Niall Litchfield; oracle.unknowns_at_gmail.com; Chris Taylor; oracle-l_at_freelists.org Subject: Re: Change database character set

There is an option to csscan that will populate some columns in one of the tables, I believe it's added to csm$errors. If I recall correctly the column names are something like CSIDLED1, ...2, ....3. It will provide up to 3 educated guesses as to which character set the data might have came from.

LCSDDATA=LOSSY LCSD=Y You can also look at SYS_OP_CSCONV to convert the data.

    $update=sprintf("update %s.%s set %s=SYS_OP_CSCONV(%s,'%s','%s' ) where rowid = ?", $err->own, $err->tab, $err->col, $err->col, $tocs, $cs); We had millions of rows with embedded characters, such as the fancy quotes from word documents, in varchar2 fields and had to convert them manually.

On Fri, Oct 24, 2014 at 10:26 AM, Scott Canaan <srcdco_at_rit.edu<mailto:srcdco_at_rit.edu>> wrote: What I ended up doing was to create a blackout in OEM, then kill the dbsnmp connections and truncate the reg$ table. That took care of that error. The remaining 9 rows in the application data remain. I’ve been working with Oracle support and they suggested checking against WE8MSWIN1252. I did that and got the same 9 rows, so I still don’t know what characterset the data was entered with and I don’t know how to find out. I’ve asked the person that entered the data if she still has the files and can re-enter it after the conversion and upgrade. I hope so, that will simplify things a lot.

Scott Canaan ’88 (srcdco_at_rit.edu<mailto:srcdco_at_rit.edu>)

(585) 475-7886<tel:%28585%29%20475-7886> – work                (585) 339-8659<tel:%28585%29%20339-8659> – cell
“Life is like a sewer, what you get out of it depends on what you put into it.” – Tom Lehrer

From: Niall Litchfield [mailto:niall.litchfield_at_gmail.com<mailto:niall.litchfield_at_gmail.com>] Sent: Friday, October 24, 2014 8:49 AM
To: Kenny Payton
Cc: oracle.unknowns_at_gmail.com<mailto:oracle.unknowns_at_gmail.com>; Scott Canaan; Chris Taylor; oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org>

Subject: Re: Change database character set

Now I'm at my desk
Lossy or Convertible Conversion Data In SYS.REG$ Table (Doc ID 1490394.1) is the note I meant.

sys.reg$ table data is related to AQ or the EM agent. This table should not contain any data if there is no connections to the database.

Ideally there should not be any sessions in the database while running csscan as it can cause problems when you do the convert if rows are deleted.

On Fri, Oct 24, 2014 at 1:32 PM, Kenny Payton <k3nnyp_at_gmail.com<mailto:k3nnyp_at_gmail.com>> wrote:

I don't believe that worked for us.
On Oct 24, 2014 8:30 AM, "Niall Litchfield" <niall.litchfield_at_gmail.com<mailto:niall.litchfield_at_gmail.com>> wrote:

You can verify the Agent issue by stopping it (or dbconsole) and re running On 24 Oct 2014 13:10, "Kenny Payton" <k3nnyp_at_gmail.com<mailto:k3nnyp_at_gmail.com>> wrote: It sounds like you're getting off easy. As for the sys.reg$ row we had a similar row and it was related to the emagent. We probably could have got away with just deleting it but what we did was create a table in a user schema with the row in it, export the user table, delete the row from both tables and then reload the row after the csalter. In hindsight this probably wasn't necessary but worked.

You could probably do the same with the clob data as well.

On Fri, Oct 24, 2014 at 8:04 AM, Scott Canaan <srcdco_at_rit.edu<mailto:srcdco_at_rit.edu>> wrote: Actually, I’ve been considering doing something similar to this. The 9 rows in that table (out of about 20) that are affected were added in August and the person that added them still has the original documents, so they could be reloaded after the conversion. I believe that what is in the HUGECLOB in that table is just a word document that was cut and pasted into the field.

The only row that really concerns me is the one in SYS.REG$.

Scott Canaan ’88 (srcdco_at_rit.edu<mailto:srcdco_at_rit.edu>)

(585) 475-7886<tel:%28585%29%20475-7886> – work                (585) 339-8659<tel:%28585%29%20339-8659> – cell
“Life is like a sewer, what you get out of it depends on what you put into it.” – Tom Lehrer

From: Chris Taylor [mailto:christopherdtaylor1994_at_gmail.com<mailto:christopherdtaylor1994_at_gmail.com>] Sent: Friday, October 24, 2014 8:01 AM
To: Niall Litchfield
Cc: Scott Canaan; oracle.unknowns_at_gmail.com<mailto:oracle.unknowns_at_gmail.com>; oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: Re: Change database character set

Based on Niall's analysis, another option could be to export the SCRIPT_GROUP table, truncate it, do the conversion and reload it (depending on the size).

Chris

On Fri, Oct 24, 2014 at 5:02 AM, Niall Litchfield <niall.litchfield_at_gmail.com<mailto:niall.litchfield_at_gmail.com>> wrote: Scott

I'm pretty sure you'll find the session_key row is transient and due to an emagent (either grid/cloud control or dbconsole) - there's a note on csscan dictionary errors that will tell you. That leaves you with 9 documents (they look like word docs/emails). Can you grab those from the app and reenter the data post conversion?

--

Niall Litchfield
Oracle DBA
http://www.orawin.info

--

http://www.freelists.org/webpage/oracle-l Received on Fri Oct 24 2014 - 22:08:23 CEST

Original text of this message