RE: How to select only columns having values..

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 18 Feb 2014 12:27:22 -0500
Message-ID: <025e01cf2cce$afc03c50$0f40b4f0$_at_rsiz.com>



"But why do any of that?"  

I infer that the time to be saved is some person manipulating the data.  

I'll bet another dozen donuts that the OP meant "about" when writing "above."  

My solution probably violates the "simple" requirement. I picture the user hand copying this data into a spreadsheet.  

The larger point is that we increasingly find delays in both fetching and transmitting extraneous columns to client machines. Pre-empting that on the server will often be worth the cost.  

mwf  

From: Ric Van Dyke [mailto:ric.van.dyke_at_hotsos.com] Sent: Tuesday, February 18, 2014 8:57 AM To: Mark W. Farnham; mark.powell2_at_hp.com; oracle-l_at_freelists.org Subject: RE: How to select only columns having values..  

Roger that Mark.  

But why do any of that? Why not just read the entire row, really the whole table, into some local structure in the app and parse thru looking for what you need? Now you hit the DB once and done. No dynamic SQL no hacking thru the DD, and one parse/execute/fetch(s) cycle. Even using a PL/SQL block that processes each row one at a time would likely take a couple of seconds at worse.  

I agree that this part of the code shouldn't take minutes of time for what appears to be a small amount of data. Of course "above 1000 records" is vague enough to mean just about any number of rows, 1001? 10,000,000?  

We don't know the grander picture of all this so it's really hard to say what is exactly the right solution. I just don't see the need for such complex SQL at this point.  

From: Mark W. Farnham [mailto:mwf_at_rsiz.com] Sent: Tuesday, February 18, 2014 8:05 AM To: Ric Van Dyke; mark.powell2_at_hp.com; oracle-l_at_freelists.org Subject: RE: How to select only columns having values..    

The OP wrote:

"If I get the data like this, It will save few minutes for the data arrangement."

and

"I have a table with 200 columns and having above 1000 Records."

I sincerely hope any of these methods take far less than a few minutes for the computer to execute (an eye blink, right?), so I infer that the time to be saved is some person manipulating the data.

If I grok'd the attachment, it relies on some non-obvious conversion of null to 'N' and it delivers column name value pairs for each column of a returned row that is not equal to 'N'.

For me, at least, the rows under the minimum number of column headings would be easier to digest.

Even so, the attachment operated by scanning each returned row and parsing it to discover columns not equal to N and then, for each such column encountered, tossing the column name and value into the output.

That is only going to read the table once, but it parses 200 columns for each returned row interpretively.

My suggestion reads the entire table twice, but only parses the 200 columns once to generate the new sql statement delivering the column sparse results as a set.

For 3 rows and 10 columns of 1000 rows by 200 columns, either way should not take very long. Probably the person who crafted the attachment knew it was intended to operate on a small data set, so zero of this is intended as criticism in that person's direction.

A functional difference is only non-N valued row-column pairs are delivered by the attachment, while my suggestion would deliver all the columns for each selected row if any of the selected rows have some non-NULL value for that column. I'm not sure which the OP would prefer.

Given larger data sets, where we cared about the data processing performance rather than the "data arrangement," I'd bet a dozen donuts that reading the predicated result set for all the columns in the table once and the selected columns a second time would significantly outperform row by row column by column parsing.

The double pass against the database does require some two statement read consistency to be bulletproof. In recent versions I suppose an as of clause referencing the time of the first pass would be better than opening a transaction.

mwf      

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Ric Van Dyke
Sent: Monday, February 17, 2014 2:11 PM
To: mwf_at_rsiz.com; mark.powell2_at_hp.com; oracle-l_at_freelists.org Subject: RE: How to select only columns having values..  

Cool as that sounds Mark, I doubt it would work any "better" (as in faster) which is the posters original concern. This is basically just a different way to do the same thing his block of code already does. The problem in both cases is the creation of dynamic SQL which is never going to be faster than just running a statement.  

The base of this issue is some horrible design decisions. No matter how much "bling" you put on this, it's still a terrible way to store the data. Which in turn makes it very difficult to retrieve. Maybe it was done to make the insert of the date "easier", never a good idea. We generally insert once and query a qua-zillion times.  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mark W. Farnham
Sent: Monday, February 17, 2014 1:33 PM
To: mark.powell2_at_hp.com; oracle-l_at_freelists.org Subject: RE: How to select only columns having values..  

While I'm not taking the time to do it, and I'm not claiming this is efficient (in other words it is possibly saving analyst time at the expense of computer time), I *think*

you could generate a dynamic select clause using the <table> with the particular where clause joined and pivoted on column_name <c> from user_tab_columns for the <table> having

sum(decode(nvl(<c>,0),<c>,1,0,0)) "<c>" for each column <c> having > 0,

plucking the rows from the result set as the column names to create the desired select statement with the original <table> and predicate.  

This, of course, would only work on types for which the nvl function is valid. For any <c> in your table for which nvl is not valid you would have to use some other means to determine whether any of the rows for your current predicate have non-null values.  

The text of your generated queries for a 200 column table is going to be pretty long.  

There is probably a reasonable way to code this up in PL/SQL other than using the pivot, but if you're citing an arbitrary where clause and no data monitoring, I agree there is no way to specify which columns might have only nulls without running the query. On top of this, understand that since this is a two stage process you would need to construct this internal to a transaction boundary lest some column changes from status from having some non-nulls to only containing nulls between the first and second queries.  

I suppose you could reduce the number of candidate columns and invariantly include any columns in the query for which non-null is a column attribute.  

Getting the syntax all correct and doing this double dynamic select statement generation seems pretty tricky to me, but I *think* you could do it.  

mwf  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Powell, Mark
Sent: Monday, February 17, 2014 10:23 AM To: oracle-l_at_freelists.org
Subject: RE: How to select only columns having values..  

I also agree with Ric. Select the entire row and let the application front-end deal with displaying the data. Stored PL/SQL could be used between the front-end and the database, but it is going to be real interesting when the situation arises that one of the rows that is returned does not have data in all the same columns as the other rows returned in the target set. I expect that this situation is bound to happen sooner or later.    

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Maaz Anjum
Sent: Monday, February 17, 2014 8:52 AM
To: ric.van.dyke_at_hotsos.com
Cc: rajakannan_at_aetins.com; Chitale, Hemant K; oracle-l_at_freelists.org Subject: Re: How to select only columns having values..  

I agree with Ric.  

I assume you need these populated rows at execution time. This is just a thought, but if statistics are up-to-date on the table then perhaps checking the user_tab_columns for the num_nulls column compared against the total number of rows (user_tables.num_rows) might help you. The point being, you query meta data that is less costly than the actual table. I might be wrong, but it's just an idea.  

Either way, it doesn't seem you can avoid a complicated SQL statement.  

Cheers,

Maaz  

On Mon, Feb 17, 2014 at 8:40 AM, Ric Van Dyke <ric.van.dyke_at_hotsos.com> wrote:

Offhand this certainly has the sound of some very bad data modeling but that likely is beyond your control. Is this some sort of "generic table"? If so that will just lead to problems and you're only just starting to have them. Simply put, this table breaks just about every rule of relational theory and hence is not going to perform well no matter what you do.  

It seems to me the "best" thing would be just select the whole row(s) and then have the application dissect the row(s) and pull out where there is data and where there isn't. Creating the dynamic SQL as you do works, but will always be slow. Getting the entire row and then parsing thru it to find data is likely going to be faster.      

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Raja Kannan Sengoden
Sent: Monday, February 17, 2014 2:36 AM
To: Chitale, Hemant K

Cc: oracle-l_at_freelists.org
Subject: RE: How to select only columns having values..  

Thank a lot Hemant K Chitale.  

As per the real scenario, all the columns will have the value.  

But for the particular condition, only limited columns will have value.  

At this time, I don't want to do analyze the data and maintain.  

Hence I requested, is there any simple way..  

Attached query helped me to get the expected result.  

Thanks to Tony who helped me to get the expected result.  

Thanks & Regards,

Raja  

From: Chitale, Hemant K [mailto:Hemant-K.Chitale_at_sc.com] Sent: Monday, February 17, 2014 3:12 PM
To: Raja Kannan Sengoden
Cc: oracle-l_at_freelists.org
Subject: RE: How to select only columns having values..  

> If I don't know exactly, Is there any way to select only those 10 columns?

Without actually querying the table, you cannot identify the 10 columns.  

You could periodically query the table for all the rows and identify columns with NOT NULL values and then maintain a list of such columns as "meta data". However, there is no guarantee that a column that had NULL values for all 1000 rows yesterday or even a minute ago still has NULL values because there might have been an INSERT or UPDATE that set one row's value to a non-NULL.  

How do you know that the 190 columns have NULL values ?  

Hemant K Chitale    

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Raja Kannan Sengoden
Sent: Monday, February 17, 2014 12:38 PM To: oracle-l_at_freelists.org
Subject: How to select only columns having values..  

Dear Experts,  

I have a table with 200 columns and having above 1000 Records.  

But for a particular where clause, there is only 3 records fetched.  

As the table having 200 columns, but the particular 3 record having only 10 columns with some value, and 190 columns having null value.  

If I am sure, which columns I want or which column having value, then I can select only those columns.  

If I don't know exactly, Is there any way to select only those 10 columns?  

If I get the data like this, It will save few minutes for the data arrangement.    

Thanks in advance.  

Raja.

This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries at https://www.sc.com/en/incorporation-details.html.  

-- 
A life yet to be lived... 



--
http://www.freelists.org/webpage/oracle-l
Received on Tue Feb 18 2014 - 18:27:22 CET

Original text of this message