Skip navigation.

Feed aggregator

Two Adaptive Plans Join Methods Examples

Bobby Durrett's DBA Blog - Wed, 2014-04-02 14:49

Here is a zip of two examples I built as I’m learning about the new adaptive plans features of Oracle 12c: zip

The first example has the optimizer underestimate the number of rows and the adaptive plans feature switches the plan on the fly from nested loops to hash join.

In the second example the optimizer overestimates the number of rows and the adaptive plans feature switches the plan from merge join to nested loops.

I ran the same scripts on 12c and 11.2.0.3 for comparison.

Example 1 11g:

Plan hash value: 2697562628

------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |      1 |        |      1 |00:00:00.01 |      18 |
|   1 |  SORT AGGREGATE               |      |      1 |      1 |      1 |00:00:00.01 |      18 |
|   2 |   NESTED LOOPS                |      |      1 |        |      8 |00:00:00.01 |      18 |
|   3 |    NESTED LOOPS               |      |      1 |      1 |      8 |00:00:00.01 |      17 |
|*  4 |     TABLE ACCESS FULL         | T1   |      1 |      1 |      8 |00:00:00.01 |      14 |
|*  5 |     INDEX RANGE SCAN          | T2I  |      8 |      1 |      8 |00:00:00.01 |       3 |
|   6 |    TABLE ACCESS BY INDEX ROWID| T2   |      8 |      1 |      8 |00:00:00.01 |       1 |
------------------------------------------------------------------------------------------------

Example 1 12c:

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:00.01 |       6 |       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.01 |       6 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |      1 |      8 |00:00:00.01 |       6 |  2168K|  2168K|     1/0/0|
|*  3 |    TABLE ACCESS FULL| T1   |      1 |      1 |      8 |00:00:00.01 |       3 |       |       |          |
|   4 |    TABLE ACCESS FULL| T2   |      1 |      1 |     16 |00:00:00.01 |       3 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Example 2 11g

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |      1 |        |      1 |00:00:00.01 |      16 |       |       |          |
|   1 |  SORT AGGREGATE               |      |      1 |      1 |      1 |00:00:00.01 |      16 |       |       |          |
|   2 |   MERGE JOIN                  |      |      1 |      4 |      1 |00:00:00.01 |      16 |       |       |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| T2   |      1 |     16 |      2 |00:00:00.01 |       2 |       |       |          |
|   4 |     INDEX FULL SCAN           | T2I  |      1 |     16 |      2 |00:00:00.01 |       1 |       |       |          |
|*  5 |    SORT JOIN                  |      |      2 |      4 |      1 |00:00:00.01 |      14 | 73728 | 73728 |          |
|*  6 |     TABLE ACCESS FULL         | T1   |      1 |      4 |      1 |00:00:00.01 |      14 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------

Example 2 12c

------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |      1 |        |      1 |00:00:00.01 |       5 |
|   1 |  SORT AGGREGATE               |      |      1 |      1 |      1 |00:00:00.01 |       5 |
|   2 |   NESTED LOOPS                |      |      1 |        |      1 |00:00:00.01 |       5 |
|   3 |    NESTED LOOPS               |      |      1 |      4 |      1 |00:00:00.01 |       4 |
|*  4 |     TABLE ACCESS FULL         | T1   |      1 |      4 |      1 |00:00:00.01 |       3 |
|*  5 |     INDEX RANGE SCAN          | T2I  |      1 |        |      1 |00:00:00.01 |       1 |
|   6 |    TABLE ACCESS BY INDEX ROWID| T2   |      1 |      1 |      1 |00:00:00.01 |       1 |
------------------------------------------------------------------------------------------------

The output of the plans for the 12c examples end with this line:

Note
-----
   - this is an adaptive plan

So, that tells me it is the adaptive plan feature that is changing the plan despite the wrong estimate of the number of rows.

- Bobby

 

Categories: DBA Blogs

Easy – Oops.

Jonathan Lewis - Wed, 2014-04-02 12:47

A question came up on OTN today asking for suggestions on how to enforce uniqueness on a pair of columns only when the second column was not null. There’s an easy and obvious solution – but I decided to clone the OP’s example and check that I’d typed my definition up before posting it; and the result came as a bit of a surprise. Here’s a demo script (not using the OP’s table):


create table t1  
(  
	col1	int not null,
	col2	varchar2(1)
);  

create unique index t1_i1 on t1( 
--	case col2 when null then cast(null as int) else col1 end,
--	case when col2 is null then cast(null as int) else col1 end,
	case when col2 is not null then col1 end,
	col2
)
;

insert into t1 values(1,null);
insert into t1 values(1,null);
insert into t1 values(1,'x');
insert into t1 values(1,'y');
insert into t1 values(1,'y');

commit;

column ind1_is   format a5
column ind1_when format 9999

set null N/A

select
	case when col2 is null then cast (null as int) else col1 end	ind1_is,
	case col2 when null then cast (null as int)  else col1 end	ind1_when
from 
	t1
;

The strategy is simple, you create a unique function-based index with two columns; the first column of the index id defined to show the first column of the table if the second column of the table is not null, the second column of the index is simply the second column of the table. So if the second column of the table is null, both columns in the index are null and there is no entry in the index; but if the second column of the table is not null then the index copies both columns from the table and a uniqueness test applies.

Based on the requirement and definition you would expect the first 4 of my insert statements to succeed and the last one to fail. The index will then have two entries, corresponding to my 3rd and 4th insertions.

I’ve actually shown three ways to use the case statement to produce the first column of the index. The last version is the cleanest, but the first option is the one I first thought of – it’s virtually a literal translation the original requirement. The trouble is, with my first definition the index acquired an entry it should not have got, and the second insert raised a “duplicate key” error; the error didn’t appear when I switched the syntax of the case statement to the second version.

That’s why the closing query of the demo is there – when you run it the two values reported should be the same as each other for all four rows in the table – but they’re not. This is what I got on 11.2.0.4:


IND1_IS IND1_WHEN
------- ---------
N/A             1
N/A             1
      1         1
      1         1


I’m glad I did a quick test before suggesting my original answer.

Anyone who has other versions of Oracle available is welcome to repeat the test and report back which versions they finding working correctly (or not).

Update

It’s not a bug (see note 2 below from Jason Bucata), it’s expected behaviour.

 


Easy – Oops.

Jonathan Lewis - Wed, 2014-04-02 12:47

A question came up on OTN today asking for suggestions on how to enforce uniqueness on a pair of columns only when the second column was not null. There’s an easy and obvious solution – but I decided to clone the OP’s example and check that I’d typed my definition up before posting it; and the result came as a bit of a surprise. Here’s a demo script (not using the OP’s table):


create table t1  
(  
	col1	int not null,
	col2	varchar2(1)
);  

create unique index t1_i1 on t1( 
--	case col2 when null then cast(null as int) else col1 end,
--	case when col2 is null then cast(null as int) else col1 end,
	case when col2 is not null then col1 end,
	col2
)
;

insert into t1 values(1,null);
insert into t1 values(1,null);
insert into t1 values(1,'x');
insert into t1 values(1,'y');
insert into t1 values(1,'y');

commit;

column ind1_is   format a5
column ind1_when format 9999

set null N/A

select
	case when col2 is null then cast (null as int) else col1 end	ind1_is,
	case col2 when null then cast (null as int)  else col1 end	ind1_when
from 
	t1
;

The strategy is simple, you create a unique function-based index with two columns; the first column of the index id defined to show the first column of the table if the second column of the table is not null, the second column of the index is simply the second column of the table. So if the second column of the table is null, both columns in the index are null and there is no entry in the index; but if the second column of the table is not null then the index copies both columns from the table and a uniqueness test applies.

Based on the requirement and definition you would expect the first 4 of my insert statements to succeed and the last one to fail. The index will then have two entries, corresponding to my 3rd and 4th insertions.

I’ve actually shown three ways to use the case statement to produce the first column of the index. The last version is the cleanest, but the first option is the one I first thought of – it’s virtually a literal translation the original requirement. The trouble is, with my first definition the index acquired an entry it should not have got, and the second insert raised a “duplicate key” error; the error didn’t appear when I switched the syntax of the case statement to the second version.

That’s why the closing query of the demo is there – when you run it the two values reported should be the same as each other for all four rows in the table – but they’re not. This is what I got on 11.2.0.4:


IND1_IS IND1_WHEN
------- ---------
N/A             1
N/A             1
      1         1
      1         1


I’m glad I did a quick test before suggesting my original answer.

Anyone who has other versions of Oracle available is welcome to repeat the test and report back which versions they finding working correctly (or not).

Update

It’s not a bug (see note 2 below from Jason Bucata), it’s expected behaviour.

 


Efficacy, Adaptive Learning, and the Flipped Classroom

Michael Feldstein - Wed, 2014-04-02 12:10

Speaking of efficacy and the complexity of measuring it, I had an interesting conversation the other day with Danae Hudson, a professor of Psychology at Missouri State University, about a course redesign effort that she participated in. The initial contact came from a P.R. firm hired by Pearson. Phil and I get a lot of these and turn down most of them. This one interested me for several reasons. First, it was an opportunity to talk directly to a faculty member who was involved in the project. (A tip to all you P.R. folks out there: You will have a better chance of getting our attention when the focus of the call is to put us in direct contact with your customers about what they are achieving.) Second, the project was facilitated by The National Center for Academic Transformation (NCAT). I am a big fan of NCAT’s work, despite the fact that they seem to have an almost pathological urge to sabotage efforts to bring their work the attention that they deserve. Pearson’s interest lab in the call was that MyPsychLab was an integral part of the course redesign. My interest was to see what I could learn about the interaction between educational technology products and educational practices in delivering educational efficacy.

What I heard tended to confirm my suspicions (and common sense): Educational technology products can produce significant learning gains, but they often doing so by supporting changes in classroom practices.

The Goals

Like all NCAT redesign projects, this one has a complete write-up on the NCAT site. The document summarizes the redesign context and goals as follows:

Introductory Psychology is a semester-long, general education course at Missouri State University (MSU). The course falls within the self understanding/social behavioral perspective area of general education and is by far, the most popular choice for students within that area. Each academic year, at least 18 traditional face-to-face sections are offered with a total enrollment of 2,500-2,700 students. The course is lecture-based and typically taught by 65% full-time faculty and 35% adjunct instructors. While there are common general education goals across all sections, each instructor makes individual choices of content and delivery.

Despite being a popular choice among students, Introductory Psychology has traditionally experienced a high DFW rate (approximately 25%). The department wants to find ways to develop a more engaging course that will result in improved student learning outcomes and student satisfaction. Due to the large enrollment and numerous sections offered throughout the year, a significant number of adjunct instructors teach the course, which has contributed to some course drift and grade inflation. Currently, each section of 153 students is taught by one instructor, which significantly limits the type of activities that can be assigned and graded. The vast majority of the final course grade is derived from a series of multiple-choice exams. The goal is to redesign the course to be much more engaging and interactive, with an emphasis on true mastery of the course material.

To sum up: We have a popular Gen Ed course with a high failure and withdrawal rate. Danae also told me that the psychology department had long delivered a formative exam at the beginning of that class, and that they were unhappy with the level of improvement students were showing between the formative and summative exams. The faculty wanted to improve those numbers by making the course “more engaging and interactive, with an emphasis on the true mastery of the course material.”

This is typically where we start hearing that teaching effectively is expensive. But NCAT has a strong track record of proving that to be false. It turns out that ineffective teaching methods are usually often inefficient. Let’s pause and think about the formulation of that last sentence for a moment. It’s not always the case that effective teaching measures are cost-efficient. Of course we know that good seminars with low teacher/student ratios can be very effective but, to adopt the current parlance, “don’t scale.”  In that situation, there is a tension between effectiveness and efficiency. But despite appearances, some traditional styles of teaching—most notably the classic ginormous lecture class—are both ineffective and inefficient. Why is that so? For several reasons. First, both the physical plant and the labor structure of the large lecture class limit its ability to scale. If you run out of lecture hall seats, or you run out of TAs, you have exhausted your ability to increase the number of students taught with the faculty that you have. The central innovation of video-based xMOOCs is that they remove this limitation without changing the underlying pedagogical model of the large lecture. But the central problem is that cost and effectiveness is a two-way street in education. In my last post, I discussed David Wiley’s argument that cost of curricular materials impact effectiveness insofar as cost limits student access to those materials. But it goes the other way too. There is a cost for every student who fails or withdraws from a class and therefore has to retake it. The direct cost is in the tuition paid for two classes rather than one—a cost paid but the financial aid providers in addition to the student—but indirect costs include increased chances that the student might have to stay an extra semester or drop out altogether as well as the knock-on effect of the student blocking the seat for another student in an enrollment-capped but graduation-required course. NCAT typically doesn’t even look at these indirect costs and are often able to find significant direct cost savings by restructuring courses away from ineffective pedagogical approaches toward more effective pedagogical approaches that also happen to be more scalable. In MSU’s case, they projected that they would be able to lower the direct cost of the course by 17.8% while still achieving the primary goal of increasing effectiveness. The NCAT report notes,

The cost savings will remain in the psychology department and will be used to provide support for the redesigned course in the future, faculty wishing to take on additional course redesign projects and faculty travel to present at conferences related to the scholarship of teaching and learning.

But How?

MSU decided to redesign its course around what NCAT calls “the Replacement Model,” which can be thought of as a combination of flipped and hybrid. At this point most people have at least a basic idea of what “hybrid” means, but “flipped” is a lot less clear. The Chronicle of Higher Education recently published a column by Robert Talbert highlighting a group that is trying to establish definition and best practices around what they call “flipped learning,” which they describe as follows:

Flipped Learning is a pedagogical approach in which direct instruction moves from the group learning space to the individual learning space, and the resulting group space is transformed into a dynamic, interactive learning environment where the educator guides students as they apply concepts and engage creatively in the subject matter.

That’s it in a nutshell: Move direct instruction (i.e., lectures) out of class time so that there can be more direct student interaction time. Which sounds great, but it leads to a frequently asked question. If students have to do all the homework they were doing before plus watching all the lecture videos at home, isn’t that going to dramatically increase the amount of time they have to spend on the class? How can they do all of that work? NCAT’s answer is that you give them back some of that time by making the class “hybrid” in the sense that you reduce their in-class seat time by 50%. That’s why it’s called the “Replacement Model.”

While Danae never used the term “flipped learning”, she did talk about the flipped classroom and made it very clear that she meant using it to increase the amount of class time spent interacting with students and focusing on their particular needs. But the MSU plan called for decreasing class time by 50% while doubling the number of students per class from an average of 153 to 300. How was that supposed to work?

Part of the answer lies in using traditional techniques like group projects, but a lot of it is in using data to provide students with more feedback and fine tune the classroom experience. This is where Pearson comes in. I wrote a while back that the promise of adaptive learning programs is to transform the economics of tutoring:

The simplest way to think about adaptive learning products in their current state is as tutors. Tutors, in the American usage of the word, provide supplemental instruction and coaching to students on a one-on-one basis. They are not expected to know everything that the instructor knows, but they are good at helping to ensure that the students get the basics right. They might quiz students and give them tips to help them remember key concepts. They might help a student get unstuck on a particular step that he hasn’t quite understood.  And above all, they help each student to figure out exactly where she is doing well and where she still needs help.

Adaptive learning technologies are potentially transformative in that they may be able to change the economics of tutoring. Imagine if every student in your class could have a private tutor, available to them at any time for as long as they need. Imagine further that these tutors work together to give you a daily report of your whole class—who is doing well, who is struggling on which concepts, and what areas are most difficult for the class as a whole. How could such a capability change the way that you teach? What would it enable you to spend less of your class time doing, and what else would it enable you to spend more of your class time doing? How might it impact your students’ preparedness and change the kinds of conversations you could have with them? The answers to these questions are certainly different for every discipline and possibly even for every class. The point is that these technologies can open up a world of new possibilities.

This is exactly how MSU is using MyPsychLab. One of the biggest benefits that Danae cited was being able to walk into a class knowing what students were doing well with and what they were struggling with. This enables her and her colleagues to focus on the topics that those particular students need the most help with in class while simultaneously communicating to the students that their teacher is aware of how they are doing and what they need. Likewise, she said that the students are coming to class more engaged with better questions. MSU also uses clickers in class to augment the feedback loop that they are getting from the homework platform. This certainly was a critical enabler at a class size of 300 and would be useful in a significantly smaller lecture class as well.

Did it work? The results are overall very positive but mixed:

  • On the 30-item comprehensive exam, students in the redesigned sections performed significantly better (84% improvement) compared to the traditional comparison group (54% improvement).
  • Students in the redesigned course demonstrated significantly more improvement from pre to post on the 50-item comprehensive exam (62% improvement) compared to the traditional sections (37% improvement).
  • Attendance improved substantially in the redesigned section. (Fall 2011 traditional mean percent attendance = 75% versus fall 2012 redesign mean percent attendance = 83%)
  • They did not get a statistically significant improvement in the number of failures and withdrawals, which was one of the main goals of the redesign, although they note that “it does appear that the distribution of A’s, B’s, and C’s shifted such that in the redesign, there were more A’s and B’s and fewer C’s compared to the traditional course.”
  • In terms of cost reduction, while they fell short of their 17.8% goal, they did achieve a 10% drop in the cost of the course.
Intuitions and Open Questions

The study of the course redesign was intended to measure the overall impact of the effort rather than to research the components of efficacy, which means that we don’t have good data from which we can draw strong conclusions on the most interesting questions in this regard. But I’m not afraid to make some guesses and I asked Danae to do the same with me. To be clear, her first answer to any of the questions I’m going to bring up in this section of the post was consistently along the lines of, “I don’t have data that speaks to that question.” Which is the right answer. I want to be clear that wherever I reference her opinions here that it was in this context and that she was appropriately tentative.

First of all, what role did MyPsychLab have in the improvements? Here we have at least one hard number:

A significant portion of the redesigned course utilized publisher-customized digital learning technology. A correlation was calculated between the students’ online total score of assigned material and the total of five exam scores. This correlation was .68, p < .001 suggesting a strong relationship between the completion of online learning activities and exam performance.

But why? Obviously, practice is part of the equation. Students who do the homework tend to do better in classes in general. I asked Danae what she thought the big drivers were beyond that. She cited the feedback to faculty and student engagement. The product seemed to succeed in getting students engaged, from her perspective. When pressed about the degree to which the adaptive component of the product made a difference, she guessed that it wasn’t as big a factor. “My gut tells me that it is less about the personalization,” she said. But then she added that the personalization may have helped to drive student engagement by making the students feel like the content was tailored to their needs. “I think personalization is the part that appeals to the students.” This raises the question about the degree to which any gains that we see added to an adaptive product may be because of a digital…er…analog to teaching presence and as opposed to the software’s real ability to adapt to individual student needs and capabilities.

Second, I asked Danae to characterize how much she thinks adopting MyPsychLab would have driven improvements had it been added to the original class before the redesign. Her instinct was not nearly as much, which is my instinct too. We don’t have numbers to separate the impact of the practice from the impact of the tailored instruction that resulted from having the student data in the product. Nor do we know how much student engagement with the product was impacted by the fact that it was integrated into the whole course redesign. These would be important questions to answer before we can have a clear and nuanced answer to the question of the product’s “efficacy.” Efficacious under what circumstances?

Finally, I’d like to return to David Wiley’s question about cost as a proxy for access and its impact on efficacy. Danae was traveling and didn’t have access to the course materials cost information when I reached her by email today, but she was confident that the cost had not gone up significantly and thought it might have actually gone done post-redesign. (And no, Pearson did not subsidize the cost of MyPsychLab to the students.) So we have no before/after data from which we can make inferences regarding the impact of cost on student outcomes. But it’s possible that MSU could have had a more significant impact on its DFW rate had the total cost to the students been lower. It’s also worth noting that MSU expected to increase enrollment by 72 students annually but actually saw a decline of enrollment by 126 students, which impacted their ability to deliver decreased costs to the institution. Would they have seen different enrollments had the curricular materials been less expensive? Or free? We don’t know. But this raises the point that efficacy cannot be reduced to one aggregate number. Improving student aggregate test scores and reducing the number of students who fail or withdraw are two different goals which certainly need to be measured differently and probably need different sorts of interventions to achieve.

Postscript

After this post went live, Danae shared some data with me from the semesters after the NCAT report was published. As it turns out, the course did see significant reduction in its DFW rates and enrollments bounced back over the course of several semesters. You can read about the details, as well as possible explanations, here.

The post Efficacy, Adaptive Learning, and the Flipped Classroom appeared first on e-Literate.

Tweaking

Jonathan Lewis - Wed, 2014-04-02 11:24

The following question came up on OTN recently:

Which one gives better performance? Could please explain.

1) nvl( my_column, ‘N’) <> ‘Y’

2) nvl( my_column, ‘N’) = ‘N’

It’s a question that can lead to a good 20 minute discussion – if you were in some sort of development environment and had a fairly free hand to do whatever you wanted.

The most direct answer is that you could expect the performance to be the same whichever option you chose – but the results might be different, of course, unless you had a constraint on my_column that ensured that it would hold only null, ‘Y’, or ‘N’.  (Reminder:  the constraint check (my_column in (‘Y’,’N’) will allow nulls in the column).

On the other hand, if you create a function-based index on nvl(my_column,’N’) then the second option would give the optimizer the opportunity to use the index – which could make it the more efficient option if a sufficiently small quantity of the data from a large table matched the predicate. Of course in this case you would need a histogram on the hidden column definition supporting the index so that the optimizer could detect the data skew.

But if a function-based index is a step in the right direction it’s worth thinking about how to pick the best function-based index.


create index t1_i1 on t1(case my_column when 'Y' then null else 'N' end);

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1');

select * from t1 where case my_column when 'Y' then null else 'N' end = 'N';

Provided you can change the queries that use the original predicate, you can create a function-based index that is the smallest possible index to assist your query, and you won’t need a histogram to allow the optimizer to get the correct cardinality since there will be just one distinct value in the index (and the underlying column).

It’s possible to go a little further – what if the typical query that started this discussion was really something like:


select  *
from    t1
where   nvl(my_column, 'N') = 'N'
and     other_column = {some value}

If the nvl() predicate always returned a small fraction of the data we might engineer an index to handle the requirement of the entire where clause:


create index t1_i2 on t1(case my_column when 'Y' then null else other_column end);
execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all hidden columns size 254');

In this case you might need a histogram on the hidden column definition supporting the index, which is why I’ve changed my method_opt. We’ve constructed an index that is the smallest possible index that will satisfy the query as precisely as possible while giving the optimizer the best chance of producing accurate cardinality estimates. [Update: until the appearance of Kirill's comment (#1) I forgot to point out that once again you have to change the original SQL from using the nvl() expression to using a case expression that matches the index.]

Footnote

It would be easy to fall into the trap of playing around with this type of tweaking every time you hit a performance problem. Bear in mind, though, that even indexes which are as small and accurately targeted as possible can introduce various types of overhead (contention, cost of gathering stats, instability of histogram collection); so always try to think through the potential costs as well as the benefits of such an approach - is it really worth the effort.

It’s also worth noting that things don’t always work the way you expect.

Finally, of course, you might choose to create “proper” virtual columns in 11g so that you can refer to them by name, rather than handling them indirectly as you have to when they are “undocumented” hidden columns generated by function-based indexes.

 


Tweaking

Jonathan Lewis - Wed, 2014-04-02 11:24

The following question came up on OTN recently:

Which one gives better performance? Could please explain.

1) nvl( my_column, ‘N’) <> ‘Y’

2) nvl( my_column, ‘N’) = ‘N’

It’s a question that can lead to a good 20 minute discussion – if you were in some sort of development environment and had a fairly free hand to do whatever you wanted.

The most direct answer is that you could expect the performance to be the same whichever option you chose – but the results might be different, of course, unless you had a constraint on my_column that ensured that it would hold only null, ‘Y’, or ‘N’.  (Reminder:  the constraint check (my_column in (‘Y’,’N’) will allow nulls in the column).

On the other hand, if you create a function-based index on nvl(my_column,’N’) then the second option would give the optimizer the opportunity to use the index – which could make it the more efficient option if a sufficiently small quantity of the data from a large table matched the predicate. Of course in this case you would need a histogram on the hidden column definition supporting the index so that the optimizer could detect the data skew.

But if a function-based index is a step in the right direction it’s worth thinking about how to pick the best function-based index.


create index t1_i1 on t1(case my_column when 'Y' then null else 'N' end);

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1');

select * from t1 where case my_column when 'Y' then null else 'N' end = 'N';

Provided you can change the queries that use the original predicate, you can create a function-based index that is the smallest possible index to assist your query, and you won’t need a histogram to allow the optimizer to get the correct cardinality since there will be just one distinct value in the index (and the underlying column).

It’s possible to go a little further – what if the typical query that started this discussion was really something like:


select  *
from    t1
where   nvl(my_column, 'N') = 'N'
and     other_column = {some value}

If the nvl() predicate always returned a small fraction of the data we might engineer an index to handle the requirement of the entire where clause:


create index t1_i2 on t1(case my_column when 'Y' then null else other_column end);
execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all hidden columns size 254');

In this case you might need a histogram on the hidden column definition supporting the index, which is why I’ve changed my method_opt. We’ve constructed an index that is the smallest possible index that will satisfy the query as precisely as possible while giving the optimizer the best chance of producing accurate cardinality estimates. [Update: until the appearance of Kirill's comment (#1) I forgot to point out that once again you have to change the original SQL from using the nvl() expression to using a case expression that matches the index.]

Footnote

It would be easy to fall into the trap of playing around with this type of tweaking every time you hit a performance problem. Bear in mind, though, that even indexes which are as small and accurately targeted as possible can introduce various types of overhead (contention, cost of gathering stats, instability of histogram collection); so always try to think through the potential costs as well as the benefits of such an approach - is it really worth the effort.

It’s also worth noting that things don’t always work the way you expect.

Finally, of course, you might choose to create “proper” virtual columns in 11g so that you can refer to them by name, rather than handling them indirectly as you have to when they are “undocumented” hidden columns generated by function-based indexes.

 


Organizations using data to track environmental change

Chris Foot - Wed, 2014-04-02 10:58

In light of a study recently released by the Intergovernmental Panel on Climate Change, the database administration needs of public agencies and organizations are expected to expand significantly. As it was industrialization and innovation that incited this worldwide issue, the Internet of Things will continue to be used to identify the detrimental effects climate change has on particular ecosystems and economies of the world. 

Patrick Thibodeau, a contributor to Computerworld, claimed that the IPCC's study acknowledged the importance of sensor networks to monitor the shifting global environment. Potentially, these devices could help government officials anticipate droughts, floods and natural disasters caused by rising temperatures. In addition, it is hoped that the mechanisms will identify ways to preserve water and food supplies as well as determine methods for reducing energy consumption. 

If public authorities choose to acknowledge the recommendations of the IPCC, the influx of new data derived from the IoT is sure to increase network traffic, requiring the expertise of remote database support to ensure that all analytics programs are running efficiently. As it's somewhat ambiguous as to how these sensors will be deployed, the kinds of avenues through which information flows into networks may pose as a challenge to in-house IT departments.

An example of a new innovation 
The types of devices the government and non-profit environmental agencies use are quite variable. Some may track the shifting tides across the Atlantic and the Pacific while others will determine the acidity of farming soil. If all the data collected by these devices is assembled onto a single server, outsourced database experts may be consulted to mitigate it all. It looks as if scientists have already taken the first step. 

According to Space Daily, engineers from Europe developed the Sentinel-5 instrument, a mechanism which allows the continent's Copernicus program to monitor air quality around the globe. The article noted that atmospheric pollution is linked to millions of deaths around the world. 

"The readings will help to both monitor and differentiate between natural and human-produced emissions, providing new insight on the human impact on climate," noted the news source. 

Amassing and translating such an incredible amount of data will most likely necessitate the expertise of remote DBA to ensure that networks don't crash or overload. It's hoped that Copernicus, the world's first operational environmental surveillance system, will provide scientists with specific insight on how the earth's population can reduce emissions. 

The Art of Easy: Make It Easy To Deliver Great Customer Service Experiences (Part 1 of 6)

Linda Fishman Hoyle - Wed, 2014-04-02 10:54

A Guest Post by JP Saunders, Product Strategist for Oracle CX Service offerings

Modern Customer Service is helping businesses to differentiate and grow in today’s competitive market. Why? Because your success gets accelerated when you take care of the people who make you successful. Modern Customer Service is about just that, taking care of your customers, and your people that support them. To do that, you need to make it EASY to deliver great experiences, time and time again.

Businesses that make it “look” easy stand out, because they consistently deliver experiences so memorable that their customers feel compelled to share them. But ask any of them how they do it, and you will quickly learn that the ART of EASY really isn’t EASY at all―or we would all be doing it!

The most common business challenge stems from underestimating the complexity of simplicity, which can be devastating to the customer’s experience. EASY is an ART. And it requires a deep understanding to get the right results―understanding of your business and of your customer, which can be applied in your cultural skills. Your canvas is the combination of every engagement channel you have where the picture of your brand gets painted. Your pallet of colors is the collection of your people, knowledge and data. Your brushes are the technologies that blend all together. In order to make “EASY”, beautiful and repeatable, you need solutions that provide a paint by numbers approach.

In this six part blog series, we will help guide you in managing the complexity of your customer service business challenges, while avoiding the common pitfalls, and deliver the type of great experiences that your customers will want to share with others. Experiences that are easily repeatable by you, and hard for your competitors to replicate. We will show you how Oracle Service Cloud empowers your business to care, take action and succeed in your Web, Contact Center, Knowledge, Policy, and Cloud customer service initiatives, to become a Modern Customer Service organization.

Dealing with technical questions about #Oracle

The Oracle Instructor - Wed, 2014-04-02 10:51

During the OCM Preparation Workshop today in Bucharest, we got into a discussion about parallel inserts vs. direct load with append hint, where I said something like: “I think a parallel insert is always* also a direct load.” Most attendees were satisfied with that, but some still looked a bit sceptical. And I was also not 100% sure about it. While the others practiced, I did a brief research and came up with this answer from Tom Kyte and quoted him: “Parallel is always a direct path, if you go parallel, you will be appending.” Everybody in the room was immediately convinced – only I felt a bit uncomfortable with the outcome myself. After all, I don’t want to teach people to blindly follow my or any other guys claims without checking it if possible. And as with so many other claims about Oracle, it is possible to prove (or falsify) it with little effort:

 

SQL> grant dba to adam identified by adam;

Grant succeeded.

SQL> connect adam/adam
Connected.

SQL> create table t as select 'TEST' as col1 from dual connect by level<=1e6; 
Table created. 

SQL> select bytes/1024/1024 from user_segments;

BYTES/1024/1024
---------------
             13

SQL> delete from t;

1000000 rows deleted.

SQL> commit;

Commit complete.

SQL> insert into t select 'TEST' as col1 from dual connect by level<=1e6; 
1000000 rows created. 

SQL> commit;

Commit complete.

SQL> select bytes/1024/1024 from user_segments;

BYTES/1024/1024
---------------
             13

The above shows that a conventional insert reuses the empty space before the high water mark of the table and does not allocate additional extents. Now I’m going to do a parallel insert:

SQL> delete from t;

1000000 rows deleted.

SQL> commit;

Commit complete.
SQL> alter table t parallel;

Table altered.

SQL> alter session enable parallel dml;

Session altered.

SQL> insert into t select 'TEST' as col1 from dual connect by level<=1e6; 
1000000 rows created.

SQL> commit;

Commit complete.

SQL> select bytes/1024/1024 from user_segments;

BYTES/1024/1024
---------------
         24.375

QED! Obviously, the parallel insert appended rows after the high water mark – it was a direct load, in other words. That is in itself not so spectacular, but that little anecdote illustrates the in my opinion proper way to deal with technical questions about Oracle: Always try to back up your assumptions respectively claims with proof instead of relying only on (your or others) authority.

In other words: Don’t believe it, test it! :-)

*Addendum: Never say always, I always say: If you include the NOAPPEND hint to the parallel insert, it will not be done as a direct load. Thank you guys for pointing that out!


Tagged: 3e
Categories: DBA Blogs

How to Chat Up an Accountant Safely: Social Networking in the Finance Department

Linda Fishman Hoyle - Wed, 2014-04-02 10:38

A Guest Post by Ultan O'Broin, Director of User Experience, Oracle Corporation

Seems that baby boomers are now Instagram-ing, WhatsApp-ing and SnapChat-ing just like younger Digital Natives do. How widespread those apps are in the enterprise is another matter, but it’s a reminder never to make assumptions about apps users. Yet, certain job titles do sometimes conjure up a mental picture of how we think some people actually work.

Mention “accountant”, and you might visualize a gray picture of quiet, introspective types, heads down in books and spreadsheets, papers flying, calculators working overtime, phones to their ears begging cash from customers and wiring funds to suppliers, while accounting for all the money. Not terribly social, then? The polar opposite of those freewheeling “Mad Men” sales rep CRM types, out meeting and greeting, getting their message across to make that sale, perhaps? In fact, the finance department is a hive of social activity.

I spoke with David Haimes, Senior Director in Oracle Financials Applications, about the social side of the finance department. David understands the reality of his applications users. “Their most critical time is the 5-10 days after period close when everything has to be closed out and reported”, David told me. “There’s a huge amount of effort and social interaction going on”.

During the close process, David said teams need to exchange information and make decisions as quickly as possible and still satisfy business and legal requirements. Accounting teams were early adopters and heavy users of instant messaging, email distribution lists (with Microsoft Excel spreadsheet attachments), wikis, file sharing workspaces, and of course, the old fashioned telephone. But these tools were external to the financial application and data. The user experience was disjointed. Who works well in a silo? And, there was no audit trail. David has seen accounting teams copying and pasting emails into documents and attaching them to meet that audit requirement.

“The finance department has to make sure everything is correct and legal,” David said. “They’re reporting not just to internal management, but to Wall Street, to tax authorities, and to other legislative bodies. And, since the Sarbanes-Oxley act, CEOs are legally responsible for the correctness of the accounts,” David reminded me. That’s pressure.

Things are even more hectic when you consider the nature of the enterprise financial department today, with its distributed team members with shared service centers offshore and everyone working in different countries and time zones. Everyone needs to communicate and collaborate efficiently, yet securely and transparently.

That’s where Oracle Social Network is a financial department win. 

  • Oracle Social Network conversations are tied to business objects and transactions, enabling finance teams to easily share and collaborate in a role-based way. 
  • Oracle Social Network conversations are auditable (which is “usually the first question I’m asked,” says David). 
  • Oracle Social Network conversations are searchable.
  • Oracle Social Network is secure, with users with the right permissions working together on information stored in an Oracle database. 
  • Oracle Social Network is integrated with Oracle Financials applications, so the user experience is  streamlined.

“[Oracle Social Network] is a game changer in the finance department,” says David, not just for the closing period but also for daily financial activity. And, Oracle Social Network is available as a cloud service, with iOS and Android mobile apps versions too.

With the Oracle Social Network user experience in the finance department, Oracle also satisfies today’s workforce that expects social networking tools to be as much a part of their work lives as their personal lives. Said David: “Younger users are already familiar with how social networking sites work and how they’re easy to use, and that’s the sort of user experience we need to reflect. It’s demanded.”

Having a social networking application as part of the job makes hiring and onboarding easier too, offering benefits right across the enterprise. And it’s not only Digital Natives or Millennials who easily take to integrated social networking in work. Even senior users now see the benefits.

Socializing the finance department with Oracle technology is an example of how a great user experience can engage workers, accelerate performance and efficiency, deliver productivity for business while meeting the consumer technology demands of end users, and satisfy the requirements of stakeholder user groups such as other departments, auditing and security teams, tax authorities, reporting agencies, shareholders, and so on.

Read more about socializing the finance department on the Oracle Applications blog and David’s blog (a bookmark must) too. And, check out what the Oracle Social Network Cloud Service now offers and how it benefits your users and business.

The Merge Join Transformation and a SSIS Sorting Tip

Chris Foot - Wed, 2014-04-02 06:43

My previous blog post was about the SSIS Lookup task and how it really works. Now that I have shown that the Lookup task shouldn’t be used for one-to-many or many-to-many joins, let’s take a look at the Merge Join transformation task. If you follow along with this blog, you will learn a little tip that will eliminate the requirement for you to add a SORT transformation task within your data flow task.

Previously, we isolated our results sets down to one employee in the AdventureWorks database and joined two tables together. I’m going to do the same thing here. This time, I am going to introduce a third table into the join logic. The three tables are listed below:

  • Employee Department History
  • Person
  • Department

Here is what it would look like via SSMS with T-SQL:

Image and video hosting by TinyPic

Let’s see how we can mimic this result set in SSIS without using T-SQL to join the three tables. First, I want to say it is not always going to be the best option not to use T-SQL instead of individual tasks in SSIS. I have learned over time that it is easier to write the join logic directly in you data flow source task sometimes. However, this is for demonstration purposes.

Let’s say you received a request to extract a result set, order the results set, and load it to another location. Here is what your package would look like in SSIS using the Merge Join transformation task:

Image and video hosting by TinyPic

Here are our results:

Image and video hosting by TinyPic

Notice, I used the SORT transformation task in the example above. I used this to depict what has to occur in a step by step approach:

  • Extracted data from the Person and Employee Department History tables
  • Sorted each result set
  • Merged the two results into one using inner join logic
  • Extracted data from the Departement table
  • Sorted the first Joined result set and the Department result set
  • Merge the Joined result set from Persons and Employee History with the Department table

Let’s talk about best practice for this example. This is where the Sort tip is introduced. Since we need an ordered result set per the request, we are using the merge transformation instead of the Union All task. Additionally, we used the Sort task. The Sort task can heavily impact the performance of an SSIS package, particularly when you have larger result sets than what we are going to extract from the AdventureWorks database.

Best practice is to bring in an ordered result set at the source and then merge your record sets. Well, how do you do that? Let’s walk through ordering your result set at the source and configuring your source to define the sorted column for merging your record sets.

First, we open the task and add our ORDER BY clause to our source.

Image and video hosting by TinyPic

Next, close the source task, right click on the same source task, and choose the Show Advanced Editor option.

Image and video hosting by TinyPic

There are two specifications in the Advanced Editor that need to be defined in order to make this work:

  • Click on the Input and Output Properties tab
  • Click on the OLE DB Source Outputs
  • Change the IsSorted parameter to “True”

Image and video hosting by TinyPic

  • Drill down into the OLE DB Source Output to Output columns.
  • Click on your columns that you used in your ORDER BY clause.
  • Change your SortKeyPosition parameter from “0″ to “1″.

The desired results should look similar to those below:

Image and video hosting by TinyPic

Next, you can remove each sort task that directly follows your OLE DB Source task by repeating the steps above to reconfigure each source editor. Now, my data flow task looks like this:

Image and video hosting by TinyPic

We get back the same results:

Image and video hosting by TinyPic

In case you are wondering why I got rid of all of the Sort tasks except for the one that follows the first merge join, I’ll explain. There are two reasons for this. My second join is on DepartmentID and, most importantly, the merge transformation task is not considered a data flow source task and does not come with the functionality to define the sorted order.

To conclude my second blog post of this series, the Merge Join transformation task can be used to merge columns from two different tables using Join logic similar to the Joins that can be used in T-SQL. We have looked at a step by step break down of what has to occur to implement a Merge Join transformation task as well as discussing some tips and best practice in regards to using the Sort task in SSIS.

I hope this blog post has been informative and that you look forward to reading my third post soon.

Indexing Foreign Keys (Helden)

Richard Foote - Wed, 2014-04-02 00:41
A recent question on an internal forum asked whether an index on a Foreign Key (FK) constraint designed to avoid locking issues associated with deletes on the parent tables needs to precisely match the columns in the FK. Could the columns in the index potentially be a different order or be appended with additional columns ? The answer is […]
Categories: DBA Blogs

Cupcake Wars at NoCOUG Spring Conference

Iggy Fernandez - Tue, 2014-04-01 14:49
FOR IMMEDIATE RELEASE Cupcake Wars at NoCOUG Spring Conference on May 15 at UCSC Extension Silicon Valley SILICON VALLEY (APRIL 1, 2014) – In a bold experiment aimed at increasing attendance at its awesome educational conferences, the Northern California Oracle Users Group (NoCOUG) is considering changing the format of its spring conference to that of […]
Categories: DBA Blogs

<b>Contributions by Angela Golla,

Oracle Infogram - Tue, 2014-04-01 12:57
Contributions by Angela Golla, Infogram Deputy Editor

Big Data Architecture
Everybody's talking about Big Data, but what are you doing about it? The latest OTN ArchBeat podcast just might help to get you started down the right path to Big Data. The panel for this discussion of Big Data Architecture includes Oracle ACE Director and Hadoop expert Gwen Shapira  and Oracle Big Data Handbook authors Tom Plunkett, Bruce Nelson, and Brian MacDonald.

Data Integration Tips: ODI 12c – Varchar2 (CHAR or BYTE)

Rittman Mead Consulting - Tue, 2014-04-01 11:28

Continuing with our Data Integration Tips series, here’s one that applies to both Oracle Data Integrator 11g and 12c. This “tip” was actually discovered as a part of a larger issue involving GoldenGate, Oracle Datapump, and ODI. Maybe a future post will dive deeper into those challenges, but here I’m going to focus just on the ODI bit.

The Scenario

During our setup of GoldenGate and ODI, it was discovered that the source and target databases were set to use different character sets.

Source:  WE8ISO8859P1

Target (DW):  AL32UTF8

During my research, I found that the source is a single-byte character set and the target is multi-byte. What this means is that a special character, such as  ”Ǣ“, for example, may take up more than one byte when stored in a column with a VARCHAR2 datatype (as described in the Oracle documentation – “Choosing a Character Set“). When attempting to load a column of datatype VARCHAR2(1) containing the text “Ǣ”, the load would fail with an error, similar to the one below.

ORA-12899: value too large for column "COL_NAME" (actual: 2, maximum: 1)

The difference in character sets is clearly the issue, but how do we handle this when performing a load between the two databases? Reading through the Oracle doc referenced above, we can see that it all depends on the target database column length semantics. Specifically, for the attributes of VARCHAR2 datatype, we need to use character semantics in the target, “VARCHAR2(1 CHAR)”, rather than byte semantics, ”VARCHAR2(1 BYTE)”. The former can handle multi-byte character sets simply by storing the characters as they are inserted. The latter will store each byte necessary for the character value individually. Looking back at the example, the character “Ǣ” inserted into a column using byte semantics (which is the default, in this case, when BYTE or CHAR is not specified) would require 2 bytes, thus causing the error.

Here’s the Tip…

The overall solution is to modify any VARCHAR2 columns that may have special characters inserted to use character semantics in the target database. Quite often we cannot determine which columns may or may not contain certain data, requiring the modification of all columns to use character semantics. Using the database system tables, the alter table script to make the necessary changes to existing columns can be generated and executed. But what about new columns generated by ODI? Here we’ll need to use the power of the Oracle Data Integrator metadata to create a new datatype.

In the ODI Topology, under the Physical Architecture accordion, the technologies that can be used as a data source or target are listed. Each technology, in turn, has a set of datatypes defined that may be used as Datastore Attributes when the technology is chosen in a Model.

Oracle Technology

Further down in the list, you will find the VARCHAR2 datatype. Double-click the name to open the object. In the SQL Generation Code section we will find the syntax used when DDL is generated for a column of type VARCHAR2.

Oracle technology - VARCHAR2 datatype

As you can see, the default is to omit the type of semantics used in the datatype syntax, which most likely means BYTE semantics are used, as this is typically the default in an Oracle database. This syntax can be modified to always produce character semantics by adding the CHAR keyword after the length substitution value.

VARCHAR(%L CHAR)

Before making the change to the “out of the box” VARCHAR2 datatype, you may want to think about how this datatype will be used on Oracle targets and sources. Any DDL generated by ODI will use this syntax when VARCHAR2 is selected for an attribute datatype. In some cases, this might be just fine as the ODI tool is only used for a single target data warehouse. But quite often, ODI is used in many different capacities, such as data migrations, data warehousing, etc. To handle both forms of semantics, the best approach is to duplicate the VARCHAR2 datatype and create a new version for the use of characters.

VARCHAR2 datatype edited

Now we can assign the datatype VARCHAR2 (CHAR) to any of our Datastore columns. I recommend the use of a Groovy script if changing Attributes in multiple Datastores.

Change Datatype - VARCHAR2 CHAR

Now when Generate DDL is executed on the Model, the Create Table step will have the appropriate semantics for the VARCHAR2 datatype.

Generate DDL - VARCHAR2 CHAR

As you can see, the power of Oracle Data Integrator and the ability to modify and customize its metadata provided me with the solution in this particular situation. Look for more Data Integration Tips from Rittman Mead – coming soon!

Categories: BI & Warehousing

Efficacy Math is Hard

Michael Feldstein - Tue, 2014-04-01 10:12

David Wiley has a great post up on efficacy and OER in response to my original post about Pearson’s efficacy plan. He opens the piece by writing about Benjamin Bloom’s famous “2 sigma” problem:

The problem isn’t that we don’t know how to drastically increasing learning. The two-part problem is that we don’t know how to drastically increase learning while holding cost constant. Many people have sought to create and publish “grand challenges” in education, but to my mind none will ever be more elegant than Bloom’s from 30 years ago:

“If the research on the 2 sigma problem yields practical methods – which the average teacher or school faculty can learn in a brief period of time and use with little more cost or time than conventional instruction – it would be an educational contribution of the greatest magnitude.” (p. 6; emphasis in original)

So the conversation can’t focus on efficacy only – if there were no other constraints, we actually know how to do “effective.” But there are other constraints to consider, and to limit our discussions to efficacy is to remain in the ethereal imaginary realm where cost doesn’t matter. And cost matters greatly.

David then launches into a discussion of what he calls his “golden ratio,” or standard deviations per dollar. I have long been a fan of this formulation and quote it frequently. I’m not going to try to summarize his explication of it in his post; you really should go read it. But I would like to tease out a few implications here.

Cost/Effectiveness Analysis

By expressing cost and educational impact in a ratio, David is engaging in something called cost/effectiveness analysis. You may be more familiar with the closely related term “cost/benefit analysis.” The main difference between these two is that in the latter benefit is expressed in financial terms while in the former it is expressed in non-financial terms (such as learning gains, in this case). This is a powerful tool which is unfortunately misapplied more often than not. When people invoke cost/benefit, what often mean to invoke is cost, as in, “Do you really think this is worth it?” It is used to selectively question an expenditure that somebody doesn’t like. (Note that I am not accusing David of making this error; I’m just talking about common usage.) In Congress, cost/benefit is often a requirement tacked on to a bill to decrease the likelihood that the thing the amendment author doesn’t like will actually get funding. Likewise in education, cost/benefit or cost/effectiveness is loosely invoked for things that the invokers don’t think are worth the money up front, whether it’s textbooks, LMSs, or teacher salaries.

But the better way to apply the tool is comparatively across the range of possible investment decisions. “Given X amount of money, do we get more standard deviations for our dollars by investing in A or B?” This moves us away from a focus on preventing spending on things we don’t like and toward a focus on maximizing utility, which is what David is after.  And this is where it gets complicated. A good part of David’s post is about the complexities of measuring and impacting the numerator in standard deviations per dollar. Unfortunately, we have a lot of trouble tracking the denominator as well. Even the institutional costs can be complex, as Phil’s recent response to Chris Newfield regarding the true cost of the UF/Pearson deal illustrates. It gets a lot more complicated when we start asking, “Cost to whom?” The controversy around the UF deal centers around the cost to the institution and ultimately to the state. Textbooks are paid for by students. Mostly. Sort of. Except when they spend university scholarship money on them. Or state or Federal financial aid on them. None of this argues against the framework that David is presenting. It just makes the practical application of it more challenging.

But It’s Worse Than That

So far, we’ve been talking about the ratio as if “efficacy” is represented in the numerator. David reinforces this impression when he writes,

So the conversation can’t focus on efficacy only – if there were no other constraints, we actually know how to do “effective.” But there are other constraints to consider, and to limit our discussions to efficacy is to remain in the ethereal imaginary realm where cost doesn’t matter.

But that’s not really his argument. His argument is cost impacts access which impacts efficacy. If students fail to use the prescribed product because they cannot afford to buy it, and they therefore do poorly in class, then the cost of the product is inextricable from the measure of its efficacy. This is an excellent example of what Mike Caulfield meant when he referred to the “last mile” problem. An educational product, technique, or intervention can only be said to be “effective” when it has an effect. It can only have an effect if it is actually used—and often only if it is actually used in the way it was designed to be used. Of course, if students can’t afford to buy the product, then they won’t use it and it therefore is not effective for them.

So maybe the entire ratio, including numerator and denominator, collectively expresses a measure of effectiveness, right? Not so fast. There are two colleges that are fairly close to where I live. Once, Berkshire Community College, has a total non-residential cost of $5,850 per year for Massachusetts residents taking 15 credits per semester. The other, Simon’s Rock College, has a total residential cost of $60,000 per year. A cost of $100 for curricular materials could have a dramatic impact on access (and therefore efficacy) in the former environment but negligible in the latter. Standard deviations per dollar does not capture this difference. We could instead express the denominator in terms of percentage of total cost, which would help somewhat for this particular purpose. But what we really need is empirical data quantifying the impact of cost on student access under different conditions. Doing so would enable us to separate the numerator and the denominator once again. If the impact of cost for a particular educational population is already factored into the numerator, then we can get back to a discussion of bang for the buck. We also can make more nuanced evaluations. It may be that, because of the access issue, a commercial product is more effective for Simon’s Rock students than it is for BCC students. Further, we could (theoretically) perform a calculation to determine its effectiveness for University of Massachusetts students, which would presumably be different from either of the other two.

I guess what I’m trying to say is that efficacy is complicated. It’s a great goal, but teasing out what it means and how to measure it in authentic and useful ways is going to be very difficult.

The post Efficacy Math is Hard appeared first on e-Literate.

Omnichannel retail changes the face of the merchandiser’s database

Chris Foot - Tue, 2014-04-01 09:48

Retailers that have failed to adapt to the e-commerce landscape are seemingly destined for failure. Those that have executed an omnichannel product delivery approach have implemented complex data analytics programs to provide them with valuable market insight on both individual customers and groups of people. Helping them effectively manage this software are database experts well acquainted with the technology. 

Meeting expectations 
Although online shopping has driven profits for merchandisers, it's also presented them with a list of new problems. One challenge that has evolved with the prevalence of e-commerce is reaching customer satisfaction. Back in the days when the only place to purchase items was in a brick-and-mortar store, it was enough to deliver a product that functioned the way it was supposed to at a reasonable price. Now, retail websites are expected to possess customized marketing campaigns for particular visitors and offer more rewards to loyal customers. 

Meyar Sheik, CEO of personalization software vendor Certona, claimed that without the appropriate data and actions to target shoppers with relevant, applicable information, it becomes very difficult for merchandisers to execute an effective omnichannel strategy. In this respect, possessing the programs capable of managing and translating such a large amount of data is just as much a part of the the customer relations strategy as product development. 

Leaving it to the professionals 
As more retail executives are more concerned about the intelligence derived from the data, many have hired database administration services to effectively implement and run the data analytics programs. In a way, these professionals do more than maintain a corporate network, they provide the expertise and tools necessary to keep the foundation of a business profitable. 

C-suite merchandisers aren't ignorant of this fact, either. According to a report released by research firm IDC, retailers are expected to heavily invest in big data and analytics projects in 2014, requiring the invaluable knowledge of IT services providers. In addition, the study showed that mobile applications connected to e-commerce operations will grow at a compound annual growth rate of 10 percent over the next three years. 

From what can be gathered based on the latter statistic, smartphones and tablets are anticipated to be major participants in omnichannel strategies. It is very likely that database administration companies will be hired to oversee the connectivity of these devices and ensure that the avenues through which they communicate are not exploited by cybercriminals. 

Overall, the functionality of data analytics tools and e-commerce software is dependent on IT professionals capable of assessing the needs of particular merchandisers. 

User Groups and Speaking About Support and diag tools.

Fuad Arshad - Tue, 2014-04-01 07:58
The Chicago Oracle Users Group (COUG) is finally in its reboot mode. Thanks to Alfredo Abate for taking on  the responsibility and bringing the enthusiasm to bring the community back together.  Jeremy Schneider has blogged about this here .  There is a Linked in Group now open for business and i would recommend every one to contribute and lets make this reboot a success.

I am also going to be presenting  at the Ohio Users Group on April 17th along with Jeremy Schneider. The Details of the Event can be found at http:///www.ooug.org. If you are in the area, Please stop by and say hi. I'll be talking about various support tools that Oracle has and how to use them effectively.



Installation of Oracle Identity Management (OID/OVD/ODSM/OIF) 11gR1(11.1.1.7) – Part 2

Online Apps DBA - Tue, 2014-04-01 07:40
This post covers part ” of  OID/OVD 11gR1 (11.1.1.7) installation that will be used as user repository (Identity Store) for our Oracle Access Manager (OAM) 11gR2 Admin Training (training starts on 3rd May and fee is 699 USD). For first part of OID/OVD installation click here 1. Install IDM (OID/OVD/ODSM) 11.1.1.7 1.1 Start installer as cd /stage/oracle/idm/11.1.1.7.0/Disk1 (IdM 11.1.1.7 [...]

This is a content summary only. Visit my website http://onlineAppsDBA.com for full links, other content, and more!
Categories: APPS Blogs

A few words about deployment of ADF production domain on Oracle Linux

Recently I have configured ADF 11.1.2.4 production domain on Oracle Linux 6 OS. One of the stages of this process is installation and configuration of Application Development Runtime (ADR) which is...

We share our skills to maximize your revenue!
Categories: DBA Blogs