Home » SQL & PL/SQL » SQL & PL/SQL » How to Populate a fact table from dimension tables.Kindly reply. (Oracle 10.2g)
How to Populate a fact table from dimension tables.Kindly reply. [message #380301] Sat, 10 January 2009 07:51 Go to next message
sqlstar_student
Messages: 42
Registered: November 2007
Member
Basically my aim is there are 5 dimensional tables that are created

Student->s_id primary key,upn(unique pupil no),name
Grade->g_id primary key,grade,exam_level,values
Subject->sb_id primary key,subjectid,subname
School->sc_id primary key,schoolno,school_name
year->y_id primary key,year(like 2008)

s_id,g_id,sb_id,sc_id,y_id are sequences

select * from student;
S_ID UPN FNAME COMMONNAME GENDER DOB
==============================
9062 1027 MELISSA ANNE       f  13-OCT-81

...

9000 rows selected


select * from grade;
      G_ID GRADE      E_LEVEL         VALUE
---------- ---------- ---------- ----------
        73 A          a                 120
        74 B          a                 100
        75 C          a                  80
        76 D          a                  60
        77 E          a                  40
        78 F          a                  20
        79 U          a                   0
        80 X          a                   0
...
18 rows selected

These are basically the dimensional views

Now according to the specification given, need to create a fact table as facts_table which contains all the dim tables primary keys as foreign keys in it.

The problem is when i say,I am going to consider a smaller example than the actual no of dimension tables 5 lets say there are 2 dim tables student,grade with s_id,g_id as p key.
create materialized view facts_table(s_id,g_id)
as
select  s.s_id,g.g_id
from   (select distinct s_id from student)s
,         (select distinct g_id from grade)g


This results in massive duplication as there is no join between the two tables.But basically there are no common things between the two tables to join,how to solve it?

Consider it when i do it for 5 tables the amount of duplication being involved, thats why there is not enough tablespace.

I was hoping if there is no other way then create a fact table with just one column initially

create materialized view facts_table(s_id)
as
select s_id
from student;


then
alter materialized view facts_table add column g_id number;


Then populate this g_id column by fetching all the g_id values from the grade table using some sort of loop even though we should not use pl/sql i dont know if this works?

Any suggestions.
Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380303 is a reply to message #380301] Sat, 10 January 2009 08:28 Go to previous messageGo to next message
flyboy
Messages: 1903
Registered: November 2006
Senior Member
Hi,

before using this kind of "fact table", you shall ask this question: what shall this table store and how shall be its content used? Populating all combinations means, that each student studied all subjects with every grade in every school in all years. Does it make any sense?

Maybe you should read something about Business Intelligence and OLAP concepts. As fact tables are typically used for aggregating information about dimension combinations in existing system (the school one in your case), so it shall have at least one measure (event count). Otherwise, you would not aggregate - you would have common OLTP evidence. As in normal systems, not all dimension combinations are present, the fact table is not very huge.

Also in your case, the "student" dimension seems to be too detailed to use. Are you really planning to make reports per individual students? Maybe their gender, age or some other personal info (nationality, ...) should suffice.

Also, instead of using "normal" tables (ROLAP) you could have a look at Oracle OLAP support (MOLAP).
Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380305 is a reply to message #380301] Sat, 10 January 2009 08:43 Go to previous messageGo to next message
sqlstar_student
Messages: 42
Registered: November 2007
Member
All iam trying to put in the fact table is the foreign keys i.e the primary keys of the dimension tables i created.

I am not getting how to do that without involving lot of duplication.

Thank you.
Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380307 is a reply to message #380303] Sat, 10 January 2009 08:57 Go to previous messageGo to next message
flyboy
Messages: 1903
Registered: November 2006
Senior Member
flyboy wrote on Sat, 10 January 2009 15:28
What shall this table store and how shall be its content used? Populating all combinations means, that each student studied all subjects with every grade in every school in all years. Does it make any sense?

Does really every student pass all subjects with every grade in every school in all years? Even if this is a homework, you shall think what do the stored data represent.

sqlstar_student wrote on Sat, 10 January 2009 15:43
I am not getting how to do that without involving lot of duplication.

When the dimensions are properly chosen (according to the report requirements), that "duplication" is not so huge. On the contrary, it shall be just a fraction of corresponding OLTP evidence storage.
Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380311 is a reply to message #380301] Sat, 10 January 2009 09:38 Go to previous messageGo to next message
sqlstar_student
Messages: 42
Registered: November 2007
Member
Basically your quite right to say that without any logical common columns between the dimension tables it will produce results that every student studied every sibject and got every grade and its very rubbish,

I am confused at to whether the dimension tables can contain duplicated columns i.e column like upn(unique pupil no) i will also copy in another table so that when writing queries a join can be placed. i dont know whether thats right

These are the required queries from the star schema

Design a conformed star schema which will support the following queries:
a. For each year give the actual number of students entered for at A-level in the whole country / in each LEA / in each school.

b. For each A-level subject, and for each year, give the percentage of students who gained each grade.
c. For the most recent 3 years, show the 5 most popular A-level subjects in that year over the whole country (measure popularity as the number of entries for that subject as a percentage of the total number of exam entries).

I written the queries earlier based on dimesnion tables which were highly duplicated they were like

student
=======
upn
school

school
======
school(this column substr gives lea,school and the whole is country)
id(id of school)

student_group
=============
upn(unique pupil no)
gid(group id)
grade

year_col
========
year
sid(subject id)
gid(group id)
exam_level
id(school id)

grades_list
===========
exam_level
grade
value

subject
========
sid
subject
compulsory

These were the dimension table si created earlier and as you can see many columns are duplicated in other tables like upn and this structure effectively gets the data out of the schema as there are common column upon which we can link

But a collegue suggested that these dimension tables are wrong and they should not be this way and should not contain dupliated columns.

select 	distinct count(s.upn) as st_count
,	y.year
,	c.sn 
from 	student_info s
,	student_group sg
,	year_col y
,	subject sb
,	grades_list g
,	country c
where 	s.upn=sg.upn
and	sb.sid=y.sid
and	sg.gid=y.gid
and	c.id=y.id
and	c.id=s.school
and 	y.exam_lev=g.exam_level
and 	g.exam_level='a'
group by y.year,c.sn
order by y.year;


This is the code for the 1st query

I am confused now which structure is right.Are my earlier dimension tables which i am describing here or the new dimension tables which i explained above are right.

If what i am describing now is right i mean the dimension tables and the columns are allright then i just need to create a fact table with foreign keys of all the dimension tables.


Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380314 is a reply to message #380301] Sat, 10 January 2009 10:00 Go to previous messageGo to next message
flyboy
Messages: 1903
Registered: November 2006
Senior Member
Quote:
I written the queries earlier based on dimesnion tables which were highly duplicated they were like

Maybe you shall realize the difference between dimension and fact tables.
SCHOOL, GRADES_LIST and SUBJECT are dimension tables.
The rest is either fact tables (STUDENT, YEAR_COL), but containing duplicates instead of measured count or some mix between (STUDENT_GROUP).

Quote:
I am confused now which structure is right.Are my earlier dimension tables which i am describing here or the new dimension tables which i explained above are right.

Both do not contain measures, so I would not call them "right".
Quote:
If what i am describing now is right i mean the dimension tables and the columns are allright then i just need to create a fact table with foreign keys of all the dimension tables.

You may fill it from YEAR_COL, maybe with joining some other tables. And remove a STUDENT column from the fact table and replace it with its count (measure).

[Updated on: Sat, 10 January 2009 10:01]

Report message to a moderator

Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380315 is a reply to message #380314] Sat, 10 January 2009 10:23 Go to previous messageGo to next message
sqlstar_student
Messages: 42
Registered: November 2007
Member
I understand that if there is a query like

For each A-level subject, and for each year, give the percentage of students who gained each grade.

Then based on the requested queries we create the dimension tables as the dimension tables are tables which are being used as part of a condition in a query everything else is fact.

so basically the dimension tables would be by looking at the query

Grade Dimension table
===========
g_id primary key to link to fact table a sequence of auto gen no
grade
exam_level
value

as its mentioned each A-level subject and show data by each grade

students dim table
====================
s_id primary key to link to fact table a sequence of auto gen no
upn

for count(upn) the no of students.

year D table
============
y_id primary key to link to fact table a sequence of auto gen no
year

as its asked by each year


similarly the other queries show that
we need

Subject Dimension table
=======================
sb_id pk a sequence of auto generated no as st.seq.nextval
subjectid
subject name

and

School dim table
=================
sc_id a sequence of auto generated no as school_seq.nextval
schoolid
schoolname
...

Fact data
=========
All foreign keys

s_id
g_id
sb_id
sc_id
y_id

But how will the queries return meaningful data if there are no common columns between the tables.
Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380316 is a reply to message #380314] Sat, 10 January 2009 11:01 Go to previous messageGo to next message
sqlstar_student
Messages: 42
Registered: November 2007
Member
Ahhh i think i understand a little bit , i get to see your point now

Say there are 3 dimension tables

Student
====
s_id pk auto gen sequence
Studentid
Studentname

School
=====
sc_id pk auto gen sequence
school
schoolname

grade
=====
g_id pk auto gen sequence
e_level

fact table
========
s_id fk reference s_id
sc_id fk reference sc_id
g_id fk reference g_id

so when i want to get the data for the query which could be like

get me the no of students who are studying a-level (advanced level) all over the country

then the query will be something like
select 	count(s.upn)
from 	student s
,	grade g
,	fact_table ft
where 	ft.s_id=s.s_id
and	ft.g_id=g.g_id
and	g.e_level='a';


If this is allright,

I kindly need to know how to populate the fact table with the primary keys of the dim tables which become the fkey in fact table without duplication.Because as i explained above in the first post whenever i try to populate the fact table then it produces duplicated keys.

[Updated on: Sat, 10 January 2009 11:07]

Report message to a moderator

Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380317 is a reply to message #380301] Sat, 10 January 2009 11:29 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
http://www.orafaq.com/forum/t/88153/0/
You have not read or followed Posting Guideline as stated in URL above.

Which comes first, then chicken or the egg?
Which comes first, the DIMENSION table or the FACT table?
Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380318 is a reply to message #380317] Sat, 10 January 2009 11:38 Go to previous messageGo to next message
sqlstar_student
Messages: 42
Registered: November 2007
Member
I dont know about the chicken and egg buddy

but in the case of dimension tables and fact tables

the dimension tables come first then the fact tables.
Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380320 is a reply to message #380301] Sat, 10 January 2009 11:45 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
You need to learn about 3rd Normal Form.
You need to be able to identify which student took which course(s) at what school during which year & got what grade(s).
After you can do than then you can build the summarized FACT tables.

Dimensions are tables that contain attributes that describe an object.
A fact is a table that collects quantitative information, such as sales.

The goal behind the fact tables is to precompute values & store them in the fact table so the summarized values do not have to be always computed "on the fly" when generating reports.

Some classic FACT tables are having sales information summarized over time; days, weeks, months, quarters, years, etc.

I do not see any "quantitative information" in your proposed fact table.


[Updated on: Sat, 10 January 2009 12:06]

Report message to a moderator

Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380358 is a reply to message #380320] Sun, 11 January 2009 05:13 Go to previous messageGo to next message
sqlstar_student
Messages: 42
Registered: November 2007
Member
It was my fault that there was nothing in common in the fact table and dimension table, now i got that rectified and the duplication is reduced totally.

I used the primary keys in dimension tables as foreign keys in fact tables,dropped the auto generated sequences i used before as primary keys and used columns in the dimension tables which maily rep that table as its primary key and made them as foreign keys to fact table and mapped references the primary key in the dimension table.

Thank you for all the replies.
Re: How to Populate a fact table from dimension tables.Kindly reply. [message #380387 is a reply to message #380301] Sun, 11 January 2009 10:17 Go to previous message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
accepted & expected practice if for actual solutions to be posted to this forum to assist others who visit after you.

please post the DIMENSION & FACT table DDL here.
Previous Topic: retrieve the least values from three tables and insert them into one table
Next Topic: Help in Pivot Query (merged)
Goto Forum:
  


Current Time: Sat Feb 08 08:01:55 CST 2025