Re: Advice on SQL and records

From: FRAN <fran_beta_at_hotmail.com>
Date: 22 Aug 2005 17:22:25 -0700
Message-ID: <1124756544.927652.224000_at_f14g2000cwb.googlegroups.com>

David Cressey wrote:
> <fran_beta_at_hotmail.com> wrote in message
> news:1124611376.749896.208510_at_o13g2000cwo.googlegroups.com...
> > I'm teaching in computing. A question appeared in a trial paper
> > recently and I'm trying to establish whether the suggested answer is
> > correct.
> >
> > The question was:
> >
> > SELECT Students.Surname, Students.Name
> > FROM Courses, Classes, StudentClasses, Students
> > WHERE Courses.CourseID="10MA"
> >
> > AND
> > Courses.CourseID=Classes.CourseID
> > AND
> > Classes.ClassID=StudentClasses.ClassID AND
> >
> > StudentClasses.StudentID=STudents.StudentID
> >
> > ORDER BY Students.Surname, Students.Name
> >
> >
> > Which of the tables is likely to contain the most records?
> >
> > The suggested answer from Courses; Classes; StudentClassses and
> > Students was:
> >
> > StudentClasses
> >
> > How would this be determined? Why wouldn't Students contain the most
> > records?
> >
> > Thanks
> >
> > Fran
> >
>
> Assuming that the database contains only data about students that are
> currently taking courses, there will be at least one row in StudentClasses
> for each row in Student. Every student that's taking more than one course
> will have additional rows in StudentClasses. So Student classes has more
> rows.
>
> A similar argument applies to Classes versus StudentClasses. Since every
> class has at least one student, and most have more, there are more rows in
> StudentClasses than in Classes.
>
> There are likely to be more rows in Classes than in Courses, unless there
> are a lot of courses that are currently not being taught.

I'm learning so much here. Thanks everyone. I hope I don't sound like an ignorant moron, but frankly, I'd much prefer to run that risk here, than in front of a class, especially when I might spread my own ignorance about prior to assessment.

Lets say for the sake of argument, that we have a school with 500 students (pretty close to the size of the place where I teach actually).

In this hypothetical school, students take a minimum of five courses and a maximum of 8 courses. In total, there are 40 different courses at the school.

Each course is listed in the StudentCourses table. Each course name is the value of the CourseID field. A foreign key from the Students table called "StudentID" uniquely identifies each student in each course.

All 500 students are listed in the "Students" table. This table has "CourseID" as a foreign key.

The aggregate number of instances of a StudentID (regardless of duplicates) appears in all the courses for which there are records is 3860.

Is it likely that the number of records in StudentCourses is

40
3860
40*500 (ie 20000)
Something else worked out by another method
Insufficient data to determine an ballpark answer

Assuming it's not (a) I'd like the reasoning so I can get it clear in my head.

> A better question might be whether the optimizer can use the (approximate)
> cardinality of the tables, at the time of the query, in addition to other
> cues, to pick the best strategy for performing the query.

What does "cardinality" mean? What's an "optimizer"?

Thanks once again.

Fran Received on Tue Aug 23 2005 - 02:22:25 CEST

This message: [ Message body ]
Next message: x: "Re: dbdebunk 'Quote of Week' comment"
Previous message: David Cressey: "Re: Advice on SQL and records"
In reply to: David Cressey: "Re: Advice on SQL and records"
In reply to David Cressey: "Re: Advice on SQL and records"
Next in thread: Stefan Rybacki: "Re: Advice on SQL and records"
Reply: Stefan Rybacki: "Re: Advice on SQL and records"
Reply: FRAN: "Re: Advice on SQL and records"
Reply: Stefan Rybacki: "Re: Advice on SQL and records"
Reply: David Cressey: "Re: Advice on SQL and records"
Reply: David Cressey: "Re: Advice on SQL and records"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message