CSc 340/CSc 515
Database Design Project
Due Dates: See TIMELINE below
This trimester you'll be bringing together all of the concepts you've
been learning about in class by designing and implementing a complete
relational database from the ground up. You will map out all of the
structure and semantics of the enterprise that you are modeling, create
and instantiate tables in a relational DBMS, perform queries on your
database to ensure its functionality, and then present your results to
your fellow classmates in order to convince others of the efficiency of
your design.
The choice of the enterprise
you'll be modeling is up to you, whether it be a small retail business,
library, insurance company, hospital, or whatever. Your choice
must be sufficiently robust (i.e. there must be enough information to
put in a DB) to satisfy the design criteria below. You must make
a 10-minute appointment with me to
explain what enterprise you intend to model with your database so that I
can be sure that you don't intend to model something too small or too
large. No more than two people may model the same enterprise, so if you
have a specific database in mind, it behooves you to "reserve it" by
having me approve it sooner rather than later.
Design Phase
You'll be creating a design portfolio to contain all of the information
pertaining to the design of your database. This portfolio will contain
the following:
- Report on how the enterprise works in the "real world." This
report should be relatively free of database "jargon" and is mainly used
to inform anyone unfamiliar with the specifics of the enterprise.
Examples of such reports are the kind you worked with in Homework 1. The
report should be detailed and specific about all aspects you intend to
model.
- EER diagram based on the design report. The diagram should be
an unambiguous extension of the design report. That is, if you model
something in the EER diagram that requires an assumption not mentioned
in the report, you should go back to the report and add more information
to it so that no assumptions are needed to understand the diagram. The
EER diagram should contain all cardinality constraints, participation
constraints, primary keys, and partial keys. You must model enough of
the enterprise to warrant at least eight entity sets, not
including subclasses.
- Preliminary "first draft" of a relational schema based on the
EER diagram above using the algorithm discussed in class.
- For each relation in your schema, a list of all relevant
functional dependencies which hold upon it. You need not list every
dependency in the closure - just the relevant nontrivial ones.
You should arrive at these FDs by thinking about
the semantic constraints that apply to your enterprise, similar to
what you were doing in Homework 2, Problem 4. You shouldn't just
give me the FDs that hold due to your primary key choices.
- A "final draft" of your relational schema based on the
functional dependencies you listed above. This schema should be in 3NF,
contain only lossless-joins, and be dependency preserving. If you have
good reason to keep a relation in a lower (or higher) normal form, you
may do so provided you supply an accompanying explanation as to your
reasons. As part of this final draft, you must include explanations as
to why each relation is in 3NF.
Vague "handwaving" explanations like "because I
used the algorithm in the book" will not receive full credit.
Implementation Phase
Once your database is designed, you must implement it in a relational DBMS like
mySQL or Microsoft Access. mySQL is freely available for download,
and Access is available throughout Union College facilities.
If you have access to another DBMS like PostgreSQL or Oracle,
you are welcome to use it
providing you can access it remotely
(since you'll be demonstrating it to the rest
of us). Please check with me first if you intend to do this since I
want you to test it before heading too deeply into this project. As
time permits, I'll show some of the features of Microsoft Access in
class, but, in general, answers to questions like "how do I do X in
such-and-such DBMS?" will not be answered in class. Resources abound on the
Web and here at Union for those answers, and, of course, you can always
ask me individually.
You should instantiate your tables with (made-up) data. Your entire
database should have no less than a combined total of 100 tuples across
all relations, and every table must have at least 5 tuples in it.
Make the data as realistic as possible. For example,
for a 1:M relationship between attributes A and B, actually put the same
value in for A for multiple tuples and place different values in
for B. Point deductions will result if your data does not accurately
reflect your design.
Create no less than ten "canned" (saved) queries which demonstrate your
DB in action. These queries should be realistic with respect to what a
user may actually wish to find out from your data. All queries should
be written in SQL. Your set of queries should demonstrate the use of
all of the following:
- Selection
- Projection
- Joins (multiple types, including self-joins, if possible)
- Sorting
- Grouping
- Aggregate functions
- Nested queries
- Calculations
- Wildcards
- Update queries
You will turn in a copy of your implementation either on disk/CD or
over the network (via the
Blackboard dropbox). In
addition, please turn in a copy of your canned queries in both SQL
and English on paper.
Extra Credit Alert
Up to 5 points of extra credit will be given for providing an interface
through which queries can be issued to your database. Examples
of such database "frontends" include forms in Microsoft Access and
Web-based interfaces using the PHP scripting language. Since we
are mainly concerned with the design of the database "backend" in this
course, I won't be explaining the details of database interfaces during
lecture. However, they are becoming increasingly popular, and I
encourage you to explore them independently. Come talk to me
individually if you'd like me to point you to resources to get you
started. No extra credit will be given if you use a "ready-made" front-end
like phpMyAdmin or something
similar. You only get the points if
you build it yourself (and haven't previously built it elsewhere.)
Presentation Phase
Each person will give a 20 minute oral presentation to the class
(the 20 minutes includes time for questions). You
will show the structure of your database, demonstrate some of your
queries (in both SQL and QBE if provided), and defend why you chose the
decomposition that you did.
I (or other audience members) may ask you to issue
queries on the fly during the presentation, so be prepared.
Timeline
Due dates are as follows:
- Thurs, October 5th: Deadline for meeting with me to
approve your project. One point deduction for each weekday missed if
this deadline is not met.
- Thurs, October 19th: Design portfolio due
- Tue, November 7th: Implementations due. Turn in your
queries on paper along with any design changes you made since you turned
in your design portfolio. Submit a copy of the DB on disk/CD or on
Blackboard. First day of
oral presentations. Your implementation is due even if you do
not present today.
- Thurs, November 9th: Second day of oral presentations, if
necessary.
- Warning: I'll be taking attendance during both presentation days.
A 2-point penalty will result from an unexcused absence on either day.
Grading
This project is worth 15% of your total grade (3/4 of a midterm). Your
design portfolio, implementation, and oral report are each worth 5%.
This is a big project so don't put this off to the last minute! Start
early and come see me when you run into stumbling blocks. Some due
dates will overlap with due dates for homework assignments and quizzes
so manage your time accordingly.
CSc 340/515 Homepage