CSC 523, Section U, Spring 2002
FILE & DB SYSTEM


Homework

 

1. Submit your paper-based solutions directly to Mr. Huaichen Yang, Teaching assistant(TA) for CSC 523. Online submissions must be kept in your home directory for CSC523 account which will be checked by the TA at the end of submission deadlines. Make a subdirectory (hw1, hw2 etc) for each homework.

2. Deadline for submission is mentioned against each problem. Late submission may not be accepted without penalty except under prior permission from the instructor/TA.

3. In case of any doubt regarding notation and/or terminologies the text book may be consulted.

4. You may make any assumptions you feel necessary to solve any problem. Any such assumptions should be stated clearly.

5. There are 11 homework problems for the entire semester as listed below.

6.Each problem carries 10 points. The total points obtained out of 110 in all 11 problems will be scaled down to 40 at the time of final grading.

1. [10 points. Deadline : 5 pm of January 29, 2002.]

Consider the following set of requirements for a university database that is used to keep track of students¡¯ transcripts.

(a)    The university keeps track of each student¡¯s name, student number, SSN, current address and phone, birthdate, sex, class, major, minor(if any), and degree program. Both SSN and student number have unique values for each student.

(b)    Each department is described by a name, department code, office number, office phone, and college. Both name and code have unique values for each department.

(c)    Each course has a course name, description, course number, number of semester hours, level, and offering department. The value of the course number is unique for each course.

(d)    Each section has an instructor, semester, year, course, and section number.

(e)    A grade report has a student, section, letter grade, and GPA.

Design an ER schema for this application and draw an ER diagram for that schema. Specify key attributes of each entity type and structural constraints on each relationship type. Note any unspecified requirements, and make appropriate assumptions to make the specifications complete.

2. [10 points. Deadline : 5 pm of February 5, 2002.] Sample made by TA

A PARTS file with Part# as the hash key includes records with the following Part# values : 2369, 3760, 4692, 4871, 2901, 1821, 1074, 7115, 1620, 2428, 3943, 4750, 6975, 3590, 1234, 5555, 7789, 2333, 4441, 9974, 1290, 5551, 9996, 1888, 2777, 3355, 4487, 7854, 5836, 2849. Each bucket is one disk block and holds four records.

(a) The file uses 9 buckets, numbered 0 to 8. Load these records into the file in the given order using the hash function h(K) = K mod 9. Collision resolution scheme is Open Addressing. Calculate the average number of block accesses for a successful retrieval on Part# (assume any of the part numbers is equally likely to be retrieved).

(b) Load the records into expandable hash file using linear hashing. Start with a  hash table of 2 blocks using the hash function h0 = K mod 2. Show the hash table after the insertion of 7789 as well as the final hash table.

3. [10 points. Deadline : 5 pm of February 12, 2002.] Sample made by TA

Consider a disk with block size B = 512 bytes. A block pointer is 6 bytes long. A file has r = 30,000 EMPLOYEE records of fixed length. each record has the following fields :

NAME (30 bytes), SSN (9 bytes), DEPARTMENTCODE (9 bytes), ADDRESS (40 bytes),

PHONE (9 bytes), BIRTHDATE (8 bytes), SEX (1 byte), JOBCODE (4 bytes),

SALARY (4 bytes, a real number). An additional byte is used as a deletion marker.

(a) Calculate the record size R in bytes.

(b) Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned organization.


(c) Suppose the file is ordered by the key field SSN and we want to construct
a primary index on SSN. Calculate (i) index of blocking factor bfri (which is also the index fan-out fo),(ii) the number of first-level index entries and the first-level index blocks, (iii) the number of levels needed if we make it a multilevel index, (iv) the total number of blocks required by the multilevel index, and (v) the number of block accesses needed to search for and retrieve a record from the file given its SSN value using the primary index.

(d) Suppose the file is not ordered by the key field SSN and we want to construct a B-tree access structure (index) on SSN. Calculate (i) the order p of the B-tree (ii) the number of leaf-level blocks needed, (iii) the number of levels needed, (iv) the total number of blocks required by the B-tree, and (v) the number of block accesses needed to search for and retrieve a record from the file - given the SSN value - using the B-tree.

4. [10 points. Deadline : 5 pm of February 21, 2002.]

Write SQL code in oracle to do the following ¨C

(a)    create two tables S(ABCD) and T(PQRS) choosing suitable types for the attributes

(b)    show primary key constraints on both the tables assuming A is the PK of S and PQ the PK of T

(c)    show foreign key constraint assuming R is a foreign key of T that references A of S

(d)    insert ten tuples in each table

5. [10 points. Deadline : 5 pm of February 28, 2002.]Sample made by TA

Consider the AIRLINE relational database schema shown below, which describes a database for airline flight information.

AIRPORT(AIRPORT_CODE,NAME,CITY,STATE)

FLIGHT(NUMBER,AIRLINE,WEEKDAYS)

FLIGHT_LEG(FLIGHT_NUMBER,LEG_NUMBER,DAPARTURE_AIRPORT_CODE,

SCHEDULED_DEPARTURE_TIME,ARRIVAL_AIRPORT_CODE,SCHEDULED_ARRIVAL_TIME)

LEG_INSTANCE(FLIGHT_NUMBER,LEG_NUMBER,DATE,NUMBER_OF_AVAILABLE_SEATS,

AIRPLANE_ID,DEPARTURE_AIRPORT_CODE,DEPARTURE_TIME,ARRIVAL_AIRPORT_CODE,

ARRIVAL_TIME)

FARES(FLIGHT_NUMBER,FARE_CODE,AMOUNT,RESTRICTIONS)

AIRPLANE_TYPE(TYPE_NAME,MAX_SEATS,COMPANY)

CAN_LAND(AIRPLANE_TYPE_NAME,AIRPORT_CODE)

AIRPLANE(AIRPLANE_ID,TOTAL_NUMBER_OF_SEATS,AIRPLANE_TYPE)

SEAT_RESERVATION(FLIGHT_NUMBER,LEG_NUMBER,DATE,SEAT_NUMBER,CUSTOMER_NAME,

CUSTOMER_PHONE)

Each FLIGHT is identified by a flight NUMBER, and consists of one or more FLIGHT_LEGs with LEG_NUMBERS 1,2,3, etc. Each leg has scheduled arrival and departure times and airports and has many LEG_INSTANCEs - one for each DATE on which the flight travels. FAREs are kept for each flight. For each leg instance, SEAT_RESERVATIONs are kept, as are the AIRPLANE used in the leg and the actual arrival and departure times and airports. An AIRPLANE is identified by an AIRPLANE_ID and is of a particular AIRPLANE_TYPE. CAN_LAND relates AIRPLANE_TYPEs to the AIRPORTs in which they can land. An AIRPORT is identified by an AIRPORT_CODE. Specify the following queries in relational algebra:

(a) For each flight, list the flight number, the departure airport for the first leg of the flight, and the arrival airport for the last leg of the flight.

(b) List the flight numbers and weekdays of all flights or flight legs that depart from Houston Intercontinental Airport (airport code 'IAH') and arrive in Los Angeles International Airport (airport code 'LAX').

(c) List the flight number, departure airport code, scheduled departure time, arrival airport code, scheduled arrival time, and weekdays of all flights or flight legs that depart from some airport in the city of Houston and arrive at some airport in the city of Los Angeles.

(d) Retrieve the number of available seats for flight number 'CO197' on '23-MAR-2001'

6. [10 points. Deadline : 5 pm of March 7, 2002.]

Suppose we have the following requirements for a university database that is used to keep track of students' transcripts as discussed in problem 1:

Design a relational database schema for this database application. First show all the functional dependencies that should hold among the attributes. Specify the key attribute of each relation. Note any unspecified requirements, and make appropriate assumptions to make the specification complete.

7. [10 points. Deadline : 5 pm of March 21, 2002.]example made by TA

Consider the AIRLINE relational database schema in problem 5. Write appropriate programs in the Oracle environment to achieve the following :

(a)    Create suitable tables to represent this database schema after incorporating appropriate integrity constraints.

(b)    Load the tables with data records (insert about 5 data records in each table)

(c)    Write queries in SQL to do the following tasks:

1.    For each flight, list the flight number, the departure airport for the first leg of the flight, and the arrival airport for the last leg of the flight.

2.    List the flight numbers and weekdays of all flights or flight legs that depart from Houston Intercontinental Airport (airport code 'IAH') and arrive in Los Angeles International Airport (airport code 'LAX').

3.    List the flight number, departure airport code, scheduled departure time, arrival airport code, scheduled arrival time, and weekdays of all flights or flight legs that depart from some airport in the city of Houston and arrive at some airport in the city of Los Angeles.

4.    List all fare information for flight number 'CO197'.

5.    Retrieve the number of available seats for flight number 'CO197' on '09-OCT-2000'

8. [10 points. Deadline : 5 pm of March 28, 2002.]

Suppose we have the requirements for a university database that is used to keep track of students' transcripts as explained in problem 1 above.

Implement the relational database schema you had designed (as a solution to problem 1)in oracle and insert sample data records into the tables. Write PL/SQL codes and to

1.    List all students who registered for a particular course in a given semester of a given year

2.    List all courses taken by a given students in a given semester and year

3.    Show the frequency counts of grades obtained by the students of a given course in a given semester of a given year

4.    Show the courses taught be a given professor in a given semester of a given year

9. [10 points. Deadline : 5 pm of April 9, 2002.]

Consider the following two sets of functional dependencies: F={A®C, AC®D,E®AD,E®H} and G={A®CD,E®AH}. Check whether F and G are equivalent.

10. [10 points. Deadline : 5 pm of April 16, 2002.]

Consider the relation R={A,B,C,D,E,F,G,H,I,J} and the set of functional dependencies F={AB®C, A®DE,B®F,F®GH,D®IJ}. Find the key(s) of R. Decompose R into 3NF. Is this decomposition in BCNF ?

11. [10 points. Deadline : 5 pm of April 23, 2002.]

Consider the relation R(M,Y,P,L,C) with the set of functional dependencies F={M®L,MY®P,L®C}. Suppose, R is decomposed into R1(MYP) and R2(MLC). Is this decomposition lossless ? Explain why.


-- Last Modified by TA