Advanced Data Modeling Overview
Week 5 Overview
This week we continue studying and applying the processes in designing relational databases. Last week we continued to study modeling databases using professional modeling languages. The two languages we used were the Chen modeling language and the Crows Feet modeling language. There were in class and take home assignments to provide practice in both. This week, we will continue to use the Chen Diagram that we made in class to learn about translating the ER Diagram into the Relational Schema. The relational schema is a translated mapping of the tables that will be created in the database. We will translate the ER Diagram into the Relational Schema using a couple of rules that we will review and illustrate in class this week.
Once the relational schema is created from the ER Diagram, the next step is to create metadata documentation using the relational schema as a guide. The metadata documentation will describe the tables that are to be created. The description will include a list of all the table names to be created, and a list of the names of each tables attributes. For each attribute a list of descriptive information should be provided. For us the descriptive information will include the data type of the attribute, the range of allowable values that the attribute can contain, a true or false indicator for primary key, a true or false indicator for foreign key, a true false indicator for null, and a brief description of the role the attribute plays in the table.
So that is two phases in the design process that we will begin to cover this week, the translation of the ER Diagram into a Relational Schema, and the creation of the metadata from the Relation Schema. You provide applied practice, we will do a practice exercise in class covering both activities
Week 5 Chapter 3 & 4
Relational Models and ER Modeling
To be successful you should follow these tasks weekly
- read the weekly overview
- read the lecture notes posted in weekly modules
- participate in the weekly discussion forums
- review the supplement documents and example code posted in the weekly modules
- attempt the weekly practice exercise
- complete the weekly assignment
- complete the weekly quiz
Important Reminder
The correct process for responding to discussions is to first post your own reply to the discussion topic. Other participant’s replies will not be visible until you post your first individual reply. Second, you must post a brief reply to at least two other students’ replies. This along with a relevant individual reply will give you full credit in discussion assignments
Assignment Extensions
Highlights This Week
One of the major focus of this week is adding more semantic constructs and clarity to the original entity relationship (ER) model that we reviewed in weeks three and four. This week I will lecture and illustrate extending this clarity and using additional constructs called an Enhanced or Extended EER diagram (EERD).
We will compare the Entity super type against the Entity subtype
The Entity super type is a generic entity type related to one or more entity subtypes that contains common characteristics of the super type
The Entity subtype contains unique characteristics of the entity subtype that is not contained in the Entity super type.
Both super types and subtypes depicts arrangement of higher-level entity super types and lower-level entity subtypes
An Entity Specialization Relationships are described in terms of “IS-A” relationships. Entity Subtype exists only within context of super type. In the specialization hierarchy, every subtype has only one super type to which it is directly related to. A database can have many levels of super type/subtype relationships
The hierarchy structure enables entity subtype to inherit attributes and relationships of its associated super type. In this structure all entity subtypes inherit their primary key attribute from their super type entity. At the implementation level, super type entities and its subtype(s) maintain a 1:1 relationship. Entity subtypes inherit all relationships in which super type entity participates in. Lower-level subtypes inherit all attributes and relationships from all upper-level super types
The chapter describes disjoint subtypes and Overlapping subtypes. A disjoint subtype is a entity subtype that has its own unique attributes. An overlapping subtype is an entity subtype that contains attributes of the super type.
Disjoint subtypes is also called non-overlapping subtypes it is a subtypes that contain unique subset of super type entity set
Overlapping subtypes are subtypes that contain non unique subsets of super type entity set
Comparing Specialization and Generalization, specialization, identifies more specific entity subtypes from higher-level entity super type. Specialization is a top-down process. It is based on grouping unique characteristics and relationships of the Super types into common sub-types.
In contrast, generalization identifies more generic entity super type from lower-level entity subtypes. It is a bottom-up process. It is based on grouping common characteristics and relationships of the subtypes
This week continues the pivotal process of understanding the relational model and diagramming databases using ER diagrams. We continue the effort of reinforcing very important concepts in Relational Database Management. We will continue to use the Entity Relationship Modeling Language to illustrate designs for relational databases. In order to create these designs we must understand. Last week we introduced the Chen diagraming model that primarily uses rectangles, diamonds, ovals and lines to convey the design of a relational database. Within the context of that we reviewed some key concepts and terms
Required attribute: must have a value
Optional attribute: may be left empty
Domain of an Attribute: is a set of possible values for that attribute
Unique Identifiers: one or more attributes that uniquely identify each entity instance, this translates to a primary key of a table
Composite Identifier: primary key composed of more than one attribute
Derived Attribute: value may be calculated from other attributes this type of attribute does not have to be physically stored within database
Composite attribute can be subdivided
Simple or atomic attributes cannot be subdivided
Single-value attribute can have only a single value
Multivalued attributes can have many values
We also introduced the concept of the Relationship in a Database. We defined a relationship as being an association between entities. The participants in the relationship are the entities that participate in the relationship. Relationships between entities always operate in both directions • Relationship can be classified as 1:1, 1:M and M:N.
We also introduced the concept of Cardinality. Cardinality expresses minimum and maximum number of entity occurrences associated with one occurrence of related entity. Cardinality is established by very concise statements known as business rules
Additional Concepts that we will continue to reinforce
- Primary Keys
Primary Keys
We are beginning to start working on the process of creating ER diagrams and translating those diagrams into a relational schema and metadata that will ultimately be used to create a database. In an ER diagram of a data model, one or more unique identifiers may be declared for each data entity to uniquely identify an instance in the entity. Typically, entities in a relational model will become tables in the database. The set of unique attributes declared for a data entity is often referred to as the candidate keys for that data entity. Thinks like social security number, your driver’s license number. Your school identification number is an example of attributes that can uniquely identify an instance of you in a table. These can be considered as candidate keys that may become a primary key. From the set of candidate keys, a single unique key is selected and declared the primary key for that data entity. Each unique key may be composed from one or more data attributes of that data entity. In an entity relationship diagram, each entity relationship uses a unique key, most often the primary key, of one data entity and copies the unique key data attributes to another data entity to which it relates. This inheritance of the unique key data attributes is referred to as a foreign key and is used to provide data access paths between data entities. Once the data model is instantiated into a database, each data entity usually becomes a database table, unique keys become unique indexes associated with their assigned database tables, and entity relationships become foreign key constraints. The primary key of a relational table uniquely identifies each record in the table. Primary keys must contain unique values. A primary key column cannot contain NULL values. Each table should have a primary key, and each table can only have one primary key
- Foreign Key
Foreign Keys
We are studying relational database management systems (RDBMS). In the context of relational databases, a foreign key is a referential constraint between two tables. A foreign key is a field in a relational table that matches a unique key of another table. Usually this is the primary key in another table. The foreign key can be used as a bridge to cross-reference tables.
For example, say we have two tables, a CUSTOMER table that includes all customer data for cleaner service, and an ORDER table that includes all customer clothes that they handed over to e processed and cleaned.. The intention here is that all orders must be associated with a customer that is already in the CUSTOMER table. To do this, we will place a foreign key in the ORDER table and have it relate to the primary key of the CUSTOMER table.
The foreign key identifies a column or set of columns in one (referencing) table that refers to a column or set of columns in another (referenced) table. The columns in the referencing table must reference the columns of the primary key or other key in the referenced table. The values in one row of the referencing columns must occur in a single row in the referenced table. Thus, a row in the foreign table cannot contain values that don't exist in the primary table. This way references can be made to link information together when writing and executing queries. This is an essential part of database organization and normalization. Multiple rows in the foreign key table may refer to the same row in the primary key table. Most of the time, it reflects the one (parent table or primary key table) to the many (child table, or foreign table)
- Functional Dependency
Functional Dependency
By definition, a functional dependency is a constraint between two sets of attributes in a relation from a database. The "function" being discussed in "functional dependency" is the function of identification of related information. The determination of functional dependencies is an important part of designing databases in the relational model. A functional dependency occurs when one attribute uniquely determines another attribute in a relation or one attribute in a relation uniquely determines another attribute in another relation. This can be written A -> B which would be the same as stating "B is functionally dependent upon A."
Examples of functional dependency:
In a table listing employee attributes including Social Security Number (SSN) and name, it can be said that name is functionally dependent upon SSN (or SSN -> name) because an employee's name can be uniquely determined from their SSN. However, the reverse statement (name -> SSN) is not true because more than one employee can have the same name but different SSNs. an employee name cannot distinguish the SSN because more than one employee could have the same name.
- Referential Integrity
The very important concept of Referential Integrity is covered in this chapter.
Referential integrity is a property of data in a relational database which, when satisfied, requires every value of one attribute (column) of a relation (table) to exist as a value of another attribute in a different (or the same) relation (table). This means that when there is a primary and foreign key cross reference system, the primary key record must exist before the foreign key record can be inserted into the foreign key table. We will discuss this during lecture this week.
For referential integrity to hold in a relational database, any field in a table that is declared a foreign key can contain only values from a parent table's primary key or a candidate key. For instance, deleting a record that contains a value referred to by a foreign key in another table would break referential integrity. This means that you cannot delete a primary key record while there are still records in the foreign key table point to the primary key row.
Some relational database management systems (RDBMS) can enforce referential integrity, normally either by deleting the foreign key rows as well to maintain integrity, or by returning an error and not performing the delete. When the database deletes the foreign key records automatically when the primary key is removed, the process is called a cascade delete. This is not encouraged for the beginner working with databases. Typically a database will not perform a cascade deleted unless explicitly given the command. Usually the delete will not be executed if the system detects a referential integrity constraint violation.
- Cardinality
In data modeling, the cardinality of one data table with respect to another data table is a critical aspect of database design. Relationships between data tables define cardinality when explaining how each table links to another. In the relational model, tables can be related as any of: many-to-many, many-to-one (rev. one-to-many), or one-to-one. These particular aspects are said to be the cardinality of a given table in relation to another.
Here are a few examples. Example 1, consider a database designed to keep track of hospital records. Such a database could have many tables like:
- a Doctor table full of doctor information
- a Patient table with patient information
- a Department table with an entry for each department of the hospital.
In that model:
- There is a many-to-many relationship between the records in the doctor table and records in the patient table (Doctors have many patients, and a patient could have several doctors); In M:N relationships a mapping/translation table is created that maps the primary keys of the two entities involved in the M:M relationship. This creates two 1 to many relationships towards the mapping table.
- There is a one-to-many relation between the department table and the doctor table (each doctor works for one department, but one department could have many doctors). . When there is a one to one relationship, one typically leaves it intact. It is the desirable cardinality in a relational database
- A one-to-one relationship is mostly used to split a table in two in order to optimize access or limit the visibility of some information. In the hospital example, such a relationship could be used to keep apart doctors' personal or administrative information. In data modeling, collections of data elements are grouped into data tables. Each entity has a collection of data elements. In the stable translation method, the entities and relationships are translated into tables. The data tables contain groups of data known as attributes. Tables are linked by a cross referencing system using key fields. A primary key attribute that uniquely identifies an instance in the table assigns that field's special order to a table: for example, the Doctor ID field might be assigned as the primary key of the Doctor table. Primary Keys are supposed to be unique. People can have same last name. A table can also have a foreign key which indicates that that field is linked to the primary key of another table. When there is a one to one relationship, one of two actions typically occur. The first is to combine the two tables into a single table. The second is to keep the two tables separated. When this action is taken one table becomes the primary key table and the other becomes the foreign key table. The most frequently accessed table is typically assigned as the primary key table in the relationship
Finally, this chapter revisits the two main concepts that we have been covering since chapter 1, the concepts of the primary key and the foreign key.
Within cardinality there is a concept known as existence dependence. This means that an entity instance exists in database only when it is associated with another related entity occurrence. An example of this is that a spouse entry in a spouse entity of employee database cannot exist without the primary employee instance first being entered
Existence independence is when an entity can exist apart from one or more related entities, sometimes such an entity is referred to as a strong or regular entity. This type of entity is
Chapter 3 was a very comprehensive chapter that contained a significant number of concepts and skills that has to be learned. We will continue going over these skills and concepts this week. We will continue exploring relational database modeling in more depth. Once again this week, we will continue fortifying our understanding of writing sing table queries in SQL. We will review correct queries for last week’s information requests on the Agent Table. Over the next several weeks we will begin to add additional modifiers to the single table SELECT query statements to add additional criteria to the selection process when the query is executing. This week we will not have a single query assignment. We will continue focusing on creating ER diagrams and translating them to relational schemas a Meta Data.
Your registration into Microsoft Imagine has completed. If you have not received your email welcoming you to Microsoft Imagine, check your junk folder. If it is not in your junk folder, notify me with your CCP email address included in the message. You can also navigate to the home page of Microsoft Imagine and request that Microsoft resend you an email with your login information.
In class this week, I will continue illustrating the use of Microsoft Visio to create ER Diagrams. Additionally, I will also demonstrate Microsoft Word as a tool to make less sophisticated Chen models and Relational Schema if are working from home and for some reason we are not able to get Visio working at home. Finally, I will touch on the concept of Enhanced or Extended ER Diagrams (EERD). This is ER Diagramming incorporating concepts in Object Oriented Development into the ER Model. Primarily, we will discuss the concept of inheritance in object oriented development as it relates to ER Diagramming. When ER Diagrams incorporate Inheritance, they use it through the concepts of generalization and specialization. We will briefly discuss this in class
Here are a few additional concepts to know when diagramming relational models
Relationships can have degrees. The degree of a relationship indicates number of entities or participants associated with a relationship.
There can be a unary relationship. This is when the association or relationship is maintained within single entity. In lecture it was discussed that if there is an employee entity that a supervisor can supervise other employees in the employee entity.
There can be Binary Relationships. This is when two entities are associated. In lectured it was discussed that there can be a relationship between an employee and the department that the employee is in. The example that a customer can have many accounts was also discussed. This is the relationship between the entity customer and the entity account.
There can be Ternary Relationship. This is when three or more entities are associated. In lecture it was discussed that there can be a relationship between a Project Entity, a Supplier Entity, and a Supplies Entity. In practice however, it is preferred to have many binary relationships
The textbook discusses Associative (Composite) Entities. These are more commonly known as bridge entities or mapping tables. These are used to implement M:N relationships between two entities. Typically the primary key of his entity is composed of the primary keys of each of the
entities to be connected. This is a composite key. Typically, it also contains additional descriptive attributes that play no role in connective process between the two entities.
In conclusion if referential integrity is being enforced in a database system, a primary key row cannot be removed from a table in referential key rows from another tables are associated with the primary key record. Additionally, a foreign key row cannot be inserted into a foreign key table is the primary key row is not inserted into the primary key table first. In the context of our discussions, a customer cannot be deleted from the cleaner’s database while the customer has material not picked up. A person accepting articles of clothing from a customer cannot input the each article of clothing for the customer, if the customer’s information has not been entered first. Design is of little value unless it delivers all specified query and reporting requirements. Some design and implementation problems do
not yield “clean” solutions.
Homework Assignments This Week
Translation of ER diagram to Relational Schemas and Meta Data
Readings
Finish reviewing and studying Chapter 3
Begin to study Chapter 4 in the Textbook
Example Exercise
I Translation to Relational Schema
Discussions for the Week
Week 5 Discussion 1
In your own words, describe your interpretation of the generalization and specialization hierarchy?
Week 5 Discussion 2
A conversation about Surrogate Keys
Database Project
Practice: Pizza Delivery Company
Homework Assignment
Weekly Quiz
Complete Week 5 Quiz Online