Relational Algebra Overview

 

Relational Algebra 

  • read the weekly overview
  • read the lecture notes posted in weekly modules
  • participate in the weekly discussion forums
  • review the supplement documents and example code posted in the weekly modules
  • attempt the weekly practice exercise
  • complete the weekly assignment
  • complete the weekly quiz

     



    CIS 205 Week 7 Announcements and Assignment

    Hi all, this is week 7 and the halfway point in our 14 week semester. This is the week before Spring Break at CCP.  Last week we closed out our lecture on normalization and began to study and discuss concepts in Relational Algebra. We also began to introduce concepts in Relational Algebra and Set Theory. This week, we will continue our lecture on Relational Algebra and we will also begin to practice implementing a physical relational database from its conceptual model by writing SQL statements for creating the database and tables.  We will begin this process on Wednesday.  This will be a brief introduction.  We will begin to study it in full detail the week we return from Spring Break. 

     

    I recently graded and released feedback on most outstanding assignments that were not processed.  I will attempt to review the most serious concerns during our lectures on during week 7.  If you have any remaining questions concerning the design processes or writing single table queries, week seven will be the best time to ask.  During week seven the class will begin transitioning from the design process to the implementation and use of Relational Databases from their conceptual design.

     

    Part of this week’s in class activities will focus on understanding the structure of the CREATE TABLE statement in SQL.  Additionally, the class continues getting applied practice in these concepts and skills.   The class will get practice using the CREATE TABLE Statement with our continuing using one of the in class practice database projects. 

     

     

    To prepare for using the CREATE TABLE Statement, students should read the PDF document on creating tables.  Additionally, the online tutorial links on creating tables should be studied.

     

    When we get back from Spring Break, in addition to continuing to study the use of applying Relational Algebra to real world problems, and continue studying and use the CREATE TABLE Statement, the class will learn how to insert test data into a database.  In the modern relational database management system that we are using, MySQL, you will begin to write the INSERT INTO statements that will populate our in class practice database with example test data.  In addition to using the INSERT INTO Statement to populate rows one at a time, we will learn to use the SQL LOAD Statement to load multiple rows of data from comma delimited text files.

     

    Later we will discuss how to plan queries to extract information from the test data.  During the process of creating tables and populating them with test data, you must follow the rules of referential integrity to maintain the correctness of the data that you are inserting.  This will be inserting the data for the primary table first and then inserting corresponding data into the foreign key tables. Of course this also means creating the primary key tables before creating the foreign key tables.

     

     Relational Algebra and Create Table Overview

    • read the weekly overview
    • read the lecture notes posted in weekly modules
    • participate in the weekly discussion forums
    • review the supplement documents and example code posted in the weekly modules
    • attempt the weekly practice exercise
    • complete the weekly assignment
    • complete the weekly quiz

     

    Highlights and Outcomes This Week

    Relational Algebra Concepts: In order to implement a DBMS, there must exist a set of rules which state how the database system will behave.  The concepts in Relational Algebra are the rules that DBMS systems follow when processing queries on tables.  Tables behave has sets while the DBMS performs set operations on the tables.  It is very useful to understand these set operations when writing queries.  It provides insight on what to expect in the result sets returned by the DBMS in response to the query.

     

    Projections:  As discussed the PROJECT operation is used to select a subset of the attributes of a relation by specifying the names of the required attributes.  You project with the SELECT statement.  The attributes that you list after the SELECT statement is the subset of attributes that you want from the entire set.  The entire set is all of the attributes in the table.

     

    Selections:  As discussed the SELECT operation is used to obtain a subset of the tuples/rows of a relation/table that satisfy a select condition.

    For example: find all employees born after August 29, 1961

     

     

    Unions: As discussed the union of two relations/tables is a relation/result set that includes all the tuples/rows that are both in table 1 or in table 2 or in both table 1 and table 2. In the theoretical concept of the Union Operation Duplicate tuples are eliminated. In the implemented DBMS you can choose to eliminate the duplicates or keep the duplicates in the result set.

     

    Joins:  As discussed, the JOIN Operation is used to combine related tuples/rows from two relations/tables in a query.  As the join matches to attributes that have equal values, tuples are removed from the result set making the data and information in the result set more meaningful.  The JOIN involves an equality test, and thus is often described as an Equi-Join. Equi-Join result in two attributes in the resulting relation/(result set) having exactly the same value. A `Natural Join' operation will remove the duplicate attribute(s).

     

    Intersections:  As discussed, the intersection of table 1 and table 2 is a relation/result set that includes all tuples/rows that are both in table 1 and table 2.  The rows that are identified as matching are determined by the attributes listed in the projection / (attribute list) specified in the SELECT Statements.  

     

    Differences:  As discussed, the difference of table 1 and table 2 is the relation/ (result set) that contains all the tuples/rows that are in table 1 but that are not in table 2. Once again, the rows that are identified as matching or not matching are determined by the attributes listed in the projection / (attribute list) specified in the SELECT Statements.  

     

     

    OUTER JOINs

    You may have noticed that data is lost when applying a join to two relations / tables. Only the rows that match by the keys being compared are in the result set.  In some cases this lost data might hold useful information.  An Outer Join Operation retains the information that would have been lost from the result set table, replacing missing data with nulls.

    There are three forms of the outer join, depending on which data is to be kept.

    LEFT OUTER JOIN - keep data from the left-hand table

    RIGHT OUTER JOIN - keep data from the right-hand table

    FULL OUTER JOIN - keep data from both tables

    The left table is always the first table listed in the query.  The right table is always the second table listed in the query.

     

    NEW SQL Commands this Week

                CEATE DATABASE:  Creates the Physical and Logical Database Container for the DBMS to access and process the tables.  You must create and use a database before creating the tables for it

                USE Database Name: Open a specific Database container for access and processing

    CREATE TABLE Name:  Create a table using the name specified

    DROP DATABASE Name:  Deletes a database using the specified name. Use with caution.  Dropping a database also drops all the tables and items in the database container. 

    DROP TABLE Name:  Deletes a specific table in a database by name

    SHOW DATABASES:  Displays all of the Database Containers the DBMS has access to

    SHOW TABLES:  Displays all of the tables within the current database container in use.

    DESCRIBE TABLE:  Displays metadata about a specific named table

     

    Important Reminder 

    The correct process for responding to discussions is to first post your own reply to the discussion topic.  Other participant’s replies will not be visible until you post your first individual reply.  Second, you must post a brief reply to at least two other students’ replies. This along with a relevant individual reply will give you full credit in discussion assignments

      

     Highlights This Week

    This week ends the pivotal process of understanding relational database modeling and diagramming databases using ER diagrams.  We will continue the effort of reinforcing very important concepts in Relational Database Management. This week, starts a transition away from creating ER Diagrams and begin to focus more heavily on the concepts that make a RDBMS work.  This week also begins to focus on creating the physical database, its tables and specifying the relationships between the tables to the RDBMS.

     

    Another major focus of this week is understanding and reinforcing some of the more important concepts in Relational Algebra. 

     

    This Week

     This week we continue exploring Relational Algebra and Set Theory and some of the more important concepts within it.  Relational Algebra is sometimes referred to as or called set operations.

     

    As mentioned, this week we are transitioning into understanding the foundation concepts on what makes a relational database management system work.  These concepts will also provide insight on writing multiple table queries.  This starts with what appears to be an intimidating topic Relational Algebra.  You will see that the concepts in relational algebra are really easy common sense concepts in set theory that ae easy to understand.  

     

    Relational algebra defines a theoretical way of manipulating table contents using relational operators.  When relational algebra operators are used on existing relations, the result set produces new relations.  As we have been discussing, a relation is a fancy word for a table in a database. 

     

    Below are the relational operators that we will be using during the second half of the class.  A couple of them we have discussed and used during the first half of the class.  We have discussed and used SQL operators representing the relational operators SELECT and PROJECT.  

     

    The section above describes there terms in more detail 

    ·         SELECT

    ·          UNION

    ·          PROJECT

    ·          DIFFERENCE

    ·          JOIN

    ·          PRODUCT

    ·          INTERSECT

     

     

    As we discussed, a SELECT Operation determines the number of rows that are retrieved from an information request using a SELECT query.  This can contain all of the rows in the tables of the query or only a subset of rows based on filtering criteria.  A PROJECTION determines the columns that will be displayed in the result set of the information request.  It can contain all of the columns of the tables in the query or it can contain a subset of the columns of the tables in the query. 

     

    Chapter five reviews the concepts of the union, difference and intersection. 

     

    In order to perform a UNION Operation, for the set operation to function correctly the relations both tables being combined must be union compatible. The two relations are union compatible if they have the same number of attributes and the data types of each attribute in column order is the same in both tables.

     

    The table that a query returns is called a result set. The result set table that is returned during the union of two relations is a relation that includes all the tuples that are in both tables.  In conceptual relational algebra, duplicate tuples are eliminated from the result set.  In the actual implementation of SQL the result set may contain duplicates unless the query specifies no duplicates.  When using the reserve word UNION in the SELECT statement, you are combining two tables into one table in the result set.

      

    When an INTERSECTION is performed on two tables, the result set produced is the intersection of the two tables. The result set is a relation that includes all tuples that match in both tables. In order to perform an INTERSECTION operation, for the set operation to function correctly both tables being intersected must be compatible. The two relations are compatible if they have the same number of attributes and the data types of each attribute in column order is the same in both tables. When implementing an intersection operation, you are finding the rows that are in both tables based on the list of attributes in the SELECT statement   

     

    When a DIFFERENCE is performed on two tables, the result set produced is the DIFFERENCE of the two tables. The result set is a relation that contains all the tuples that are in first table but that are not in the second table. Once again,  in order to perform a DIFFERENCE operation, for the set operation to function correctly both tables in the DIFFERENCE operation must be compatible. The two relations are compatible if they have the same number of attributes and the data types of each attribute in column order is the same in both tables.  When implementing a difference operation, you are finding what is in one table but not in the other, based on the attributes in the select statement

      

     PRODUCT Operations:  In the additional notes that I provided, this operation is referred as Cartesian Product.  The Cartesian Product is also an operator which works on two sets. It is sometimes called the CROSS PRODUCT operation or the CROSS JOIN operation. 
 A PRODUCT is when you join two tables with no condition specified. It combines the tuples of one relation with all the tuples of the other relation.  It combines the attributes of a row in one table, with the attributes of all the rows in another table. In this context the word combines can be interchanged with the word joins. This is typically not the result set that database query person wants as a result set.

     

    Cross Product operation is typically not the end result that one wants when executing a query that uses more than one table.  However, it is important to understand that this is how a RDBMS will process the operation on two tables if there are no constraints placed on the query.  Constrains are placed on a query containing more than one table by using the JOIN operator or the WHERE operator to connect the primary key in one table with the foreign key in another table.

     

    The major reason you want to understand the PRODUCT operation is to understand the JOIN operation.  The JOIN operation joins the attributes of two tables.  The JOIN operation is a subset of a PRODUCT operation. In SQL, the JOIN operator is used to combine related rows from two tables.  Rows are related when they have common attributes that match.  In its simplest form a JOIN is just the cross product of two tables.
 A PRODUCT is when you join two tables with no condition specified.  As the join becomes more defined, row are removed within the cross product result set to make the result of the join more meaningful.
The JOIN  operation allows you to evaluate a join condition between the attributes of the two tables on which the join is undertaken.

     

    The JOIN operation in SQL involves an equality test, and thus is often described as an Equijoin, see the definition of Equijoin in the section below. The JOIN operation joins the attributes of two tables. Joins result in two attributes in the tables involved in the queries having exactly the same value. The attributes typically are in a foreign key primary key relationship.  An Equijoin will have the duplicate attributes of both tables displayed in the result set.  A `natural join' will remove the duplicate attribute(s) from the result set.

     

    Inner Joins only return matched records from the tables that are being joined.  The rows that do not match based on the JOIN statement are not included in the result set.

     

    Natural Joins: In many RDBM systems a natural join will require that the attributes have the same name to identify the attribute(s) to be used in the join. This may require a renaming mechanism.
If you do use natural joins make sure that the tables do not have two attributes with the same name by accident, this may cause an ambiguity error.  This however, is based on the particular RDBMS that you are using.

     

    Outer Joins – as you begin to use joins, you may notice that much of the data is lost when applying a join to two tables. In some cases this lost data might hold useful information. Outer Joins display additionally, the rows that do not match the condition specified in the JOIN ON statement. An outer join retains the information that would have been lost in the filtering process from the tables in a normal join operation.  Outer joins replace missing data of the attributes of the rows that did not match with null values.

     

    There are three forms of the outer join, depending on which data is to be kept.

    LEFT OUTER JOIN - keep data from the left-hand table RIGHT OUTER JOIN - keep data from the right-hand table FULL OUTER JOIN - keeps data from both tables

    Left outer join yields all of the rows in the first table, including those that do not have a matching value in the second table

     

    Right outer join yields all of the rows in the second table, including those that do not have matching values in the second table

     

    As mentioned earlier, Natural Joins links tables by selecting rows with common values in common attributes (join columns). They show duplicate attributes in the result set.

     

    Equijoin - Links tables on the basis of an equality condition that compares specified columns.  It does not show duplicate attributes in the result set.

    Theta Join – is when additional criteria is used to filter the result set.

     

    Inner Join - Only returns matched records from the tables that are being joined

    Outer join - Matched pairs are retained, and any unmatched values in other table are left null

     

     

     

    Homework Assignments This Week

    Homework Assignment

                Write the CREATE TABLE Statements for the Relational Schema and Metadata for the Movie Rental Database

                Relational Algebra Assignment

      

     

    Readings

    Start reading chapter 5.  It has additional information and illustrations on the SQL Reserve Words that we are exploring and using this week. Continue reviewing the material on Relational Algebra in your textbook.  This can be found in Chapter 5.  It is the section on SET Theory and operations

    Read the PDF document on Relational Algebra.  We will review this document in class this week.

     

     

    Example Exercise

     The example assignment this week is to create the physical database from the Little League Football Commission’s ER Diagram, Relational Schema and Metadata that you created in previous assignments.  We will review the process in an in class practice assignment that we will do collectively.  This should provide you with a framework of the process. You will get an independent project to convert an ER diagram Schema and Metadata into a physical database on your own.  You should review the document that explains the structure and formats of the CREATE TABLE Statement.  You will then use this knowledge to write the CREATE TABLE Statements that is outlined in your Schema and Metadata.   

    Discussions for the Week

     

    Week 7 Discussion 1

                In your own words, describe the difference between an Outer Join and an Inner Join.  Paraphrase your replies.  Do not directly copy and paste your replies. If you want to copy and paste some Web Information. Site it in your response 

    Week 7 Discussion 2

                Chapter 3 discusses the concept of the Cross PRODUCT operation.  It is also illustrated in this week's PDF file reading as a Cartesian Product.  In your own words, describe what you think the relation ship between a PRODUCT Operation and an SQL Join Operation

    Paraphrase your replies.  Do not directly copy and paste your replies. If you want to copy and paste some Web Information. Site it in your response 

     

     

     

    Database Project

    The Database Project is the CREATE TABLE Statement assignment.             

      

     

    Weekly Quiz 

    Relational Algebra Quiz

     

     

    Have a Happy and Restful Spring Break  !!!!