Relational Algebra Overview

 

Relational Algebra 

  • read the weekly overview
  • read the lecture notes posted in weekly modules
  • participate in the weekly discussion forums
  • review the supplement documents and example code posted in the weekly modules
  • attempt the weekly practice exercise
  • complete the weekly assignment
  • complete the weekly quiz

 

Highlights and Outcomes This Week

Relational Algebra Concepts

 Unions

Joins

Intersections

Differences

 

Homework

 

  

Readings

Chapter 3 - Review the material from Chapter 3 on Relational Algebra

Read the PDF document on Relational Algebra

Example Exercise

 The example assignment this week is to create the physical database from the ER Diagram and Metadata that you created in previous assignments.  We will review the process in an in class practice assignment that we will do collectively.  This should provide you with a framework of the process. Later, you will get an independent project to convert an ER diagram Schema and Metadata into a physical database on your own.  You will review a document that explains the structure and formats of the CREATE TABLE Statement.  You will then use this knowledge to write the CREATE TABLE Statements that is outlined in your Schema and Metadata.   

Discussions for the Week

 

Week 7 Chapter 6 Discussion 1

Describe the difference between an Outer Join and an Inner Join

 

Week 7 Chapter 6 Discussion 2

Explain the relationship between the (PRODUCT Operation also referred to as the Cartesian Product) and the JOIN ON Operation used to work with two tables in a SELECT statement

 

Database Project

Additionally, this week, you get back to writing database queries.  This will be one of the last single table query assignments that you will have.  After this week you will begin to explore writing queries that reference more than one table. The database assignment this week is to write queries for the sailor boat reservation database.  It is a database with three tables in it.  However, each of your information requests will only require that you use one of the tables in each of your queries.   

Weekly Quiz 

Weekly Quiz for Week 7 on Relational Algebra

 

Important Reminder 

The correct process for responding to discussions is to first post your own reply to the discussion topic.  Other participant’s replies will not be visible until you post your first individual reply.  Second, you must post a brief reply to at least two other students’ replies. This along with a relevant individual reply will give you full credit in discussion assignments

 

To provide additional flexibility for students that do not have their own personal computer resources, I am extending the deadline for assignment submissions from Sundays at midnight to Mondays at midnight. 

 

 Highlights This Week

This week ends the pivotal process of understanding relational modeling and diagramming databases using ER diagrams.  We will continue the effort of reinforcing very important concepts in Relational Database Management. This week,  we will transition away from creating  ER Diagrams and begin to focus on the concepts that make a RDBMS work.

The major focus of this week is understanding some of the more important concepts in Relational Algebra. 

 

 

This Week

 This week we explore Relational Algebra and some of the more important concepts within it. 

We go back to Chapter 3 where it identifies a variety of concepts in Relational Algebra.

 

This week we are transitioning into understanding the foundation concepts on what makes a relational database management system work.  These concepts will also provide insight on writing multiple table queries.  This starts with what appears to be an intimidating topic Relational Algebra.  You will see that the concepts in relational algebra are really easy common sense concepts to understand.  

 

Relational algebra defines a theoretical way of manipulating table contents using relational operators.  When relational algebra operators are used on existing relations, the result set produces new relations.  As we have been discussing, a relation is a fancy word for a table in a database. 

 

Below are the relational operators that we will be using during the second half of the class.  A couple of them we have discussed and used during the first half of the class.  We have discussed and used SQL operators representing the relational operators SELECT and PROJECT.  

 

  • SELECT
  •  UNION
  •  PROJECT
  •  DIFFERENCE
  •  JOIN
  •  PRODUCT
  •  INTERSECT
 
   
   
   

As we discussed, a SELECT Operation determines the number of rows that are retrieved from an information request using a SELECT query.  This can contain all of the rows in the tables of the query or only a subset of rows based on filtering criteria.  A PROJECTION determines the columns that will be displayed in the result set of the information request.  It can contain all of the columns of the tables in the query or it can contain a subset of the columns of the tables in the query. 

 

Chapter three reviewed the concepts of the union, difference and intersection. 

 

In order to perform a UNION Operation, for the set operation to function correctly the relations both tables being combined must be union compatible. The two relations are union compatible if they have the same number of attributes and the data types of each attribute in column order is the same in both tables. 

The table that a query returns is called a result set. The result set table that is returned during the union of two relations is a relation that includes all the tuples that are in both tables.  In conceptual relational algebra, duplicate tuples are eliminated from the result set.  In the actual implementation of SQL the result set may contain duplicates unless the query specifies no duplicates.  When using the reserve word UNION in the SELECT statement, you are combining two tables into one table in the result set.

 

 

When an INTERSECTION is performed on two tables, the result set produced is the intersection of the two tables. The result set is a relation that includes all tuples that match in both tables. In order to perform an INTERSECTION operation, for the set operation to function correctly both tables being intersected must be compatible. The two relations are compatible if they have the same number of attributes and the data types of each attribute in column order is the same in both tables. When implementing an intersection operation, you are finding the rows that are in both tables based on the list of attributes in the SELECT statement   

 

 

 

When a DIFFERENCE is performed on two tables, the result set produced is the DIFFERENCE of the two tables. The result set is a relation that contains all the tuples that are in first table but that are not in the second table. Once again,  in order to perform a DIFFERENCE operation, for the set operation to function correctly both tables in the DIFFERENCE operation must be compatible. The two relations are compatible if they have the same number of attributes and the data types of each attribute in column order is the same in both tables.  When implementing a difference operation, you are finding what is in one table but not in the other, based on the attributes in the select statement

 

 

 

Chapter 3 reviews the PRODUCT Operation.  In the additional notes that I provided this operation is referred as Cartesian Product.  The Cartesian Product is also an operator which works on two sets. It is sometimes called the CROSS PRODUCT operation or the CROSS JOIN operation. 
 A PRODUCT is when you join two tables with no condition specified. It combines the tuples of one relation with all the tuples of the other relation.  It combines the attributes of a row in one table, with the attributes of all the rows in another table. In this context the word combines can be interchanged with the word joins.

 

A Cross Product operation is typically not the end result that one wants when executing a query that uses more than one table.  However, it is important to understand that this is how a RDBMS will process the operation on two tables if there are no constraints placed on the query.  Constrains are placed on a query containing more than one table by using the JOIN operator or the WHERE operator to connect the primary key in one table with the foreign key in another table.

 

The major reason you want to understand the PRODUCT operation is to understand the JOIN operation.  The JOIN operation joins the attributes of two tables.  The JOIN operation is a subset of a PRODUCT operation. In SQL, the JOIN operator is used to combine related rows from two tables.  Rows are related when they have common attributes that match.  In its simplest form a JOIN is just the cross product of two tables.
 A PRODUCT is when you join two tables with no condition specified.  As the join becomes more defined, row are removed within the cross product result set to make the result of the join more meaningful.
The JOIN  operation allows you to evaluate a join condition between the attributes of the two tables on which the join is undertaken.

 

The JOIN operation in SQL involves an equality test, and thus is often described as an Equijoin, see the definition of Equijoin in the section below. The JOIN operation joins the attributes of two tables. Joins result in two attributes in the tables involved in the queries having exactly the same value. The attributes typically are in a foreign key primary key relationship.  An Equijoin will have the duplicate attributes of both tables displayed in the result set.  A `natural join' will remove the duplicate attribute(s) from the result set.

 

Inner Joins only return matched records from the tables that are being joined.  The rows that do not match based on the join statement are not included in the result set.

 

In many RDBM systems a natural join will require that the attributes have the same name to identify the attribute(s) to be used in the join. This may require a renaming mechanism.
If you do use natural joins make sure that the tables do not have two attributes with the same name by accident, this may cause an ambiguity error.  This however, is based on the particular RDBMS that you are using.

 

Outer Joins – as you begin to use joins, you may notice that much of the data is lost when applying a join to two tables. In some cases this lost data might hold useful information. Outer Joins display additionally, the rows that do not match the condition specified in the JOIN ON statement. An outer join retains the information that would have been lost in the filtering process from the tables in a normal join operation.  Outer joins replace missing data of the attributes of the rows that did not match with null values.

 

There are three forms of the outer join, depending on which data is to be kept.

LEFT OUTER JOIN - keep data from the left-hand table RIGHT OUTER JOIN - keep data from the right-hand table FULL OUTER JOIN - keeps data from both tables

Left outer join yields all of the rows in the first table, including those that do not have a matching value in the second table

 

Right outer join yields all of the rows in the second table, including those that do not have matching values in the second table

 

As mentioned earlier, Natural Joins links tables by selecting rows with common values in common attributes (join columns). They show duplicate attributes in the result set.

 

Equijoin - Links tables on the basis of an equality condition that compares specified columns.  It does not show duplicate attributes in the result set.

Theta Join – is when additional criteria is used to filter the result set.

 

Inner Join - Only returns matched records from the tables that are being joined

Outer join - Matched pairs are retained, and any unmatched values in other table are left null

 

 

Homework Assignments This Week

 

 

Readings

Relational Algebra Chapter3 of the Textbook

The Relational Algebra PDF File

 

In Class Example Exercise

 

 

Discussions for the Week

 

Week 7  Discussion 1

Describe the difference between an Outer Join and an Inner Join

 

Week 7  Discussion 2

Explain the relationship between the (PRODUCT Operation also referred to as the Cartesian Product) and the JOIN ON Operation used to work with two tables in a SELECT statement.


 

Database Project

 

 

Weekly Quiz 

Complete Week 7 Quiz Online