next up previous contents index
Next: One-to-Many Joins Up: Joining Tables Previous: Additional Join Possibilities

  
Choosing a Join Key

The join key is the value used to link rows between tables. For example, in Figure [*], 648 is the customer key, appearing in the customer table to uniquely identify the row, and in the salesorder table to refer to that specific customer row.

Some people might question whether an identification number is needed. Should the customer name be used as a join key? Using it as the join key is not a good idea for several reasons:

In the statename table, the two-letter state code is probably a good join key for the following reasons:

Essentially, two choices for join keys exist: identification numbers and short character codes. If an item is referenced repeatedly, it is best to use a short character code as a join key. You can display this key to users and allow them to refer to customers and employees using codes. Users prefer to identify items by short, fixed-length character codes containing numbers and letters. For example, customers might be identified by six-character codes (FLE001), employees by their initials (BAW), and parts by five-character codes (E7245). Codes are easy to use and remember. In many cases, users can choose the codes, as long as they are unique.

It is possible to allow users to enter short character codes and still use identification numbers as join keys. Adding a code column to the table accomplishes this goal. For the customer table, a new column called code can be added to hold the customer code. When the user enters a customer code, the query can find the customer_id assigned to the customer code, then use that customer_id in joins with other tables. Figure [*] shows a query using a customer code to find all order numbers for that customer.  

        SELECT order_id
        FROM   customer, salesorder
        WHERE  customer.code = 'FLE001' AND
               customer.customer_id = salesorder.customer_id;
 

In some cases, identification numbers work well and codes are unnecessary, as in the following cases:

Defining codes for such values would be useless. It is better to allow the database to assign a unique number to each item. Chapter [*] discusses database support for assigning unique identifiers.

No universal rule dictates when you should choose codes or identification numbers. U.S. states are clearly better keyed on codes, because only 50 exist. The resulting codes are short, unique, and well known by most users. At the other extreme, order numbers are best used without codes because too many of them are possible and codes would be of little use. 


next up previous contents index
Next: One-to-Many Joins Up: Joining Tables Previous: Additional Join Possibilities
Bruce Momjian
2001-05-09