A Relational Storage Mechanism for Implementing Flexible Input Controls

Users sometimes need to enter invalid values or values they are not sure of. Flexible input controls allow invalid data values to be entered, they enable values that the user is not sure of to be marked as such, and enable the user to make explanatory annotations. This article describes a storage technique for relational databases that bridges the gap between the invalid and uncertain values used by flexible input controls and the strict data integrity and validation rules required by relational databases.

1. Introduction

Users sometimes need to enter invalid values or values they are not sure of. Flexible input controls allow invalid data values to be entered and enable values that the user is not sure of to be marked as such, on the understanding that invalid and uncertain data will be corrected in the future, and enable users to make annotations that explain why such values were entered. This article describes a storage technique for relational databases that bridges the gap between the invalid and uncertain values used by flexible input controls and the strict data integrity and validation rules required by relational databases.

2. Flexible Values

To demonstrate the implementation of the storage technique, the examples in this article use a relational database that stores customer order information. Table (a) below shows the CUSTOMER relation that stores the name and telephone number of each customer, each uniquely identified by a customer number. Table (b) shows the ORDERS relation that stores the product number and quantity of each order, each uniquely identified by an order number.

The CUSTOMER Relation
(a)
The ORDERS relation
(b)

The following two scenarios provide an invalid value and a value that the user is not sure of, neither of which can be stored in the relations shown in tables (a) or (b). In the first scenario, the user wants to store the text “Contact John Smith at International Widgets” in the phone number field. The user will enter John Smith’s phone number at a later date when it becomes available. The user also wants to annotate the text with the reason that “the company secretary will call back with Mr. Smith’s phone number.” This string value and its annotation cannot be stored in the relation shown in table (a): the phone number field cannot accept string values and there is no field for storing annotations.

In the second scenario, the user wants to store the account number 319. This is a valid account number but the user is unsure whether it is the correct account to charge the bill to. The user wants to annotate this account number with the comment: “I need to confirm this account number with Marketing.” There is no way to store the user’s uncertainty about the account number in the relation shown in table (b).

One way to mark the user’s uncertainty about a field value is to add an uncertainty attribute for each attribute in the relation. The disadvantage of this method is that the number of attributes in each relation would be doubled, the vast majority of which would never be used. A more efficient method of storing flexible values, one that does not require changing existing relations, is to add new relations to store flexible values and to link them to existing relations.

3. Storing Flexible Values

A new relation called FLEXIBLE_VALUES is created to store the invalid and uncertain values and their annotations. Table (c) shows a FLEXIBLE_VALUES relation that stores the example invalid and uncertain values described in section 2. The ATTRIBUTE_NO attribute stores the attribute number in a relation for which an invalid or uncertain value is to be stored. The first tuple in the FLEXIBLE_VALUES relation stores the invalid telephone number. The value of the ATTRIBUTE_NO attribute is 3 because PHONE_NO is the third attribute of the CUSTOMER relation.

The FLEXIBLE_VALUES Relation
(c)

The VALUE attribute stores the invalid or uncertain value as a string. The ANNOTATION attribute stores an annotation to the string stored in the VALUE attribute. The STATUS field is a marker that records whether the flexible input value is an invalid (1) or an uncertain value (2).

A new relation is created for each existing relation that needs to store flexible values. The new relations map a tuple in an existing relation to a tuple in the FLEXIBLE_VALUES relation. A flexible attribute value is marked in a relation by storing the NULL value and an SQL query retrieves the value of the attribute from the FLEXIBLE_VALUES relation. The following diagram shows the FLEXIBLE_CUSTOMERS relation that maps tuples in the CUSTOMER relation to tuples in the FLEXIBLE_VALUES relation.

The FLEXIBLE_CUSTOMERS relation maps tuples in the CUSTOMER relation to tuples in the FLEXIBLE_VALUES relation

The following SQL query retrieves the invalid value in the CUSTOMER relation. The nested SELECT statement retrieves all the tuples in the FLEXIBLE_CUSTOMERS relation that contain the specified customer number. The outer SELECT statement uses the FLEXIBLE_NO values to retrieve the corresponding tuples from the FLEXIBLE_VALUES relation.

SQL query that retrieves the invalid value in the CUSTOMER relation

The diagram below shows the FLEXIBLE_ORDERS relation that maps tuples in the ORDERS relation to tuples in the FLEXIBLE_VALUES relation.

The FLEXIBLE_ORDERS relation maps tuples in the ORDERS relation to tuples in the FLEXIBLE_VALUES relation

The following SQL query retrieves the uncertain value in the ORDERS relation.

SQL query that retrieves the uncertain value in the ORDERS relation