A Relational Storage Mechanism for Implementing Flexible Input Controls
Users sometimes need to enter invalid values or values they are not sure of. Flexible input controls allow invalid data values to be entered, they enable values that the user is not sure of to be marked as such, and enable the user to make explanatory annotations. This article describes a storage technique for relational databases that bridges the gap between the invalid and uncertain values used by flexible input controls and the strict data integrity and validation rules required by relational databases.
1. Introduction
Users sometimes need to enter invalid values or values they are not sure of. Flexible input controls allow invalid data values to be entered and enable values that the user is not sure of to be marked as such, on the understanding that invalid and uncertain data will be corrected in the future, and enable users to make annotations that explain why such values were entered. This article describes a storage technique for relational databases that bridges the gap between the invalid and uncertain values used by flexible input controls and the strict data integrity and validation rules required by relational databases.
2. Flexible Values
To demonstrate the implementation of the storage technique, the examples in this article use a relational database that stores customer order information. Table (a) below shows the CUSTOMER
relation that stores the name and telephone number of each customer, each uniquely identified by a customer number. Table (b) shows the ORDERS
relation that stores the product number and quantity of each order, each uniquely identified by an order number.
The following two scenarios provide an invalid value and a value that the user is not sure of, neither of which can be stored in the relations shown in tables (a) or (b). In the first scenario, the user wants to store the text “Contact John Smith at International Widgets” in the phone number field. The user will enter John Smith’s phone number at a later date when it becomes available. The user also wants to annotate the text with the reason that “the company secretary will call back with Mr. Smith’s phone number.” This string value and its annotation cannot be stored in the relation shown in table (a): the phone number field cannot accept string values and there is no field for storing annotations.
In the second scenario, the user wants to store the account number 319. This is a valid account number but the user is unsure whether it is the correct account to charge the bill to. The user wants to annotate this account number with the comment: “I need to confirm this account number with Marketing.” There is no way to store the user’s uncertainty about the account number in the relation shown in table (b).
One way to mark the user’s uncertainty about a field value is to add an uncertainty attribute for each attribute in the relation. The disadvantage of this method is that the number of attributes in each relation would be doubled, the vast majority of which would never be used. A more efficient method of storing flexible values, one that does not require changing existing relations, is to add new relations to store flexible values and to link them to existing relations.
3. Storing Flexible Values
A new relation called FLEXIBLE_VALUES
is created to store the invalid and uncertain values and their annotations. Table (c) shows a FLEXIBLE_VALUES
relation that stores the example invalid and uncertain values described in section 2. The ATTRIBUTE_NO
attribute stores the attribute number in a relation for which an invalid or uncertain value is to be stored. The first tuple in the FLEXIBLE_VALUES
relation stores the invalid telephone number. The value of the ATTRIBUTE_NO
attribute is 3 because PHONE_NO
is the third attribute of the CUSTOMER
relation.
The VALUE
attribute stores the invalid or uncertain value as a string. The ANNOTATION
attribute stores an annotation to the string stored in the VALUE
attribute. The STATUS
field is a marker that records whether the flexible input value is an invalid (1) or an uncertain value (2).
A new relation is created for each existing relation that needs to store flexible values. The new relations map a tuple in an existing relation to a tuple in the FLEXIBLE_VALUES
relation. A flexible attribute value is marked in a relation by storing the NULL
value and an SQL query retrieves the value of the attribute from the FLEXIBLE_VALUES
relation. The following diagram shows the FLEXIBLE_CUSTOMERS
relation that maps tuples in the CUSTOMER
relation to tuples in the FLEXIBLE_VALUES
relation.
The following SQL query retrieves the invalid value in the CUSTOMER
relation. The nested SELECT
statement retrieves all the tuples in the FLEXIBLE_CUSTOMERS
relation that contain the specified customer number. The outer SELECT
statement uses the FLEXIBLE_NO
values to retrieve the corresponding tuples from the FLEXIBLE_VALUES
relation.
The diagram below shows the FLEXIBLE_ORDERS
relation that maps tuples in the ORDERS
relation to tuples in the FLEXIBLE_VALUES
relation.
The following SQL query retrieves the uncertain value in the ORDERS
relation.