Migrating Empty Date Values From TurboIMAGE

By Bob Green

When I gave my paper "Transforming TurboIMAGE Databases" at HPWorld in Atlanta, there was a very good question from the audience

"I have fields like SHIPDATE, which begin life with a blank value and are updated when the actual ship date is known. How do I move these datasets into SQL tables, where the SHIPDATE field will presumably be of type 'Date' and where blank will be detected as an invalid value?"
Excellent question!

You Cannot Load Invalid Dates

Unlike TurboIMAGE, where date values are stored in generic data types such as X (character) or I (integer), SQL databases have a DATE type specifically for date values.

Blanks and zero are not valid dates! So you cannot typically load them into Oracle or SQL Server, or most other SQL databases. Since TurboIMAGE does not have a DATE type (and doesn't enforce data types anyway), many IMAGE-based applications have some invalid date values, and use blanks/zero to mean that the date is not yet known.

Any truly invalid date values should be fixed before exporting to another database. You can find the invalid dates easily in Suprtool using the $Invalid function. And while you are looking for invalid dates, you might as well look for unreasonable dates as well (i.e., before 1900 or after 2010, depending upon the field meaning in the application). For example:

>base store.demo
Database password [;]?
>get d-sales
>item deliv-date,date,ccyymmdd
>if $invalid(deliv-date) or &
 deliv-date < 19000101 or &
 deliv-date > 20100101 
>out baddates,link
>list standard
>xeq
Note: you may find the DBEDIT module of Suprtool handy for correcting the invalid dates after you find them.

What About Deliberately Missing Dates?

In TurboIMAGE, applications usually initialize SHIPDATE-type fields to Blanks or Zeroes to indicate that the value is not yet know.

In SQL databases, you need to define the column as NULLABLE in order to allow a row to exist without a value in that column. SQL has the concept of a specific NULL value that can exist and be detected (the internal implementation may actually use zeroes or blanks or a special flag, but that is normally not exposed to the user).

The only remaining question is "how do we specify a NULL value in our load data from TurboIMAGE?"

Loading Null Dates

We did some testing with Microsoft SQL Server 2000 with loading dates. First we created a default table, then we imported the following data from a .txt file:
"Id","Name","PostDate","Description"
1,"a",,"asdfasdf"
2,"b",2003-01-01 00:00:00,"asdfaas8df"
3,"c",,"asdfasd"
In the input file ,, represents the blank date.

The rows were inserted into the table successfully, and the blank dates became NULL values. Just what we wanted!

In Allbase, the description file has a column to indicate the logical null value. For example,

     myfield  1   10    ?
If there's a question mark in columns 1-10, the LOAD program assumes myfield is a NULL.

In mySQL, "Illegal DATETIME, DATE, or TIMESTAMP values are converted to the 'zero' value of the appropriate type ('0000-00-00 00:00:00', '0000-00-00', or 00000000000000). The column is given a value and is imported without errors. However invalid date ranges also revert to this making it hard to identify if the field was intentionally left blank or an invalid date was imported in."

From PostgreSQL, "... we could set the date shipped to NULL before an order is shipped." This will work by default and can change this by creating the database date column with "NOT NULL".

Loading Nulls in Oracle

The same null date load is possible in Oracle as well.
NULLIF Keyword: Use the NULLIF keyword after the datatype and optional delimiter specification, followed by a condition. The condition has the same format as that specified for a WHEN clause. The column's value is set to null if the condition is true. Otherwise, the value remains unchanged.

NULLIF field_condition

The NULLIF clause may refer to the column that contains it, as in the following example:

COLUMN1 POSITION(11:17) CHAR NULLIF (COLUMN1 = "unknown")

This specification may be useful if you want certain data values to be replaced by nulls. The value for a column is first determined from the data file. It is then set to null just before the insert takes place.

. . .

The BLANKS keyword makes it possible to determine easily if a field of unknown length is blank. For example, use the following clause to load a blank field as null:

 
column_name ... NULLIF column_name=BLANKS 

Exporting Invalid Dates as NULL

You can use Suprtool and STExport to export TurboIMAGE date values in a form that your SQL database will accept. By default, STExport formats invalid dates as asterisks (****). But, there is a special option called Date Invalid Null that converts an invalid date into a zero-length field (you must also specify variable Columns).

For example, first use Suprtool to create a self-describing data file for the dataset:

>base store.demo
Database password [;]?
>get d-sales
>item deliv-date,date,ccyymmdd
>item purch-date,date,ccyymmdd
>output dsales,link
>xeq
Then use STExport to read the date file and convert it into a CSV file with invalid dates as "null" columns:
$input dsales
$date invalid null
$date ccyymmdd
$columns none  {i.e., Variable}
$quote double
$delimiter comma
$output dsexprt
$xeq
Note: always set the date export format to CCYYMMDD, since this is the ISO standard format and is understood by most SQL load programs without further configuration.

That's it - you should now have the information you need to migrate your date fields from TurboIMAGE to most SQL databases, even those that are null.