Large File DataSets: Background, Status, and Future
February 13, 2006
[Editor’s note: HP has been working on repairing IMAGE/SQL’s Large File DataSets (LFDS) since 2004. Using LFDS, the default in MPE/iX 7.5, can result in corrupt HP 3000 databases, as we have reported in the NewsWire. As of this month, HP was working with database tool vendors in testing a new fix for the problem; a 2005 attempt failed in alpha testing. Database tools vendors have documented corruption at some customer sites. Vendors recommend avoiding the use of Large File DataSets in IMAGE/SQL. Stan Sieler, executive vice president of Allegro Consultants and a 3000 veteran who led the design of major enhancements to IMAGE/3000 while working at HP, offered his insights on the problem.]
By Stan Sieler
Background: Before Large Files
In the beginning, or at least in 1994, IMAGE datasets were limited to 4 gigabytes (GB). This was because a single datatset was a single file, and MPE limited file sizes to 4 GB. Allegro Consultants, Inc., was contracted by HP to design and implement a solution to this limitation. At the time, along with other technical experts, we suggested that the solution lay in enhancing MPE/iX to support larger disk files — but we were told that that enhancement was uncertain, and certainly not likely to appear in the near future. So, we chose to implement “Jumbo datasets”, using a collection of 4 GB files to logically comprise a single dataset. Given the limitations of IMAGE filenaming, we chose to name the “chunks” with HFS names (e.g., set 23 of SALES would be SALES23, SALES23.001, SALES23.002, etc.)
Jumbo datasets arrived, and worked well — then and now.
As it turns out, the ability to use a Jumbo dataset for DDX (Dynamic [detail] Dataset eXpansion) or MDX (Master Dataset eXpansion) was originally planned to be added in a follow-on project. Unfortunately, that project was never funded.
Over the years, it was made clear to us that DDX/MDX for jumbo datasets wasn’t a critical enhancement. Utilities like Adager made it easy for users to expand their datasets quickly, and disk drive capacities/prices made it easier to add tremendous amounts of disk storage (allowing the user to always have sufficiently large datasets).
Background: Large Files
Eventually, Large Files (files larger than 4GB) were added to MPE as of MPE/iX 6.5 — but without any changes to IMAGE to support their use.
HP later decided to enhance IMAGE to support Large File DataSets (LFDS, or a dataset consisting of a single Large File), and released the first version in 2002. In March of 2004, I discovered a very serious data corruption problem in LFDS, which I reported to HP, and the 3000 NewsWire has also covered. The result was recommendations from both vendors and HP: avoid using LFDS, use Jumbo datasets instead.
At several points in the last two years, I and some other vendors have tried to get HP to drop LFDS entirely, but without success. We were concerned that users could invoke LFDS by accident, and encounter the data corruption problem.
Tomorrow: The basic LFDS problem, as well as a solution for HP to employ.