IMAGE's Large File DataSets: The Problem and How to Fix It
Hurd's first quarter succeeds, except in support

Vendors repeat warnings about LFDS corruption

Three vendors with some of the deepest knowledge of HP 3000 databases — HP, Adager and Robelle — have posted fresh messages which update and warn customers about Large File DataSets (LFDS). This change in the 3000's IMAGE/SQL database can cause corruption in datasets greater than 4GB. You might be using LFDS, if you've got a large database, but not be aware of the risk. Or even be aware that your largest datasets now are in this risky format.

That's because LFDS is the new default for large datasets created in IMAGE C.10.00, the database version shipping with MPE/iX 7.5. HP used Jumbo datasets as the default in earlier versions of IMAGE. The vendor gave us an update yesterday on the project to repair this corrupting bug. This is an issue for customers migrating as well as homesteaders. Anyone who continues to run a 3000 with a large database and the most recent MPE/iX release should be aware of the risk.

Messages have also been posted in the last 24 hours by Adager's Alfredo Rego and Robelle's Bob Green about LFDS. Combined with the detailed technical report by Allegro's Stan Sieler in our February NewsWire and on this blog the past two days, these experts present a massive set of reasons to avoid using LFDS, even accidentally. It seems to me like the IMAGE community has waited long enough and is raising its collective voice. Even now, HP says it has pre-released another patch to continue tests of its work on the problem, a project begun in 2004.

Liz Glogowski, HP’s vCSY division escalation manager, replied yesterday to our request for an update.

A pre-release copy of the patch and a very preliminary copy of a supporting technical vendor document were delivered to a trusted third party for testing on February 2. Additional testing is also continuing at HP in parallel. Following these alpha tests, we will assess the situation and determine our release strategy. These are currently in progress but we do not have a firm completion date for them. We will give you another update when we have completed our plans.

The testing can be a time-consuming process, according to reports we've received. Since these are tests on very large databases, only the very largest HP 3000s can begin to complete a suite in a period less than several days. This level of system — a 4-CPU HP 3000 in the 750MHz N-Class tier, is usually working in a customer site, not in a third party vendor's test bay.

Adager's report was as complete as the vendor's reputation for resolving problems, full of experience, technical detail and even some wry humor. Alfredo Rego wrote:

Several MPE-Image colleagues have called me and emailed me, asking for  “my personal opinion” regarding LargeFile Datasets (LFDS).

As with everything, there are pluses and minuses with LFDS.  I highly recommend that, before even considering installing LFDS, you should have a serious consultation with your HP software support person(s).

WARNING: The version of DBSCHEMA released with MPE/iX 7.5 (released by HP in August 2002) produces incorrect LFDS.  Without realizing it, you may have already created bad LFDS.

IMAGE version C.11.0 (not yet released by HP) corrects this LFDS problem.

With IMAGE version C.11.0, the relationship between IMAGE blocks and the file RECORDS that encapsulate them has been subtly altered for LFDS. 

Regular (non-privileged) access to IMAGE data entries (via DBGET,  DBPUT, DBUPDATE, and so on) continues to function without any change for LFDS.

MultiRecord NoBuf access to IMAGE blocks (which contain IMAGE media entries — which, in turn, contain IMAGE data entries — plus block bitmaps) is subtly different with LFDS. 

If we were to extrapolate the schedule and budget that Adager Corporation has invested (and will have to continue to invest, to tie up a couple of  loose ends that we need to update to be able to support the latest HP  version), and if we were to multiply Adager’s LFDS schedule and budget “times the number of commercial products and home-grown solutions that use MultiRecord NoBuf access to IMAGE blocks”, we would come up with several person-years and several million U.S. Dollars. 

Bottom line: The fact that Adager supports LFDS turns out to be irrelevant, because people do not use MPE-Image just for the fun of running Adager.  People use MPE-Image to be able to run APPLICATIONS, such as Suprtool, that process data to produce information.  If these applications (many of which use MultiRecord NoBuf access to IMAGE blocks for performance reasons) break down with LFDS (because of the subtle differences in the relationship between IMAGE blocks and the file  RECORDS that encapsulate them), then you can NOT use LFDS.

So, besides having a serious consultation with your HP software support person(s), you must have a serious consultation with your non-Adager suppliers (and with your in-house programmers, should you have home-grown solutions that use MultiRecord NoBuf access to IMAGE  blocks for performance reasons).  What are the total combined schedules and budgets required to make these non-Adager systems LFDS-savvy?

It’s all a matter of time and money. 

My personal opinion: I wish I had more time and more money.

Robelle's Bob Green wrote on the company's blog this morning to add more details as well as another advisory to steer clear of LFDS:

With the advent of MPE/iX 7.5 a new option for datasets greater than 4Gb became available in mid-2002 with IMAGE version C.10.00. The option, called Large File Datasets (LFDS), puts all data into a single Large File if the data set size is greater than 4Gb. This was done with very little fanfare and basically flew in under the radar for most customers and even tool vendor providers such as Robelle.

LFDS became the new default for datasets over 4Gb. Previously, datasets that needed to be larger than 4Gb were implemented using "Jumbo datasets," a series of files built in the POSIX space in 4Gb chunks. Suprtool has supported Jumbo datasets for the last ten years, since they were created.

We started to look at how Suprtool would react to LFDS, but then reports came that there was corruption when using these datasets and that HP was re-visiting/fixing the code. Some months later we then heard that a new set of patches were being released in July/August 2005 timeframe and again started to look at the impact on Suprtool. However, we now have heard that there are new potential corruptions with the July/August 2005 set of patches and HP has again reviewed the design and implementation of LFDS.

Due to the previous corruption issues and that HP has the enhancement under review we cannot state that Suprtool supports this enhancement as yet. We will keep you posted on what we learn, but at this time we certainly would not recommend using LFDS.

You will get a LFDS if you specified a size thru Dbschema that was less than 128Gb. See the details below from HP's communicator article on this feature:

By default, any data set size less than 128GB is created as a single file data set, while a data set size greater than 128GB is created as Jumbo data set. The user can force creation of Jumbo data sets, if data set size is greater than 4GB, with a $CONTROL JUMBO option in the database schema. Each jumbo chunk file would be a maximum of 4GB and can have up to 99 chunks. If the user specifies $CONTROL NOJUMBO, which is default, any data set greater than 4GB but less than or equal to 128GB will be LFDS, while data set size greater than 128GB cannot be created.

We recommend that you always use $CONTROL JUMBO for all databases. Please note that LFDS cannot co-exist with Jumbo data sets within one database.

Adager is also capable of generating Large File datasets. However, you must set a special JCW in order for Adager to select this option over Jumbo datasets.

Adager also has a utility that will convert from Jumbo datasets to LFDS. However again, I personally would only use this for testing purposes — and we are not endorsing that anyone use the LFDS feature in production.