By Roy Brown
Whether you are migrating data, or just wanting to present it in a more portable format, be aware of how you can manipulate it using those all-pervasive Microsoft tools. When your consulting role takes you across a wide range of HP 3000 sites, you rapidly learn that not everybody has all the add-on tools you might like to see – Qedit, MPEX, Adager, Suprtool, and so on. You can rely on what’s in FOS, but there are a bunch of things you are brought up short by, that are not so easy without the armory above.
So, when I needed to extract and massage data from a bare or nearly bare HP 3000, I pretty soon learned to rely on what I could bring to bear from my laptop, equipped with Reflection and the MS Office suite.
Actually, the product I really missed isn’t one I listed above – it’s MBF-UDALink, from MB Foster. Perhaps because I’ve never quite mastered its rather quirky interface, I find it’s often easier to rewrite a query than to modify one. But they are so quick to write that it really doesn’t matter – especially for multi-set, multi-key extracts.
And as it can make your data extract, put it in the format of your choice, and transfer it to your PC via your termulator, all in one go, it lets you skip a whole bunch of what I describe below; stuff you need to do only if all you have is FOS in this area.
Mostly, when grabbing stuff on an ad-hoc basis, I like to list it out in Query, and watch it scroll by in Reflection, with logging to a PC file turned on. I know that I could file equate the output to QSLIST with DEV=DISC, make a file and copy it that way, if I wanted. But this way, I get to see problems as it runs. And if it runs okay, it’s already on the PC for me.
I use Query because it’s always there. I figure I don’t need to do a Query tutorial here – though you can email me at [email protected] if you’d like a copy of one – but suffice to say that you can usually walk the paths you need, and pick up the data you want. I generally set LINES=0 or NOPAGE, and I pay attention to numeric field formatting with Edit masks where needed, but I only output Detail lines. Dates I leave in CCYYMMDD format, just as they come. And I don’t need to do any math – I can save that until I’m in Office.
But I do hit the 80-character line limit, which is where the first neat WinWord trick kicks in. I use multiple lines, and I mark the end of each line except the last with a string like ### - something that I know won’t ever occur naturally in the data – ending at position 80.
Then I use WinWord to open the .txt file that Reflection has built for me on the PC, and Edit/Replace to change ###^p to one space throughout. Bingo! One long line per returned entry.
How does this work? Well, ^p (caret p) is WinWord’s code for a paragraph mark, which is how each line in the data is terminated. So I’m saying “Find each line ending in ###, and chop off not just these characters, but also the line ending itself. And then put a space in, to make sure that doesn’t cause two fields to run together.”
If you open Edit/Replace, choose the More tab and then the Special tab, you will be able to see the list of formatting characters you can search for and replace. Paragraph Mark is at the top, and right below it is Tab Character; click this, and you’ll see caret-t appear in the ‘Find what:’ window. There are 20 options there in all, but ^p and ^t cover pretty much 99% of what I need to do.
I top and tail my output file to remove the original query lines, and the >end at the end. Sometimes, I might then sort it, with Table/Sort and the default options there, to get the detail records in order. But generally, WinWord’s sort runs out of steam with a file that is more than a few megabytes, so I wait until I’m in Excel.
So I Save As on the WinWord file, taking care to keep it as a .txt file, and ignoring WinWord’s warnings about ‘losing formatting’ if I do. WinWord’s formatting is exactly what I don’t need; .txt is the most versatile format for use here.
This excursion into WinWord has really just been to de-block the detail lines, a task which is straightforward here, but nigh-on impossible in Excel. But we’ll find, in turn, that there are things which Excel can easily do for us, while in WinWord they would be nigh-on impossible.
Formatting in Excel
Next, I open up Excel, and File/Open my .txt file, setting ‘Files of type’ to All Files so I can see it. Excel comes up with its Text Import Wizard, a most powerful tool that lets me break my file up into individual fields.
The option to use here is Fixed Width, and Excel guesses where the field breaks are. It usually does this very well, providing the fields have spaces between them. But if not, I can add, remove or move the suggested breaks.
Moving on with Next, I can set the format for each field. Again it usually guesses these right, except for those CCYYMMDD dates. But for those, I just choose Date format, and the YMD option, and it will convert them to Excel dates. On completing the Wizard, I have my data, field by field, neatly arranged in Excel columns.
By the way, I could have just Copied from my WinWord file, and Pasted into Column A of the Excel sheet, and then split that. The wizard is available under Data/Text to Columns, for just that purpose.