Techniques of CSV Importing, Revealed
August 9, 2013
I'm importing a Comma Separating Value (CSV) text file into a COBOL II program. I want to compare a numeric field from the file to a number in my COBOL program. In that program, the number is defined as S9(8)V99. The CSV file’s numeric field can vary in length, such as "-1,234.99" or "-123,456.99". If the CSV text file field is always the same length, I know I can move the text field to a COBOL numeric field that is redefined as alpha-numeric. My problem is that the input text field can be different for each record. How do I code in COBOL to accommodate the different number sizes in the text file?
Several HP 3000 programmers and developers recommended Suprtool from Robelle to accomplish this kind of import. Robelle's own Neil Armstrong has offered this advice.
Walter Murray, who worked inside HP's Language Labs before moving out into the user community, noted that Suprtool was likely the best solution to the problem. But after a suggestion that the UNSTRING statement could be useful, he had his doubts.One of the goals I had for Suprtool was what I called "close the loop." What the goal of the project was to essentially provide functions and other enhancements within Suprtool to aid in the import of data into self-describing files, FROM the CSV type files that the Suprtool suite of tools have been able to generate for years.
I added some new functionality such as $split, $number and $clean amongst others to facilitate the importing of data from really any source. I wrote an article about it on our website. The article essentially shows some the steps in Suprtool that you can use to import CSV data into a self-describing file -- or really any data target.
The UNSTRING statement will be problematic, because one of your fields may have one (or more?) commas in it, and you may have an empty field not surrounded by quotation marks. You might have to roll your own code to break the record into fields. If you are comfortable with reference modification in COBOL, your code will be a lot cleaner.
Once you do isolate the check amount in a data item by itself, you should be able to use FUNCTION NUMVAL-C to convert it. Yes, NUMVAL and NUMVAL-C are supported by COBOL II/iX, as long as you turn on the POST85 option.
Olav Kappert offered a long but consistent process.
First thing to do is to not use CVS; use tab delimited. No problem with UNSTRING. Just use the length field and determine if the length = 0.
Do an UNSTRING of the fields delimited by the tab. Then strip out the quotes. Determine the length of each field and right-justify each field and zero-fill them with a leading zero. Then move the field to a numeric field.
You now have your values. Do this for each field from the unstring. You can create a loop and keep finding the ",". By the way, determine the record length and set the last byte+1 to "~" so that the unstring can determine the end of record. Long process, but consistent in method.
In addition to generating a CSV file with leading zeroes, Alan Yeo suggests
Move the CSV value to a full size X field, then strip trailing spaces, and then move the result to an X redefines of your numeric. Please note, as your numeric is V99, you might also want to strip all "." and "," before the compare.
Dave Powell offered up a general purpose, bullet-proof COBOL program to accomplish the task, fully referenced at the 3000-L newsgroup archive. The entire discussion of the mission is also online at the archives.