Problem re-creating VB datasets

bradv · Post by **bradv** » Fri Oct 01, 2010 11:42 am

Hello,

We are seeking guidance in re-creating VB datasets on z/OS. Years ago old billing data was archived from MVS VB datasets to z/OS UNIX, compressed using bzip2, then FTP'd to a UNIX server. The group that originally created these data files is gone and they didn't leave any notes about how they were created (imagine that!).

We are now trying to bring this data back to the original VB datasets using Co:Z and Dataset Pipes. The VB datasets are not being re-created in their original form. Some records are missing and some records are combined into single lines instead of being broken off into a new records.

The data has stayed in EBCDIC format the entire time. I've FTP'd the files from the Sun UNIX server to z/OS and bunzip'd them into their larger decompressed size. Then I used todsn as such:

cat ~/input.file | todsn -o 'recfm=vb,lrecl=32756,blksize=32760,space=(cyl,(20,40))' //SC90.PF.USERID.WTEST3

Any suggestions or lessons learned when re-creating VB datasets? My fear is that when the original data group created the *.bz2 files they somehow dropped or clobbered the end of record bytes and now todsn can't determine where the end of the record is. There doesn't appear to be an RDW present in the data stream...does that get created by the access method on the z/OS side when writing the RECFM=VB dataset?

Thanks,
Brad

Post by **dovetail** » Fri Oct 01, 2010 12:03 pm

Brad,

The essential thing that you will need to discover is: how are records delineated in your data? I would suggest that you decompress the data and then look at it in a hex editor to determine what you have.

If the records are separated by IBM RDWs, then you can use the "-l rdw" switch on "todsn". The default option relies on newline record separators and is only effective if the data is text, with no binary fields. But it sounds like this is not the case.

If the original archive has no record delimiters and are variable length, then how would record boundaries be recognized? Perhaps the record data itself has information about each record's length that is application specific. If this were the case, you could write a simple filter program that would read record data from stdin and write data to stdout, with RDWs inserted before each record. The output from this could be piped into "todsn -l rdw" and you would be fine.

BTW: The following could be used to created and restore binary record archives properly (regardless of whether fixed or variable length):

fromdsn -l rdw //MY.VB.DATA | bzip2 > compressed.data

and then to restore:

bunzip2 < compressed.data | todsn -l rdw -o 'options...' //MY.VB.RESTORED.DATA