Character Conversion

Discussion of Co:Z sftp, a port of OpenSSH sftp for z/OS
Post Reply
gngrossi
Posts: 38
Joined: Sat Mar 06, 2010 6:10 pm

Character Conversion

Post by gngrossi »

We are having a conversion issue with the solid and broken bar (pipe) character.

I have tested with these results using the "iconv" and "od" commands on z/OS:

IBM-1047 converting to ISO8859-1

solid pipe X'4F' converts to broken pipe X'7C'

broken pipe X'6A' converts to solid pipe X'A6'

(is X'A6' character a solid pipe?)


Our batch jobs using SFTP produce these results sending files from z/OS to Solaris 10:

solid pipe X'4F' converts to solid pipe X'7C'

broken pipe X'6A' converts to broken pipe X'A6'

Our files on z/OS contain broken pipe characters and when transferred need them converted to solid pipe characters on Unix.


Thanks.
dovetail
Site Admin
Posts: 2025
Joined: Thu Jul 29, 2004 12:12 pm

Post by dovetail »

Co:Z uses IBM z/OS Unicode System services for codepage conversion (with "technique" string LMREC).

z/OS Unicode Services tables for this conversion, as well as the specs for IBM-1047 have the IBM-1047 code point "4F" mapping to unicode "7C". 7C in ISO8859-1 maps into Unicode 7C, so I think that this conversion is being done correctly.

See:
http://en.wikipedia.org/wiki/EBCDIC_1047
http://en.wikipedia.org/wiki/ISO_8859-1#Codepage_layout

Apparently, the problem is that Windows/PCs have it confused:
http://en.wikipedia.org/wiki/Vertical_bar

BTW:
There is a Unix command included in the Co:Z Toolkit which can be used to display / debug translation:

Code: Select all

<coz>/bin/showtrtab -s IBM-1047 -t ISO8859-1
00:  00 01 02 03   9C 09 86 7F   97 8D 8E 0B   0C 0D 0E 0F 
10:  10 11 12 13   9D 0A 08 87   18 19 92 8F   1C 1D 1E 1F 
20:  80 81 82 83   84 85 17 1B   88 89 8A 8B   8C 05 06 07 
30:  90 91 16 93   94 95 96 04   98 99 9A 9B   14 15 9E 1A 
40:  20 A0 E2 E4   E0 E1 E3 E5   E7 F1 A2 2E   3C 28 2B 7C 
50:  26 E9 EA EB   E8 ED EE EF   EC DF 21 24   2A 29 3B 5E 
60:  2D 2F C2 C4   C0 C1 C3 C5   C7 D1 A6 2C   25 5F 3E 3F 
70:  F8 C9 CA CB   C8 CD CE CF   CC 60 3A 23   40 27 3D 22 
80:  D8 61 62 63   64 65 66 67   68 69 AB BB   F0 FD FE B1 
90:  B0 6A 6B 6C   6D 6E 6F 70   71 72 AA BA   E6 B8 C6 A4 
A0:  B5 7E 73 74   75 76 77 78   79 7A A1 BF   D0 5B DE AE 
B0:  AC A3 A5 B7   A9 A7 B6 BC   BD BE DD A8   AF 5D B4 D7 
C0:  7B 41 42 43   44 45 46 47   48 49 AD F4   F6 F2 F3 F5 
D0:  7D 4A 4B 4C   4D 4E 4F 50   51 52 B9 FB   FC F9 FA FF 
E0:  5C F7 53 54   55 56 57 58   59 5A B2 D4   D6 D2 D3 D5 
F0:  30 31 32 33   34 35 36 37   38 39 B3 DB   DC D9 DA 9F
gngrossi
Posts: 38
Joined: Sat Mar 06, 2010 6:10 pm

Post by gngrossi »

The character translations are correct from IBM-1047 to ISO8859-1:

X'4F' becomes X'7C'
X'6A' becomes X'A6'

Since our files on z/OS contain broken pipe characters and when transferred to Unix need them converted to solid pipe characters, we convert them from X'6A' to X'4F' and then run cozsftp getting the results we need.
Post Reply