dsn_profile anomalities

Discussion of the Co:Z Toolkit Dataset Pipes utilities
Post Reply
Jeno
Posts: 23
Joined: Wed Oct 03, 2007 5:28 am

dsn_profile anomalities

Post by Jeno »

Hello, I've an easy-to-reproduce problem again:

I have created the following /etc/dsn_profile on the host for fromdsn:

Code: Select all

    fromdsn *
        sourceCodePage IBM-1165
        targetcodepage ISO8859-2
The reason is the following. Our host codepage is IBM-1165 (latin2). The associated ascii codepage is iso9959-2. Without dsn_profile, the default iso8859-1 does not map some characters from ibm-1165. I wanted to set a correct default. The pc version of fromdsn seems to ignore dsn_profile, so the only place for it seemed to be on the host.

I experience the following inconsistencies (fromdsn V1.1.1 and 1.0.3 as well) using this dsn_profile:

Invoking fromdsn without the -b option:
  • - locally from OMVS shell --> no translation (fine)
    - remote from another z/OS OMVS shell using 'fromdsn -ssh ...' --> no translation (fine)
    - locally from an ssh shell --> translating to ascii (WRONG, NO XLAT EXPECTED)
    - remote from the pc (linux) using 'fromdsn -ssh ...' --> translating to ascii (fine)
Invoking fromdsn with the -b line option:
  • - locally from an ssh shell (trace below) --> translating to ascii (WRONG, IGNORES -b)
    - remote from the pc (linux) using 'fromdsn -ssh ...' --> translating to ascii (WRONG, IGNORES -b)

Some lines from trace (local ssh shell):

Code: Select all

# fromdsn -b -L T //TCPIP.STANDARD.TCPXLBIN
[...]
fromdsn(TCPIP.STANDARD.TCPXLBIN)[D]: initialize(): sourceCodePage="IBM-1165"(1165), targetCodePage="ISO8859-2"(912), srcSp=0x40, tgtCr=0xD, tgtLf=0xA, lineTerm=0
and the first line is ascii garbage instead of the readable header '*TCP/IP translate tables'.


Instead of -b, specifying -t IBM-1165 on the command line overwrites dsn_profile.
However, I think that the -b option should disable any translations.

Thank you very much in advance
Best regards Jenoe
dovetail
Site Admin
Posts: 2022
Joined: Thu Jul 29, 2004 12:12 pm

Post by dovetail »

Sorry for not responding earlier. For some reason PHPBB wasn't showing this as "new"...sorry.

There seem to be several things going on (and going wrong).

1) The dsn_profile is only used on the z/OS server side if using -ssh.

2) the fromdsn:targetCodePage and todsn:sourceCodePage is determined by the first found from the following list:

- what is specified on the command line
- what is inferred from dsn_profile
- if -l (linetermination rule) is not none, rdw, mfrdw, or ibmrdw:
- if ssh, the client's default codepage
- if not ssh, the host's default codepage
- the same as the z/os host's codepage - which disables translation

So, -b only implicitly disables translation because it sets -l to 'none'.
This is confusing, IMO, and we should probably change this to force/disable
translation. This is the cause of some of your problems.

For your situation, the best thing is *not* to put codepages in dsn_profile,
since it is better to get the default from the server (z/os) and the client. Otherwise, you will be setting the ssh client codepage to iso8859-2 even if your client is a local or remote z/OS system.

The server codepage (fromdsn:srcCodepage and todsn:tgtCodepage)
defaults to the "default" USS codepage. This is configured by the LANG/LC_ALL environment variables. Depending on how you configured these, they may not be set when invoked under SSHD. To see, do something like this :

ssh user@zos /bin/env

If not set, check "man sshd" to see where SSHD gets its environment. For "non-shell" ssh logins like todsn/fromdsn with -ssh, /etc/profile is *not* read.

The client codepage for *nix is also set by LANG or LC_ALL.
For Windows, you change the codepage using the control panel.
I believe that Windows equivalent for latin-2 is Cp1250.

Also, I'm curious as to why "locally from OMVS shell" is different from "locally from ssh shell". It is probably due to different settings of LANG/LC_ALL environment variables. I think that /etc/profile is run in both cases, but maybe yours has conditional logic that is setting the locale differently based on which shell.

If this doesn't solve your problems, give us the debug "initialize()" messages (using -LT) and relevant LANG/LC_ALL settings for specific cases and I will try to help.

Regards,
Kirk Wolf
Jeno
Posts: 23
Joined: Wed Oct 03, 2007 5:28 am

Post by Jeno »

Thank you for the explanation, no problem for the delay. This week I was also very busy...

As for the -b option, I agree with you absolutely. IMHO, -b, bin, binary or similar should not perform any conversion.

As for using codepages in the host dsn_profile, I agree that it is not the optimal place.
Note that e.g. for DFS, both the source and target code pages are defined on the host
(well, it is somewhat simpler there since on the wire DFS always has ascii).
For dspipes, however, nothing else was allowing for me to set the default ascii codepage, that is why I gave dsn_profile a try.
Here are some reasons:

1) even from my linux pc, using LC_ALL=hu_HU.iso8859-2, fromdsn(linux) still assumes iso8859-1 (also with capital ISO8859-2)

Code: Select all

$ locale && fromdsn -ssh ???@??? -L T //???.b.txt
LANG=hu_HU
LANGUAGE=hu
LC_CTYPE="hu_HU.ISO8859-2"
LC_NUMERIC="hu_HU.ISO8859-2"
LC_TIME="hu_HU.ISO8859-2"
LC_COLLATE="hu_HU.ISO8859-2"
LC_MONETARY="hu_HU.ISO8859-2"
LC_MESSAGES="hu_HU.ISO8859-2"
LC_PAPER="hu_HU.ISO8859-2"
LC_NAME="hu_HU.ISO8859-2"
LC_ADDRESS="hu_HU.ISO8859-2"
LC_TELEPHONE="hu_HU.ISO8859-2"
LC_MEASUREMENT="hu_HU.ISO8859-2"
LC_IDENTIFICATION="hu_HU.ISO8859-2"
LC_ALL=hu_HU.ISO8859-2
fromdsn[T]: -> setTargetCodePage(ISO8859-1)
fromdsn[T]: <- setTargetCodePage()
fromdsn[D]: targetCodePage defaulted to COZ_CLIENT_CODEPAGE=ISO8859-1
fromdsn[T]: <- fromdsn.parseArgs()
fromdsn(???)[D]: initialize(): sourceCodePage="IBM-1165"(1165), targetCodePage="ISO8859-1"(819), srcSp=0x40, tgtCr=0xD, tgtLf=0xA, lineTerm=6
2) my standard setting is LC_*=hu_HU LC_TIME=C LC_ALL=(empty) - hu_HU meaning iso8859-2 since it is the default locale on my pc. I take from your explanation, that this would not let dspipes use my codepage instead of the wired default 8859-1.

3) I was trying to control centrally how the different clients would perform default translation.
Many clients use our correct locale, but some clients just missed localization, others may have compatibility reasons to use a different locale. However, unless using latin-2 or utf locale, the default conversion would lose national characters. What I was trying is similar to the pure ascii scenario where the host is iso8859-2 and transfers are considered binary (outside of z/OS, neither scp nor sftp would convert).

As for the differences between local omvs shell vs. local ssh shell, I must apologize. After checking it again, both cases translateidentically to ascii. However, what I hoped, was that dspipes would perhaps ignore default ***Codepage parameters for the ebcdic case.

I agree that allowing both ebcdic and ascii clients, the targetcodepage/sourcecodepage parameters are unusable on the host side. Perhaps an asciiCodepage or similar parameter would allow the setting of the default ascii code page in dsn_profile.

Best regards, Jenoe
dovetail
Site Admin
Posts: 2022
Joined: Thu Jul 29, 2004 12:12 pm

Post by dovetail »

In your example (1), I don't understand why the client (running on Linux) is not finding the locale codepage to be "ISO8859-2".

The code we are using is:
setlocale(LC_ALL, "");
clientCodePage = nl_langinfo(CODESET);

Which should work, and I don't understand why it does not.

Regardless, I think that there are two enhancements that we could make that might help you:

1) change the host code so that -b forces no-conversion.
2) change the client code so that if an environment variable "COZ_CLIENT_CODEPAGE" is set, then we use that rather than the default client locale codepage.

Your thoughts?
Jeno
Posts: 23
Joined: Wed Oct 03, 2007 5:28 am

Post by Jeno »

For me the enhancements seem to be fine, and I think it also maintains downward compatibility, and so no negative side effects expected.

About ignoring LC_ALL. It is the same on another linux with different distribution. With hu_HU.utf8 , trace reports UTF8 accepted (on both tested linux systems). For all other installed locales, they will just be ignored (hu_HU.ISO-8859-2 and it's aliases, or de_DE.iso8859-15@euro)

Trace shows the following message for non-existent locales:
  • fromdsn-client[W]: setlocale() returned NULL. The NLS environment may not be properly configured
No such message appears for the installed locales. However, fromdsn uses the default 8859-1 in all cases but for UTF-8. Even for en_US.ISO8859-1, the trace reports targetcodepage defaulted to 8859-1.

Thank you for the help,
Best regards and have a nice week-end - Jenoe
Jeno
Posts: 23
Joined: Wed Oct 03, 2007 5:28 am

Post by Jeno »

Hello, I am not a C programmer myself, but this is the referred code isolated. It works as expected:

Code: Select all

#include <langinfo.h>
#include <locale.h>
#include <stdio.h>
int main() { 
	    static char* buf1 ;
	    static char* buf2 ;
	    buf1 = setlocale(LC_ALL, "") ;
	    printf(buf1, "\n" ) ; printf("\n\n");
	    buf2 = nl_langinfo(CODESET);
	    printf(buf2 ) ; printf("\n");
			return 0;
		}
LC_CTYPE=hu_HU.ISO-8859-2;LC_NUMERIC=hu_HU.ISO-8859-2;LC_TIME=C;LC_COLLATE=hu_HU.ISO-8859-2;
LC_MONETARY=hu_HU.ISO-8859-2;LC_MESSAGES=C;LC_PAPER=hu_HU.ISO-8859-2;LC_NAME=hu_HU.ISO-8859-2;
LC_ADDRESS=hu_HU.ISO-8859-2;LC_TELEPHONE=hu_HU.ISO-8859-2;LC_MEASUREMENT=hu_HU.ISO-8859-2;
LC_IDENTIFICATION=hu_HU.ISO-8859-2

ISO-8859-2
Best regards Jenoe
dovetail
Site Admin
Posts: 2022
Joined: Thu Jul 29, 2004 12:12 pm

Post by dovetail »

I notice that earlier your locale had "ISO8859-2" and now it has "ISO-8859-2". Did something change?

If you run "fromdsn -LD T ... " again, what do you see?
Jeno
Posts: 23
Joined: Wed Oct 03, 2007 5:28 am

Post by Jeno »

Yes, after finding a hint about some programs only accepting the canonical name, I tried to add the dash (did not help), and left so. This is the form that I get returned using "locale charmap".
fromdsn[D]: targetCodePage defaulted to COZ_CLIENT_CODEPAGE=ISO8859-1
fromdsn(JVAGO.B.SEQ)[D]: initialize(): sourceCodePage="IBM-1165"(1165), targetCodePage="ISO8859-1"(819), srcSp=0x40, tgtCr=0xD, tgtLf=0xA, lineTerm=6
fromdsn(xxx.B.SEQ): opts=rb,type=record,noseek maxreclen=1024 trim=false
fromdsn(xxx.B.SEQ)[N]: 2 records/85 bytes read; 87 bytes written in 0 milliseconds.
dovetail
Site Admin
Posts: 2022
Joined: Thu Jul 29, 2004 12:12 pm

Post by dovetail »

I think that something is going wrong in the "dspipes" ssh subsystem - which is where the client codepage is handled.

Getting debug messages out of the dspipes ssh subsystem is a little bit tricky, since you can't really use stderr.

If there are errors from it, they are by default logged to the z/OS Unix "SYSLOGD" service.

If you have Unix SYSLOGD running, check your log files to see if you are getting any messages from the dspipes subsystem. I suspect that you will find messages like:

dspipes: Error creating client codepage (ISO8859-2) translater ...(reason)
dspipes: Client codepage falling back to ISO8859-1

If you don't have SYSLOGD running, there is information on setting it up in the z/OS Ported Tools User's Guide, under the section "Setting up syslogd to debug sshd", or in the z/OS Communications Server: IP configuration guide.

Also, there is a little diagnostic program in the dspipes/bin directory that you can use to test a given codepage:

./lookupccsid ISO8859-2

And it should print:
912 ISO8859-2
Jeno
Posts: 23
Joined: Wed Oct 03, 2007 5:28 am

Post by Jeno »

The problem is that z/OS (1.7) iconv rejects the canonical codeset form.

Code: Select all

iconv -t ISO-8859-2 ... is rejected, while
iconv -t ISO8859-2 ... is accepted.
The linux client, however, gets the canonical form in queries ("locale charmap" shell command or "langinfo(codeset)" function), independent of which alias form has been defined in the linux locale setting. Consequently, this will then be sent to the host.

Do you think there is any solution but opening a pmr with IBM? Even in the case an APAR is available, it still takes time to get it into all prod systems ;(
For test, I just added (with success) the following after "clientCodePage = nl_langinfo(CODESET);"

Code: Select all

if (clientCodePage = "ISO-8859-2") { clientCodePage = "ISO8859-2" ; }
Perhaps striping the extra dash from "ISO-8859" would do the trick.
coz
Posts: 391
Joined: Fri Jul 30, 2004 5:29 pm

Post by coz »

A couple of thoughts: First, can you get the messages from the ssh dspipes subsystem as described above and send to us at info@dovetail.com?

Here's an easy way to get a log if you don't have syslog running on your machine:

1.) Ensure that PermitUserEnvironment is set to "yes" in your sshd_config file and restart sshd.

2.) create a file named "~/.ssh/environment" on z/OS with the following line:

Code: Select all

COZ_LOG=f=/tmp/cozlog.userid.txt,D,t
(where userid is your login id)

3.) Verify that this environment is working by running ssh to your z/os system:

Code: Select all

ssh user@zos /bin/env
You should see the COZ_LOG env var in the output.

4.) Rerun your problem fromdsn command and send us the logfile.

To address your specific problem, there are a couple of options:
1.) (best) get z/os unicode services running so that it will recognize iso8859-2. You are falling back onto iconv because you don't have unicode services tables setup for CCSID 912. This will not only fix your problem, but it will be faster.

2.) If you can't fix unicode services, you can add the following line to you environment file (from above):

Code: Select all

COZ_TRSUB_ISO-8859-2=ISO8859-2
Note: /etc/ssh/environment can also be used for this and will apply to all users that don't have ~/.ssh/environment. Note again that PermitUserEnvironment MUST be "yes" for this to work.

After testing, make sure you remove the COZ_LOG environment variable you set above.

3.) In the next server release of dataset pipes, we'll add specific code to strip out the dash immediately following the ISO substring.[/code]
Post Reply