Discussions

General J2EE: Junk character using ANSI char of 174 on linux 8.0

  1. Hello All,

    I have a j2ee application running on oracle9ias (OC4J) ported on Linux 8.0. I am facing some typical problem while I use the application to do a socket communication to a windows system and putting it to a MSMQ.

    A packet is constructed from XML file on oc4j and at one point where I am using the ANSI Character char of 174 ( ® ) it automatically adds char of 194 (  ) by default Eg. Linux : (41979®46240®0®) but the same on windows work fine. Eg WIN2K : ( 41979®46240®0® )

    Is there anything I am doing wrong?

    Please help me on this.

    Thank you in advance.

    Sajeev

    Threaded Messages (6)

  2. Your problem is that ANSI only includes the character codes for 0 to 127. The character code 174 is the encoding for ® in Cp1252, the Windows-specific extension of ANSI.

    You should switch to the UTF-8 character encoding, which is much more portable. In UTF-8, the XML encoding for ® is: ¨
  3. Thank you Paul Strack[ Go to top ]

    Thank you soo much, I have been struggling with this for last couple of days.

    I would also request one more help from you. As you suggested the the XML encoding for ® is: ¨ can you help me placing a small java code sample for using it.

    Thank you .
  4. Thank you Paul Strack[ Go to top ]

    First off, I made a mistake earlier ... the correct code is ®

    Sorry about that :)

    1) Make sure your XML header is:

    <?xml version="1.0" encoding="utf-8"?>

    2) Wherever you want ®, have your Java code print "&#174;" in the output XML.

    An example output file:

    <?xml version="1.0" encoding="utf-8"?>

    <example>
      This is a registered trademark &#174; of so-and-so
    </example>
  5. Thank you Paul Strack[ Go to top ]

    Thank you very much. You have been of great help to me.

    I understood it. But my issue is I am constructing the packet from the XML in java and sending to the MSMQ from java through a socket. If I were to write on to the xml it will help but if I have to write from the java to the socket how can I use the same. Please express you comments.

    Once again thank you in advance.
  6. Thank you Paul Strack[ Go to top ]

    It is definitely some kind of character encoding problem. Since I am not sure exactly where you are getting the XML from, and how you are sending it through the socket, I am not sure what the best solution is.

    Here is a page with the appropriate UTF-8 and Unicode character codes:

    http://home.worldonline.nl/~t876506/utf8tbl.html

    Here is a page with the Cp1252 character codes:

    http://www.microsoft.com/globaldev/reference/sbcs/1252.htm

    If you can get your raw data in the form of a byte array, you can convert the data to Unicode characters (Java's native character format) via the String constructor:

    byte[] bytes = // data as bytes ...
    String s = new String(bytes, "UTF-8"); // Convert UTF-8 to Unicode

    Once you have it in Unicode, you can convert to another text encoding using the String's getBytes() method:

    byte[] cp1252data = s.getBytes("Cp1252");

    I suspect your raw data is in UTF-8, and is being interpretted by your target machine as Cp1252, but I am not sure what combination of encoding/decoding would solve your problem.
  7. Thank you, based on your guidelines I have identified the issue and resolved it by just changing the character to a char below 127.

    You have been a great help.