Encoding non english characters with utf 8 on jsp (Critical!!)


Web tier: servlets, JSP, Web frameworks: Encoding non english characters with utf 8 on jsp (Critical!!)

  1. I am inserting hebrew characters from JSP into oracle db and everything is fine until this point. But when I try to retrieve the information from the database, the characters are not displayed properly (I get some garbage characters). I am sure that the data stored in the database is correct, but not sure why there is a problem in displaying the data in the JSP. I came across a thread on TSS https://www.theserverside.com/discussions/thread.tss?thread_id=28944 and followed the suggestions given there like having <%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8" %> and also this <% //Some JDBC and sql statement query UTF-8 data and then ... String str = rs.getString("utf8_data"); str = new String(str.getBytes("ISO-8859-1"),"UTF-8"); %> <%= str %> Now, the data getting displayed is partly correct, I mean to say, some characters are still coming as squares. Any ideas will be of great help.
  2. Hi, If you are sure that the data inserted into the Oracle DB is correct, you need not do any conversions on the way out. See this link : http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/ Btw, did you make sure that the column type was NVARCHAR and you used something like below to insert the data? PreparedStatement pstmt = con.prepareStatement("INSERT INTO i18n VALUES (?,?,?)"); pstmt.setLong(1, id); // number column pstmt.setString(2, "name"); // VARCHAR2 column ((OraclePreparedStatement)pstmt) .setFormOfUse(3, OraclePreparedStatement.FORM_NCHAR); pstmt.setString(3, "some unicode string"); // NVARCHAR2 column
  3. What is the encoding of your browser? UserDefined? or UTF-8? Please check your browser settings.
  4. Hi Guys, This is regarding the issue I am facing while sending UTF-8 characters using GET method to a servlet directly from browser. I have done the following settings: 1. Created a CharsetFilter, which sets encoding type for each request as UTF-8 2. Applied this filter in web.xml before all the requests 3. In my servlet, while writing the response, I have set response.setContentType to text/html;charset=utf-8 For the above mentioned settings accented characters like À?ÅÆÇÈÉÊËÌ?Î??ÑÒÓÔÕÖØÙÚÛÜ?Þßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ works correctly. But Chinese characters, Arabic characters etc does not work. How ever if along with above settings, I change the server.xml settings to have useBodyEncodingForURI="true" OR/AND URIEncoding="UTF-8" in connector tag, the Chinese & Arabic characters works fine but now accented characters do not work . I have tried all the combination of the settings mentioned but some how only one of the above two situations work. Has anybody come across this problem? Any pointers will be great. I can not use POST request, as My servlet is the entry point to my application. Thanks Param