detect the encoding method of the file uploaded through HTTP


Web tier: servlets, JSP, Web frameworks: detect the encoding method of the file uploaded through HTTP

  1. Hi All

    Can my servlet / JSP program detect the encoding of the file uploaded through HTTP ? Any HTTP header the jsp/servlet can exam to get information such as file size, filename, encoding, etc about the uploaded file ?

    thanks a lot for your great help in advance
  2. There's more than one encoding and mime type for file upload requests. The outer request is of type multipart/form-data. There's no standard api for accessing the rest of the file data, if you look at the spec ( you'll see the filename may well be sent as part of the content-disposition header (of one of the file upload blocks), but this is of very little use.

    Rather than interpret these things yourself, use one of the available libs for handling file upload; the Struts framework does it, so does Turbine, and Jason Hunter's O'Reilly package works nicely if you dont want to use a full-on framework. See:
  3. Dear Brian / All

    First, thanks you very much for your answer.
    l knew there are a few free lib available for handling file upload through HTTP using servlet / HTTP. But my program is more than just receiving a file and then save it. Because my system is an unicode system, l want my servlet, if possible, to examine the encoding type of the file and then automatically transform it into unicode and save it in db. Let's say, if the file uploaded is in big5/gb2312(chinese) format, my servlet can detect it, transform it into unicode and then save it. However, if the file is in unicode, no transform is done and content is directly read and saved in db.

    It is highly appreciated someone can firmly answer me whether it is achievable or not ?

    Thanks & regards
  4. Can't you examine the file AFTER it has been uploaded? You can definitely get the file size and if your upload code saves the file with the same filename as it was originally, it can get the filename from it too. As for the encoding, maybe you can get that from the file after it's saved too? Just a thought.

    Brian Pipa
    author of FileNabber, an upload/download servlet (war)
  5. Dear Brian

    Thanks for your advise. l examine the api of "file" class. There is no such method for checking the enocding of file.
    Maybe, the only alternative is to force user to upload file in a specific encoding. Thanks you so much for your reply by far.

  6. You say "transform it into unicode and then save it. However, if the file is in unicode, no transform is done"

    Why not just transform all of it to unicode regardless? What happens when you tansform text into unicode if it already is unicode? I would think it would remain unchanged (though I really don't know). If so, you could just unicode everything and not worry about the encoding type.

    I'll admit though, I don't thoroughly understand what it is you are trying to do, so my "answer" may be way off :)