Consider the following:
out.write( myString.getBytes() );
This seems to be suboptimal since the String should be able to convert the bytes as it writes them into the stream. Is there any way to avoid creating this extra byte buffer (the one created by #getBytes)?
Thanks in advance!
R.
-
String#getBytes and OutputStream (7 messages)
- Posted by: Robert DiFalco
- Posted on: August 22 2004 12:06 EDT
Threaded Messages (7)
- Re: String#getBytes and OutputStream by Jacob Rolph on August 23 2004 14:08 EDT
- Re: String#getBytes and OutputStream by Robert DiFalco on August 23 2004 16:40 EDT
- String#getBytes and OutputStream by Cameron Purdy on August 25 2004 13:55 EDT
- String#getBytes and OutputStream by Robert DiFalco on August 26 2004 13:45 EDT
- String#getBytes and OutputStream by Cameron Purdy on August 26 2004 02:36 EDT
- String#getBytes and OutputStream by Robert DiFalco on August 26 2004 13:45 EDT
- String#getBytes and OutputStream by Jose Ramon Huerga Ayuso on August 29 2004 16:38 EDT
- String#getBytes and OutputStream by Robert DiFalco on August 30 2004 20:05 EDT
-
Re: String#getBytes and OutputStream[ Go to top ]
- Posted by: Jacob Rolph
- Posted on: August 23 2004 14:08 EDT
- in response to Robert DiFalco
Since strings are immutable objects, the JVM is going to duplicate the instance data (the string) whenever it's giving you a reference to something that can potentially change the contents of the string (in this case the byte array). Have you tried wrapping your ouput stream with a Writer? This would allow you to write the string directly to stream without converting it to bytes and letting the lower-level APIs handle this. I haven't tested this myself; my only concern would be that the lower-level APIs would also convert it to a byte array using the same method. -
Re: String#getBytes and OutputStream[ Go to top ]
- Posted by: Robert DiFalco
- Posted on: August 23 2004 16:40 EDT
- in response to Jacob Rolph
It's clear why #getBytes returns a new byte array. Even more than your explanation, String doesn't even use a byte array internally. And a Writer won't help much. This won't stream the bytes directly to an InputStream.
What you would need is something more like this:
String:
public void writeBytes( ByteBuffer buffer );
public void writeBytes( ByteBuffer buffer, String charsetName );
Or even:
public void writeBytes( OutputStream out );
public void writeBytes( OutputStream, String charsetName );
Either of these would have worked.
R. -
String#getBytes and OutputStream[ Go to top ]
- Posted by: Cameron Purdy
- Posted on: August 25 2004 13:55 EDT
- in response to Robert DiFalco
First, you should not use getBytes().
Second, unless you expect the size to exceed 64K characters, you should use DataOutput.writeUTF().
Peace,
Cameron Purdy
Tangosol, Inc.
Coherence: Shared Memories for J2EE Clusters -
String#getBytes and OutputStream[ Go to top ]
- Posted by: Robert DiFalco
- Posted on: August 26 2004 13:45 EDT
- in response to Cameron Purdy
You should use #writeUTF unless you are writing ascii Strings (as bytes) to a Socket expecting ascii bytes (such as a telnet host). ;)
That's kinda why I put #getBytes in the subject line, because that is the behavior I needed -- bytes.
But if you think about it, from a performance perspective #writeUTF (as it is implemented in the Sun JDK) is just as bad if not worse than #getBytes. It's fine for small strings but for large ones in a highly scaled application, there will be a lot of large temporary allocations. If you want unencoded bytes #getBytes will allocate an extra buffer unnecessarily. If you want machine independent UTF encoded byte, then #writeUTF will create TWO extra buffers: one for the char array and one for the byte array.
You can imagine that if you operated on the string rather than the Stream, you could implement this much more optimally. For example, instead of:
out.writeUTF( string );
It were:
string.writeUTF( out );
Or....
string.writeBytes( out, encoding );
This approach seems better (the "tell don't ask" approach) because it allows you to implement these methods in a way that allows them to be completely streamed with out temporary buffers.
R. -
String#getBytes and OutputStream[ Go to top ]
- Posted by: Cameron Purdy
- Posted on: August 26 2004 14:36 EDT
- in response to Robert DiFalco
Hi Robert,
Going to ASCII from a String? You should be using a Writer then on top of the OutputStream with an encoding of ASCII.
The point is that String.getBytes() and DataOutput.writeBytes() are just _wrong_ because they lose data.
(OTOH, if you know your data is in the range 0x00-0x7F, then maybe you know what you are doing and it doesn't matter.)
It's obvious that writeUTF isn't what you need for a telnet host, though ;-)
Peace,
Cameron Purdy
Tangosol, Inc.
Coherence: Shared Memories for J2EE Clusters -
String#getBytes and OutputStream[ Go to top ]
- Posted by: Jose Ramon Huerga Ayuso
- Posted on: August 29 2004 16:38 EDT
- in response to Robert DiFalco
Consider the following: out.write( myString.getBytes() );This seems to be suboptimal since the String should be able to convert the bytes as it writes them into the stream. Is there any way to avoid creating this extra byte buffer (the one created by #getBytes)?Thanks in advance!R.
Maybe when the JIT compiler converts this class to binary code, it is going to convert both functions (out.write() and myString.getBytes()) in only one inline function in binary code, so I don't think that you should concern about performance problems.
Just my 2 cents,
Jose Ramon Huerga
http://www.terra.es/personal/jrhuerga -
String#getBytes and OutputStream[ Go to top ]
- Posted by: Robert DiFalco
- Posted on: August 30 2004 20:05 EDT
- in response to Jose Ramon Huerga Ayuso
Uh.....no.