Friday, June 27, 2008

Converting Between Strings (Unicode) and Other Character Set Encodings

0 comments
Delicious 0

Posted by Nguyen, Lam D

Many network protocols and files store their characters with a byte-oriented character set such as ISO-8859-1 (ISO-Latin-1). However, Java's native character encoding is Unicode. This example demonstrates how to convert ISO-8859-1 encoded bytes in a ByteBuffer to a string in a CharBuffer and visa versa.

// Create the encoder and decoder for ISO-8859-1  
Charset charset = Charset.forName("ISO-8859-1");
CharsetDecoder decoder = charset.newDecoder();
CharsetEncoder encoder = charset.newEncoder();
try {
    // Convert a string to ISO-LATIN-1 bytes in a ByteBuffer
    // The new ByteBuffer is ready to be read.
    ByteBuffer bbuf = encoder.encode(CharBuffer.wrap("a string"));
    // Convert ISO-LATIN-1 bytes in a ByteBuffer to a character ByteBuffer and then to a string.
    // The new ByteBuffer is ready to be read.
    CharBuffer cbuf = decoder.decode(bbuf);
    String s = cbuf.toString();
} catch (CharacterCodingException e) {
}

In the example above, the encoding and decoding methods created new ByteBuffers into which to encode or decoding the data. Moreover, the newly allocated ByteBuffers are non-direct. The encoder and decoder provide methods that use a supplied ByteBuffer rather than create one. Here's an example that uses these methods:

// Create a direct ByteBuffer.    
// This buffer will be used to send and recieve data from channels.
ByteBuffer bbuf = ByteBuffer.allocateDirect(1024);
// Create a non-direct character ByteBuffer
CharBuffer cbuf = CharBuffer.allocate(1024);
// Convert characters in cbuf to bbuf
encoder.encode(cbuf, bbuf, false);
// flip bbuf before reading from it
bbuf.flip();
// Convert bytes in bbuf to cbuf
decoder.decode(bbuf, cbuf, false);
// flip cbuf before reading from it
cbuf.flip();

Hope you think it's helpful :D - this post i found in www.exampledepot.com



Comments 0 comments:

Post a Comment