[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: DOM wrappers and UTF-8 encoding
http://www.cl.cam.ac.uk/~mgk25/unicode.html is a very complete
explanation of UTF-8 encoding/decoding. From the text, it appears that
you have to make sure that your application can or does properly decode
UTF-8 byte sequences. This may not happen automagically. You neglected
to mention what language/program you are using to decode the XML file.
Ed.
Poorav Chaudhari wrote:
>
> I have an xml file, that contains utf-8 encoded text that is 〹
> format. to read the data between the tags in xml, i am using the
> standard DOM method getNodeValue. if i put a simple string between the
> tags my program spits out the exact string, if i put a (97 - is
> the ascii value for 'a') then the entire string is converted to 'a'.
> so basically if there is any other garbage over 128, it spits out some
> funny character.
>
> supposing i had the following xml
>
> <Data10>Data 10 a</Data10>
>
> the result it spits out is Data 10 a
>
> so if the xml contains
>
> <Data15>ス</Data15>
>
> i get a funny character.
>
> please if someone has anyidea what is going wrong please reply soon.
> thanks
>
> poorav
>
> ----------------------------------------------------------------------
> Do You Yahoo!?
> Listen to your Yahoo! Mail messages from any phone with Yahoo! by
> Phone.