3
votes

I need to convert UTF8 string to ISO-8859-1 string using VB.NET.

Any example?


emphasized textI have tried Latin function and not runs. I receive incorrect string.

My case is that I need to send SMS using API.

Now I have this code:

        baseurl = "http://www.myweb.com/api/sendsms.php"
        client = New WebClient
        client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")
        client.Encoding = System.Text.Encoding.GetEncoding("ISO-8859-1")
        client.QueryString.Add("user", user)
        client.QueryString.Add("password", pass)
        client.QueryString.Add("alias", myAlias)
        client.QueryString.Add("dest",  mobile)
        textoSms = Me.mmTexto.Text
        textoSms = System.Web.HttpUtility.UrlEncode(textoSms)
        client.QueryString.Add("message", textoSms)
        data = client.OpenRead(baseurl)
        reader = New StreamReader(data)
        s = reader.ReadToEnd()
        data.Close()
        reader.Close()

But not runs...I receive incorrect messages. For example

if I write: mañana returns maa ana

If I write aigüa returns aiga

5
none. I did not get any solution that worked well at 100%. if not I would have marked it as solved...sorry... - aco
No, I asked, what have you tried? This "question" is just asking for someone to tell you what to do, with no evidence of prior research. - Lightness Races in Orbit
I do not understand what you said .... only I know that question was opened 3 years ago! - aco
I don't see how what I said was complicated or difficult to understand. - Lightness Races in Orbit
Are you saying that the answer by Jon Skeet did not in fact answer your question? Because if it didn't, the question doesn't make any sense. UTF8 and ISO-8859-1 are encodings of Unicode text. In .NET, a string is always in Unicode format in memory, it's only when you want to convert it to a byte-array (usually because you need to store it in a binary file or send it over the network) that you involve encoding it. Jon Skeets answer was to the point and correct for the question at hand. If not, I'm closing this as too-localized since clearly the question is wrong then. - Lasse V. Karlsen

5 Answers

8
votes

How about:

Dim converted as Byte() = Encoding.Convert(utf8, Encoding.UTF8, _
                                           Encoding.GetEncoding(28591))

That assumes that when you say "UTF8 string" you mean "binary data which is the UTF-8 representation of some text". If you mean something else, please specify :)

Note that ISO-8859-1 only represents a tiny proportion of full Unicode. IIRC, you'll end up with "?" for any character from the source data which isn't available in ISO-8859-1.

3
votes

The encoding ISO-8859-1 is more commonly called Latin-1. You can get this encoding by doing the following

Dim latin1 = Text.Encoding.GetEncoding(&H6FAF)

The full conversion can be done by the following

Public Function ConvertUtf8ToLatin1(Dim bytes As Byte()) As Bytes()
  Dim latin1 = Text.Encoding.GetEncoding(&H6FAF)
  Return Encoding.Convert(Encoding.UTF8, latin1, bytes)
End Function

EDIT

As Jon pointed out, it may be easier for people to remember the decimal number 28591 rather than the hex number &H6FAF.

1
votes

Because System.Text.Encoding.GetEncoding("ISO-8859-1") does not support ñ is my guess, in that case you need to use another encoding type for you SMS.

Please read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

0
votes

http://msdn.microsoft.com/en-us/library/system.text.encoding.convert.aspx

Try this with the variable "input" as the UTF-8 String;

VB.NET:

Dim result As Byte() = Encoding.Convert(Encoding.UTF8, Encoding.GetEncoding("iso-8859-1"), input);

C#:

byte[] result = Encoding.Convert(Encoding.UTF8, Encoding.GetEncoding("iso-8859-1"), input);
0
votes

Dont know if this should be posted here but i made a small function in C# to check if a string support the target encoding type.

Hope it can be of any help...

/// <summary>
/// Function for checking if a string can support the target encoding type
/// </summary>
/// <param name="text">The text to check</param>
/// <param name="targetEncoding">The target encoding</param>
/// <returns>True if the encoding supports the string and false if it does not</returns>
public bool SupportsEncoding(string text, Encoding targetEncoding)
{
    var btext = Encoding.Unicode.GetBytes(text);
    var bencodedtext = Encoding.Convert(Encoding.Unicode, targetEncoding, btext);

    var checktext = targetEncoding.GetString(bencodedtext);
    return checktext == text;
}

//Call the function demo with ISO-8859-1/Latin-1
if (SupportsEncoding("some text...", Encoding.GetEncoding("ISO-8859-1")))
{
    //The encoding is supported
}
else
{
    //The encoding is not supported 
}