2
votes

In VB.Net, is there a way of auto-detecting the culture of a string representation of a number? I'll explain the situation:

Our asp.net web site receives xml data feeds for boat data. Most of the time, the number format for the prices use either a simple non-formatted integer e.g. "999000". That's easy for us to process.

Occaisionally, there are commas for thousands separators and periods for the decimal point. Also, that's fine as our data import understands this. Example "999,000.00".

We're starting to get some data from France where some of the prices have been entered with the periods and thousands separators the other way around as that's the way it's done in many European countries. E.g. "999.000,00". This is where our system would interpret that as nine hundred and ninety nine pounds instead of the nine hundred and ninety nine thousand pounds that was intended.

Unfortunately, the data feed contains prices in a mixture of the formats without any culture indicator on each one. Does anyone know of any in-built .net functions that will auto-detect the culture of a string number based on where the period and comma are?

4
Think about what you're asking, and you'll realize the answer is, "no, of course not". Instead, maybe you should define your feed to accept integers only. In fact, what is the currency unit for the prices? EUR? Maybe some of those numbers are Francs?John Saunders
I agree tha the best method would be to more clearly define the feed format. We're working on a new version - but until that's complete, we're stuck with making the best of this one. The currency unit is defined in a separate field of the feed.Simon White
Then maybe the currency can inform your opinion of the price format?John Saunders
Is there any accepted answer to this? It would be nice if you could accept one of those and/or comment it, so we can all follow in your footsteps.tfrascaroli
TIMINeutron, yes I apologise for not returning to this until now. Thank you all for replies. I think ultimately John Saunders has the answer - we just need to get on with fixing the feed format and documenting it well so it's clear which format should be used.Simon White

4 Answers

3
votes

I found a thread on vbforums suggesting how to get the decimal separator by culture.

I made a converter that converts between different files using oledb, excel object, mysql and more. Using "decimalSeparator" was the solution to get the number format working correctly when I converted to Excel and Access files.

Dim decimalSeparator As String = Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator

cellValue = cellValue.Replace(".", decimalSeparator ).Replace(",", decimalSeparator )
2
votes

There is no built-in way to determine the CultureInfo from a numeric string, as far as I know. And I seriously doubt it'll ever be, because there is no 100% safe way to do it.

Until you find a better solution (eg: some change on the sender-side), I guess the best you can do is to decrease the chances of error in two steps:

1) Input data cleanup and standardization:

Dim input as String = " 99 9.000,00 "
' This way you can remove unwanted characters (anything that is not a digit, and the following symbols: ".", "-", ",")
Dim fixedInput as String = Regex.Replace(input, "[^\d-,\.]", "")
' fixedInput now is "999.000,00"

2) Guess yourself the format:

Dim indexOfDot as Integer = fixedInput.IndexOf(".")
Dim indexOfComma as Integer = fixedInput.IndexOf(",")
Dim cultureTestOrder as List(Of CultureInfo) = new List(Of CultureInfo)
Dim parsingResult as Double?
Try
    If indexOfDot > 0 And indexOfComma > 0 Then
        ' There are both the dot and the comma..let's check their order
        If indexOfDot > indexOfComma Then
            ' The dot comes after the comma. It should be en-US like Culture
            parsingResult = Double.Parse(fixedInput, NumberStyles.Number, CultureInfo.GetCultureInfo("en-US"))
        Else
            ' The dot comes after the comma. It should be it-IT like Culture
            parsingResult = Double.Parse(fixedInput, NumberStyles.Number, CultureInfo.GetCultureInfo("it-IT"))
        End If
    Else If indexOfDot = fixedInput.Length-3 Then
        ' There is only the dot! And it is followed by exactly two digits..it should be en-US like Culture
        parsingResult = Double.Parse(fixedInput, NumberStyles.Number, CultureInfo.GetCultureInfo("en-US"))
    Else If indexOfComma = fixedInput.Length-3 Then
        ' There is only the comma! And it is followed by exactly two digits..it should be en-US like Culture
        parsingResult = Double.Parse(fixedInput, NumberStyles.Number, CultureInfo.GetCultureInfo("it-IT"))
    End If
Catch
End Try
If Not parsingResult.HasValue Then
    Try
        ' There is no dot or comma, or the parsing failed for some reason. Let's try a less specific parsing.
        parsingResult = Double.Parse(fixedInput, NumberStyles.Any, NumberFormatInfo.InvariantInfo)
    Catch
    End Try 
End If
If Not parsingResult.HasValue Then
    ' Conversion not possible, throw exception or do something else
Else
    ' Use parsingResult.Value
End If

You are not 100% safe this way, but it should be still better than your current code (and at least works as expected on the example data your provided).

1
votes

Without knowing the specifics of how many differne ways the string can be formatted... This works for the three examples you gave

Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
    Dim v As String = "999.123,45"
    Debug.WriteLine(foo(v))

    v = "999,123.45"
    Debug.WriteLine(foo(v))

    v = "999123"
    Debug.WriteLine(foo(v))
End Sub

Private Function foo(value As String) As Double
    Dim style As NumberStyles = NumberStyles.AllowThousands Or NumberStyles.AllowDecimalPoint
    Dim culture As CultureInfo = CultureInfo.InvariantCulture
    Dim rv As Double
    If Double.TryParse(value, style, culture, rv) Then
        Debug.WriteLine(",.Converted '{0}' to {1}.", value, rv)
    Else
        Dim styleES As NumberStyles = NumberStyles.AllowThousands Or NumberStyles.AllowDecimalPoint
        Dim cultureES As CultureInfo = CultureInfo.CreateSpecificCulture("es-ES")

        If Double.TryParse(value, styleES, cultureES, rv) Then
            Debug.WriteLine(".,Converted '{0}' to {1}.", value, rv)
        Else
            Throw New ArgumentException
        End If
    End If
    Return rv
End Function
1
votes

'This variable is True in case the O.S. Regional settings use a dot as decimal separator, false in the other cases.

Dim bAmerican as boolean=Cdec("0,2") > 1

Original post: https://www.linkedin.com/groups/8141257/8141257-6347113651027079171