1
votes

EDIT: Fix from Sharondio: Is the "-- Enable High ASCII characters and Unicode for data sources configured for non-Latin characters" setting checked in the cfadmin datasource advanced settings? The complete path is CFAdmin --> Data & Services --> Datasources --> Click on your datasource --> Click Show Advanced Settings --> check the box as indicated above.

I'm trying to create a batch import page (in Coldfusion) that will take in a CSV file and insert the data into a SQL Server database. The CSV file can include data in a number of languages, including ones that don't use Roman characters, such as Chinese, Korean, Hindi and Thai.

When I run the import, anything with non-Roman characters is displayed in SQL Server Management Studio as a string of question marks, and attempting to output the data in my Coldfusion application also shows question marks. Previously entered Arabic data is displayed properly on the website and in the management studio, so it doesn't look like an encoding issue on my site or database.

I've tried UTF-8 and UTF-16 encoded CSV files (created in Excel and Notepad), and neither worked. I'm working with SQL Server 2000; collation is set to SQL_Latin1_General_CP1_CI_AS. The column I'm inserting into is ntext.

After some reading, I tried to modify my insert statement to have the N character before the inserted string (ie. insert into table_blah (column) values (N'stuff')), but that didn't make a difference either.

At this point I'm looking for any tips or guidance anyone can provide. Thanks.

Edit @Jake Feasel

Table I'm trying to insert into:

ID             int
Language_ID    int
Value          ntext

As far as what I'm trying to insert, it's a CSV in the following format:

English    French      Chinese    Hindi    Thai      German
Course     cours       课程        कोर्स      หลักสูตร    Kurs
Student    étudiant    学生        छात्र      นักเรียน     Schüler

Which is being inserted into the database via a Coldfusion query (which loops over the imported CSV and inserts the appropriate language ID value and translated value):

<cfquery name="insert_values" datasource="#APPLICATION.DSN#">
    INSERT INTO tbl_language_value (Language_ID, Value)
    VALUES (
    <cfqueryparam cfsqltype="cf_sql_integer" value="3" />,
    <cfqueryparam cfsqltype="cf_sql_varchar" value="นักเรียน" 
    )
SELECT SCOPE_IDENTITY() AS 'New_ID';
</cfquery>

When I dump the results of the query, I can see that the query itself is taking the proper characters for non-Roman languages, but for some reason they aren't being inserted into the database that way.

1
Seybsen - it's my understanding that collation is for sorting, not storage. The fact that I have existing Arabic text in the database that is stored/displayed properly seems to back this up.shimmoril
Could you provide your current table definition and a sample of the data you're having trouble inserting?Jake Feasel
Is the "-- Enable High ASCII characters and Unicode for data sources configured for non-Latin characters" setting checked in the cfadmin datasource advanced settings?Sharondio
@Sharondio - It was not, and now that it is, the import is working correctly! Thanks so much, I never would have found that.shimmoril
Sharondio.... nice catch. Most folks forget that one.Mark A Kruger

1 Answers

0
votes

If you're using a version of ColdFusion prior to CF10, you'll need to enable String format in the ColdFusion Adminintrator Datasource settings for the database you're working with.

If you're using ColdFusion 10, then you'll want to pay attention to the result of this question about the changes in ColdFusion 10.