EDIT: Fix from Sharondio: Is the "-- Enable High ASCII characters and Unicode for data sources configured for non-Latin characters" setting checked in the cfadmin datasource advanced settings? The complete path is CFAdmin --> Data & Services --> Datasources --> Click on your datasource --> Click Show Advanced Settings --> check the box as indicated above.
I'm trying to create a batch import page (in Coldfusion) that will take in a CSV file and insert the data into a SQL Server database. The CSV file can include data in a number of languages, including ones that don't use Roman characters, such as Chinese, Korean, Hindi and Thai.
When I run the import, anything with non-Roman characters is displayed in SQL Server Management Studio as a string of question marks, and attempting to output the data in my Coldfusion application also shows question marks. Previously entered Arabic data is displayed properly on the website and in the management studio, so it doesn't look like an encoding issue on my site or database.
I've tried UTF-8 and UTF-16 encoded CSV files (created in Excel and Notepad), and neither worked. I'm working with SQL Server 2000; collation is set to SQL_Latin1_General_CP1_CI_AS. The column I'm inserting into is ntext.
After some reading, I tried to modify my insert statement to have the N character before the inserted string (ie. insert into table_blah (column) values (N'stuff')), but that didn't make a difference either.
At this point I'm looking for any tips or guidance anyone can provide. Thanks.
Edit @Jake Feasel
Table I'm trying to insert into:
ID int
Language_ID int
Value ntext
As far as what I'm trying to insert, it's a CSV in the following format:
English French Chinese Hindi Thai German
Course cours 课程 कोर्स หลักสูตร Kurs
Student étudiant 学生 छात्र นักเรียน Schüler
Which is being inserted into the database via a Coldfusion query (which loops over the imported CSV and inserts the appropriate language ID value and translated value):
<cfquery name="insert_values" datasource="#APPLICATION.DSN#">
INSERT INTO tbl_language_value (Language_ID, Value)
VALUES (
<cfqueryparam cfsqltype="cf_sql_integer" value="3" />,
<cfqueryparam cfsqltype="cf_sql_varchar" value="นักเรียน"
)
SELECT SCOPE_IDENTITY() AS 'New_ID';
</cfquery>
When I dump the results of the query, I can see that the query itself is taking the proper characters for non-Roman languages, but for some reason they aren't being inserted into the database that way.