I have a table 'Table1' which has Name column (along with other columns) that stores below values (for example), 10 rows for Name 1 and 10 rows for Name 2.
Name 1: Internationale Union öffentlicher Vereinigungen 'Grosse Don-Armee'
Name 2: Internationale Union öffentlicher Vereinigungen 'Große Don-Armee'
To keep it simple for this post, I am writing a SELECT query (wrapped in a stored procedure) that groups record by Name and gets the record count as below.
SELECT Name,COUNT(1) AS CountedRecords FROM Table1 GROUP BY Name
On database server 1, these gives me output as (call it Result set 1) -
Name | CountedRecords |
---|---|
Internationale Union öffentlicher Vereinigungen 'Grosse Don-Armee' | 10 |
Internationale Union öffentlicher Vereinigungen 'Große Don-Armee' | 10 |
On database server 2, these gives me output as (call it Result set 2) -
Name | CountedRecords |
---|---|
Internationale Union öffentlicher Vereinigungen 'Grosse Don-Armee' | 20 |
The collation is set to SQL_Latin1_General_CP1_CI_AS and table schema is exactly the same on both servers. I can change my query to include COLLATE Latin1_General_CI_AI for the Name column to make it Accent Insensitive and it would return 1 row with count = 20 in Result set 1, however I am looking for the root cause or possible reason behind this difference in the result sets even when the collation is same on SQL server and database levels, this is resulting into duplicate rows when joining (inner) Result set 1 with Table1 on Name column for further operations.
Is it related to collation or some other underlying SQL/server settings?