5
votes

I will be taking in around 1000 records at a given time and I have to determine if they are existing records or new records.

If they are existing I have to update the records, if new then just insert them. I will not know if any of them will be existing or if they all will be existing.

I thought that it might be best to do one query to the database and try to find if any of them exist in the db and store them in memory and check that collection in memory and check that.

Originally I was told I 1 field would be enough to determine uniqueness. So I thought I could just do 1 big in clause against 1 field in the database but now I found out that is not the case and I need to use 3 fields to determine if the record is existing or now.

This is basically an and clause

select * from where columnA = "1" and ColumnB = "2" and ColumnC = "3"

How can I properly right this in C# ado.net?

I am guessing I going to need to have like some super where clause?

select * from where (columnA = "1" and ColumnB = "2" and ColumnC = "3") or  (columnA = "4" and ColumnB = "5" and ColumnC = "6") or [....998 more conditional clauses)

I am open to better ideas if possible. I still think doing it in 1 shot is better than doing 1000 separate queries.

2
What rdbms are you working with? SQL Server, for instance, supports table valued parameters, so you should be able to do the entire thing with a single stored procedure. (Read here on how to do it in a thread safe way)Zohar Peled
going to be using sql serer 2017 but locally have to use 2014 as don't have windows 10.chobo2
Then I would suggest using a table valued parameter instead of dynamically creating the where clause. Plenty of examples on stackoverflow and the web in general.Zohar Peled
so this table value parameter SP would return back to my C# the results of duplicates?chobo2
Yes, easily. you can also do the upsert inside the stored procedure without having to return to the c# first.Zohar Peled

2 Answers

6
votes

I can only help you to write query for your request

        var recordCount = 1000;
        var query = "SELECT * FROM TableName WHERE";
        for (var i = 1; i < recordCount - 2; i += 3)
        {
            query += " (columnA = " + i + " and ColumnB = " + (i + 1) + " and ColumnC = " + (i + 2) + ") or ";
        }
1
votes

I feel kinda silly writing this answer, because I think that you should be able to put the pieces together a complete answer from other posts - but this is not an exact duplicate of either of the questions I have in mind.

There already are questions and answers in Stackoverflow dealing with this issue - however in my search I only found answers that are not thread safe, and most of them are using merge.

There are different questions and answers I can refer you to such as my answer to Adding multiple parameterized variables to a database in c# where you can see how to work with table valued parameters on c#, and to Aaron Bertrand's answer to Using a if condition in an insert SQL Server where you can see how to create a safe upsert - however I didn't find any answer that covers this completely - so here you go:

First you need to create a user defined table type in your database:

CERATE TYPE MyTableType AS TABLE
(
     Column1 int NOT NULL,
     Column2 int NOT NULL,
     Column3 int NOT NULL,
     -- rest of the columns in your table goes here
     PRIMARY KEY (Column1, Column2, Column3)
)

Then, you create the stored procedure:

CREATE stp_UpsertMyTable
(
    @MyTableType dbo.MyTableType readonly -- table valued parameters must be readonly
)
AS

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;

UPDATE t 
SET t.column4 = tvp.column4,
    t.column5 = tvp.column5 -- and so on for all columns that are not part of the key
FROM dbo.MyTable AS t
INNER JOIN @MyTableType AS tvp
    ON t.Column1 = tvp.Column1
    AND t.Column2 = tvp.Column2
    AND t.Column3 = tvp.Column3;

 -- Note: <ColumnsList> should be replaced with the actual columns in the table
INSERT dbo.MyTable(<ColumnsList>)
  SELECT <ColumnsList>
  FROM @MyTableType AS tvp
  WHERE NOT EXISTS
  (
    SELECT 1 FROM dbo.MyTable t
    WHERE t.Column1 = tvp.Column1
    AND t.Column2 = tvp.Column2
    AND t.Column3 = tvp.Column3
  );

COMMIT TRANSACTION;

GO

Then, the c# part is simple:

DataTable dt = new DataTable();
dt.Columns.Add("Column1", typeof(int));
dt.Columns.Add("Column2", typeof(int));
dt.Columns.Add("Column3", typeof(int));
dt.Columns.Add("Column4", typeof(string));
dt.Columns.Add("Column5", typeof(string));

// Fill your data table here

using (var con = new SqlConnection("ConnectionString"))
{
    using(var cmd = new SqlCommand("stp_UpsertMyTable", con))
    {
        cmd.CommandType = CommandType.StoredProcedure;
        cmd.Parameters.Add("@MyTable", SqlDbType.Structured).Value = dt;
        con.Open();
        cmd.ExecuteNonQuery();
    }
}

Now you have a complete and safe upsert using a table valued parameter with only one round trip between c# and sql server.