2
votes

Assume a simplistic data model consisting of two tables, Book and Author, related by a foreign key constraint "Book.AuthorId = Author.Id".

Now say you've got some flat list of books:

Title                          Author
---------------------------------------------------------
The Selfish Gene               Richard Dawkins
Why Is Sex Fun                 Jared Diamond
The Ancestors Tale             Richard Dawkins

How do you import these without making a gazillion round trips to the DB?

I'm new to EF but figured it would allow me to add entities to the object context and then they'd be available in that context even if they aren't saved yet. This seems not to be the case. In principle, I'm doing this:

void Import()
{
    ctx = new Database();
    foreach (...)
    {
        Import(bookTitle, authorName);
    }

    ctx.SaveChanges();
}

void Import(string bookTitle, string authorName)
{
    var author = ctx.Authors.Single(a => a.Name == authorName);
    if (author == null)
    {
        author = new Author();
        author.Name = authorName;
        ctx.AddToAuthors(author);
    }

    var book = new Book();
    book.Author = author;

    ctx.AddToBooks(book);
}

But EF never finds an author I've already added, so it makes a new one every time. Hence the two books by Dawkins are related to two different author objects in the graph, which is of course wrong.

Next, upon SaveChanges, EF manages to insert the first Book and Author record, but then crashes on number two with a PRIMARY KEY VIOLATION because it attempts to use Id = 0 for every object.

System.Data.UpdateException: An error occurred while updating the entries.
See the InnerException for details. ---> 
  System.Data.SqlClient.SqlException: Violation of PRIMARY KEY 
  constraint 'PK_Book'. Cannot insert duplicate key in object 'dbo.Book'.

I attempted to change the definition for the id fields to INT IDENTITY(1,1) in MSSQL and then update the model. Letting the DB define the key values is anyway better than having Entity Framework (or any client) do it. But this appears not to be supported at all. EF seems not to have noticed that my keys are IDENTITY columns when I updated the model. It just leads to

System.Data.UpdateException: An error occurred while updating the entries. 
See the InnerException for details. ---> 
  System.Data.SqlClient.SqlException: Cannot insert explicit value for 
  identity column in table 'Book' when IDENTITY_INSERT is set to OFF.

The exact symptom is provided whether I update the model or not following the switch from "Id INT PRIMARY KEY" to "Id INT IDENTITY(1,1) PRIMARY KEY" for the column definition, which is what leads me to suspect that Microsoft didn't even remember to THINK ABOUT identity columns, though I hope I'm wrong about this!

I realize there's really multiple questions lurking in here, but since nothing in Entity Framework works as expected I choose to put the overarching question as "given this data model and these data to import, what is a good way to do it using Entity Framework 3.5?"

Please refrain from advising me to use EF 4.0 instead, as that is outside of my control. :)

1

1 Answers

0
votes

My question was based on a misunderstanding - in fact adding the objects to the object context without saving does make them available on the object context.

What was really going on is this: I'm trigging logic in a .net assembly from powershell script using Reflection.Load. When I made changes in the .net project, rebuilt it and GAC-ed it, then reran my PowerShell script, I was still running the old code. Basically the code from MyAssembly.dll is linked into the powershell session, so when I rerun the script it does not matter that a new version of the code has been deployed to the GAC...

The PK violation then makes sense, since my code does not assign an ID to the entity and there is more than one author - I tried to save both with ID = 0.

The "cannot insert explicit value when IDENTITY INSERT is off" also now makes sense, because I had modified my database but was still running the code that assumed the ID from should be inserted.

In short, what fixed the problem was to close my PowerShell session and start a new one.