8
votes

my iPhone application has an entity Words with the attributes word, length and language. Both are indexed: Entity and attributes

I copied the cdatamodel and database to a separate importer application where it got prefilled with about 400k words in different languages. I verified the import by looking into the SQLite file and then copied the prefilled database back to the iPhone project.

First I thought the (simple) predicate is the problem. But even after deleting the predicate from the fetch request, it takes a very long time for execution:

2011-09-01 09:26:38.945 MyApp[3474:3c07] Start
2011-09-01 09:26:58.120 MyApp[3474:3c07] End

Here is what my code looks like:

// Get word
NSLog(@"Start");
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:@"Words" inManagedObjectContext:appDelegate.managedObjectContext];
[fetchRequest setEntity:entity];
            
NSError *error = nil;
NSArray *fetchedObjects = [appDelegate.managedObjectContext executeFetchRequest:fetchRequest error:&error];
if (fetchedObjects == nil) {
    //... error handling code
}
    
[fetchRequest release];
NSLog(@"End");
return fetchedObjects;

Is the number of entries in the database a problem for Core Data?


EDIT: As gcbrueckmann and jrturton pointed out, it's a good point to set fetchBatchSize. But fetch time is still unsatisfying:

  • 2 seconds with a predicate set:

    NSPredicate *predicate = [NSPredicate predicateWithFormat:@"length == %d AND language BEGINSWITH %@", wordLength, lng]; [fetchRequest setPredicate:predicate];

  • 7 seconds with the batch size set:

    [fetchRequest setFetchBatchSize:1];

  • 1 second with a both the predicate and the batch size set

Is there still another bottleneck?

3
in your predicate is the language possibly more restrictive than the length, sometimes the order of the predicate checks can speed things up as well. for instance in this case if 60% of the words met your length criteria but only 40% met the language criteria it would be better to have the language check first. The other thing might be if you need this faster would be to have it pre-loaded and then filter an array in memory not sure if your iphone app can handle that though.crackity_jones
In this case the first query is comparing integers (indexing will make this very fast) and the second is a string comparison (even an indexed string isn't going to be speedy) - I'd be surprised if reordering the query helped. However, do try it - I'd be interested to see if it helped!deanWombourne
Oh, I forgot to mention it: I've already tried to swap the predicate's order, it doesn't speed up the fetch.Norbert

3 Answers

14
votes

Since you’re not limiting the result set in any way fetching 400,000 objects at once will definitely be a burden on Core Data. There are several ways you can improve performance:

Changing the fetch request's fetchBatchSize limits the number of objects the fetch will keep in memory at a time. This feature is completely transparent to your application, so it's definitely worth a try.

If you don't need fully-fledged objects, you may consider changing the fetch request's resultType to a more appropriate value. Especially if you’re only interested in some of the values of an object, using NSDictionaryResultType is a good idea.

Lastly the fetchLimit and fetchOffset properties allow you to limit the result range, if you want to manage batch processing yourself. This is a good idea if your handling of each of the result objects uses lots of memory because you can wrap each batch in an NSAutoreleasePool (just don’t be tempted to create an autorelease pool for every single result object).

I guess 1 sec. might just be as fast as it gets in your case – even if you resort to a plain Sqlite database. The only further optimisation I can think of is to use one table per language (instead of putting words from all languages into a single table). This, of course, will only work with Sqlite unless you define separate entities for all languages, i. e. take your Words entity as is and make it abstract. Then add subentities like EnglishWord etc. Objects from different entities are stored in separate tables. So, combined with the fetchBatchSize and predicate parameters this should perform similarly to the Sqlite approach with separate tables for all languages.

2
votes

That would fetch your full 400k database into memory which does seem like a lot. You could investigate NSFetchRequest's

setFetchBatchSize

method which stops the framework returning full objects for everything in your fetch request, on the assumption that you don't need every returned object to be fetched from the store in the first instance.

2
votes

You're doing BEGINSWITH - that's not a very fast operation! However, there are a finite number of languages so an emum would probably help.

Have a language_id field that's an indexed integer and use that in your predicate. You can still store the language name as well and return it as part of the fetched object, just don't search on it :)


PS You can turn on SQL debugging by adding '-com.apple.CoreData.SQLDebug 1' as an argument passed on launch (configure this in your Scheme) - this might help you see what SQL is doing behind the scenes.

(see this question for more details)