0
votes

I have a database full of users and a Java code that queries for all of them (there are about 5,000), creates a dictionary with the relevant details for each one, and sends it to SalesForce to make sure the data i have there is up-to-date with the data in the DB.

This is a cron-job that runs daily.

My question is - which option is better?

  1. Continue with this method, calling SalesForce's API 5,000 times a day.
  2. Create one big dictionary comprised of all of the 5,000 user's dictionaries, and use SalesForce's Bulk API to insert/update them all at once with just a single API call.

What do you think? Advantages/disadvantages of each one?

1

1 Answers

1
votes

I think you're looking wrong at the APIs. Forget bulk API for now.

SOAP API and REST API have identical or nearly identical capabilities. Pick what you feel more comfortable with / which Java libraries you know better. To name few key factors:

Request size: They both support sending more than 1 record at a time, up to 200 in fact. So first consider restructuring your code to send more in each update. You'll save on API calls (rolling limit through 24 hours), it'll be faster (less overhead introduced by network traffic)...

Error handling: If your update fails - they'll all give you the errors on same position in the returned message (5th input record -> 5th success/error record) so you can match stuff even if it'd be insert and not update (because with updates the errors also include Ids).

"All or none": do you want to save what you can in that batch of N records or it should be all or nothing, proper database rollback if something goes wrong? In SOAP API you specify it in the message header, in REST - as a HTTP header.

One advantage I can think REST API has would be authentication. With SOAP you need the username + password + sometimes token. REST would let you use OAuth flows - it never hurts to not have to save the password in your program... Might be less important if it's a cron job though.

Right, so potentially we're looking at 5K/200 = 25 requests / day. Much better.


Bulk API would let you do it in 1 chunk of up to 10K records. But it's asynchronous. You submit a job, it's queued for processing, you get back job id, you need to periodically check the status, download results, process them (unzip etc). It's doable but seems to be bit of an overkill for your situation. Consider bulk API when you're talking about +100K records.

And even then probably you wouldn't hand-craft it anyway but maybe reuse something. Did you know you can script the DataLoader to run from console (including cron jobs / windows task scheduler?). That DataLoader is pretty much a thin UI wrapper over a JAR file you can just directly use? And it supports all operations you need.


Maybe even you'll decide to use with some integration solution like dataloader.io, jitterbit, informatica... (then again these might be an overkill too)... Hell, there's even a SQL Server plugin that pretends Salesforce is just a regular database with ODBC driver so you fire normal SQL queries, updates etc.