1
votes

I like to write a process in Worker role, to download (sync) batch of files under a folder(directory) to local mirrored folder(directory)

Is there a timestamp(or a way to get) on the time of last folder(directory) updated?

Since folder(directory) structure unsure, but simply put is download whatever there to local, as soon as it changes. Except recursion and setup a timer to check it repeatedly, whats another smart idea do you have?

(edit) p.s. I found many solutions on sync files from local to Azure storage, but the same principle on local files cannot apply on Azure blob, I am still looking for a way that most easily to download(sync) files to local as soon as they are changed.

2
How are the files being written to blob storage? Direct access or from an intermediary?halfbit
Can you include some links on what you found, I'm working on something similar and would be interested in their approaches.halfbit
please email me eric#ericyin.com for a class I rewrite, I put the code in worker role to continues sync files. Original code I found here toolheaven.net/post/…Eric Yin
@makerofthings7, Files are upload to blob by third party, in whatever way. So theres no trigger to fire, I have to continues check and recheck. Initial check every 2 seconds, in no changes found, check every 4 seconds... then 8 seconds.... to save some cost on transaction.Eric Yin
Btw, I use this code to update Views (so you can see, all small files and not so many) for each instance without redeployment. The code itself has limitation of 5000 files, limited by ListBlobs method.Eric Yin

2 Answers

1
votes

Eric, I believe the concept you're trying to implement isn't really that effective for your core requirement, if I understand it correctly.

Consider the following scenario:

  1. Keep your views in the blob storage.
  2. Implement Azure (AppFabric) Cache.
  3. Store any view file to the cache, if it's not yet there on a web request with unlimited(or a very long) expiration time.
  4. Enable local cache on your web role instances with a short expiration time (e.g. 5 minutes)
  5. Create a (single, separated) worker role, outside your web roles, which scans your blobs' ETags for changes in interval. Reset the view's cache key for any blob changed
  6. Get rid of those ugly "workers" inside of your web roles :-)

There're a few things to think about in this scenario:

  • Your updated views will get to the web role instances within "local cache expiration time + worker scan interval". The lower the values, the more distributed cache requests and blob storage transactions.
  • The Azure AppFabric Cache is the only Azure service preventing the whole platform to be truly scalable. You have to choose the best cache plan based on the overall size (in MB) of your views, the number of your instances and the number of simultaneous cache requests required per instance.
  • consider caching of the compiled views inside your instances (not in the AppFabric cache). Reset this local cache based on the dedicated AppFabric cache key/keys. This will raise the performance greatly for you, as rendering the output html will be as easy as injecting the model to the pre-compiled views.
  • of course, the cache-retrieval code in your web roles must be able to retrieve the view from the primary source (storage), if it is unable to retrieve it from the cache for whatever reason.
1
votes

My suggestion is to create an abstraction on top of the blob storage, so that no one is directly writing to the blob. Then submit a message to Azure's Queue service when a new file is written. Have the file receiver poll that queue for changes. No need to scan the entire blob store recursively.

As far as the abstraction goes, use an Azure web role or worker role to authenticate and authorize your clients. Have it write to the Blob store(s). You can implement the abstraction using HTTPHandlers or WCF to directly handle the IO requests.

This abstraction will allow you to overcome the blob limitation of 5000 files you mention in the comments above, and will allow you scale out and provide additional features to your customers.

I'd be interested in seeing your code when you have a chance. Perhaps I can give you some more tips or code fixes.