I have the trial account in the azure blob storage. I try to upload 100000 generated files from my local machine. The operation already have duration over 17 hours and uploaded only ~77000 files. All files created by a simple bash-script:
for i in {1..100000}
do
echo $i
echo $i > $1\\$i.txt
done
Code for the uploading:
using(var stream = File.OpenWrite(textBoxManyUploadFileName.Text))
using(var writer = new StreamWriter(stream)) {
foreach(var file in Directory.GetFiles(textBoxManyUploadFrom.Text)) {
Guid id = Guid.NewGuid();
storage.StoreFile(file, id, ((FileType)comboBoxManyUploadTypes.SelectedItem).Number);
writer.WriteLine("{0}={1}", id, file);
}
}
public void StoreFile(Stream stream, Guid id, string container) {
try {
var blob = GetBlob(id, container);
blob.UploadFromStream(stream);
} catch(StorageException exception) {
throw TranslateException(exception, id, container);
}
}
public void StoreFile(string filename, Guid id, int type = 0) {
using(var stream = File.OpenRead(filename)) {
StoreFile(stream, id, type);
}
}
CloudBlob GetBlob(Guid id, string containerName) {
var container = azureBlobClient.GetContainerReference(containerName);
if(container.CreateIfNotExist()) {
container.SetPermissions(new BlobContainerPermissions {
PublicAccess = BlobContainerPublicAccessType.Container
});
}
return container.GetBlobReference(id.ToString());
}
The first 10000 files have bean uploaded by 20-30 minutes then the speed decreased. I think it may due to the fact that the file names are GUID and Azure tries to build the clustered index. How to speed up? What is the problem?
GetFiles
returns strings, right? ButStoreFile
takes a stream... what am I missing? (I'm wondering where you dispose of the stream. Perhaps something is leaking.) You might want to just dofor (int i = 0; i < 100000; i++) { container.GetBlobReference(Guid.NewGuid().ToString()).UploadText(i.ToString()); }
to simplify what you're measuring. – user94559