4
votes

Is there a TSQL equivalent for Azure Table Storage? I want to add hock type queries in .NET when the property names are not known at design time.

From my understating of LINQ, you need to reference existing public properties.

 var selectedOrders = from o in context.Orders
                 where o.Freight > 30
                 orderby o.ShippedDate descending 
                 select o;

Freight and ShippedDate must be public properties defined at design time. I don't have structured properties (or even structured classes).

What if I don't know the property names at design time? You can add new property names to a Table in a very add hock manner but how can you consume them?

Via the REST API can define a dynamic query

    Request Syntax:
    GET /myaccount/Customers()?$filter=(Rating%20ge%203)%20and%20(Rating%20le%206)$select=PartitionKey,RowKey,Address,CustomerSince  HTTP/1.1

Are there tools to use REST in .NET (in a dynamic manner)?

From REST API documentation: Use the logical operators defined by the .NET Client Library for ADO.NET Data Services Framework to compare a property to a value. Note that it is not possible to compare a property to a dynamic value; one side of the expression must be a constant. http://msdn.microsoft.com/en-us/library/windowsazure/dd894031.aspx

If the answer is you need to use SQL if you need TSQL type queries then OK.

What I think I am learning is that Table Storage is designed to serialize classes (especially if you have many many instances to serialize). From this link: http://msdn.microsoft.com/en-us/library/windowsazure/dd179423.aspx The schema for a table is defined as a C# class. This is the model used by ADO.NET Data Services. The schema is known only to the client application and simplifies data access. The server does not enforce this schema.

    [DataServiceKey("PartitionKey", "RowKey")]
    public class Blog
    {
        // ChannelName 
        public string PartitionKey { get; set; } 
        // PostedDate 
        public string RowKey { get; set; } 

        // User defined properties
        public string Text { get; set; }
        public int Rating { get; set; }
        public string RatingAsString { get; }
        protected string Id { get; set; }
    } 

A user will upload a file that will go to BLOB and string fields to describe the file. It needs to be able to scale to millions of records. Would only search on two required fields (properties): custID and batch. Don't need to search for other fields but I do need to preserve them and allow the user just to add new fields in a batch. Need it to scale to millions of records and hence BLOB storage is appropriate for the files. What I think I get out of Table Storage is the ability to use REST at the client to download files and fields. Need to optimized for up to 100,000 downloads at a time and supports restart. Upload is going to be relatively small batches, and that will probably not be REST as I need to do some upload validation on the server side.

What I am thinking about doing is two tables. Where the second is designed for dynamic data.

    Master
      PartitionKey CustID 
      RowKey       GUID
      string       batch
      string       filename
    Fields
      PartitionKey CustID+Guid
      RowKey       fieldName
      string       value

The fieldName is required to be unique. The queries on Master would be by CustID or by CustID and batch. The queries on Fields would be by PartitionKey. Comments, please.

6
I don't think it would be hard to write one. I have thought myself that I'd prefer table storage to work with dynamic data than concrete classes.Richard Astbury
Would you use REST and parse the Response? I am surprised there is not a .NET library for dynamic queries to Table Storage. Am I looking at Table Storage incorrectly? Is the sole intent of Table Storage to serialize large (or small) numbers of concrete classes? When I look at REST and LINQ I get two different views.paparazzo
The table storage API (i.e. the linq you're writing) is just an abstraction over the top of the REST interface. You could write your own as you correctly point out. It is surprising that a more dynamic interface doesn't exist in C#, but it's not really a dynamic language, most people like working with static types. You could always switch to node.js or ruby (for example) where dynamic libraries are available. Table storage is designed to store a large number of small object of an unfixed schema.Richard Astbury
@RichardAstbury Thanks, I like working with static types I just don't have them here. What I am thinking about doing is creating a static class to store the dynamic data. Please look at the new last paragraph and commment.paparazzo
If you're looking for the best performance, your variable fields fit within the restrictions of properties of a table and don't mind writing a bunch of code, for what you want to do, skip LINQ and just use the rest API. Based on comments here, put it in a library and share it, it could be popular.knightpfhor

6 Answers

3
votes

I have also created a library for using dynamic types with table storage:

To use it, first, create a context object:

var context = new DynamicTableContext("TableName", credentials); 

Then inserting a record is easy:

context.Insert(new { PartitionKey="1", RowKey="1", Value1="Hello", Value2="World" }); 

You can do the same with a dictionary:

var dictionary = new Dictionary<string, object>();
dictionary["PartitionKey"] = "2";
dictionary["RowKey"] = "2";
dictionary["Value3"] = "FooBar";
context.Insert(dictionary); 

Retrieving an entity is straight forward, just pass in values for partition key and row key:

dynamic entity = content.Get("1", "1"); 

You can also pass in a query:

foreach (dynamic item in context.Query("Value1 eq 'Hello'"))
{
  Console.WriteLine(item.RowKey);
}

It's available on github here: https://github.com/richorama/AzureSugar

1
votes

It is not a question of Table Storage, it is of LINQ. You can write dynamic LINQ (unfortunately losing why LINQ is so awesome) using Expression trees. LINQ is really just Expression Trees in the background anyway.

So three responses to your question:

  1. Here is how to write dynamic LINQ

  2. Here is information on a dynamic LINQ library to make things less ugly

  3. And finally you can't use order by on Table Storage queries :)

1
votes

I am working on a dynamic client for the REST API It's called Cyan, you can find it on Codeplex or in nuget searching for "Cyan".

Your query would be:

var cyan = new CyanTableClient("account", "password")
var items = cyan.Query("Customers",
    filter: "(Rating ge 3) and (Rating le 6)",
    fields: new[] { "PartitionKey", "RowKey", "Address", "CustomerSince" });

You can then access fields in this way:

var item = items.First();

// the address
string address = item.Address;

// or you can cast to CyanEntity and access fields from a Dictionary
var entity = (CyanEntity)item;
string address2 = (string)entity.Fields["Address"];

It's still a work in progress, please send me your feedback and feel free to contribute!

1
votes

I have written a Azure table storage client which supports both static and late binding (via a dictionary). Any table property not contained in the entity type will be captured in a dictionary. It also supports arrays, enums, large data, serialization and more. More features are in the works too!

You can get it at http://www.lucifure.com .

0
votes

I had a similar issue and approached it initially by having a table like your "Fields" table - lots of very small records with key/value. I came across two issues - one is that Nagle's algorithm messes with your inserts if you're not doing it asynchronously (up to 500ms per insert) and the other is that Azure's scalability targets limit you to 20,000 entities/sec across a storage account.

In the end I solved it by storing a Dictionary<string, string> using custom read/write methods and I've written that up in How can I store arbitrary key value pairs in Azure table storage?

0
votes

Azure table service does not support order by query, it only supports where, select, first, firstorsdefault, from and take. You'll have write to two separate queries.