2
votes

I have a case when I get a very big text data & each line contains some metadata + json data string. I need to process the json data on each line.

This is what I have:

public Data GetData(string textLine)
{
    var spanOfLine = textLine.AsSpan();
    var indexOfComma = spanOfLine.IndexOf(":");
    var dataJsonStringAsSpan = spanOfLine.Slice(indexOfComma + 1);

    // now use dataJsonStringAsSpan which is ReadOnlySpan<char> to deserialize the Data
}

Where Data is a Dto class which has bunch of (7) different attributes:

public class Data
{
    public int Attribute1 { get; set; }

    public double Attribute2 { get; set; }
    // ... more properties, emitted for the sake of brevity
}

I'm trying to achieve this with System.Text.Json API. Surprisingly it doesn't have any overload to deserialize from ReadOnlySpan<char>, so I come up with this:

public Data GetData(string textLine)
{
    var spanOfLine = textLine.AsSpan();
    var indexOfComma = spanOfLine.IndexOf(":");
    var dataJsonStringAsSpan = spanOfLine.Slice(indexOfComma + 1);

    var byteCount = Encoding.UTF8.GetByteCount(dataJsonStringAsSpan);
    Span<byte> buffer = stackalloc byte[byteCount];
    Encoding.UTF8.GetBytes(dataJsonStringAsSpan, buffer);
    var data = JsonSerializer.Deserialize<Data>(buffer);
    return data;
}

While this works, it looks very convoluted.

Is this the way to go or am I missing something more simple ?

1
is there a particular reason you need to convert each line of text to a span..? Strings are much easier to work with and work natively with the json api, and if you're already given a string, why not just take the substring containing the json data?Klaycon
If your text is “very big” and you don’t want to create a String object for it, you -definitely- don’t want to stackalloc. You could deserialize dataJsonStringAsSpan.ToString() (ToString on ReadOnlySpan<char> is “copy these contents to a string”, not “be a debugging aid/display value”)bartonjs
Can you please edit your question to share a (simplified) example of the JSON you are trying to deserialize? Also, if your JSON is very big then you don't want to load it into a single string to begin with, it will go on the large object heap and possibly obviate any advantages you get from using System.Text.Json. (Maybe I'm misunderstanding though and each textLine is not very big?) Are you reading the JSON from a local file, from a HTTP response, or something else?dbc
If you absolutely need to minimize memory pressure, you can see how Microsoft implements Deserialize(string json, Type returnType, JsonSerializerOptions options = null) here.dbc
It looks as though you could create your own version of that code with additional arguments int start, int length, and pass those to AsSpan(string, int, int). You'd also need your own version of a couple methods from JsonReaderHelper. Not sure it's worth it.dbc

1 Answers

-1
votes

Will this work the same? Just reading your code, it looks like it will do the same thing...

public Data GetData(string textLine)
{
    var split = textLine.Split(new char[] {':'});
    var data = JsonSerializer.DeserializeObject<Data>(split[1]);
    return data;
}