1
votes

My data set is an array of the following form:

[
  { "DATE" : "2020-01-02", "COUNTRY" : "Spain", "COUNT" : 110 },
  { ... },
  { ... }
]

There are multiple countries and multiple days. There are no gaps in dates.

I want to inject field DAYS_PASSED (and subsequently use it for the X axis) using the following algorithm:

  1. Check the value of DAYS_PASSED for the previous day for the same country and assign it to variable TEMP. (If the previous day does not exist, assume 0);
  2. Calculate DAYS_PASSED using the following formula:
   if TEMP > 0, then DAYS_PASSED = TEMP + 1
   else-if COUNT > 100 then DAYS_PASSED = 1
   else DAYS_PASSED = 0

So far I have done this in a preprocessing step (outside of Vega-Lite) but I was wondering if it was possible to migrate the calculation to Vega-Lite, maybe by plugging-in in a JavaScript function somehow?

I would also like to be able to expose 100 (from the COUNT > 100 condition) in the graph so that the user can tweak it to, say, 200.

1

1 Answers

2
votes

You can do this with a series of transforms; for example:


  "transform": [
    {"calculate": "toDate(datum.DATE)", "as": "date"},
    {"calculate": "datum.COUNT < 100", "as": "pre100"},
    {
      "joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
      "groupby": ["COUNTRY"]
    },
    {
      "window": [{"op": "count", "as": "daysPassed"}],
      "groupby": ["COUNTRY"],
      "sort": [{"field": "date"}]
    },
    {"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
  ],

Here is a more full example showing this for a small dataset (vega editor):

{
  "data": {
    "values": [
      {"DATE": "2020-02-02", "COUNTRY": "Spain", "COUNT": 50},
      {"DATE": "2020-02-03", "COUNTRY": "Spain", "COUNT": 70},
      {"DATE": "2020-02-04", "COUNTRY": "Spain", "COUNT": 110},
      {"DATE": "2020-02-05", "COUNTRY": "Spain", "COUNT": 150},
      {"DATE": "2020-02-06", "COUNTRY": "Spain", "COUNT": 200},
      {"DATE": "2020-02-02", "COUNTRY": "Italy", "COUNT": 90},
      {"DATE": "2020-02-03", "COUNTRY": "Italy", "COUNT": 100},
      {"DATE": "2020-02-04", "COUNTRY": "Italy", "COUNT": 140},
      {"DATE": "2020-02-05", "COUNTRY": "Italy", "COUNT": 190},
      {"DATE": "2020-02-06", "COUNTRY": "Italy", "COUNT": 250}
    ]
  },
  "transform": [
    {"calculate": "toDate(datum.DATE)", "as": "date"},
    {"calculate": "datum.COUNT < 100", "as": "pre100"},
    {
      "joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
      "groupby": ["COUNTRY"]
    },
    {
      "window": [{"op": "count", "as": "daysPassed"}],
      "groupby": ["COUNTRY"],
      "sort": [{"field": "date"}]
    },
    {"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
  ],
  "concat": [
    {
      "mark": "line",
      "encoding": {
        "x": {"field": "DATE", "type": "temporal"},
        "y": {"field": "COUNT", "type": "quantitative"},
        "color": {"field": "COUNTRY", "type": "nominal"}
      }
    },
    {
      "mark": "line",
      "transform": [{"filter": "datum.daysPassed > 0"}],
      "encoding": {
        "x": {"field": "daysPassed", "type": "quantitative"},
        "y": {"field": "COUNT", "type": "quantitative"},
        "color": {"field": "COUNTRY", "type": "nominal"}
      }
    }
  ]
}

enter image description here