2
votes

I'm using crossfilter and dc to render charts of subject-related observations.

Each observation gets treated as a dimension. However, not all rows will have values for all dimensions as some dimensions have data that is repeated over time. For example, Column A has four values over four rows, but Column B has only one value so the other three rows it will be 0 / "" / blank.

Now if I filter on Column B for rows with a certain range / value, then automatically I lose all other rows for Column A and if I wanted to filter on Column A AFTER filter on Column B then I'm only filtering out of the one common row that has values for both.

This may sound as a logical behaviour, but it's not true to the data, because if I wanted to filter subjects (i.e.) rows that have a certain range for Column A AND a certain range for Column B that results in a wrong result, because of the blank values which are not missing they are just there because it's a table and all columns are expected to have values for all rows.

Is there a way to be able to filter on Column B without that excluding values from Column B only because they're blank?

I'm sorry that took that much text to explain!

UPDATE

An example: observation data is collected for patients, let's say 'weight' and 'blood pressure'. For one subject there might be two weight readings, but four blood pressure readings. When I try to create the data structure for crossfilter, I create two columns one for weight and another for blood pressure. I want to display to the user two bar charts showing the distrubtion of values in each pbservation across all subjects. The user should be able to filter subjects with a weight range AND a blood pressure range. Because two of the rows for a subject will not have values for blood pressure, filtering on weight will filter out subjects(i.e. rows) that might be in the range for the blood pressure filter, but did not have a value for weight so they were wrongly excluded

2

2 Answers

1
votes

I managed to do it with arrays so now instead of my data structured in a flat table-like structure:

{subjectId: "subject-101", study: "CRC305A", A: "24", B: "79"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "74"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "83"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "74"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "72"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "82"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "74"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "79"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "76"}
{subjectId: "subject-101", study: "CRC305A", A: "", B: "72"}

it's structured as below to allow variability in values from one column to another

{subjectId: “subject-101", 
 A:[“24”],
 B:[“79", "74", "83", "74", "72", "82", "74", "79", "76", "72", "79", "76", "77", "72", "83", "69", "72”]
}

And the filtering magically works!

Have one problem still related to the behaviour of dimension.top and dimension.bottom with respect to arrays. I'll post that in another question

0
votes

If the values are logically there, you should probably propagate them to the next rows before sticking them in crossfilter. Crossfilter doesn't have any concept of row order or defaulted values.

If I understand your question correctly, I'd do something like

var lastA, lastB, lastC;
data.forEach(function(d) {
  if(d.A)
    lastA = d.A:
  else
    d.A = lastA;
  if(d.B)
    lastB = d.B;
  else
    d.B = lastB
  // ...
});
var cf = crossfilter(data);

Trying to create sort of "wildcard values" like you suggest in your question, might be possible but you'd definitely have to change at least the filter handler for every chart, because they expect to be dealing with discrete values.