Quite an oddly specific question here but something I've been having a lot of trouble with over the past day or so. Broadly, I'm trying to calculate the maximum of an array using crossfilter and then use this value to find a maximum.
For example, I have a series of Timestamps with an associated X Value and a Y Value. I want to aggregate the Timestamps by day and find the maximum X Value and then report the Y Value associated with this Timestamp. In essence this is a double dimension as I understand it.
I'm able to do the first stage simply to find the maximum values. But am having a lot of difficulty getting through to the second value.
Working code for the first, (using Crossfilter and Reductio). Assuming that each row has the following four values.
[(Timestamp, Date, XValue, YValue),
(2015-05-15 16:00:00, 2015-05-15, 30, 15),
(2015-05-15 16:45:00, 2015-05-15, 25, 33)
... (many thousand of rows)]
First Dimension
ndx = crossfilter(data);
dailyDimension = ndx.dimension(function(d) { return d.date; });
Get the max of the X Value using reductio
maxXValue = reductio().max(function(d) { return d.XValue;});
XValues = maxXValue(dailyDimension.group())
XValues now contains all of the maximum X Values on a Daily Basis.
I would now like to use these X Values to identify the corresponding Y Values on a date basis.
Using the same data above the appropriate value returned would be:
[(date, YValue),
('2015-05-15', 15)]
// Note, that it is 15 as it is the max X Value we find, not the max Y Value.
In Python/Pandas I would set the index of a DataFrame to X and then do an index match to find the Y Values
(Note, it can safely be assumed that the X Values are unique in this case but in reality we should really identify the Timestamp linked to this period and then match on that as they are strictly guaranteed to be unique, not loosely).
I believe this can be accomplished by modifying the reductio maximum code which I don't fully understand properly Source Code is from here
var reductio_max = {
add: function (prior, path) {
return function (p, v) {
if(prior) prior(p, v);
path(p).max = path(p).valueList[path(p).valueList.length - 1];
return p;
};
},
remove: function (prior, path) {
return function (p, v) {
if(prior) prior(p, v);
// Check for undefined.
if(path(p).valueList.length === 0) {
path(p).max = undefined;
return p;
}
path(p).max = path(p).valueList[path(p).valueList.length - 1];
return p;
};
},
initial: function (prior, path) {
return function (p) {
p = prior(p);
path(p).max = undefined;
return p;
};
}
};
Perhaps this can be modified so that there is a second valueList of Y Values which maps 1:1 with the X Values associated in the max function. In that case it would be the same index look up of both in the functions and could be assigned simply.
My apologies that I don't have any more working code.
An alternative approach would be to use some form of Filtering Function to remove entries which don't satisfy the X Criteria and then group by day (there should only be one value in this setting so a simple reduceSum for example will still return the correct value).
// Pseudo non working code
dailyDimension.filter(function(p) {return p.XValue === XValues;})
dailyDimension.group().reduceSum(function(d) {return d.YValue;})
Eventual results will be plotted in dc.js