How is meters hourly data calculated by MarkLogic?

Question

I know there is hourly data available but the requirement is such that I need to understand the logic behind this and implement my logic at front end. Currently front end is receiving raw data.

For example - Request/rate (hourly at 11:00) - 28

How can I get the same value if I have request/rate for the same in minutely basis? Every minute too there are more than one value. How do I create the hourly logic from raw data ?

I would expect that monitoring history is using cts:aggregate a lot, but not familiar with its code, so not entirely sure. I wonder though, are you rebuilding part of monitoring history or ops director functionality? Or do you have a special use case? — grtjn
@grtjn I performed fn:sum((request rates)) div 60 and the value is matching with the hourly data. — bosari
Be very very careful of attempting to aggregate 'rate' metrics, or 'average of average'. When you say "There are more than one value" -- that is not true. You must be looking at something else. PER Server, PER Minute (raw) -- there should be one and only one value per metric per dimension. In one uninterrupted hour, on one host, you should only ever see 60 'values' for a given metric. you should get the same results as @grtjn. — DALDEI

DALDEI DALDEI · Accepted Answer · 2019-04-24T23:18:43

The "Monitoring History" (GUI) and the "Meters" (API, and feature) are related but different.

THe "Meters" feature is internal to the server and collects the raw (typically minute) data, aggragates it hourly, daily and monthly "rollups", and expires old data. ALL the data is stored in the Meters database in 'plain' XML files, but also indexed in a uniq way using the semantic's indexing.

THe Meters data is exposed via either direct query to the Meters database, or via a set of public REST entrypoints - whose implementation is in plain xquery source in the install tree you can review.

The "Monitorinig History" GUI -- is a client side app that makes use of the public rest endpoints and then does further processing on the client side to present various views. The exact processing algorithms are not documented, but the javascript code is also in 'plain text' for examination. Its not always obvious where the client side javascript is doing additional data processing, vs calling the REST endpoints -- nor the exact mapping of what you see to what the backend requests are.

If your goal is to reproduce the 'raw' data I recommend going directly to the Meters database, that is the 'source of truth' -- to the extent possible. The specific problem requested is more subtle. It is not always possible to totally reproce the 'roll up' behaviour from the raw data in the Meters database to the Hourly, Monthly etc. Its largely 'straight forward', but internally there are cases where internal data is held with more precision or with more variables then are published to the DB and this internal dataset is used for the aggreations/'rollups' -- meaning that the math does not always work out exactly the same. Furthermore there are internal 'rules' applied wrt rounding to the nearest minute/hour/5 minute etc in order to produce more consistent data results, but with the side effect that there can be dropped data. For example, if the server is under heavy load there can be cases where the exact time of the data samples are 'rounded' to the next period and can mis-represent averages for that period. The first and last partial hour of continuous server running may not have reproducible hourly roll ups -- i.e. if you calculate on your own the partial periods you may not get the same answers because the timestamps have been adjusted to fall on even periods. The data internal to the server does not do this 'rounding', its purpose is to make client application code easier to write to produce reasonable results. There are futher subtlties when attempting to aggreagate across servers in a cluster (as the GUI does). Its not always obvious what metrics like "IO Rate" mean when applied to a cluster -- is it the cluster wide SUM or Avarage ? much like interpreting "Load Avarage" on a multi-core system.

From my read of your question, I suggest you use the data directly from the Metering database as the 'source' data. If you start with the raw data, then drop all 'partial data' (data that falls outside the start and end of the next higher rollup) -- i.e. if you start your server at 5:53 -- drop all the raw data until 6:00 then include all the raw data from 6:00 to 7:00 -- you should find a near exact match with the hourly data written out at 7:00 -- providing you use ALL the raw attributes in the equation (min,max,sum,avg,sumsq). Within the rounding precision, these should match up.

Using the 'higher level' APIs, may produce answers that you don't expect. They aren't wrong but there are many combinations of parameters which have different possible meanings -- the api's wont error out just because you supplied parameters that are ambiguous or inconsistent. You can compare this strategy to other metrics service providers, such as AWS CloudWatch -- not all possible combinations of parameters produce understandable results. But the raw data -- unmolested -- does not suffer from this.

Also, the REST API's make heavy use of the indexes for efficiency. The indexes are not at the same precision as the XML data so you can get precision related inaccuracies, depending on the exact values -- the indexes use 32 bit values. Depending on the server version, the XML data may be using either 32 or 64 bit values, but the indexes still truncate to 32.

IF you want accuracy, IMHO, avoid the JSON output -- due to JSON's inherent problem with numeric precision. THis is compensated for in the Monitoring History , but its quite tedious to do so.

If you want maximum query performance the DO use the REST endpoint (either XML or JSON) -- it is optimized for query performance across a variety of request types. While it doesnt us any 'magic' -- its not easy to achieve the same performance and accuracy yourself directly from the meters data. Again, look at the code for the endpoints, its all in plain xquery for inspection, but it is not the 'source of truth' for the raw data --its intent is for efficient *time series aggregate queries * not for maximum precision. For nearly all usages that's what you want.

How is meters hourly data calculated by MarkLogic?

1 Answers