Loss of meaning
Mar. 28th, 2008 06:10 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
I am forced to work with a new (new to me) reporting application. It crunches OS performance metrics into cubes and allows the creation of graphs. The problems with this reporting are vast, and close to insurmountable.
The documentation does not specify the calculation methods. You just have to trust it. The only trending it does is linear progression. The summaries do not include percentiles or quartiles, only min, max, average, and count (yes, basic SQL math).
So tell me, dear statisticians, does the CPU Busy % taken at 15 minute intervals averaged across 20+ servers for 28 days mean anything? Ok, let me rephrase, does it provide any level of useful information? No. No it does not. Why? First, there is no algebraic order provided. Is it just taking 53760 datapoints and averaging? Is it averaging each server first then averaging each day? Is it averaging each day, then across servers? Second, the data is so diluted it is meaningless. You could have several servers that are pegged and several that are completely idle, everything running at some even percentage, and anything in between. Yet these are the default graphs that it displays and that people are ga-ga over. I spent five hours trying to tease useful information out of the system. The closest I got without making graphs for each server was to show bar graphs of average and max (for a given time period) per function. Still, if I needed to do something with that, i could not. Even the Max is a mystery, unless I want to start extracting the raw data myself. Is it a maximum of the maximum (highest number found)? It max the average of the maximums (max per day per server carries forward)? You might think some interesting trivia like that might be in the documentation. You would be wrong.
So, on Monday I get to dive into it again, make some pretty pictures, call the vendor and ask for answers, and then tell the client the tool they purchased should be replaced by a freaking abacus and graph paper.
The documentation does not specify the calculation methods. You just have to trust it. The only trending it does is linear progression. The summaries do not include percentiles or quartiles, only min, max, average, and count (yes, basic SQL math).
So tell me, dear statisticians, does the CPU Busy % taken at 15 minute intervals averaged across 20+ servers for 28 days mean anything? Ok, let me rephrase, does it provide any level of useful information? No. No it does not. Why? First, there is no algebraic order provided. Is it just taking 53760 datapoints and averaging? Is it averaging each server first then averaging each day? Is it averaging each day, then across servers? Second, the data is so diluted it is meaningless. You could have several servers that are pegged and several that are completely idle, everything running at some even percentage, and anything in between. Yet these are the default graphs that it displays and that people are ga-ga over. I spent five hours trying to tease useful information out of the system. The closest I got without making graphs for each server was to show bar graphs of average and max (for a given time period) per function. Still, if I needed to do something with that, i could not. Even the Max is a mystery, unless I want to start extracting the raw data myself. Is it a maximum of the maximum (highest number found)? It max the average of the maximums (max per day per server carries forward)? You might think some interesting trivia like that might be in the documentation. You would be wrong.
So, on Monday I get to dive into it again, make some pretty pictures, call the vendor and ask for answers, and then tell the client the tool they purchased should be replaced by a freaking abacus and graph paper.