Excel 2007 PERCENTRANK is trash.
Tuesday March 6, 2012
People and software do not always mean the same thing when they talk about percentiles, percent rank, and so on. Do not expect different software to give the same values. In particular, Excel uses a method that is probably not what you expect and does not correspond to methods implemented in scientific software.
For Excel 2007, in the case of getting a PERCENTRANK for a value that appears in the range, you will actually get (the number of items strictly less than the value) / (the total number of items minus one). This has the nice feature of (at least for distinct values) giving percent ranks that range from zero to one inclusive. It has the nasty feature of almost certainly not being what you thought it was going to be, and not being what you'll get from SAS, R, SPSS, SciPy, etc. (It is, however, mimicked fairly well in other spreadsheet software.)
It isn't immediately obvious how Excel works out the PERCENTRANK for values that don't appear in the range. Some sort of interpolation, certainly - but not one that was easy for me to guess quickly. I'd love to know what the heck it is.
And it isn't just that Excel is non-standard - it also appears to be buggy. Here's one bizarre example I came across of Excel 2007 at work, in which PERCENTRANK is not stable when values are multiplied (or divided) by 100, sometimes giving the same percent rank for different values, sometimes giving different percent rank for the same values. Check out the rows in bold. You should be able to replicate this in Excel 2007 if you like (with nine digits of precision requested from PERCENTRANK).
The moral of the story? DON'T USE EXCEL FOR ANYTHING, BUT ESPECIALLY NOT MATH.
This post was originally hosted elsewhere.