Correct Ordering in the Zipf--Poisson Ensemble
Rankings based on counts are often presented to identify popular items, such as baby names, English words, or Web sites. This article shows that, in some examples, the number of correctly identified items can be very small. We introduce a standard error versus rank plot to diagnose possible misrankings. Then to explain the slowly growing number of correct ranks, we model the entire set of count data via a Zipf--Poisson ensemble with independent <italic>X<sub>i</sub> </italic> ∼ Poi(<italic>Ni</italic> -super-− α) for α > 1 and <italic>N</italic> > 0 and integers <italic>i</italic> ⩾ 1. We show that as <italic>N</italic> → ∞, the first <italic>n</italic>′(<italic>N</italic>) random variables have their proper order <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="uasa_a_734177_o_ilm0001.gif"/> relative to each other, with probability tending to 1 for <italic>n</italic>′ up to (<italic>AN</italic>/log (<italic>N</italic>))-super-1/(α + 2) for <italic>A</italic> = α-super-2(α + 2)/4. We also show that the rate <italic>N</italic> -super-1/(α + 2) cannot be achieved. The ordering of the first <italic>n</italic>′(<italic>N</italic>) entities does not preclude <inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="uasa_a_734177_o_ilm0002.gif"/> for some interloping <italic>m</italic> > <italic>n</italic>′. However, we show that the first <italic>n</italic>″ random variables are correctly ordered exclusive of any interlopers, with probability tending to 1 if <italic>n</italic>″ ⩽ (<italic>BN</italic>/log (<italic>N</italic>))-super-1/(α + 2) for any <italic>B</italic> > <italic>A</italic>. We also show how to compute the cutoff for alternative models such as a Zipf--Mandelbrot--Poisson ensemble.
Year of publication: |
2012
|
---|---|
Authors: | Dyer, Justin S. ; Owen, Art B. |
Published in: |
Journal of the American Statistical Association. - Taylor & Francis Journals, ISSN 0162-1459. - Vol. 107.2012, 500, p. 1510-1517
|
Publisher: |
Taylor & Francis Journals |
Saved in:
Saved in favorites
Similar items by person
-
Correct Ordering in the ZipfPoisson Ensemble
Dyer, Justin S., (2012)
-
Owen, Art B., (2001)
-
Designing experiments informed by observational studies
Rosenman, Evan T. R., (2021)
- More ...