Size-dependent word frequencies and translational invariance of books
It is shown that a real novel shares many characteristic features with a null model in which the words are randomly distributed throughout the text. Such a common feature is a certain translational invariance of the text. Another is that the functional form of the word-frequency distribution of a novel depends on the length of the text in the same way as the null model. This means that an approximate power-law tail ascribed to the data will have an exponent which changes with the size of the text-section which is analyzed. A further consequence is that a novel cannot be described by text-evolution models such as the Simon model. The size-transformation of a novel is found to be well described by a specific Random Book Transformation. This size transformation in addition enables a more precise determination of the functional form of the word-frequency distribution. The implications of the results are discussed.
Year of publication: |
2010
|
---|---|
Authors: | Bernhardsson, Sebastian ; Rocha, Luis Enrique Correa da ; Minnhagen, Petter |
Published in: |
Physica A: Statistical Mechanics and its Applications. - Elsevier, ISSN 0378-4371. - Vol. 389.2010, 2, p. 330-341
|
Publisher: |
Elsevier |
Subject: | Word frequency distributions | Random book transformation | Text evolution models |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
Bounds of percolation thresholds in the enhanced binary tree
Baek, Seung Ki, (2011)
-
Equilibrium strategy and population-size effects in lowest unique bid auctions
Pigolotti, Simone, (2011)
-
Thörnqvist, Christer, (2015)
- More ...