The phonology of letter shapes: Feature economy and informativeness in 43 writing systems

ElsevierVolume 142, April 2025, 104620Journal of Memory and LanguageAuthor links open overlay panel, , , Highlights•

The visual shapes of letters, contrary to the phonemes of spoken languages, lack a unified description.

With a gamified approach, we crowdsourced thousands of letter descriptions.

43 scripts’ letters can be completely described as sets of binary features.

Compared to phonemes, letters require more such features for a complete description.

These features are also less informative in letters than in phonemes.

Abstract

Differentiating letter shapes accurately is a core competence for any reader. Are letter shapes as distinctive as they could be? The visual shapes of letters, contrary to the phonemes of spoken languages, lack a unified description — an equivalent of the phonological features that describe most phonemes in the world’s languages. Using a gamified crowdsourcing approach, we elicited thousands of letter descriptions from lay people for the sets of letter shapes (the scripts) used in 43 diverse writing systems. Using 19,591 letter classifications, contributed by 1,683 participants, who were asked to sort the letters of each script repeatedly into two groups, we extracted a sufficient number of binary classifications (features) to provide a unique description for all letters in the 43 scripts. We show that scripts, compared to phoneme inventories, use more features to produce similar sets of distinct elements. Compared to the phoneme inventories of a large sample of the world’s languages dataset (the P-base dataset, collected by another team), our 43 scripts have lower feature economy (fewer symbols for a given number of features) and lower feature informativeness (a less balanced distribution of feature values). Compared to phonemes, letter shapes require more binary features for a complete description. These features are also less informative in letters than in phonemes: the chances that two random letters in a script differ on any given feature are low. Letter shapes, which have more degrees of freedom than speech sounds, use those degrees of freedom less efficiently.

Keywords

Writing

Reading

Letter recognition

Phonology

Combinatoriality

Feature economy

© 2025 The Authors. Published by Elsevier Inc.

Comments (0)

No login
gif