Monday, May 28, 2012

Generating Semantic Maps

One of the central features of the Conlanger's Thesaurus is the cross-linguistic semantic maps. For the first version of the Thesaurus I used those I could find in public linguistics journal articles. But it occurred to me I could come up with some of these on my own.

First I came up with some straightforward software to manipulate lists of definitions to produce the semantic maps automatically. I wasn't actually expecting this approach to work out so well right away, but my initial assumptions and model turned out to work pretty well.

The biggest problem has been finding good dictionaries to work with. All too many online dictionaries — and not a few printed ones — are simply lists of words with single-word definitions. This is not a great way to get at polysemy. However, over the last few days I have managed to find enough good dictionaries online to make me confident in the cross-linguistic (and cross-cultural) polysemy maps I've been creating.

The code is explained at Generating Cross-Linguistic Semantic Maps. At the bottom of that page is a list of core words around which I have generated maps. Even if you cannot understand the Python programming language, you can see the list of languages and meanings I have used in the links that end in .py. The maps are images of the common polysemies.

There have been two big surprises to me in these maps. First, "face" can refer to the blade of a knife in two utterly unrelated languages (Turkish and Inupiaq). Second, I was surprised how often "sweet" can refer to what English speakers consider other flavors, especially "salty."

No comments:

Post a Comment