Monday, May 28, 2012

Generating Semantic Maps

One of the central features of the Conlanger's Thesaurus is the cross-linguistic semantic maps. For the first version of the Thesaurus I used those I could find in public linguistics journal articles. But it occurred to me I could come up with some of these on my own.

First I came up with some straightforward software to manipulate lists of definitions to produce the semantic maps automatically. I wasn't actually expecting this approach to work out so well right away, but my initial assumptions and model turned out to work pretty well.

The biggest problem has been finding good dictionaries to work with. All too many online dictionaries — and not a few printed ones — are simply lists of words with single-word definitions. This is not a great way to get at polysemy. However, over the last few days I have managed to find enough good dictionaries online to make me confident in the cross-linguistic (and cross-cultural) polysemy maps I've been creating.

The code is explained at Generating Cross-Linguistic Semantic Maps. At the bottom of that page is a list of core words around which I have generated maps. Even if you cannot understand the Python programming language, you can see the list of languages and meanings I have used in the links that end in .py. The maps are images of the common polysemies.

There have been two big surprises to me in these maps. First, "face" can refer to the blade of a knife in two utterly unrelated languages (Turkish and Inupiaq). Second, I was surprised how often "sweet" can refer to what English speakers consider other flavors, especially "salty."

Wednesday, May 23, 2012

A Conlanger's Thesaurus

In the last year or so I have been thinking about writing a piece of software which would spit up a skeleton dictionary which I could fill in with a new language being created. The point was to help me get out of certain lexical ruts, while still creating a language that would be more or less internally consistent. I gave up on that project, but one side effect was a lot of reading about recent work in lexical typology. I found the semantic maps especially interesting as tools for conlanging. I've collected a bunch of that work in A Conlanger's Thesaurus.

The core of that document is a word list, lightly edited, but mixed in whenever possible are cross-linguistic semantic maps, to prod thinking about new possibilities for words that don't simply reproduce the semantic boundaries of languages I already know. There are still a lot of gaps, but this seems a good start.

The last two pages have some dense but very interesting semantic maps relating to matters most people would consider grammatical rather than lexical. But there's a lot of interesting stuff there, too. Whenever possible I have included links to free, online papers and articles which discuss the maps in much more detail, so people can hunt down details if they are moved to do so.

If you know of cross-linguistic maps I have missed, please let me know. I'm sure I'll be updating this document from time to time as I run across new work.