## Sunday, May 12, 2013

### Conlanging with LaTeX, Part Two

In the previous post I suggested a basic LaTeX tutorial you might use to get a basic command of LaTeX. I'm going to assume everyone reading this has played around a little with LaTeX.

Before you can produce any document in LaTeX, you need to tell it a little about what you intend. The very simplest trussing for this will look a lot like this:

\documentclass{article}

\begin{document}
Saluton!
\end{document}

The space between the \documentclass and \begin{document} lines is called the preamble, and this is were you can put all sorts of other declarations to change how LaTeX works, either by changing its default behavior or by adding new functionality. For this post, I'm going to mention a few things that are useful for conlangers to have in their preambles. Specifically, I'm going to focus on what LaTeX calls packages. Fortunately, if you do a web search on most LaTeX packages you can get good documentation on how to use them effectively.

The first thing you should know, is that the font size can be changed in the \documentclass line. I usually like a 12pt font, but you can also ask for 10 or 11 points. As always, you need to use other packages to get more font size options.

\documentclass[12pt]{article}

By default, LaTeX has rather large margins. I have no need for so much whitespace, so I use the fullpage package to pull out the margins to something less wasteful of paper:

\documentclass{article}
\usepackage{fullpage}

\begin{document}
Saluton!
\end{document}

And that's all you need to say. Simply by using the package, the changes you want take effect.

The next big thing is a package to manage fonts. In the old days, dealing with fonts in LaTeX was truly a nightmare — strange font names, freaky encodings, fonts themselves in a special LaTeX format, fights between different packages and font expectations, etc. These days, the XeTeX version of LaTeX has much simpler font management capabilities, though you still have to do a little work.

For XeTeX to find a TrueType or OpenType font, it needs to be installed in the usual places your OS would put the font, since it relies on local mechanisms to find them.

There is a utility package that helps manage all this, fontspec:

\documentclass{article}
\usepackage{fullpage}

\usepackage{fontspec}
\defaultfontfeatures{Mapping=tex-text}
\setromanfont{Gentium Basic}
\newfontfamily\gplus{Gentium Plus}

\begin{document}
Saluton!
\end{document}

So, what I'm doing here is loading up the package, then immediately running some commands provided by that package to set some font defaults. The \defaultfontfeatures line tells XeTeX I want to use the normal, old-fashioned LaTeX digraphs and trigraphs for certain kinds of characters. For example, it will convert three consecutive minus signs into an em-dash (—), in the usual LaTeX way. If you omit this line, many examples of LaTeX you might find on the web may break in subtle ways for you.

The next line, \setromanfont picks the default font for the document. I like the Gentium family, since it has lots of accenting support, as well as Ancient Greek, which I often find myself using.

The next line lets me create a font command. It turns out, the Gentium Plus font has much better support for IPA characters, so when I want to type IPA, I can use the \gplus command to get the IPA. Note that you have to enclose the commands created that way in curly-braces to limit their effect. An example from my Kahtsaai grammar:

 \item Double \LL{ł}, \LL{łł}, is
pronounced [{\gplus ɮ}:]. 

So, the \newfontfamily command needs a command name, which you choose, and then a font name. Here, I picked the name \gplus (the leading backslash is required for all LaTeX command names).

The fontspec package is vast and powerful, allowing many interesting effects. You can look at the documentation to learn about more of its capabilities. I will just add that it is common for LaTeX documentation to have a large section at the end with the actual package code, with explanations. Most of the time, that is safely skipped.

I like to use different sorts of underlining in examples, for which the package ulem is very useful. Just use \usepackage{ulem}, and then you get some new LaTeX commands:

\uline{Just a normal underline.}
\uuline{A double underline}.
\uwave{A wavy underline}.

Some people will want to use the tipa package, which provides a funky encoding for IPA. I don't use it these days, since I don't always like the look of the output.

These are the most basic packages I use. There are a few more, but they are complex enough, or add such large new functionality, that I will save them for future posts.

Do experienced LaTeX-er conlangers have other basic packages to recommend, other than things like multicol, makeidx, multicol or hyperref, which I hope to talk about more in the future?

In the next post, I will talk a bit about defining your own simple macros to ease some formatting tasks, and tables tables tables...

## Friday, April 26, 2013

### Conlanging with LaTeX, Part One

One common set of questions in conlanging forums is about how to organize the material, the grammar, the dictionary, lessons, etc.  While there are some dedicated language tools out there, most of them are fairly complex or expensive.  So most people just use word processors for their grammars and sometimes spreadsheets for their dictionaries, assuming they use computers at all.

At this point, I'm prepared to say there are no good tools for writing a dictionary.  There are tools out there, but they tend to be very tricky to use well, assuming the hobbyist conlanger can even afford the cash or the time to invest in such tools.  And for tools to let people collaborate on a lexicon?  Forget it.

So, I just write my dictionaries as text.  Here's an example lemma for Kahtsaai,
No spreadsheet is going to produce anything that looks like this without a great deal of programming.  It might be nice to have a nifty tool to manage a dictionary entry like this, but a general tool to do that would be so complex that I'm not sure it would be worth the effort.

Because I want my grammars and dictionaries to look good, I had to pick something nicer than a plain text file or even HTML.  I went with LaTeX, a very sophisticated typesetting system that started out in the world of mathematics and the sciences, but which humanities folks are starting to learn to appreciate.  Unlike a word processor, which is WYSIWYG, "what you see is what you get," LaTeX takes a different approach.  You type up your document in a special typesetting language, and then you feed that to a LaTeX program which spits out your document after making all the typesetting decisions and formatting for you.  Paraphrasing, you tell LaTeX what you intend, and it produces the nicest possible output matching your intent.

In LaTeX simple things are simple.  You could typeset a printed letter in it, and except for some messing about at the start of the file, what you had to type wouldn't look much different from an email (though the output would be far nicer).  But, LaTeX is programmable, and is thus capable of very sophisticated things.  Here, for example, is a semantic map which was described entirely in TikZ, a graphics language that exists for LaTeX,
It is this ability to do sophisticated things when you need to that makes LaTeX such a powerful tool.

Due to an early encounter with old Latin grammars, I prefer to typeset my grammars with bold face for text in the language, italics for translations, and just the normal font for English explanations.  But, rather than tell LaTeX to bold everything in my conlang, I write a macro which I enclose all my conlang in.  That way, if one day I decide to format everything differently, I just have to change the macro, run the LaTeX program again, and voilà! out comes a new version of my grammar with everything changed to the new way.  I wrote a set of macros to typeset my dictionary entries in the way I prefer.

Reasons a conlanger might want to use LaTeX:

• It's programmable, and thus easy to make sweeping formatting changes with minimal effort.
• Modern versions speak UNICODE natively, so it's good for fun character sets and accents galore.
• Modern versions can also use almost any font you want.
• The output is gorgeous.
• Conlangers love tables, and LaTeX has very powerful table capabilities.
• Cross-references are useful in grammars, and LaTeX has a powerful reference system, which can produce clickable citations in a PDF.
In the next few blog posts, I am going to explain some features of LaTeX that would be most useful for conlangers.  I cannot do a full tutorial on LaTeX.   One good tutorial is Learn to use LaTeX, but there are many on the internet easily found by search.  I recommend you practice with a few quick and simple documents before reading the other posts.

LaTeX is free software, and there are several different distributions out there.  I strongly recommend TeX Live.  It's sort of large, but it will have all the extra linguistics packages you want to use, and it includes XeTeX, the most powerful modern LaTeX engine, which speaks UNICODE natively and has far, far nicer font management tools.  It's the best choice for conlangers.  I will assume XeTeX for all my posts on LaTeX.

In my next post, I will go a bit more into detail about the things you'll want in your LaTeX preamble to make XeTeX pick the best fonts for multilingual work.  And maybe start in on tables.

## Monday, December 17, 2012

### At least...

Among the easiest things to smuggle into a conlang from one's native tongue are discourse particles and phrases. I recently had reason to think about the phrase at least, which means at least three distinct things.

First, it is used to set a lower limit on some statement about degree or scale. It is easiest to see with numbers, but has a wide range of uses beyond that. Adding the tag "if not more" is often a good diagnostic test for this use,

• I saw at least five [if not more].
• She has invited at least Sarah and James [if not others].
• He's at least slightly depressed [if not seriously so].

The second sense is evaluative. It selects a particular part of a larger state of affairs and marks it as something the speaker expects everyone to see as positive,

• At least I got an A-.
• At least she didn't ask me out.

The last sense I've seen called "rhetorical retreat." You identify the source of your information but step back from committing to its reliability,

• Mary is at home – at least I think so.
• Mary is at home – at least that's what Sue said.

If you have a conlang with a single word or phrase for "at least" and it means all three of these things, you've accidentally slurped up whole a little corner of English.

## Monday, October 22, 2012

### New Thesaurus

Yet another version of the Conlanger's Thesaurus. The most interesting change is on the last page, which has a semantic map of the diminutive. Thanks to Alex Fink for bringing this excellent semantic map to my attention.

## Thursday, September 6, 2012

### Conlanger's Thesaurus: new version

Version 1.5 of the Conlaner's Thesaurus is out. It has mostly minor changes: a few more grammaticalizations, some minor tweaks in the parts of speech, and some clarifying suggestions others have offered.

## Thursday, August 2, 2012

### Kahtsaai: Distributive Portmanteau

So, while I sit here with a plumber working on my shower, I finally pulled the trigger on a change I've been thinking about for a while for Kahtsaai. One of the slot-one prefixes for verbs is -na'a- which marks distributed or widespread activity. Thinking about common uses for a while, I decided it needed to merge with two of the person prefixes for subject, he- (3 inanimate) and hááí- (3pl. animate).

The resulting portmanteaus are he'a- and háá'ya-, giving such fun as he'a'ánméín It's going to be hot (everywhere) (I hear), and háá'yawósénats they were running around everywhere.

That change only took a week to commit to.

## Thursday, June 28, 2012

### Recent Developments in Kahtsaai

In the last few months I have been focusing almost entirely on Kahtsaai vocabulary, and allowing that to drive any tweaks to the grammar. At this point, I consider the skeleton of the grammar complete, wanting only a lot more detail for certain sections.

### The Imperfective

For most of its life Kahtsaai has had a single primary verb of motion, , which was usually marked with either the trans- or cis-locative prefix to distinguish go and come. This turns out to be typologically very rare, which was fine, but I finally started to find it annoying, so I added aas come. The form kóh-ló is still available for come, but it cannot be used when the speaker means "right here where we're talking now," which is aas's core meaning.

At the same time aas was coming into being, I was getting a bit annoyed about the regularity of the imperfective marker, -na. I did not want to add massive irregularity, but it just wasn't sitting right all by itself. So, I added a small number of verbs which take the imperfective in -rá/-réí. The choice between the two forms depends on things like stem syllable weight and compensatory lengthening after certain assimilations, but for practical purposes should be considered irregular. In a last act of randomness, I seriously modified aas, giving it an imperfective of saréí. Finally, an imperfective in -rá becomes -réí when the adverbial suffix -ne/-hte is added, always resulting in -réín. This parallels the -na > -naan change.

I have confined the -rá/-réí forms to intransitive verbs of motion ("come," "flow"), location and posture ("stand," "hang") and weather ("lightening"). I don't expect that to change. Right now only thirteen verbs have this new imperfective. Probably a few more will enter this class over time, but I doubt it will be too many.

### Postpositions and Verbs do the Frame Dance

I recently added the postposition -próh. It is imagined that at one point in its history it covered certain meanings one expects of the dative, but by about, say, a half a millennium ago it was confined to marking the experiencer of certain verbs of emotion or judgement. For example, léíkou means insipid, flavorless, boring. With -próh one can say someone is bored,

 Ra'é tápróh heléíkou. ra'é tá-próh he-léíkou that.INAN 1SG-to 3INAN-be.insipid That bores me.

The postposition now also marks the judicantis role, that is, the person in whose judgement a statement holds true.

 Táttá aapróh máámo łakíntsááłtsi wé. tá-ttá aa-próh máámo ła-kí-n-tsááł-ts wé 1SG-father 3AN.SG-to money TRNS-3INAN.S-3INAN-misuse-EVID this To my father, this is a waste of money.

In thinking about the core uses for -próh an interesting commonality has developed, where a stative verb takes the "detransitive of causative" marking -ríi-se and is then used with -próh to mark the induction of some state in a person. For example, láhme means "be angry, be unpleasant," but rather than taking the causative for "to anger," instead this -ríi-se form is used, tápróh yoláhmeríise he made me angry. I'm expecting to see more of the construction X-próh Vstative-ríi-se in the future.

Finally, I have started thinking more about the frames of new and existing vocabulary, and making sure I have examples covering expected uses. One result of this is that the postposition -por, "seeking after, wanting," is now used mark the ultimate goal for purposive action. For example, the verb móka means "trick" or "deceive." The postposition -por marks the goal of the deception if that is expressed,

 Yokatmókats máámo onpor pá. yo-kat-móka-ts máámo on-por pá 3AN.SG-1PL-trick-EVID money 3INAN-wanting PTCL He tricked us for the money.

This week makes me want to give into the "40 words for snow" syndrome, and create a rich vocabulary to describe my own emotional state when experiencing 95-100F days and very high humidity. I'm also trying to think up a good way to express "at stake, on the line," as in the phrase, "when your life is at stake." This is a subtle one.