How Real are Proto Languages?

Last week I attended a very interesting session from guest speaker J.P. Mallory, an archaeologist who has branched into “Indo-European”. Fortunately, I have appreciated Mallory’s ability to bring a skeptical eye to the field in terms of figuring out how archaeology and language connect.

A question that arises in historical linguistics is when we construct a proto-language, how “real” is it? Some believe that what we reconstruct was literally spoken in the past and others believe that any reconstruction is completely artificial. As usual, I’m somewhere in the middle.

An interesting point Mallory made is that after two centuries of work, we can really reconstruct only a percentage of Indo-European roots. Pokorny’s comprehensive dictionary lists between 2,000-3,000 roots, but most real languages many more roots (even the “low tech” languages). Word counts are hard to lock down because it can be hard to distinguish roots from words, but a conservative estimate would be in the tens of thousands. If we do a very conservative estimate of 10,000 roots, I suspect that we only have reconstructed 20% of them.

Beyond that, there are other issues which make you wonder how well you can reconstruct a proto-language. One is that almost all “languages” have dialects of one sort or another. In fact, when English “arrived” in Britain, it probably came in a series of dialects which are well attested in Old English. Similarly, English emigrated to the U.S. in as 4 major dialect classes. You could never really reconstruct “Proto-English” as spoken in Britain.

In case you’re wondering if non-power languages are immune to breaking up into dialects, the answer is that it may be worse, especially if a language has been in the region for a long time. For instance, the Mayan language has survived, but it survives as a fairly large language family.

But is a proto-language completely fake? I think that one aspect that’s fairly realistic is the sound. Although we don’t have a perfect representation of the Indo-European sound system, I think we be fairly confident on a lot of forms. Similarly, I think one can be reasonably confident on some basic grammartical forms. The one part that continues to elude scholars is sentence structure, and I think it will continue to do so.

So is it foolish to reconstruct proto-languages? Definitely not. I think we can learn a lot about how language changes. For instance, the fact that reconstructed Proto-Romance is so different from attested Classical Latin shows how different street Latin was from written Latin. Reconstruction is also a good tool for determining if a root was borrowed recently, a few centuries ago or has been there all along.

Reconstruction is a good tool, but as with other tools, there are still limits which we need to determine.

