Working Through Historical Problems

Basic thoughts on Rule-writing for Historical Datasets

It is somewhat similar to phonology problems, namely in that you have a good amount of data, and you should pay some attention to environment. However, there are differences as well as similarities :

a. You are comparing two surface forms, so everything should be talked about in terms of the brackets for the segments. In a similar manner, since we are talking about a change of forms over time, rather than a change of mental forms to surface forms, we generally use ‘>’ to show the change rather than the ‘→’ we used for phonological problems.

b. While environment may cause differences, there are some instances where an entire segment just changed historically, which means that it would be a change that is not environment-specific. This means that you can have a historical rule that does not need to be specified as to environment.

c. In phonology, you had to decide what segment you thought was the underlying form. In a similar way, for historical datasets, you should refer to what segments you think occurred first. With some datasets, this is already given to you (when you have a previous version of the language compared to a more modern version of the language). However, when you are given a dataset that’s comparing two current versions of a language, you need to decide what the initial proto-form that the two would have shared was. In these cases, you also must specify which language the sound change occurred in.

That being said, there are four possible versions of rules that you can write

Not language specific Language-specific
Not environment-specific *a > b *a > b in X language
Environment specific *a > b/(environment) *a > b /(environment) in X language

So when should you use each?

– You should only use the Language-specific versions when you are comparing data from two modern languages.

– You should only use the non-environment-specific versions when you never see that proto segment  staying the same between the historical and modern form.

Below is a brief run-through of a phonology problem. Essentially, just showing how I would approach a problem. Hope it helps!

This walk-through is based on question 4 in your books Contempory Linguistics (page 286)






























This problem is somewhat easier because we are comparing a modern language (Bulgarian) to a historical form of the language (Proto-Slavic), so we know which sounds must have occurred earlier. Because of this, we automatically know that we will not need to specify what language the sound change occured in

First, compare the forms between the two. We’ll start with item (a), where we see that the word *[gladŭka] changed to [glatkə]. Note: the * means the historical, earlier form.

I notice three things that differ between the two forms (segments highlighted in bold and marked with subscript numbers):


glat1, 2kə3

Let’s take the change shown by (2) first. We see that *[ŭ] has been deleted from Proto-Slavic to Bulgarian. After we have noticed this, we automatically check with the other data shown. Are there other instances of *[ŭ] being deleted from Proto-Slavic to Bulgarian? Yes! In items (b), (c), and (e), *[ŭ] similarly disappeared on the development to Baltic. So I make my first observation into a rule:

Rule 1: *[ŭ] >  Ø (indicating the null set, showing deletion)

Since *[ŭ] does not survive in any other parts of the words, and all instances of *[ŭ] disappear in Bulgarian, you don’t have to put down any considerations about environment-specific changes. This is a full out deletion of the segment.

The next most obvious change that has occured is reduction of *[a] to [ə], as shown by (3). However note that you cannot write a rule that is not environment-specific, because there are instances of *[a] which remain in Bulgarian(*[gladŭka] to [glatkə]). Probably the most obvious environment where *[a] reduces is in the word-final position (you could also say that it’s when it follows [k], but I think that is a mere coincidence, and there’s good historical evidence for reduction at word bounderies). So we write our rule!:

Rule 2: *[a] > [ə]/ __#   (The underline shows where the segment occurs, and # stands for a word boundary. This reads as ‘proto [a] reduces to [ə] in Bulgarian when it appears before a word-boundary).

Finally, let’s look at the change indicated by (1). In this particular word, we see the devoicing of *[d] to [t]. Of course, after we have noticed this, we must first look at the rest of the data to see if there are other instances of this.

There is no other instance of *[d] in the data, so we may be tempted to make just one rule for this instance (*[d] > [t]). However, we would be missing out on a generalization here. Note that all of the segments in the proto-language that appeared before the deleted *[ŭ] and *[ĭ] (except for *[r]) changed to their voiceless counterparts in Bulgarian. This is a pattern that requires a bit of storytelling. First of all, we know that we can’t say that these segments all changed to new segments, disregarding environment, because some of those segments still occur in Bulgarian (consider (d) for example).

Let’s deal with the first half of the rule first. What can describe all of those segments to the exception of [r]? They’re all -sonorants. So the first half of our rule should be:

*voiced sonorants > voiceless

Now to decide on how to describe the environment. It would be incorrect to say that this occured before *[ŭ] or *[ĭ], because then we would see evidence of the devoicing in the proto-language. Instead, it is better to imagine that the historic change where *[ŭ] and *[ĭ] were deleted occurred, and that the previously voiced sonorant became voiceless before [k], since we have good evidence of that environment in the modern Bulgarian. This makes our third rule:

Rule 3: *voiced sonorants > voiceless/__[k]

Now we have written rules to explain all of the first item (a). If we check (b) and (c), we see that all of the same rules account for the changes there. However, then we come to (d). Note that in (d), there is no *[ŭ], but there is a *[ĭ] that gets deleted. We could capture this by making another rule:

*[ĭ] >  Ø

However, note how similar it is to our Rule 1. Is there something that the segments  *[ŭ] and *[ĭ] share? Well, we see that both have the symbol showing that they are short vowels over them, and these deleted segments are the only segments in Proto-Slavic that are short. This allows us to write the following rule:

Rule 1 (revised): *short vowels >  Ø

(If you wanted to add the environment notation __k, you could, but it’s not necessary)

In summary, then, we have three rules:

Rule 1 : *short vowels >  Ø

Rule 2:  *[a] > [ə]/ __#

Rule 3:  *voiced sonorants > voiceless/__[k]

It doesn’t really matter when Rule 2 applied, but Rule 1 must have applied before Rule 3 historically, as discussed.

Then we test our work with item (e):









Original form: *[lovŭka]

Apply Rule 1 (*short vowels >  Ø): *[lovka]

Apply Rule 2 (*[a] > [ə]/ __#): *[lovkə]

Apply Rule 3 (*voiced sonorants > voiceless/__[k]): [lofkə]

As we can see, after applying all the rules, the resulting form is identical to the Bulgarian form. So it checks out!

Note that if we had applied Rule 3 before Rule 1,, the sonorants would not have occurred next to a [k], and hence the voiced sonorants wouldn’t change to their voiceless counterparts.