Volume 8 Issue 1 (2010)
DOI:10.1349/PS1.1537-0852.A.346
Note: Linguistic Discovery uses Unicode characters
to represent phonetic symbols. Please see Optimizing Display
for requirements to accurately reproduce this page.
Semantic Maps as
Metrics on Meaning
Michael Cysouw
Max Planck Institute for Evolutionary Anthropology,
Leipzig
By using the world’s
linguistic diversity, the study of meaning can be transformed from an
introspective inquiry into a subject of empirical investigation. For this to be
possible, the notion of meaning has to be operationalized by defining the
meaning of an expression as the collection of all contexts in which the
expression can be used. Under this definition, meaning can be empirically
investigated by sampling contexts. A semantic map is a technique to show the
relations between such sampled contextual occurrences. Or, formulated more
technically, a semantic map is a visualization of a metric on contexts sampled
to represent a domain of meaning. Or, put more succinctly, a semantic map is a
metric on meaning.
To establish such a metric, a notion of (dis)similarity is needed.
The similarity between two meanings can be empirically investigated by looking
at their encoding in many different languages. The more similar these encodings,
in language after language, the more similar the contexts. So, to investigate
the similarity between two contextualized meanings, only judgments about the
similarity between expressions within the structure of individual languages are
needed. As an example of this approach, data on cross-linguistic variation in
inchoative/causative alternations from Haspelmath (1993) is
reanalyzed.
1.
Measuring Meaning
Meaning is a particularly elusive property to measure. The
central problem is that the meanings of linguistic expressions are variable
across languages, and it is still mostly unknown how large this variability is.
It does not really help to analyze the meaning of a language-specific expression
(for example the English verb
to walk) by saying that it expresses a
general concept (like WALK). Such a change in typography still leaves open the
question as to what the relation is between WALK and, for example, the meaning
of the German word
spazieren or the Spanish word
andar. Actually,
without a more explicit definition of the concept WALK, asking whether
andar
expresses the concept WALK is not much different from asking whether
andar means the same as
to walk. Yet, individual linguistic
expressions across languages never convey exactly the same range of senses,
making such a simplistic approach to comparing meaning across languages devoid
of content.
In this paper, I will defend the view that a much more profitable
operationalization of cross-linguistic variability of meaning is achieved by
defining
the meaning of a language-specific expression as the collection of
all contexts in which the expression can be used
. This definition
represents, to some extent, a reversal of the intuitive notion of meaning.
Meaning is typically thought of as some kind of property of a linguistic
expression that governs its potential appearance in a particular context. In
this conventional view, the main difficulty is how to express this property
called “meaning”. The approach to meaning proposed in this paper
simply defines this property as the sum of all actual appearances. It is of
course practically impossible to ever collect all appearances of a particular
linguistic expression (be it a lexical or a grammatical item) in a living
language—though this is possible for a dead language by including all
documentation available—but samples of contexts can be used for any
empirical question at hand (cf. Croft 2007; Wälchli & Cysouw 2008 for a
similar approach to meaning).
Samples of the actual occurrences of expressions in concrete contexts
can be used to compare the variation in meaning between different
language-specific expressions. So, instead of assuming that we know what the
English expression
walk means, I propose to sample its meaning by
considering various contextualized occurrences of walk-like situations. To
compare expressions across languages, ideally the same sample of contexts should
be used for all languages investigated. The parallel collection of such
occurrences across languages can take various forms. It is possible to use
extra-linguistic stimuli, like pictures (e.g. Levinson & Meira 2003) or
video sequences (e.g. Majid et al. 2007), and investigate the linguistic
expressions used to describe them. The contexts can also be defined purely
linguistically, using descriptions of situations (e.g. Dahl 1985) or examples
from parallel texts (e.g. Wälchli 2005).
In the practice of grammatical typology it is often impossible to
collect sufficient parallel expressions because of the limited amount of
material available and because of the difficulty of finding native speakers for
all the languages to be investigated. So, instead of concrete occurrences of
language-specific expressions in context, normally somewhat larger domains of
contexts are used in which an expression can occur (e.g. Haspelmath 1997). These
domains are (more or less) explicitly defined as “chunks” of
meaning, large enough to be identifiable from reference grammars, and small
enough to capture the main distinctions of the cross-linguistic
variation.[1]
Both parallel expressions in context as well as the somewhat more abstract
domains of meaning as used conventionally in linguistic typology are called
ANALYTICAL PRIMATIVES in Cysouw
(2007).[2]
One of the consequences of comparing languages on the basis of an
(empirical) selection of analytical primitives is that such a selection strongly
reduces the range of possible meanings that can be identified across languages.
Instead of the real-world continuous variation of possible meanings, a (finite)
sample of analytical primitives only allows for a restricted, point-wise,
granular view on this variation. In this approach, the meaning of a
language-specific expression reduces to a subset of the sampled primitives. This
subset consists of those sampled contexts in which the language-specific
expression occurs. From the perspective of individual languages, the semantic
analysis offered on the basis of such a selection of primitives might be
somewhat coarse-grained and perhaps to some extent even misleading. The most
important gain of this approach, however, is that it offers a concrete
operationalization of the cross-linguistic study of meaning. From this
perspective,
the comparison of the meanings of two expressions from two
different languages consists in the comparison of the selected subsets of
analytical primitives.
Any deficits in the comparison arising from a biased
selection of analytical primitives can easily be repaired by changing or
extending the sample of primitives.
To be able to make cross-linguistic comparisons between
language-specific expressions from different languages, first the internal
structure among the primitives must be considered. This paper deals with the
empirical establishment of such structure among analytical primitives in the
form of semantic maps (Section 2). The actual comparison of language-specific
expressions (i.e. questions like “how similar is English
walk to
Spanish
andar, and in which aspects to they differ?”) will not be
further pursued here.[3]
In a very
general sense, the structure among analytical primitives amounts to establishing
a
metric on analytic primitives, i.e. a specification of the distances (or
“dissimilarities”) between them
, as will be discussed in Section
3. One way to empirically arrive at these dissimilarities between primitives is
to use cross-linguistic diversity in the encoding of the primitives, as
discussed in Section 4.
Only language-specific analysis is necessary to
establish dissimilarities between primitives—no cross-linguistic
judgements are necessary.
This important insight led to the establishment of
semantic maps in the first place but will be generalized here in Section 5. In
Section 6, I will argue that
both form and behavior can be analyzed as
language-specific encoding
. An example of this conceptualization of the
cross-linguistic study of meaning is presented in Section 7, in which data from
Haspelmath (1993) on the inchoative/causative alternation is
reanalyzed.
2.
Semantic Maps
Anaytical primitives are not just points in an unstructured
cloud of semantic space. Some primitives are more similar to each other than to
others. Such structure among analytical primitives is suitably analyzed by using
semantic maps (cf. Haspelmath 2003). Semantic maps are a special kind of
analysis and display of the internal structure of a sample of analytical
primitives. My use of the terms SEMANTIC SPACE and SEMANTIC MAP is most closely
related to Haspelmath’s terminology, in which “a semantic map is a
geometrical representation of functions in ‘conceptual/semantic
space’ ” (Haspelmath 2003:213). This is different from the
terminology used by Croft (although there is no difference in content), who uses
the term “conceptual space” for the geometrical representation, and
“semantic map” for the language-specific instantiation (cf. Croft
2001:92ff; Croft 2003:133-139; Croft & Poole 2008:3). The different
terminologies are summarized in Table 1.
Differently from the received view of such semantic maps, I propose here
to strictly separate the notion of a semantic map into two different aspects,
namely the STRUCTURE among the primitives and the DISPLAY of this structure. The
structure itself will be formulated as a metric on the primitives; the display
of the structure is the semantic map proper. Given a particular set of data,
there will both be different ways to establish the structure among the
primitives, and there will be different ways to display any structure attested.
Because of the multitude of possibilities, it is particularly important to
separate effects stemming from the decision on how to measure the structure from
effects resulting from the specific method of visualizing the structure. In this
paper, I will only discuss approaches to the establishment of the structure
among primitives. The discussion of the various possible visualizations will be
left for another occasion.
Concept
|
|
Terminology
|
|
|
This paper
|
Haspelmath
|
Croft
|
Collection of all possible
analytical primitives
|
conceptual/
semantic space
|
conceptual/
semantic space
|
–
|
Structure within the set
of analytical primitives
|
cross-linguistic
metric on meaning
|
semantic map
|
conceptual space
|
Graphical representation
of attested structure
|
semantic map
|
semantic map |
conceptual space |
Language-specific encoding
of analytical primitives
|
language-specific metric on meaning
|
boundaries in
semantic map
|
semantic map
|
Graphical representation of language-specific encoding
|
language map
|
boundaries in
semantic map |
semantic map |
Table 1: Terminological clarification
3.
Metrics and Distance Matrices
A METRIC is the mathematical explication of a notion of
distance (or dissimilarity, i.e. the opposite of similarity). In our daily
world, the most natural notion of distance is the Euclidean distance, i.e. the
distance “as the crow flies”. However, when moving from point A to B
it is often not possible to take the direct route (if you are not a crow), so
another natural metric is the ground travel distance. This notion of distance
can widely deviate from the straight-line Euclidean distance, namely when there
is no (approximately) direct route to get from A to B while staying on the
ground. Still another way to measure distance in daily life is to take the time
it takes to get from A to B. Again, this notion of distance might give a rather
different perspective on our surroundings depending on transportation
possibilities. These different ways of measuring distance illustrate that
any
notion of distance is a question of perspective and is not in any sense
pre-established by the nature of the objects investigated.
This holds also
for metrics on meaning: what counts as similar in meaning depends on which
perspective one wants to
take.[4]
The result of applying a metric on some data is a table of pairwise
distances for all pairs of objects investigated: a DISTANCE MATRIX. So, given
some data and a decision on how to interpret the data (the metric), distances
between pairs of objects can be computed. Normally, such pairwise distances are
expressed as a (fractional) number between zero and one. At the one extreme
“0” indicates “no distance”, i.e. the two objects are
the same, and at the other extreme “1” indicates “maximal
distance”, i.e. the objects are completely different. It is not necessary
to normalize distances to this zero-one interval, but it makes it easier to
combine distance matrices. Also, decimally written values between zero and one
can intuitively be taken to represent percentages. For example, a distance of
0.54733 can be interpreted as “almost 55% of the maximal distance”.
And, finally, the distances between zero and one are easily switched to
similarities, because when two objects have a distance of
d, then they
have a similarity of 1−
d.
Distance matrices can become bewilderingly large and difficult to
interpret for a human being. For example, with only 10 analytical primitives
there are already 10×9÷2=45 distances between pairs of primitives.
Just looking at such a long list of numbers will normally not result in very
revealing insights because it is difficult to identify meaningful distinctions
amid the wealth of available information. There are many ways to help a human
being make sense of what would otherwise be categorized as information overload,
but this is an extensive topic which I will not discuss in detail here. Suffice
it to say that visualization is a highly powerful technique, though it can also
be deceptive because human eyes (and brains) tend to see patterns even when
there are none. For this reason it is advisable never to rely on just one
visualization and to always determine afterwards whether any patterns perceived
are really statistically significant. Finally, it is important to recognize that
every visualization is always an abstraction of the underlying data, or, put
more bluntly, many details are necessarily ignored, or intentionally
misrepresented, in the process of making a visually pleasing graphic display.
The network-like graph used for traditional semantic maps (cf. Haspelmath 2003)
is an example of such a pleasing graphic display for which various fundamental
abstractions of the available data are made (cf. Cysouw 2007 for a detailed
criticism).
4.
Using Linguistic Diversity
The basic intuition behind the semantic map approach to
meaning is that
cross-linguistic variation in the expression of meaning can
be used as a proxy to the investigation of meaning itself
. Concretely,
recurrent similarity in form reflects similarity in meaning, or, as Haiman
(1985:19) puts it: “recurrent identity of form between different
grammatical categories will always reflect some perceived similarity in
communicative function.” Thus, the assumption is that when the expression
of two meanings is similar in language after language, then the two meanings
themselves are similar. Individual languages might (and will) deviate from any
general pattern, but when combining many languages, overall the cross-linguistic
regularities will overshadow such aberrant
cases.[5]
Formulated within the framework set up in the previous sections, this
basic intuition can be formalized as follows. To start off, a sample of
analytical primitives has to be established, and expressions of these primitives
must be collected for a sample of the world’s languages. Then, for each
language individually, the similarity between these expressions can be
established within the structure of the language (i.e. only language-specific
constructions and language-internal form-similarities are investigated).
Technically formulated, this means that a language-specific metric on the
expressions will be set up—a different one for each language (see Section
7.2 for a concrete example of how this might work). Then,
the
cross-linguistic metric on the analytical primitives (“semantic
map”) is the average of the language-specific metrics on the expressions
collected.
This simple statement represents a big step forward for any
empirical investigation of meaning (cf. Haspelmath 2003:230-233). Instead of
requiring elusive judgments about the similarities between meanings, all that is
needed now are very concrete judgments about the similarity between
language-specific expressions within one and the same language. So, to establish
a cross-linguistically viable metric on meaning, it is not necessary to perform
cross-linguistic comparisons of expressions from different languages. Purely on
the basis of many language-specific analyses, it is possible to arrive at
general results.
5. Constructions and
Strategies
To establish a metric on expressions, a notion of
(dis)similarity between expressions is needed. There are basically two different
kinds of (dis)similarity. The first possibility is to compare the amount of
shared morphophonological material between expressions. Such similarity is
purely language-specific and cannot be used to directly compare expressions
across languages (except of course in historical-comparative reconstruction). In
contrast, more abstract characteristics are necessary to establish the
cross-linguistic similarity between expressions. Examples of more abstract
characteristics are the order of elements, the length of expressions, or the
degree of fusion between elements (e.g. isolation, concatenation, or non-linear
morphology). This is an important differentiation, as made implicitly in the
semantic map literature. The first similarity leads to a LANGUAGE-SPECIFIC
EXPRESSION METRIC (“constructions”) and the second to a
CROSS-LINGUISTIC EXPRESSION METRIC (“strategies”). Most of the
comparisons in the field of linguistic typology are based on comparing
cross-linguistic strategies (cf. Croft 2003:31ff.). However, semantic maps are
purely based on language-specific constructions.
Given a language-specific metric, a LANGUAGE-SPECIFIC CONSTRUCTION (in
the sense of Croft 2001; Goldberg 2006) is a set of language-specific
expressions that are highly similar from the perspective of the metric. What
exactly “highly similar” means is of course less obvious, but any
disputable similarity-boundary will likely be reflected by an equally vague
notion of what defines the construction involved. Though different
operationalizations of similarity can be used (and see Section 7 for a few
possibilities), I am strongly in favor of a gradient notion of language-specific
constructions (i.e. individual expressions in a language are more or less
similar on a continuous scale). I think it is misguided to look for any strict
definition of constructions that discretely classifies all expressions of a
language into separate constructions.
Being the counterpart to constructions, a TYPOLOGICAL STRATEGY is a set
of expressions that are highly similar from the perspective of a
cross-linguistic metric (the term “strategy”, now commonly found in
the typological literature, was probably first used in this sense by Keenan
& Comrie 1977:64). Just as constructions are abstractions of
language-specific metrics, strategies are abstractions of cross-linguistic
metrics. For example, consider the causative/inchoative alternation, to be
discussed extensively in Section 7. The English inchoative expression
the
vessel is destroyed
has a causative counterpart
the torpedo destroyed the
vessel
. Now, the language-specific construction to derive the anticausative
from the causative in English for the verb
destroy is to use an
expression with the verb
to be. From a cross-linguistic perspective, this
alternation is an example of an “anticausative” typological
strategy, using the terminology of Haspelmath (1993:91), because the inchoative
is transparently derived from the causative.
The main claim of the semantic map approach is that a metric on meaning
(“semantic map”) can be established purely on the basis of many
language-specific expression metrics (“constructions”), averaged
over a diverse sample of languages. Cross-linguistic metrics
(“strategies”) are not necessary for this
goal.[6]
6.
Coding and Behavior
There are many different ways to establish a
language-specific expression metric. In the next section, concrete examples of
three different metrics on the same data will be discussed in detail. One
somewhat atypical aspect of the following examples is that the metrics are based
on pairs of expressions, not on single expressions as in traditional semantic
maps (Haspelmath 2003). This approach—considering the relation between two
expressions—is reminiscent of Keenan’s (1976:306-307)
“transformational behavior”. Following Keenan, the terms
“coding” and “behavior” have become widespread for the
analysis of grammatical relations. Generalizing this distinction, I will use the
term “coding properties” for properties of individual expressions,
while “behavioral properties” are properties of the relation between
expressions.
The properties may be pragmatic, semantic, or syntactic. And of
the syntactic ones, some concern properties internal to a single sentence [i.e.
“coding”, MC] and others concern the relation between a b-sentence
and some modification of it [i.e. “behavior”, MC]. (Keenan
1976:312)
Under this definition, the opposition coding vs. behavior is
independent from the opposition construction vs. strategy, as discussed in the
previous section. There are thus four logically possible combinations that
represent different approaches to characterizing and comparing
expressions.
First, a
coding strategy is a cross-linguistic classification of
the structure of a particular expression. This is the most prototypical kind of
approach in linguistic typology. The classic example is the typology of relative
clause structures distinguishing types like “relative pronoun
strategy” or the “internally headed relative clauses” (Lehmann
1984; Comrie & Kuteva 2005). Second, a
behavioral strategy is a
cross-linguistic classification of the relation between various expressions
(typically two, but possibly more). A classic example is the relation between a
regular matrix sentence like
John swept the floor and the corresponding
action nominal construction
John’s sweeping of the floor (cf.
Keenan 1976:321). For this behavior, a cross-linguistic classification of
possible strategies used by human languages has been developed by
Koptjevskaya-Tamm (1993, 2005).
Third,
constructional coding is a characterization of the
language-specific form of a expression. This is the typical information that is
used in traditional semantic maps. The more similar two expressions are in terms
of their constructional coding, the closer their meaning (when averaged over a
large number of languages). Finally,
constructional behavior is the
fourth possibility. This method of characterizing expressions is not very widely
acknowledged in the typological literature, but it will be the approach that I
will use in the case study in the next section. The basic idea is to compare the
combined language-specific forms of all alternative expressions that are
relevant for the behavior.
7.
Case Study
7.1 Causative/inchoative
alternations
As an example of the approach presented here, I will
reanalyze the data from Haspelmath (1993) on the causative/inchoative
alternation. In his paper Haspelmath addresses the question as to how languages
mark the predicate in the alternation between an inchoative expression like
the water boiled and a causative expression like
the man boiled the
water
. In the case of the English predicate
boil there is no
difference in the marking, but for other alternations, like
die/kill or
be destroyed/destroy, the difference between the inchoative and the
causative version is reflected in the lexical or morphological form of the
predicate. The approach of Haspelmath’s study is to investigate
cross-linguistic
strategies of expressing the relation between inchoative
and causative meanings, but that aspect of his study will not be the main focus
of this paper (some preliminary hints on the relation between strategies and
meaning will be given at the end of Section 7.2). Instead, I will investigate
the
relations between the meanings of the predicates by investigating the
language-specific marking that is used to express the inchoative/causative
alternation.
Haspelmath investigated the inchoative/causative alternation for 31
analytical primitives (“lexical meanings”) in 21 languages. The 31
meanings investigated are repeated here in Table 2 (adapted from Table 2 in
Haspelmath 1993:97).[7]
The
translations of these meanings in all 21 languages are added as an appendix to
Haspelmath’s paper, allowing for the current reanalysis of the
data.[8]
No.
|
Inchoative
|
Causative
|
No.
|
Inchoative
|
Causative
|
1
|
wake up
|
wake up
|
17
|
connect
|
connect
|
2
|
break
|
break
|
18
|
boil
|
boil
|
3
|
burn
|
burn
|
19
|
rock
|
rock
|
4
|
die
|
kill
|
20
|
go out
|
put out
|
5
|
open
|
open
|
21
|
rise
|
raise
|
6
|
close
|
close
|
22
|
finish
|
finish
|
7
|
begin
|
begin
|
23
|
turn
|
turn
|
8
|
learn
|
teach
|
24
|
roll
|
roll
|
9
|
gather
|
gather
|
25
|
freeze
|
freeze
|
10
|
spread
|
spread
|
26
|
dissolve
|
dissolve
|
11
|
sink
|
sink
|
27
|
fill
|
fill
|
12
|
change
|
change
|
28
|
improve
|
improve
|
13
|
melt
|
melt
|
29
|
dry
|
dry
|
14
|
be destroyed
|
destroy
|
30
|
split
|
split
|
15
|
get lost
|
lose
|
31
|
stop
|
stop
|
16
|
develop
|
develop
|
|
|
|
Table 2: Inchoative/causative pairs investigated in
Haspelmath (1993)
I will use the language-specific marking of the
inchoative/causative alternation of the meanings listed in Table 2 as a proxy to
the measurement of the similarity between the meanings. For example, the English
expression of meaning 1,
wake up/wake up, does not use any marking to
differentiate inchoative from causative. This means that meaning 1 is somewhat
alike to meaning 2, in English expressed as
break/break, which likewise
does not differentiate inchoative from causative. A similar situation is found
in French. The French expressions of meanings 1 and 2 also use the same
construction (viz. a reflexive pronoun with the inchoative:
se
réveiller/réveiller
and
se briser/briser,
respectively). This is again an indication that these two meanings are somewhat
alike. In German, though, meanings 1 and 2 do not use the same process (viz. an
ablaut-like alternation in
aufwachen/aufwecken vs. no differentiation in
zerbrechen/zerbrechen, respectively), which is an indication that the
meanings 1 and 2 are also somewhat different.
The marking of the inchoative/causative alternation on the predicate is
just one of very many possible approaches to investigating similarity between
meanings, or, to paraphrase a claim made in Section 3, any notion of similarity
is a question of perspective and is not in any sense pre-established by the
nature of the expressions investigated. The rather abstract nature of the notion
of similarity as used here (i.e. the formation of the inchoative/causative
alternation) is appealing because it allows for the comparison of otherwise
difficult-to-compare meanings, like “wake up” and
“break”.[9]
In the
following section, I will discuss three different ways to operationalize this
language-specific notion of similarity between expressions.
7.2
Metric A: Language-specific constructions
The first example of a language-specific similarity between
expressions will be based on establishing language-specific constructions. I
will here define a construction as a regular morphosyntactic relation between an
inchoative and a causative verb form. Such relations are purely
language-specific (see the appendix for a complete survey of all constructions
distinguished for this paper). For example, in English, the 31 meanings shown in
Table 2 can be classified as belonging to seven language-specific constructions.
There is one large class consisting of verbs that do not show any difference in
morphology between inchoative and causative usage (viz.
wake up, break, burn,
open,
etc.). The remaining six classes each consist only of one meaning,
using different inchoative/causative alternations in each case (viz.
die/kill, learn/teach, be destroyed/destroy, get lost/lose, go out/put
out
, and
rise/raise). As an example, just the first three meanings
are shown in Table 3, all three being marked as belonging to the same class
(called “E-1”, where the “E” indicates that this is a
language-specific class for English only).
For other languages, these classifications will look
different. For example, in French there are five different classes. First, there
is one large class in which the inchoative form is marked with a reflexive
pronoun (e.g. 1:
se réveiller/réveiller and 2:
se
briser/briser
). Second, there is another large class in which there is no
difference between inchoative and causative verb forms (e.g. 3:
brûler/brûler). Then, there is a small class where the
causative is formed by adding the verb
faire (among the current 31
meanings this is found only for 13:
fondre/faire fondre and 18:
bouillir/faire bouillir). Finally, there are two French expressions that
do not have any parallel among the current 31 meanings, so they make up their
own class (viz. 4:
mourir/tuer and 14:
être
détruit/détruir
).
|
English
|
French
|
German
|
No.
|
Form
|
Class
|
Form
|
Class
|
Form
|
Class
|
1
|
wake up/wake up
|
E-1
|
se réveiller/réveiller
|
F-1
|
aufwachen/aufwecken
|
G-1
|
2
|
break/break
|
E-1
|
se briser/briser
|
F-1
|
zerbrechen/zerbrechen
|
G-2
|
3
|
burn/burn
|
E-1
|
brûler/brûler
|
F-2
|
verbrennen/verbrennen
|
G-2
|
Table 3: Excerpt of language-specific classes for
inchoative/causative alternations
Once established for all languages in the sample, these
language-specific classes (“constructions”) can now be used to
calculate the (dis)similarity between the primitives (“lexical
meanings”). Basically, every pair of meanings is considered separately for
all 21 languages, and the number of languages is counted for which the two
meanings belong to different constructions. The higher this number, the more
languages put the meanings in different constructions, indicating that the
meanings are different. For example, considering meanings 1 and 2 in the excerpt
of the data shown in Table 3, these two meanings belong to the same class in
English and in French, but to different constructions in just one language,
namely German. So, the distance between meaning 1 and 2 is “1”.
Likewise, the distance between 1 and 3 is “2” because two of these
languages treat them differently, and between 2 and 3 the distance is
“1” because only French treats them differently. The establishment
of the language-specific constructions and the counting of differences together
are a metric on meanings, and the result is a list of distances between all
pairs of meanings.
A different way of performing exactly the same calculation is obtained
by a reformulation of the language-specific constructions into language-specific
distance matrices. This reformulation might seem somewhat cumbersome at first,
but it will allow for a much wider array of possible analyses—a few of
which will be discussed in the next sections. The basic idea is to consider a
language-specific construction to be a very simple notion of dissimilarity. As
defined earlier, a construction can be considered to be a language-specific
metric on expressions (cf. Table 1 and the discussion in Section 5). Such a
metric only allows for the options “identical” (i.e. a
dissimilarity/distance of “0”) or “different” (i.e. a
dissimilarity/distance of “1”). From the perspective of English, the
meanings 1, 2, and 3 are all identical (i.e. they belong to the same
construction), which translates to a distance of zero between all pairs of these
meanings. Of course, also the distance between each meaning and itself is zero
(they necessarily belong to the same construction), so the result of
reformulating the first three English meanings into a language-specific distance
matrix is a matrix with all zeros (cf. the leftmost matrix in Figure 1—for
convenience of presentation all matrices are shown completely, although distance
matrices redundantly duplicate each entry in the upper and lower triangle). The
same procedure can also be used for French and German, which will result in some
distances of “1” because not all three meanings belong to the same
class in these languages. Given these language-specific distance matrices, the
cross-linguistic distance matrix on the meanings can now easily be computed by
summing up these three matrices (cf. the rightmost matrix in Figure
1).[10]
English
|
|
French
|
|
German
|
|
Sum
|
|
1
|
2
|
3
|
|
|
1
|
2
|
3
|
|
|
1
|
2
|
3
|
|
|
1
|
2
|
3
|
1
|
0
|
0
|
0
|
|
1
|
0
|
0
|
1
|
|
1
|
0
|
1
|
1
|
|
1
|
0
|
1
|
2
|
2
|
0
|
0
|
0
|
|
2
|
0
|
0
|
1
|
|
2
|
1
|
0
|
0
|
|
2
|
1
|
0
|
1
|
3
|
0
|
0
|
0
|
|
3
|
1
|
1
|
0
|
|
3
|
1
|
0
|
0
|
|
3
|
2
|
1
|
0
|
Figure 1: Language-specific constructions as distance
matrices. Adding them together results in a cross-linguistic distance matrix on
the meanings
Doing these calculations for all 31 meanings in all 21
languages results in a 31×31 cross-linguistic distance matrix giving the
dissimilarity for all pairs of meanings—an excerpt of which is shown in
Table 4. The minimal value in this table is zero (i.e. the meanings belong to
the same construction in all 21 languages), and the maximum is 21 (i.e. the
meanings belong to different constructions in all 21 languages). These values
can be normalized to the [0,1] interval by dividing them by 21 (shown in
parentheses in the table). Just to give some perspective on these numbers, it
appears that the pairs “close”–“open”,
“open”–“break”, and
“close”–“break” are relatively similar (they
belong to the same construction in about half of the languages investigated). In
contrast, “die/kill” is highly dissimilar from all others, as might
have been expected, because the inchoative/causative alternation for this
meaning is suppletive in most languages and thus different from all other
alternations in the same language.
|
wake up
|
break
|
burn
|
die/kill
|
open
|
close
|
wake up
|
0
|
17 (.81)
|
16 (.76)
|
20 (.95)
|
17 (.81)
|
16 (.76)
|
break
|
17 (.81)
|
0
|
13 (.62)
|
19 (.90)
|
10 (.48)
|
12 (.57)
|
burn
|
16 (.76)
|
13 (.62)
|
0
|
20 (.95)
|
16 (.76)
|
17 (.81)
|
die/kill
|
20 (.95)
|
19 (.90)
|
20 (.95)
|
0
|
21 (1.0)
|
21 (1.0)
|
open
|
17 (.81)
|
10 (.48)
|
16 (.76)
|
21 (1.0)
|
0
|
10 (.48)
|
close
|
16 (.76)
|
12 (.57)
|
17 (.81)
|
21 (1.0)
|
10 (.48)
|
0
|
Table 4: Excerpt of the cross-linguistic dissimilarity matrix
on meaning
as established by summing up over all 21
language-specific classifications
A complete analysis of the full 31×31 distance matrix
will not be pursued here, but one quick example will be given to indicate
possible routes of analysis (see Cysouw 2008 for a more elaborate discussion).
When multidimensional scaling is applied to the cross-linguistic distance
matrix, then the first dimension (i.e. the dimension that explains most of the
variation) appears to be related to the “scale of likelihood of
spontaneous occurrence” (Haspelmath
1993:105).[11]
On one side of this
scale predicates are found that prototypically do not need an agentive
instigator, like “boil”, “freeze”, and
“burn” (and in the multidimensional scaling “die/kill”
is also found to belong to this side). The other side of the scale holds such
events that normally have a human agent, like “gather”,
“connect”, or “change”. This scale was originally
proposed by Haspelmath to explain the preference of certain meanings for
particular behavioral strategies. Specifically, he argued that those meanings
that are typically in need of a human instigator cross-linguistically have a
preference for an anticausative coding strategy (i.e. the inchoative is derived
from the causative), while the meanings on the other side of the scale have a
preference for a causative strategy (i.e. the causative is derived from the
inchoative).
Now, instead of deriving the scale of likelihood of spontaneous
occurrence from behavioral strategies, as Haspelmath did, in this paper the
scale is purely based on the analysis of language-specific constructions. The
semantic scale of likelihood of spontaneous occurrence (here defined as the
first dimension of the MDS of the metric on meaning) can then be correlated
empirically with the proportion of languages that use an anticausative strategy
(see Figure 2).[13]
The correlation
is almost perfect (
r=.83,
p<10
-8). This example
indicates that a linguistic scale can be conceived of as a (significant)
correlation between meaning-similarity and form-similarity.
Figure 2: Correlation
between preference for anticausative coding strategy and the first dimension of
the MDS of the metric of meaning.
7.3 Metric B: Algorithmically
approximating constructions
The reformulation of constructions as language-specific
metrics on expressions, as discussed in relation to Figure 1 above, allows for a
wide variety of other approaches to establishing a semantic map. The basic idea
of this reformulation is that for each language a language-specific distance
matrix is calculated describing how similar the expressions of the meanings are
from the perspective of each language individually. The cross-linguistic
distances then are the result of simply summing up over all these
language-specific distances. Using constructions, as done in the previous
section, the language-specific matrices will only consist of “0”
(indicating “same construction”) and “1” (indicating
“different constructions”). However, all values in between
“0” and “1” can also be used to indicate that two
constructions are neither completely different nor completely similar. For
example, one might argue that the German alternations
aufwachen/aufwecken
and
versinken/versenken are different constructions, but also
somewhat alike. They both involve a kind of ablaut, though the details are
different. Neither considering them to be completely different nor completely
identical will do justice to the empirical situation. To deal with such a
situation, a gradient language-specific distance can be used. For example, one
could set the language-specific distance between the two alternations above as
0.75 (see Table 5). The specification of gradient dissimilarities can be
performed on the basis of a detailed analysis of each language individually.
However, it is also possible to use a general method for measuring
language-internal similarity. One such approach will be discussed in this
section, and a simpler but also less satisfying method will be discussed in the
next section.
|
|
Yes/No distance
|
|
Gradient distance
|
No.
|
German expressions
|
1
|
2
|
3
|
11
|
|
1
|
2
|
3
|
11
|
1
|
aufwachen/aufwecken
|
0
|
1
|
1
|
1
|
|
0
|
1
|
1
|
.75
|
2
|
zerbrechen/zerbrechen
|
1
|
0
|
0
|
1
|
|
1
|
0
|
0
|
1
|
3
|
verbrennen/verbrennen
|
1
|
0
|
0
|
1
|
|
1
|
0
|
0
|
1
|
11
|
versinken/versenken
|
1
|
1
|
1
|
0
|
|
.75
|
1
|
1
|
0
|
Table 5: Different language-specific distances of some German
inchoative/causative alternations
One method of comparing inchoative/causative alternations
within the structure of a single language is to analyze each alternation as a
collection of changes of letters needed to get from the inchoative to the
causative string of letters. Changes are either a deletion of an existing letter
or an insertion of a new letter. To match linguistic intuitions about what makes
a similar change, the method distinguishes between making a change at the start
of a word, at the end of a word, or in the middle of a word. For every
inchoative/causative pair, this leads to a list of changes on how to get from
the inchoative to the causative form. So, for example, to get from
rise
to
raise only one change is needed, namely an <a> has to be
inserted in the middle of the word. To compare two alternations, the number of
shared letter changes is counted and then normalized by the maximum number of
changes attested. The distance between two alternations will then be the
complement of this value (i.e. 1−shared/maximum).
For example, to get from the German inchoative
aufwachen to
causative
aufwecken the following four changes are needed:
1) deletion of <a> inside the word
(“
aufwchen”)
2) deletion of <h> inside the word
(“
aufwcen”)
3) insertion of <e> inside the word
(“
aufwecen”)
4) insertion of <k> inside the word
(“
aufwecken”)
To get from German inchoative
versinken to causative
versenken the following two changes are needed:
1) deletion of <i> inside the word
(“
versnken”)
2) insertion of <e> inside the word
(“
versenken”)
These two sets of changes have one change in common
(“insertion of <e> inside the word”), and the maximum number
of changes needed is “4” (for the
aufwachen/aufwecken
alternation), so the distance between the two alternations is
1−1/4=.75 (cf. Table 5). This algorithm could be improved in various
ways.[14]
However, the main point is
that it is relatively easy to get a rough estimate of the language-internal
dissimilarity between two inchoative/causative
alternations.[15]
To get from language-specific dissimilarities to a cross-linguistic
distance matrix, all individual matrices are added together. An excerpt of the
resulting matrix is shown in Table 6, which can be compared with the same
selection shown in Table 4. Although the two tables are not completely
identical, the values are astonishingly close. The complete correlation between
the results of this algorithmic notion of dissimilarity and the dissimilarity
based on the manually established language-specific constructions is shown in
Figure 3 (
r=.91). Shown on the x-axis in this figure are the
dissimilarities (“distances”) from the metric discussion in the
previous Section 7.2. On the y-axis, the distances from the algorithmic approach
as discussed in this section are shown. The close match between these two
methods suggests that automatic approaches can be very useful in the
establishment of cross-linguistic metrics on meaning. In general, it appears
that the errors introduced by the linguistically naive algorithm are easily
corrected by summing up over many languages.
|
wake
up
|
break
|
burn
|
die/kill
|
open
|
close
|
wake up
|
0
|
14.1 (.67)
|
14.5 (.69)
|
18.5 (.88)
|
13.8 (.66)
|
13.5 (.64)
|
break
|
14.1 (.67)
|
0
|
12.7 (.61)
|
17.5 (.83)
|
10.2 (.49)
|
10.8 (.51)
|
burn
|
14.5 (.69)
|
12.7 (.61)
|
0
|
17 (.81)
|
14.5 (.69)
|
15.4 (.73)
|
die/kill
|
18.5 (.88)
|
17.5 (.83)
|
17 (.81)
|
0
|
18.7 (.89)
|
18.6 (.89)
|
open
|
13.8 (.66)
|
10.2 (.49)
|
14.5 (.69)
|
18.7 (.89)
|
0
|
10.3 (.49)
|
close
|
13.5 (.64)
|
10.8 (.51)
|
15.4 (.73)
|
18.6 (.89)
|
10.3 (.49)
|
0
|
Table 6: Excerpt of the cross-linguistic distance matrix
as established by the algorithmic approach
Figure 3: Correlation between cross-linguistic distances as
established by language-specific classes and by the algorithmic
approach
7.4 Metric C: Simplistic
string-based similarity
The good results of the algorithmic approach to establishing
language-specific similarities prompted me to try out an even simpler, even more
linguistically naive algorithmic approach. It is based on the longest COMMON
SUBSTRING measure of similarity between two strings of letters. This similarity
consists of the length of the longest consecutive stretch of letters shared
between two expressions. So, for example,
house and
mouse share 4
letters in a row. To use this measure of similarity for inchoative/causative
alternations, I pasted the inchoative and the causative forms together into one
string without spaces (e.g. French
seréveillerréveiller or
sebriserbriser) and established the longest common substring (in the
French example this would be “2” for the string
“se”). This approach of course finds all kinds of small
random similarities (e.g.
wakeupwakeup and
breakbreak also have a
longest common substring of “2” for the string
“
ak”), and in general it only works well with concatenative
morphology or morphologically independent markers (like the reflexive
se
in the French example above).
Figure 4 shows the relation between the distances from this very
simplistic approach (shown on the y-axis) to the distances from the
linguistically sophisticated approach using language-specific classes, as
discussed in Section 7.2. The match between this extremely simple measurement of
language-specific similarity to the linguistically sophisticated similarity
using language-specific classes is not as good as for the more elaborate
algorithmic approach from the previous section (
r=.61, cf. Figure 4 with
the previous Figure 3), though the correlation is still highly significant
(Mantel test
p<.00001), indicating that even with similarity measures
which are linguistically very naive relatively good overall results are
possible.
Figure 4: Correlation between cross-linguistic distances as
established by language-specific classes and by the longest common
substring.
8. Conclusion
By using the world’s linguistic diversity, the study
of meaning can be transformed from an introspective inquiry into a subject of
empirical investigation. For this to be possible, the notion of meaning has to
be operationalized by defining the meaning of an expression as the collection of
all contexts in which the expression can be used. Under this definition, meaning
can be empirically investigated by sampling contexts. A semantic map is a
technique to show the relations between such sampled contexts. Or, formulated
more technically, a semantic map is a visualization of a metric on contexts
sampled to represent a domain of meaning. Or, put more succinctly, a semantic
map is a metric on meaning.
The relation between different contexts/meanings can be investigated by
looking at their expressions in many languages. The more similar these
expressions when averaged over all languages studied, the more similar the
contexts. So, to investigate the similarity between contexts, only judgments
about the local similarity between expressions within the structure of
individual languages are needed. In general, this similarity between
language-specific expressions is a special—language-specific—metric
between contexts. A metric on meaning, then, is the cross-linguistic average of
many language-specific expression metrics.
A language-specific expression metric can be very fine-grained, and to a
large extent automatically retrieved, opening up the possibility of speeding up
the empirical study of meaning. It is important to realize, however, that for
any resulting semi-automatically retrieved metric on meaning, the interpretation
(“the meaning of the metric”) is of course still in the eye of the
beholder, namely, the human investigator.
Acknowledgments
I thank Martin Haspelmath and Caterina Mauri for helpful
comments on how to improve the presentation of the somewhat tedious subject of
this paper. Further, many of the concepts used in this paper arose in discussion
with Bernhard Wälchli and should often just as well be considered his ideas
(cf. Wälchli & Cysouw 2010). I of course take complete responsibility
for any remaining inconsistency or lack of clarity.
References
Comrie, Bernard and Tania Kuteva. 2005. Relativization strategies.
World Atlas of Language Structures, ed. by Martin Haspelmath, Matthew S. Dryer,
David Gil and Bernard Comrie, 398-405. Oxford: Oxford University
Press.
Croft, William. 2001. Radical Construction Grammar: Syntactic
theory in typological perspective. Oxford: Oxford University
Press.
-----. 2003. Typology and universals, 2nd edition.
Cambridge: Cambridge University Press. (Cambridge Textbooks in
Linguistics).
-----. 2007. Exemplar semantics. Unpublished manuscript,
available online at
http://www.unm.edu/~wcroft/WACpubs.html.
Croft, William and Keith T. Poole. 2008. Inferring universals from
grammatical variation: Multidimensional scaling for typological analysis.
Theoretical Linguistics 34/1.1-37. doi:10.1515/thli.2008.001
Cysouw, Michael. 2007. Building semantic maps: The case of person
marking. New Challenges in Typology, ed. by Bernhard Wälchli and Matti
Miestamo, 225-248. Berlin: Mouton de Gruyter. (Trends in Linguistics: Studies
and Monographs 189).
-----. 2008. Generalizing scales. Scales, ed. by Marc
Richards & Andrej Malchukov, 379-396. Leipzig: Institut für Linguistik,
Universität Leipzig. (Linguistische Arbeits-Berichte 86).
Dahl, Östen. 1985. Tense and Aspect systems. Oxford:
Blackwell.
Goldberg, Adele E. 2006. Constructions at work: The nature of
generalization in language. Oxford: Oxford University Press.
Haiman, John. 1985. Natural syntax. Cambridge: Cambridge University
Press.
Haspelmath, Martin. 1993. More on the typology of
inchoative/causative verb alternations. Causatives and transitivity, ed. by
Bernard Comrie and Maria Polinsky, 87-120. Amsterdam: Benjamins. (Studies in
Language Companion Series).
-----. 1997. Indefinite pronouns. Oxford: Clarendon.
(Oxford Studies in Typology and Linguistic Theory).
-----. 2003. The geometry of grammatical meaning: Semantic
maps and cross-linguistic comparison. The new psychology of language: Cognitive
and functional approaches to language structure, ed. by Michael Tomasello, vol.
2, 211-242. Mahwah, NJ: Erlbaum.
-----. 2010. Comparative concepts and descriptive
categories in cross-linguistic studies. Language 86. doi:10.1353/lan.2010.0021
Keenan, Edward L. 1976. Towards a universal definition of
‘subject’. Subject and topic, ed. by Charles N. Li, 303-333. New
York, NY: Academic Press.
Keenan, Edward L. and Bernard Comrie. 1977. Noun phrase
accessibility and universal grammar, Linguistic Inquiry 8/1.63-99.
Koptjevskaja-Tamm, Maria. 1993. Nominalizations. London:
Routledge.
-----. 2005. Action nominal constructions. World Atlas of
Language Structures, ed. by Martin Haspelmath, Matthew S. Dryer, David Gil and
Bernard Comrie, 254-257. Oxford: Oxford University Press.
Lehmann, Christian. 1984. Der Relativsatz: Typologie seiner
Strukturen, Theorie seiner Funktionen, Kompendium seiner Grammatik.
Tübingen: Narr.
Levinson, Stephen C. 2003. Space in language and cognition:
Explorations in cognitive diversity. Cambridge: Cambridge University Press.
(Language, Culture & Cognition 5).
Levinson, Stephen C. and Sérgio Meira. 2003. 'Natural
concepts' in the spatial topological domain – Adpositional meanings in
crosslinguistic perspective: An exercise in semantic typology. Language
79/3.485-516. doi:10.1353/lan.2003.0174
Majid, Asifa, Melissa Bowerman, Miriam van Staden and James S.
Boster. 2007. The semantic categories of cutting and breaking events: A
crosslinguistic perspective. Cognitive Linguistics 18/2.133-152. doi:10.1515/cog.2007.005
R Development Core Team. 2007. R: A Language and Environment for
Statistical Computing. Vienna, Austria: R Foundation for Statistical
Computing.
Wälchli, Bernhard. 2005. Co-compounds and natural
coordination. Oxford: Oxford University Press.
Wälchli, Bernhard and Michael Cysouw. 2010. Lexical typology through similarity semantics: Toward a semantic
map of motion verbs. Linguistics. doi:10.1515/ling-2012-0021
Wierzbicka, Anna. 1996. Semantics: Primes and universals. Oxford:
Oxford University Press.
Author's contact information:
Michael Cysouw
Department of Linguistics
Max Planck Institute for Evolutionary
Anthropology
Deutscher Platz 6
04103 Leipzig
Germany
cysouw@eva.mpg.de
Appendix: Language-Specific
Classes of Causative/Inchoative Alternations
Arabic
|
Armenian
|
English
|
Class A: C/CC
|
Class A: ø/c
|
Class A: Identical
|
1.
|
sah́aa/sah́h́aa
|
1.
|
artnanal/artnacnel
|
1.
|
wake up
|
8.
|
darasa/darrasa
|
16.
|
zarzanal/zarzacnel
|
2.
|
break
|
14.
|
damara/dammara
|
21.
|
barʒranal/barʒracnel
|
3.
|
burn
|
31.
|
waqafa/waqqafa
|
22.
|
k’eršanal/k’eršacnel
|
5.
|
open
|
|
|
28.
|
lavanal/lavacnel
|
6.
|
close
|
Class B: in/ø
|
29.
|
čoranal/čoracnel
|
7.
|
begin
|
2.
|
inkasara/kasara
|
|
|
9.
|
gather
|
5.
|
infatah́a/fatah́a
|
Class B: v/ø
|
10.
|
spread
|
6.
|
inqafala/qafala
|
2.
|
ǯardvel/ǯardel
|
11.
|
sink
|
13.
|
inṣahara/ṣahara
|
3.
|
ayrvel/ayrel
|
12.
|
change
|
30.
|
inšaqqa/šaqqa
|
6.
|
pak’vel/pak’el
|
13.
|
melt
|
|
|
7.
|
sksvel/sksel
|
16.
|
develop
|
Class C: in/ʔ
|
9.
|
havakvel/havakel
|
17.
|
connect
|
3.
|
ih́taraqa/ʔah́raqa
|
10.
|
əndarc’ak’vel/əndarc’ak’el
|
18.
|
boil
|
20.
|
inṭafaʔa/ʔaṭfaʔa
|
11.
|
xegolvel/xegolel
|
19.
|
rock
|
22.
|
intahaa/ʔanhaa
|
12.
|
poxvel/poxel
|
22.
|
finish
|
|
|
13.
|
halvel/halel
|
23.
|
turn
|
Class D: t/ø
|
14.
|
kandvel/kandel
|
24.
|
roll
|
9.
|
iltamma/lamma
|
17.
|
k’ap’vel/k’ap’el
|
25.
|
freeze
|
10.
|
intašara/našara
|
19.
|
č’oč’vel/č’oč’el
|
26.
|
dissolve
|
17.
|
irtabaṭa/rabaṭa
|
23.
|
pttvel/pttel
|
27.
|
fill
|
21.
|
irtafaʕa/rafaʕa
|
24.
|
glorvel/glorel
|
28.
|
improve
|
27.
|
imtalaʔa/malaʔa
|
26.
|
luc’vel/luc’el
|
29.
|
dry
|
|
|
30.
|
č’eɣkvel/č’eɣkel
|
30.
|
split
|
Class E: ø/ʔ
|
|
|
31.
|
stop
|
11.
|
ġariqa/ʔaġraqa
|
Class C: v/n
|
|
|
18.
|
ġalaa/ʔaġlaa
|
5.
|
bacvel/bacanal
|
Singular classes:
|
23.
|
daara/ʔadaara
|
27.
|
lcvel/lcnel
|
4.
|
die/kill
|
26.
|
ðaaba/ʔaðaaba
|
|
|
8.
|
learn/teach
|
|
|
Class D: ø/Vcn
|
14.
|
be destroyed/destroy
|
Class F: ta/ø
|
8.
|
sovorel/sovorecnel
|
15.
|
get lost/lose
|
12.
|
tabaddala/baddala
|
18.
|
eṙal/eṙacnel
|
20.
|
go out/put out
|
16.
|
taṭawwara/ṭawwara
|
31.
|
k’angnil/k’angnecnel
|
21.
|
rise/raise
|
19.
|
taʔarjah́a/ʔarjah́a
|
|
|
|
|
24.
|
tadah́raja/dah́raja
|
Class E: č/cn
|
|
|
25.
|
tajammada/jammada
|
15.
|
k’orčel/k’orcnel
|
|
|
28.
|
tah́assana/h́assana
|
20.
|
hangčel/hangcnel
|
|
|
|
|
25.
|
saṙčel/saṙecnel
|
|
|
Singular classes:
|
|
|
|
|
4.
|
maata/qatala
|
Class F:
|
|
|
7.
|
badaʔa
|
4.
|
spanel/mernel
|
|
|
15.
|
daaʕa/xasira
|
|
|
|
|
29.
|
jaffa/jaffafa
|
|
|
|
|
Finnish
|
French
|
Georgian
|
Class A: ø/tt
|
Class A: se/ø
|
Class A: i/a
|
1.
|
herätä/herättää
|
1.
|
se
réveiller/réveiller
|
1.
|
gaiɣviʒebs/gaaɣviʒebs
|
3.
|
palaa/polttaa
|
2.
|
se briser/briser
|
8.
|
isc’avlis/asc’avlis
|
8.
|
oppia/opettaa
|
5.
|
s’ouvrir/ouvrir
|
|
|
10.
|
levitä/levittää
|
6.
|
se fermer/fermer
|
Class B: i+a/a+s
|
13.
|
sulaa/sulattaa
|
9.
|
s’assembler/assembler
|
2.
|
imt’vreva/amt’vrevs
|
18.
|
kiehua/kiehuttaa
|
10.
|
s’étendre/étendre
|
5.
|
gaiɣeba/gaaɣebs
|
19.
|
kiikkua/kiikuttaa
|
11.
|
s’enfoncer/enfoncer
|
11.
|
daixrčoba/axrčobs
|
20.
|
sammua/sammuttaa
|
15.
|
se perdre/perdre
|
14.
|
daingreva/daangrevs
|
21.
|
kohota/kohottaa
|
16.
|
se
développer/développer
|
19.
|
irxeva/arxevs
|
22.
|
loppua/lopettaa
|
17.
|
se lier/lier
|
27.
|
aivseba/aavsebs
|
24.
|
vieriä/vierittää
|
19.
|
se balancer/balancer
|
30.
|
gaip’oba/gaap’obs
|
26.
|
liueta/liuottaa
|
20.
|
s’éteindre/éteindre
|
|
|
29.
|
kuivaa/kuivata
|
21.
|
se lever/lever
|
Class C: i+eba/ø+avs
|
|
|
23.
|
se tourner/tourner
|
6.
|
daixureba/daxuravs
|
Class B: U/ø
|
26.
|
se dissoudre/dissoudre
|
15.
|
ik’argeba/k’argavs
|
2.
|
murtua/murtaa
|
27.
|
se remplir/remplir
|
25.
|
gaiqineba/gaqinavs
|
12.
|
muuttua/muuttaa
|
28.
|
s’améliorer/améliorer
|
|
|
16.
|
kehittyä/kehittää
|
30.
|
se fendre/fendre
|
Class D: i+eba/ø+is
|
23.
|
vääntyä/vääntää
|
31.
|
s’arrêter/arrêter
|
9.
|
šeik’ribeba/šek’rebs
|
27.
|
täyttyä/täyttää
|
|
|
12.
|
šeicvleba/šecvlis
|
28.
|
parantua/parantaa
|
Class B: Identical
|
16.
|
daišleba/dašlis
|
|
|
3.
|
brûler
|
26.
|
gaixsneba/gaxsnis
|
Class C: UtU/ø
|
7.
|
commencer
|
|
|
5.
|
avautua/avata
|
8.
|
apprendre
|
Class E: ø+eba/a+obs
|
6.
|
sulkeutua/sulkea
|
12.
|
changer
|
13.
|
gadneba/gaadnobs
|
14.
|
tuhoutua/tuhota
|
22.
|
finir
|
20.
|
kreba/akrobs
|
|
|
24.
|
rouler
|
29.
|
šreba/ašrobs
|
Class D: ntu/t
|
25.
|
geler
|
|
|
9.
|
kokoontua/koota
|
29.
|
sécher
|
Class F: ø+deba/a+ebs
|
15.
|
hukkaantua/hukata
|
|
|
10.
|
gavrceldeba/gaavrcelebs
|
|
|
Class C: ø/faire
|
22.
|
gatavdeba/gaatavebs
|
Class E:
tyä/dytää
|
13.
|
fondre/faire fondre
|
28.
|
gaumǯobesdeba/
gaaumǯobesebs
|
17.
|
yhtyä/yhdistää
|
18.
|
bouillir/faire bouillir
|
31.
|
gačerdeba/gaačerebs
|
25.
|
jäätyä/jäädyttää
|
|
|
|
|
31.
|
pysähtyä/pysähdyttää
|
Singular classes:
|
Class G: ø+avs/a+ebs
|
|
|
4.
|
mourir/tuer
|
23.
|
brunavs/abrunebs
|
Singular classes:
|
14.
|
être
détruit/détruir
|
24.
|
migoravs/miagorebs
|
4.
|
kuolla/tappaa
|
|
|
|
|
7.
|
alkaa/aloitaa
|
|
|
Singular classes:
|
11.
|
laskea
|
|
|
3.
|
ic’vis/c’vavs
|
30.
|
haljeta/halkaista
|
|
|
4.
|
mok’vdeba/mok’lavs
|
|
|
|
|
7.
|
daic’qeba/daic’qebs
|
|
|
|
|
17.
|
šeexameba/šeuxamebs
|
|
|
|
|
18.
|
duɣs/aduɣebs
|
|
|
|
|
21.
|
adgeba/aiɣebs
|
German
|
Greek
|
Hebrew
|
Class A: Identical
|
Class A: Identical
|
Class A: hit/ø
|
2.
|
zerbrechen
|
1.
|
ksipnó
|
1.
|
hitʕorer/ʕorer
|
3.
|
verbrennen
|
2.
|
spázo
|
9.
|
hitʔasef/ʔasaf
|
7.
|
anfangen
|
5.
|
anígho
|
10.
|
hitpares/paras
|
13.
|
schmelzen
|
6.
|
klíno
|
12.
|
hištana/šina
|
18.
|
kochen
|
7.
|
arçízo
|
16.
|
hitpatah́/patah́
|
19.
|
schaukeln
|
8.
|
mathéno
|
17.
|
hitkašer/kišer
|
24.
|
rollen
|
12.
|
alázo
|
19.
|
hitnadned/nidned
|
25.
|
einfrieren
|
14.
|
xalnó
|
21.
|
hitromem/romem
|
29.
|
trocknen
|
18.
|
vrázo
|
23.
|
histovev/sovev
|
31.
|
anhalten
|
20.
|
svíno
|
26.
|
hitporer/porer
|
|
|
22.
|
telióno
|
27.
|
hitmale/mile
|
Class B: sich/ø
|
23.
|
yirízo
|
28.
|
hištaper/šiper
|
5.
|
sich öffnen/öffnen
|
25.
|
paghóno
|
29.
|
hityabeš/yibeš
|
6.
|
sich schliessen/schliessen
|
27.
|
yemízo
|
30.
|
hitpacel/picel
|
9.
|
sich sammeln/sammeln
|
30.
|
xorízo
|
|
|
10.
|
sich ausbreiten/ausbreiten
|
31.
|
stamatáo
|
Class B: ni/ø
|
12.
|
sich
verändern/verändern
|
|
|
2.
|
nišbar/šavar
|
16.
|
sich entwickeln/entwickeln
|
Class B: me/ø
|
3.
|
nisraf/saraf
|
17.
|
sich verbinden/verbinden
|
3.
|
kéome/kéo
|
5.
|
niftah́/patah́
|
21.
|
sich heben/heben
|
9.
|
singendrónome/
singendróno
|
6.
|
nisgar/sagar
|
23.
|
sich umdrehen/umdrehen
|
10.
|
dhiadhídhome/dhiadhídho
|
22.
|
nigmar/gamar
|
26.
|
sich
auflösen/auflösen
|
11.
|
vithízome/vithízo
|
31.
|
neʕecar/ʕacar
|
27.
|
sich füllen/füllen
|
13.
|
tíkome/tíko
|
|
|
28.
|
sich verbessern/verbessern
|
15.
|
xánome/xáno
|
Class C: ø/hV
|
30.
|
sich spalten/spalten
|
16.
|
anaptísome/anaptíso
|
4.
|
mat/hemit
|
|
|
17.
|
sindhéome/sindhéo
|
14.
|
h́arav/heh́eriv
|
Singular classes:
|
19.
|
liknízome/liknízo
|
18.
|
ratah́/hirtiah́
|
1.
|
aufwachen/aufwecken
|
21.
|
sikónome/sikóno
|
25.
|
kafa/hikfi
|
4.
|
sterben/töten
|
24.
|
kiliéme/kilió
|
|
|
8.
|
lernen/lehren
|
26.
|
dhialíome/dhialío
|
Class D: av/ib
|
11.
|
versinken/versenken
|
28.
|
veltiónome/veltióno
|
11.
|
tavaʕ/tibaʕ
|
14.
|
kaputt gehen/kaputt machen
|
29.
|
apoksirénome/apoksiréno
|
15.
|
ʔavad/ʔibed
|
15.
|
verloren gehen/verlieren
|
|
|
20.
|
kava/kiba
|
20.
|
erlöschen/löschen
|
Singular classes:
|
|
|
22.
|
enden/beenden
|
4.
|
pethéno/skotóno
|
Singular classes:
|
|
|
|
|
7.
|
hith́il
|
|
|
|
|
8.
|
lamad/limed
|
|
|
|
|
13.
|
namas/hemes
|
|
|
|
|
24.
|
nagol/galal
|
Hindi-Urdu
|
Hungarian
|
Indonesian
|
Class A: ø/aa
|
Class A: d/szt
|
Class A: ter/me+kan
|
1.
|
jaagnaa/jagaanaa
|
1.
|
felébred/felébreszt
|
1.
|
terbangun/membangunkan
|
3.
|
jalnaa/jalaanaa
|
10.
|
terjed/terjeszt
|
10.
|
tersebar/menyebarkan
|
8.
|
parhnaa/parhaanaa
|
11.
|
elsüllyed/elsüllyeszt
|
|
|
10.
|
phailnaa/phailaanaa
|
13.
|
olvad/olvaszt
|
Class B: ø/me+kan
|
13.
|
pighalnaa/pighlaanaa
|
|
|
2.
|
patah/mematahkan
|
19.
|
hilnaa/hilaanaa
|
Class B: ø/Vt
|
4.
|
mati/mematikan
|
21.
|
uṭhnaa/uṭhaanaa
|
3.
|
elég/eléget
|
11.
|
tenggelam/ menenggelamkan
|
23.
|
phirnaa/phiraanaa
|
15.
|
elvész/elveszít
|
14.
|
binasa/membinasakan
|
24.
|
luṛhaknaa/luṛhkaanaa
|
23.
|
forog/forgat
|
20.
|
padam/memadamkan
|
25.
|
jamnaa/jamaanaa
|
31.
|
megáll/megállít
|
22.
|
selesai/menyelesaikan
|
26.
|
ghulnaa/ghulaanaa
|
|
|
26.
|
larut/melarutkan
|
29.
|
suukhnaa/sukhaanaa
|
Class C: Vlik/it
|
29.
|
kering/mengeringkan
|
|
|
5.
|
kinyílik/kinyit
|
|
|
Class B: ṭ/ṛ
|
9.
|
összegyülik/összegyüjt
|
Class C: ter/me
|
2.
|
ṭuuṭnaa/ṭoṛnaa
|
|
|
3.
|
terbakar/membakar
|
30.
|
phaṭnaa/phaaṛnaa
|
Class D: Odik/ø
|
5.
|
terbuka/membuka
|
|
|
6.
|
záródik/zár
|
27.
|
terisi/mengisi
|
Class C: a/aa
|
7.
|
elkezdödik/elkezd
|
30.
|
terbelah/membelah
|
4.
|
marnaa/maarnaa
|
22.
|
befejezödik/befejez
|
|
|
14.
|
ujarnaa/ujaarnaa
|
26.
|
oldódik/old
|
Class D: ø/me
|
17.
|
bandhnaa/baandhnaa
|
|
|
6.
|
tutup/menutup
|
18.
|
ubalnaa/ubaalnaa
|
Class E: Ul/it
|
7.
|
mulai/memulai
|
|
|
8.
|
tanul/tanít
|
|
|
Class D: u/o
|
14.
|
elpusztul/elpusztít
|
Class E: ber/meng
|
5.
|
khulnaa/kholnaa
|
24.
|
gurul/gurít
|
8.
|
belajar/mengajar
|
31.
|
ruknaa/roknaa
|
28.
|
javul/javít
|
12.
|
berubah/mengubah
|
|
|
|
|
19.
|
berayun/mengayun
|
Class E: honaa/karnaa
|
Class F: ik/tat
|
|
|
6.
|
band honaa/band karnaa
|
12.
|
megváltozik/megváltoztat
|
Class F: ø/kan
|
7.
|
šuruu honaa/šuruu
karnaa
|
19.
|
hintázik/hintáztat;
|
9.
|
mengumpul/mengumpulkan
|
9.
|
ikaṭṭhaa
honaa/
ikaṭṭhaa
karnaa
|
|
|
13.
|
mencair/mencairkan
|
16.
|
vikaas honaa/vikaas karnaa
|
Class G: ad/it
|
24.
|
menggelinding/menggelindingkan
|
20.
|
gul honaa/gul karnaa
|
29.
|
szárad/szárít
|
25.
|
membeku/membekukan
|
22.
|
xatm honaa/xatm karnaa
|
30.
|
széthasad/széthasít
|
|
|
28.
|
behtar honaa/behtar banaanaa
|
|
|
Class G: ber/me+kan
|
|
|
Singular classes:
|
16.
|
berkembang/ mengembangkan
|
Class F: Identical
|
2.
|
összetörik/összetör
|
17.
|
bergabung/menggabungkan
|
12.
|
badalnaa
|
4.
|
meghal/megöl
|
23.
|
berbalik/membalikkan
|
27.
|
bharnaa
|
16.
|
fejlödik/fejleszt
|
31.
|
berhenti/menghentikan
|
|
|
17.
|
szövetkezik/összeköt
|
|
|
Singular classes:
|
18.
|
fö/föz
|
Singular classes:
|
11.
|
ḍuubnaa/ḍubonaa
|
20.
|
kialszik/kiolt
|
15.
|
menghilang/kehilangan
|
15.
|
khojaanaa/khonaa
|
21.
|
emelkedik/emel
|
18.
|
direbus/merebus
|
|
|
25.
|
megfagy/megfagyaszt
|
21.
|
kenaikan/menaikkan
|
|
|
27.
|
megtelik/tölt
|
28.
|
bertambahbaik/ memperbaiki
|
Japanese
|
Lezgian
|
Lithuanian
|
Class A: Vr/Vs
|
Class A: Identical
|
Class A: ø/in
|
1.
|
okiru/okosu
|
2.
|
xun
|
1.
|
pabusti/pabudinti
|
6.
|
toziru/tozasu
|
3.
|
kun
|
3.
|
degti/deginti
|
13.
|
tokeru/tokasu
|
4.
|
q’in
|
11.
|
skendeti/skandinti
|
19.
|
yureru/yurasu
|
18.
|
rugun
|
18.
|
virti/virinti
|
20.
|
kieru/kesu
|
30.
|
xun
|
20.
|
gesti/gesinti
|
23.
|
mawaru/mawasu
|
|
|
26.
|
ištirpti/ištirpinti
|
24.
|
korogaru/korogasu
|
Class B: x̂/ø
|
28.
|
gerėti/gerinti
|
26.
|
tokeru/tokasu
|
5.
|
aqʰa
x̂un/aqʰajun
|
29.
|
sausti/sausinti
|
27.
|
mitiru/mitasu
|
6.
|
k’ew
x̂un/k’ewun
|
|
|
28.
|
naoru/naosu
|
7.
|
bašlamiš
x̂un/bašlamišun
|
Class B: ūp/au
|
|
|
8.
|
čir x̂un/čirun
|
2.
|
lūžti/laužti
|
Class B: er/ø
|
9.
|
k’wat’
x̂un/k’wat’un
|
14.
|
sugriūti/sugriauti
|
2.
|
oreru/oru
|
19.
|
e’čä
x̂un/e’čäǧun
|
31.
|
nutrūkti/nutraukti
|
3.
|
yakeru/yaku
|
21.
|
xkaž
x̂un/xkažun
|
|
|
30.
|
sakeru/saku
|
22.
|
kütäh
x̂un/kütähun
|
Class C: si/ø
|
|
|
|
|
5.
|
atsidaryti/atidaryti
|
Class C: ø/er
|
Class C: ø/r
|
7.
|
prasidėti/pradėti
|
5.
|
aku/akeru
|
10.
|
čuk’un/čuk’urun
|
10.
|
išsiplėsti/išplėsti
|
11.
|
sizumu/sizumeru
|
13.
|
c’urun/c’ururun
|
12.
|
pasikeisti/pakeisti
|
|
|
14.
|
čuk’un/čuk’urun
|
13.
|
išsilydyti/išlydyti
|
Class D: a/e
|
17.
|
sadsadaw q’un/sadsadaw
q’urun
|
15.
|
pasimesti/pamesti
|
7.
|
hazimaru/hazimeru
|
20.
|
tüxün/tüxürun
|
22.
|
pasibaigti/pabaigti
|
8.
|
osowaru/osieru
|
23.
|
elqün/elqürun
|
27.
|
prisipildyti/pripildyti
|
9.
|
atumaru/atumeru
|
25.
|
č’agun/č’agurun
|
|
|
10.
|
hirogaru/hirogeru
|
26.
|
c’urun/c’ururun
|
Class D: s/ø
|
12.
|
kawaru/kaeru
|
27.
|
ac’un/ac’urun
|
6.
|
klostytis/klostyti
|
17.
|
tunagaru/tunageru
|
29.
|
q’urun/q’ururun
|
8.
|
mokytis/mokyti
|
21.
|
agaru/ageru
|
|
|
9.
|
rinktis/rinkti
|
22.
|
owaru/oeru
|
Class D: x̂/ar
|
16.
|
plėtotis/plėtoti
|
31.
|
tomaru/tomeru
|
11.
|
batmiš
x̂un/batmišarun
|
17.
|
jungtis/jungti
|
|
|
12.
|
degiš
x̂un/degišarun
|
19.
|
suptis/supti
|
Class E: ø/ase
|
28.
|
qʰsan
x̂un/qʰsanarun
|
23.
|
suktis/sukti
|
16.
|
hattatu
suru/
hattatu saseru
|
|
|
24.
|
ristis/risti
|
25.
|
kooru/kooraseru
|
Class E: ø/ar
|
|
|
|
|
15.
|
kwax̂un/kwadarun
|
Class E: i/e
|
Class F: ø/as
|
31.
|
aqwazun/aqwazarun
|
21.
|
pakilti/pakelti
|
18.
|
waku/wakasu
|
|
|
30.
|
perskilti/perskelti
|
29.
|
kawaku/kawakasu
|
Class F: fin/raqurun
|
|
|
|
|
16.
|
wilik fin/wilik raqurun
|
Singular classes:
|
Singular classes:
|
24.
|
awax̂izawax̂iz
fin/awax̂izawax̂iz raqurun
|
4.
|
užmušti/mirti
|
4.
|
sinu/korosu
|
|
|
25.
|
užšalti/užšaldyti
|
14.
|
kowareru/kowasu
|
Class D: t/d
|
|
|
15.
|
nakunaru/nakusu
|
1.
|
axwaraj
awatun/
axwaraj awudun
|
|
|
Mongolian
|
Romanian
|
Russian
|
Class A: ø/V
|
Class A: se/ø
|
Class A: sja/ø
|
1.
|
serex/sereex
|
1.
|
se trezi/trezi
|
2.
|
lomat’sja/lomat’
|
3.
|
šatax/šataax
|
2.
|
se rupe/rupe
|
5.
|
otkryt’sja/otkryt’
|
20.
|
untrax/untraax
|
5.
|
se deschide/deschide
|
6.
|
zakryt’sja/zakryt’
|
25.
|
xöldöx/xöldööx
|
6.
|
se închide/închide
|
7.
|
načat’sja/načat’
|
29.
|
xatax/xataax
|
9.
|
se aduna/aduna
|
8.
|
učit’sja/učit’
|
31.
|
zogsox/zogsoox
|
10.
|
se
rǎspîndi/rǎspîndi
|
9.
|
sobrat’sja/sobrat’
|
|
|
11.
|
se scufunda/scufunda
|
10.
|
rasprostranit’sja/
rasprostranit’
|
Class B: r/l
|
12.
|
se schimba/schimba
|
12.
|
izmenit’sja/izmenit’
|
2.
|
xugarax/xugalax
|
13.
|
se topi/topi
|
13.
|
rasplavit’sja/rasplavit’
|
30.
|
xagarax/xagalax
|
15.
|
se pierde/pierde
|
14.
|
razručit’sja/razručit’
|
|
|
16.
|
se dezvolta/dezvolta
|
15.
|
terjat’sja/terjat’
|
Class C: Vgd/ø
|
17.
|
se uni/uni
|
16.
|
razvit’sja/razvit’
|
6.
|
xaagdax/xaax
|
19.
|
se legǎna/legǎna
|
17.
|
sočetat’sja/sočetat’
|
12.
|
öörčlögdöx/öörčlöx
|
20.
|
se stinge/stinge
|
19.
|
kačat’sja/kačat’
|
15.
|
xajagdax/xajax
|
21.
|
se ridica/ridica
|
21.
|
podnjat’sja/podnjat’
|
17.
|
xolbogdox/xolbox
|
22.
|
se
sfîrşi/sfîrşi
|
22.
|
končit’sja/končit’
|
21.
|
örgögdöx/örgöx
|
23.
|
se
învîrti/învîrti
|
23.
|
povernut’sja/povernut’
|
|
|
24.
|
se rostogoli/rostogoli
|
24.
|
katit’sja/katit’
|
Class D: ø/g
|
26.
|
se dizolva/dizolva
|
26.
|
rastvorit’sja/rastvorit’
|
7.
|
üüsex/üüsgex
|
27.
|
se umple/umple
|
27.
|
napolnit’sja/napolnit’
|
8.
|
surax/surgax
|
28.
|
se
îndrepta/îndrepta
|
28.
|
ulučšit’sja/ulučšit’
|
18.
|
buclax/bucalgax
|
29.
|
se usca/usca
|
30.
|
raskolot’sja/raskolot’
|
22.
|
duusax/duusgax
|
30.
|
se crǎpa/crǎpa
|
31.
|
ostanovit’sja/ostanovit’
|
26.
|
uusax/uusgax
|
31.
|
se opri/opri
|
|
|
27.
|
düürex/düürgex
|
|
|
Class B: nut/it
|
|
|
Class B: Identical
|
11.
|
utonut’/utopit’
|
Class E: ø/UUl
|
3.
|
arde
|
20.
|
gasnut’/gasit’
|
9.
|
cuglax/cugluulax
|
7.
|
începe
|
25.
|
zamerznut’/zamorozit’
|
11.
|
živex/živuulex
|
18.
|
fierne
|
29.
|
soxnut’/sušit’
|
13.
|
xajlax/xajluulax
|
|
|
|
|
16.
|
xögžix/xögžüülex
|
Singular classes:
|
Singular classes:
|
19.
|
dajvalzax/dajvalzuulax
|
4.
|
muri/ucide
|
1.
|
prosnut’sja/budit’
|
23.
|
ergex/ergüülex
|
8.
|
învǎţa/preda
|
3.
|
goret’/žeč’
|
24.
|
önxröx/önxrüülex
|
14.
|
?/distruge
|
4.
|
umeret’/ubit’
|
28.
|
sajžrax/sajžruulax
|
25.
|
îngheţa/face sa
îngheţe
|
18.
|
kipet’/kipjatit’
|
|
|
|
|
|
|
Class F: r/ø
|
|
|
|
|
10.
|
delgerex/delgex
|
|
|
|
|
14.
|
evdrex/evdex
|
|
|
|
|
|
|
|
|
|
|
Singular classes:
|
|
|
|
|
4.
|
üxex/alax
|
|
|
|
|
5.
|
ongojx/ongojlgox
|
|
|
|
|
Swahili
|
Turkish
|
Udmurt
|
Class A: k/sh
|
Class A: ø/dVr
|
Class A: ø/ty
|
1.
|
amka/amsha
|
1.
|
uyanmak/uyandırmak
|
1.
|
sajkany/sajkatyny
|
13.
|
yeyuka/yeyusha
|
4.
|
ölmek/öldürmek
|
8.
|
dyšyny/dyšetyny
|
18.
|
chemka/chemsha
|
20.
|
sönmek/söndürmek
|
10.
|
võlmyny/võlmytyny
|
24.
|
fingirika/fingirisha
|
21.
|
kalkmak/kaldırmak
|
11.
|
vyjyny/vyjytyny
|
26.
|
yeyuka/yeyusha
|
23.
|
dönmek/döndürmek
|
13.
|
ćyžany/ćyžatyny
|
29.
|
kauka/kausha
|
25.
|
donmak/dondurmak
|
14.
|
kuaškany/kuaškatyny
|
|
|
27.
|
dolmak/doldurmak
|
15.
|
ysyny/ystyny
|
Class B: k/ø
|
31.
|
durmak/durdurmak
|
23.
|
bergany/bergatyny
|
2.
|
vunjika/vunja
|
|
|
26.
|
sylmyny/sylmytyny
|
3.
|
unguka/ungua
|
Class B: Vl/ø
|
27.
|
tyrmyny/tyrmytyny
|
5.
|
funguka/fungua
|
2.
|
kırılmak/kırmak
|
31.
|
dugdyny/dugdytyny
|
9.
|
kusanyika/kusanya
|
5.
|
açılmak/açmak
|
|
|
14.
|
haribika/haribu
|
10.
|
yayılmak/yaymak
|
Class B: śky/ø
|
20.
|
zimika/zima
|
14.
|
bozulmak/bozmak
|
2.
|
tijaśkyny/tijany
|
21.
|
inuka/inua
|
26.
|
çözülmek/çözmek
|
3.
|
sutskyny/sutyny
|
22.
|
malizika/maliza
|
|
|
5.
|
ustiśkyny/ustyny
|
12.
|
geuka/geua
|
Class C: n/ø
|
6.
|
pytsaśkyny/pytsany
|
30.
|
pasuka/pasua
|
9.
|
toplanmak/toplamak
|
9.
|
l’ukaśkyny/l’ukany
|
|
|
19.
|
sallanmak/sallamak
|
12.
|
voštiśkyny/voštyny
|
Class C: w/ø
|
24.
|
yuvarlanmak/yuvarlamak
|
17.
|
gerʒ́askyny/gerʒ́any
|
6.
|
fungwa/funga
|
|
|
19.
|
vettaśkyny/vettany
|
17.
|
ungwa/unga
|
Class D: n/t
|
21.
|
ǯutśkyny/ǯutyny
|
|
|
6.
|
kapanmak/kapatmak
|
30.
|
pil’iśkyny/pil’yny
|
Class D: ø/sh
|
8.
|
öǧrenmek/öǧretmek
|
|
|
7.
|
anza/anzisha
|
|
|
Class C: Identical
|
8.
|
funda/fundisha
|
Class E: ø/tir
|
7.
|
kutskyny
|
11.
|
zama/zamisha
|
16.
|
inkişaf
etmek/
inkişaf ettirmek
|
18.
|
byrektyny
|
16.
|
sitawia/sitawisha
|
12.
|
degişmek/degiştirmek
|
20.
|
kysyny
|
19.
|
yonga/yongesha
|
17.
|
birleşmek/birleştirmek
|
|
|
23.
|
zungua/zungusha
|
|
|
Class D: sky/ty
|
25.
|
ganda/gandisha
|
Class F: ø/ir
|
16.
|
azinskyny/azintyny
|
31.
|
simama/simamisha
|
11.
|
batmak/batırmak
|
24.
|
pityrskyny/pityrtyny
|
|
|
18.
|
pişmek/pişirmek
|
28.
|
umojatskyny/umojatyny
|
Class E: ø/z
|
22.
|
bitmek/bitirmek
|
|
|
10.
|
enea/eneza
|
|
|
Class E: my/ty
|
15.
|
potea/poteza
|
Class G: ø/t
|
22.
|
bydesmyny/bydestyny
|
27.
|
jaa/jaza
|
13.
|
erimek/eritmek
|
25.
|
kynmyny/kyntyny
|
|
|
28.
|
düzelmek/düzeltmek
|
29.
|
kuasmyny/kuastyny
|
Singular classes:
|
29.
|
kurumak/kurutmak
|
|
|
4.
|
fa/ua
|
30.
|
çatlamak/çatlatmak
|
Class F:
|
28.
|
fanya
ujambo/
pata ujambo
|
|
|
4.
|
kulyny/viyn
|
|
|
Singular classes:
|
|
|
|
|
3.
|
yanmak/yakmak
|
|
|
|
|
7.
|
?/başlamak
|
|
|
|
|
15.
|
kaybolmak/kaybetmek
|
|
|
[1]
It might be worthwhile
to consider more precise definitions of such chunks of meaning as used in
typology, for example using Natural Semantic Metalanguage (Wierzbicka
1996).
[2]
The terms
“comparative concept” as used by Haspelmath (2010) and “etic
grid” as used by Levinson & Meira (2003: 487) are highly similar, if
not identical, concepts to what I call “analytical
primitive”.
[3]
For some first attempts
at comparing the meaning of language-specific expressions, see Cysouw (2007) and
Wälchli & Cysouw (2010).
[4]
It is an open question
whether different approaches to measuring meaning converge. If something like
“the” meaning exists, then this should be the case. Given the
framework for investigating meaning as sketched in this paper, this question
becomes an empirical problem.
[5]
This approach assumes
that every meaning is expressible in all human languages. The expression of a
meaning might be easier in some languages and take more effort in others, but it
is possible everywhere. However, there are various obvious complications with
this assumption; see for example Levinson (2003) for a challenge to this
assumption regarding the expression of spatial concepts. Further, I will ignore
the complications arising from the fact that most languages will have many
different ways to express a particular meaning. This is not problematic for the
goal of computing meaning similarities, but the mathematical details will become
a bit more involved.
[6]
One auspicious prospect
is that an association between a cross-linguistic metric
(“strategy”) and a language-specific metric
(“construction”) represents a generalization of what is known in
linguistics as a “hierarchy” or a “scale”. Establishing
such a correlation is not trivial because language-specific metrics cannot be
compared directly across languages (see the example at the end of Section 7.2
for a first glimpse of this prospect and see Cysouw 2008 for a more elaborate
discussion).
[7]
The primitives used in
this paper represent a somewhat special kind of lexical meaning because they are
neutral with respect to the causative/inchoative alternation. For example, the
English pair
kill/die is considered to be a single primitive here,
notwithstanding the lexical suppletion. It is important to realize that not all
languages have suppletion for the same primitives, so cross-linguistically the
pair
kill/die has to be treated as equivalent to a non-suppletive pair
like
destroy/be destroyed.
[8]
To simplify the
calculations, I have maximally included one expression for each meaning in each
language. In some cases, Haspelmath lists more than one possible expression, and
in those cases I have semi-randomly chosen one of the options. If possible, I
have discarded idiosyncratic alternations showing inchoative/causative
morphology that was not found in any other sampled expressions of the same
language. Only when all alternatives used constructions which are also found
elsewhere did I randomly select one of them. This was only necessary in a
handful of cases.
[9]
Most theories of
meaning will not have much to say about the relation between “wake
up” and “break” other than coincidental points such as the
observation that in English the metaphor
break of day is used for the
morning, which is also the prototypical time to wake up.
[10]
This reformulation
opens up the possibility of comparing the structure of lexicalization between
languages. This can be done by correlating the language-specific distance
matrices from Figure 1. In effect, each distance matrix represents the
language-specific perspective on the relation between the meanings. The
similarity between two such matrices can be interpreted as a measure of how
similarly languages deal with the coding of meanings. The details and
implications of this approach to language comparison have to be left for another
paper though.
[11]12
For this
calculation, classic multidimensional scaling was used through the
implementation “cmdscale” in the statistical environment R (R
Development Core Team 2007). All other calculations and graphs in this paper
were also produced by using R.
[13]
Haspelmath,
following up on earlier work by Nedjalkov, uses the fraction of anticausative by
causative (A/C) strategies as an index for the cross linguistic preference for
either of these strategies. The usage of this particular fraction is unfortunate
because the resulting values are very unevenly distributed (they range between
zero and infinite). I have used A/(A+C) here instead. Another possibility would
be to use log(A/C).
[14]
There are various
questionable decisions being made in this algorithm. First, it operates on
letters, where ideally it would work on sounds. Second, there is no reason to
restrict the algorithm to only insertions and deletions—also exchanges
could be used, or other operations. Further, every insertion and deletion is
equally weighted, though some might be more significant than others. And instead
of dividing by the maximum number of changes one could also use another
normalization like dividing by the average number of changes.
[15]
I thank Hagen Jung
for assistance with the implementation of this algorithm.
|