Other Formats:
PDF - 874k

Volume 8 Issue 1 (2010) DOI:10.1349/PS1.1537-0852.A.347

Note: Linguistic Discovery uses Unicode characters to represent phonetic symbols. Please see Optimizing Display for requirements to accurately reproduce this page.

Building a Semantic Map: Top-Down versus Bottom-Up Approaches

Ferdinand de Haan

University of Arizona

This paper contrasts two methods for constructing semantic maps: the top-down model and the bottom-up model. It is argued that the bottom-up approach can be illuminating in solving long-standing issues. First, a sharp distinction is made between functions and domains: functions are indivisible semantic units, and domains are sets of functions. A bottom-up model starts with the functions and works its way up to the domain level. The difference between a bottom-up and a top-down model is illustrated by looking at the problem of evidentiality and epistemic modality, specifically the question of whether the verb /must/ is epistemic or evidential. It is argued that by looking at the functions of /must / and related verbs (such as /be bound to, will/ and the Dutch cognate verb /moeten/) we can construct a semantic map that is both more accurate and more open to linguistic inquiry than a top-down map.

1. Introduction[1]

There is a growing body of literature on semantic maps[2] ; what is immediately evident from the case studies in Footnote 2 is that there are significant differences in the geometry of semantic maps and the theoretical goal of these maps. However, what most of these studies share is a tendency to start with the domain and from there to fill in the details. This is known as the top- down approach. In this paper we will examine the reversal, a bottom-up model of semantic maps. We will start with individual linguistic elements, and from an in-depth analysis of these elements we will build a semantic map.

The example we will use to exemplify the bottom-up model is the area of epistemic modality and evidentiality. Sentence (1) illustrates the issue:

(1)	John must have played soccer.

In the literature a debate is raging about the status of must.[3] While it is commonly held that must is a modal that conveys a high degree of confidence in the truth of the statement (an epistemic modal), there is a growing belief that must is an evidential, a morpheme that conveys the source of information for the speaker’s statement. In this paper it will be argued that a top-down approach does not yield a satisfactory answer and that a bottom-up yields better results. However, the focus of this paper is to argue for a particular semantic map design, and the analysis for must which is offered in this paper is merely illustrative. The particular geometry of the semantic map is applicable to other areas.

This paper is structured as follows: in Section 2 the geometry of the semantic map model is explained. Section 3 establishes a distinction between domains and functions. Section 4 builds a bottom-up model of must and related elements, while Section 5 considers the question on how to determine whether a given linguistic element belongs to a particular domain. Section 6, finally, draws some conclusions.

2 The Geometry of Semantic Maps

In this section we will look at ways to construct a semantic map model that accounts for subtle distinctions in meaning which are relevant for linguistic research. In order to do that, we will adopt a bottom-up approach rather than the more conventional top-down approach.

In a top-down approach, we start with the categories we wish to map (such as tense, aspect, possession, case, or modality) and map sub-categories if and when necessary. For example, within the overall category of tense we may want to map present, past, and future as sub-categories, and within each we may make even finer distinctions if necessary. Thus, in a top-down approach we are presented with a predefined domain on which we can map the meanings of individual morphemes. Of course, the addition of a new morpheme may mean a rearrangement of the categories, but in general that does not mean that we change the overall category itself.

In a bottom-up model we start with individual morphemes and first determine the meaning range of these individual morphemes. We do so exhaustively, identifying every possible meaning of a given morpheme, regardless of frequency, saliency, or possibly even whether the meaning is currently attested in the language. In the bottom-up approach there are no primary and secondary meanings; all attested meanings have equal status. In this respect, it is important to look at other areas of linguistics, such as corpus linguistics, in which such bottom-up techniques are widely used.[4]

Within a top-down approach, a domain such as modality or tense often has a privileged status over other domains, while in a bottom-up approach it is the linguistic element that is privileged. A typical top-down approach leads to questions such as “to what linguistic category [domain] does linguistic element X belong?” whereas a bottom-up approach leads to questions like “what is the semantic range of linguistic element X?” If we take must as an example, the top-down approach asks the question “is must an epistemic modal or an evidential?”, while in the bottom-up approach the question becomes “what is the semantic range of must?” After the semantic range of must is known, we can compare it to the range of similar linguistic elements.[5]

In the bottom-up approach advocated here the following criteria for designing semantic maps are used:

There must be a way to distinguish between domains and functions.
Functions must be both primitive and unique.
There must be a way to make predictions regarding possible languages.

Criterion 1 concerns a fact that it is not always made clear—that there is a distinction between domains and functions. Broadly speaking, a function is part of a domain, and a domain is a contiguous area in the semantic space consisting of more than one function. Anything that consists of more than one function is a domain, whether we give it a name or not. In order to distinguish domains from functions, we will represent domains by rectangles and functions by ovals. There is no particular theoretical reason for doing so, but we model our concept of semantic maps on set theory, so it makes sense to use the graphical representation of set theory.

According to Criterion 2, a function should be primitive, i.e. not divisible into other functions, and unique, i.e. it should be attested in at least one language separate from other functions. For example, if we consider A and B as prospective functions, but we find that A and B are always expressed by one and the same morpheme in any given language, then A and B are one function and we are not justified in making a distinction between A and B.

A function can be considered unique if it is expressed with a different morpheme, even if that morpheme also expresses another function. If we have three functions, A, B, and C, and we find that in languages these functions are consistently found paired together A – B and B – C, we would still consider A, B, and C different functions, even if there are no languages in which A, B, or C are found expressed by three different morphemes, or even if there are no languages in which one of the three functions is expressed uniquely by a single morpheme.

In our view, semantic maps are tools for making predictions about the distribution of meanings over morphemes across a given domain. Thus, they should be able to make predictions regarding the semantic range of morphemes and be able to state what a possible semantic range of a given morpheme in a given language is. The way to do this, following Haspelmath (1997), (2003), is to connect functions by means of arcs. In this version of semantic maps, arcs only serve to connect functions and to exclude possible languages. In some versions of semantic maps, arcs also serve as a marker of diachronic change (see, for instance, Bybee et al.’s (1994) grammaticalization paths or van der Auwera and Plungian’s (1998) map of modality). This can be done simply by adding a directional arrow to an arc. This, however, is not consonant with a pure bottom-up approach. Adding a diachronic dimension to a semantic map means essentially adding a linguistic analysis to the beginning of a semantic map, which should be done after the construction of the map, not concurrent with it. Essentially, a diachronic map is separate from the type of map we are envisioning here.[6]

3. Domains and Functions

We will now turn to the distinction between domains and functions. In the next section we will look at a concrete example, but in this section we will lay the theoretical foundation.

As discussed above, a domain is essentially a set of two or more functions. A domain can be synonymous with a predefined category such as epistemic modality, but that is by no means a necessary condition (let alone a sufficient one). Any part of epistemic modality consisting of two or more connected functions can also be considered a domain even though we may not have a name for it.

Consequently, domains can be part of larger domains. For instance, past tense is a domain because it is subdivided into smaller parts (there are remoteness distinctions in the past tense, which are expressed by separate morphemes in some languages and which by our definition are separate functions). Past tense is also part of the higher domain of tense and as such can interact with other domains (for instance, in languages with a future/non-future distinction, the domains of past tense and present tense are linked).

One of the reasons we start with the individual morphemes is the fact that morphemes can span domains. There are many morphemes that have a function belonging to one domain as their primary meaning but have secondary meanings from other domains. For instance, the verb will in English has future tense as its primary function but can express functions from other domains secondarily (such as the domains of evidentiality and volition, reflecting diachronic changes). Indeed, it is not always clear how a given sub-domain should be treated. Should, for instance, future tense be treated as a sub-domain of tense or modality (or both)?[7] In a top-down approach, such questions must be considered before drawing the map, but in a bottom-up approach such considerations are secondary to an analysis of the morphemes which are associated with the domain.

To put things another way, in a bottom-up approach, questions of the precise drawing of domains is a secondary activity to the analysis of individual morphemes and the subsequent determination of functions and their connections. This will be exemplified in the next two sections, with examples of morphemes from the modal and evidential domain.

4. Evidentiality and Epistemic Modality

To exemplify the framework outlined above, we will have a look at the area of epistemic modality and evidentiality. The question of where the boundary between these two areas lies has attracted much recent comment in the literature (see De Haan (1999), (forthcoming); Aikhenvald (2004), van der Auwera and Plungian (1998), Palmer (1986), (2001) for discussion). There are many opinions and analyses regarding the interrelation between evidentiality and epistemic modality. Some of the most common positions are listed in (2). Other positions are possible (and have been defended in the literature), but those shown in (2) represent the most commonly attested analyses.

(2)		positions on epistemic modality and evidentiality
	a.	evidentiality is part of epistemic modality	(Palmer 1986)
	b.	evidentiality and epistemic modality are part of a larger domain	(Palmer 2001, the larger domain is called propositional modality )
	c.	evidentiality and epistemic modality partially overlap	(van der Auwera and Plungian 1998, the area of overlap is inferentiality)
	d.	evidentiality and epistemic modality are separate domains	(De Haan 1999, Aikhenvald 2004[8] )

These analyses usually involve a top-down approach: domains such as evidentiality and epistemic modality are defined, and then the definitions are applied to individual morphemes. In order to determine whether a modal such as, say, English must is an epistemic modal or an evidential, preexisting definitions are applied. Since it is typical for morphemes to have more than one meaning, different studies may place their emphasis on different parts of the meaning range. This does not make them wrong, but it makes them hard to compare.

We will now see how we can use the bottom-up model in the area of epistemic modality and evidentiality. Van der Auwera and Plungian (1998) propose a general map of modality with very broad semantic areas, such as epistemic possibility and epistemic necessity. In this paper we use finer distinctions to map the various meanings of the various modals involved, exemplified by such elements as English must and Dutch moeten.

This alternative analysis does not mean that we disagree with the map proposed by van der Auwera and Plungian. Rather, this situation is like the one described in Section 2: while epistemic modality is taken as part of a higher domain in van der Auwera and Plungian (1998), it is taken here to be a separate domain with its own functions. Of course, epistemic modality is also part of a greater domain (that of modality as a whole, which may be part of an even greater domain).

4.1 English must

In de Haan (forthcoming) it is argued, based on corpus research, that English “epistemic” must is not typically used to mark the (high) degree of confidence on the part of the speaker in his or her utterance. It is rather used to mark an evaluation of evidence. This analysis is based on data such as the following:[9]

(3)	It sounded like a fair enough invitation, Peter Marshall reflected, and Bang-Jensen must have thought so, too, because on the thirteenth, he met the group of three on the thirty-sixth floor of the U.N.
(4)	How many inches do you average a year?
	– Oh, I don’t know, but we’re way, we must be way above average this year because it’s been terrible.
(5)	… it was in the late, well I guess it would have been, I will take that back, it must have been in the forties, because they had been married, uh, probably fifteen years at the time.

In this set of data (and others representative for must as a whole) it can be seen that the range of degree of confidence varies from a high degree to a level that is not much above that of an educated guess (note the use of the adverb probably in (5)). What is constant across these examples is the presence of evidence in the context on which the statement is based. In these examples evidence is placed in a separate because-clause. What these examples then have in common is the fact that the sentences are all based on an evaluation of evidence, and this is considered to be a basic function in the sense of Section 2 (as it is indivisible) and one we use in our semantic map.

Leaving aside the “obligation” use of must, we find no other function for English must, and we can draw a semantic map for must as follows (shortening “evaluation of evidence” to “evaluative”):

Figure 1: Semantic map of must

Given that we have introduced the notion “evaluation of evidence”, the question of whether we are still dealing with epistemic modality or whether must under this analysis is now an evidential becomes more acute. However, any answer at this stage would again depend on individual definitions and belief, and any answer along these lines would be arbitrary and hence meaningless.

Instead, in accord with the bottom-up approach, we must compare “evaluative” must (which we will use as a shorthand) to other, related morphemes, both within English and cross-linguistically. We will start by comparing must to an English modal, be bound to, and to its cognate Dutch verb, moeten.

4.2 English be bound to

Palmer (1990:55, see also Coates 1983:42-3) notes that be bound to and must are almost in complementary distribution: be bound to usually refers to future events, and must refers to past or present ones. But, although Palmer’s corpus does not include examples of be bound to with present or past tense reference, they do exist. The following example is again from the Brown corpus:

(6)	He handed the bayonet to Dean and kept the pistol. “Stay well back of me”, he said. “I’m going to walk up to the horses, bold as brass, pretending I’m one of the guerrillas. There’s bound to be someone on guard, but the hat might fool them long enough for me to get close.”

Even in those cases where be bound to and must have overlapping distributions the meanings differ. According to Palmer (1990:55), be bound to is “more certain” than must, and an appropriate paraphrase of the former is “ it is certain that… ”. Also, and importantly for the present discussion, be bound to lacks the “conclusion from evidence” sense that must has. Palmer (ibid.) contrasts the following two sentences:

(7)	a.	John is bound to be in his office.
	b.	John must be in his office.

In (7a) the emphasis is on the inevitability of the truth of the statement[10] , while in (7b), as discussed above, the emphasis is on the drawing of a conclusion from evidence. We therefore consider be bound to a morpheme that is used to mark a high degree of confidence in the truth of the statement and we will consider the function “strong epistemic modality”, abbreviated “strong” to be a primitive function. The modal be bound to is then mapped as follows:

Figure 2: Semantic map of be bound to

Since both must and be bound to in their “epistemic” senses are non-overlapping, it is impossible to determine from just these two verbs whether they should be connected with an arc. We therefore need to examine other linguistic elements with a similar semantic range. Based on the semantic maps in Figures 1 and 2, we can say that must and be bound to are not synonyms since they have no functions in common and hence are non-overlapping.

4.3 Dutch moeten

The Dutch verb moeten “must” is cognate with English must, but it has a much wider range of “epistemic” and “evidential” readings. There are (at least) three different functions in the meaning range of moeten (leaving aside the “obligation” readings):

strong epistemic modality
evaluation of evidence
assertion of indirect evidence

The first two are identical to the two functions already discussed in connection with the verbs must and be bound to. The third one is new (i.e., not found in the other two verbs). It occurs in contexts such as the following:

(8)	IJje Wijkstra was timmerman en klompenmaker, hij stroopte, was op zijn vrijheid gesteld en had een hekel aan autoriteit. Maar voor IJje hield de wereld niet op bij de harde strijd om het dagelijkse bestaan. Hij las boeken over spiritisme en occultisme, waagde zich aan Hegel en Nietzsche en moet zelf een boek hebben geschreven, ‘Dualisme van het Heelal’, al is het manuscript daarvan nooit gevonden.
	‘IJje Wijkstra was a carpenter and maker of wooden shoes, he was a poacher, loved his freedom and hated authority. But for IJje the world did not end with the harsh struggle for daily survival. He read books about spiritualism and the occult, dared to tackle Hegel and Nietzsche and allegedly wrote a book himself called “Dualism of the Universe”, but the manuscript has never been found.’
	(Dagblad van het Noorden, February 11, 2003)

This fragment of the newspaper article is a descriptive list of the subject, IJje Wijkstra, and his characteristics. Given that the author of the article has never personally met Wijkstra, the entire passage is based on indirect evidence.[11] The use of the verb moet here is not an instance of a strong (epistemic) modal since this sentence is not any more doubtful than the previous sentences in the passage (as it is entirely based on indirect evidence). Its use signifies that there is evidence for the statement made. The clause in which moet occurs can then be analyzed as:

(9)	There is evidence that W. wrote a book.

The author of the article does not state on which evidence he bases his statement. It is not present in the context (unlike in the English examples, (3)-(5) above). The evidence is abstract, it can be hearsay, or based on tangible evidence, but we as readers do not know what that evidence is. All we know is that there is (indirect) evidence. We analyze this use of the verb moeten as an assertion of (indirect) evidence. It is merely stated that there is evidence (of whatever kind) for the utterance, but that that evidence is not evaluated. The difference between an evaluative and an assertive element is that in the first case the evidence is evaluated, while in the second case it is merely asserted.

Evidence that they are indeed distinct categories can be found in the fact that must is not an appropriate translation of moeten in (8). This is the only possible diagnostic test in a bottom-up model: the test of translational equivalency. If we find that a given element occurs in the exact same environment (or context) as another element (either in the same language or cross-linguistically) then they share the same primitive function.

This assertion of indirect evidence is the third primitive function, and in Dutch moeten we have a morpheme that is more complex than either must or be bound to. The semantic map for moeten under this analysis is shown in Figure 3:

Figure 3: Semantic map for Dutch moeten

Given that there are three functions for Dutch moeten under this analysis, strong epistemic modality, evaluation of evidence, and assertion of evidence, this means that these functions are related and must be connected with arcs. There are a number of ways in which we can arrange the three functions, but we will assume here that the semantic map for moeten is the one shown in Figure 3.

Comparing Figure 3 with Figures 1 and 2, we can see that Dutch moeten overlaps with both must and be bound to and that we have a case here in which one morpheme can be synonymous with two others without the entailment that these two morphemes themselves are synonyms.

4.4 Swedish lär

The Swedish verb lär is used to express indirect evidence, most often hearsay. [12] In grammars it is usually considered a modal verb, despite the fact that it has an unorthodox morphology. Lär is not conjugated for tense and must appear before all other modal verbs in the sentence. A typical example is shown in (10):

(10)	Hannah	*lär*	ha	studerat	norska.
	Hannah	LÄR	have	study.SUP	Norwegian
	‘Hannah is said to have studied Norwegian.’

This interpretation of lär is assertive, i.e. it marks that the statement is based on evidence without evaluating that evidence. In this function, lär is a synonym of Dutch moeten because they occur in the same environment. There is overlap, but no full identity, between lär and moeten because both verbs have the function of assertion of (indirect) evidence. However, moeten has functions which lär does not have, and lär has one function that moeten lacks. This function is exemplified in (11):

(11)

Några

mål

på

hörnor

och

frislag

lär

det

inte

bli

any

goals

corner.PL

and

free.kicks

L�R

NEG

become

world.cup

‘There won’t be any goals on corners or penalties in the world cup.’

This is what we will call the predictive use of lär. This function is defined as an assertion of evidence for an event in the future. It occurs (almost) exclusively in sentences with future meanings. This is a different function from the assertive, as is evidenced by the fact that lär in a sentence like (11) cannot be translated by moeten, must, or be bound to.

In the semantic map we must link the two functions assertion and predictive as they are shared by one and the same morpheme. The verb lär does not have any other functions, either epistemic or deontic. This means that the full semantic map for lär can be expressed as follows:

Figure 4: Semantic map for Swedish lär

As can be seen from the linguistic elements discussed, constructing a bottom-up semantic map involves putting together a number of smaller maps into one larger one. We will take one more linguistic element and then construct a larger semantic map from the smaller pieces.

4.5 English will

It has often been noted that English will can have an evidential-like interpretation.[13] The prototypical example is (12):

(12)	[The doorbell rings.] That will be the postman.

This use of will is predictive, i.e. identical to the use of lär in sentence (11) above. There is evidence (the sentence [or event] the doorbell rings), and the sentence “ that will be the postman ” is the event for which the truth value will not be known until a time in the future (namely, when the door is opened). The only difference between predictive lär and predictive will is that predictive will can refer to events in the past, present, and future (as long as the truth value will not be known until a time in the future), while predictive lär can only be used for future events. An example of past predictive ( will have been) from Coates (1983:179) is shown in (13):

(13)	And my mother is not drunk. Several people in the house will have said that to you.

This gives the following semantic map for predictive will.

Figure 5a

Figure 5a: Preliminary semantic map for English will

There is a link between prediction and future. Of course, the primary meaning of will is to mark future, so we can add the future domain to Figure 5a.[14] The same is true for lär. Up until about the 19 ^th century, this verb could denote pure futurity, and it can still do so in Swedish dialects. This points to a diachronic development from future to prediction. It also shows the importance of mapping all possible functions of a given morpheme or construction because that can give us valuable clues to possible diachronic developments and grammaticalization pathways.

We can add future to our semantic map of will, as is done in Figure 5b. Note that future is mapped as a domain, rather than a function, because it comprises more than one function.

Figure 5b

Figure 5b: Semantic map for English will

In this way, we could continue adding functions to our semantic map, but we will stop here as the principle of the bottom-up model is clear. In Figure 6, the entire semantic map for the functions discussed is shown. Again, note that the order of elements is not important, but the way they are connected is. The future is shown with a rectangle to mark that we are dealing with a domain, rather than a function.

Figure 6: Amalgamated semantic map

5. What is an Evidential?

We will now return to the matter of how to distinguish between epistemic modals and evidentials. We will approach this question by looking at domains and their interactions. In (2a-d), four different positions regarding the interaction of evidentiality and epistemic modality are shown. These different positions correspond to four different ways of drawing semantic maps. In Figures 7a-d, these different maps are shown:

Figure 7a

Figure 7a: Evidentiality is a part of epistemic modality

Figure 7b

Figure 7b: Evidentiality and epistemic modality are part of a larger domain

Figure 7c

Figure 7c: Evidentiality and epistemic modality overlap in one area, but are otherwise distinct domains

Figure 7d

Figure 7d: Evidentiality and epistemic modality are distinct domains

Figure 7a represents the position of Palmer (1986). Figure 7b is that of Palmer (2001), while 7c corresponds to van der Auwera and Plungian (1998). Figure 7d finally shows the position of De Haan (1999). The maps shown in Figure 7 are drawn from the perspective of domains; it is a top-down perspective. That is to say, first the domains and their boundaries are established, then the linguistic material is added. If we try to map the functions defined in the previous section into any of these domains, we run into problems. There are a number of ways in which we could conceivably map the domains of evidentiality and epistemic modality onto the domains, and there is no good a priori way to do so. Figures 8a-b show two possible ways of mapping. In Figure 8a, epistemic modality and evidentiality are separate domains, while the two domains overlap in Figure 8b.

Figure 8a

Figure 8a: Evidentiality and epistemic modality are separate domains

Figure 8b

Figure 8b: Evidentiality and epistemic modality are overlapping

The map in Figure 8a is compatible with the positions of Palmer (2001) and de Haan (1999), while Figure 9b is reminiscent of the position of van der Auwera and Plungian (1998). [15] There are other possibilities of drawing the boundaries between the two domains. For instance, it is possible to consider all three functions—strong epistemic, evaluative, and assertive—to be part of epistemic modality (per Palmer 1986), in which case the entire map (except for perhaps the domain of future) would be enclosed in a rectangle. However, even in Palmer (1986), it is conceded that there are differences between epistemic and evidential morphemes, which points to an analysis in which the two domains are separate.

Another possibility for drawing domains can occur when more than one function belongs to more than one domain. This is similar to Figure 8b but with overlap of more functions. It is possible to consider both strong epistemic and evaluative as both epistemic and evidential. This is similar for evaluative and assertive. In other words, there are a number of ways in which functions can be mapped on domains, and which one is chosen depends to a large degree on one’s theoretical assumptions (a top-down way of thinking). Note that whatever domain configuration is adopted, the underlying semantic map does not change. That is to say, the map of functions and arcs of Figure 6 stays the same no matter whether we choose the domain arrangement of Figure 8a or 8b. It can therefore be argued that the notion of domain is secondary to the representation of functions and their connections. Even though it is often considered convenient to start at the level of a domain, in reality a domain is a construct that should only be applied after the semantic map is in place. It is therefore impossible to argue for the wrongness of Figure 8a or 8b as that depends on theoretical considerations outside of linguistic data. It is, however, possible to argue against the analysis of the functions and connections in Figure 6 as that can be refuted by linguistic data.

This principle of determining domains after drawing semantic maps applies to all domains, and, consequently, there is only one meaningful domain, namely the entirety of the semantic space, which we will call Σ. The only reason this domain is meaningful is that there is nothing beyond Σ. Any semantic domain, be it epistemic modality or evidentiality , or any non-modal category for that matter, forms part of Σ. This is due to the fact that the semantic map model is non-hierarchical. There are no dominance or inheritance relationships at work, unlike in other frameworks such as X-bar theory. Even though it is true that domains can be contained within other domains (for instance, epistemic modality is contained within the larger domain of modality ), the relation is not hierarchical but one of scale: a smaller domain can make finer distinctions. [16]

The question of whether must is an evidential or not can be rephrased as: to which semantic domain does must belong? If we look at the question from this point of view it becomes obvious that the answer depends on how one draws the respective domains, on how one wishes to divide Σ. What can be said is that English must and Dutch moeten differ in their semantic range and that the two modal verbs are not identical. This means that, relative to these two languages, either analysis can be defended and the selection of Figure 8a or 8b is a matter of choice. What is necessary to resolve the issue is to compare must to morphemes in other languages that are undisputedly evidential. For instance, we can compare must and inferential evidentials from the Eastern Tucanoan language Tuyuca (Barnes 1984). [17]

	Tuyuca
(14)	díiga	apé- yi
	soccer	play-3SG.MASC.PAST.INFER [18]
	‘He played soccer. (I have seen evidence that he played: his distinctive shoe print on the playing fields. But I did not see him play.)’

The question is: is the English sentence “ he must have played soccer ” an appropriate translation of the Tuyuca sentence “ díiga apéyi”? The answer depends on whether the semantic ranges of must and the Tuyuca inferential overlap, because the only instance in which must can be an appropriate paraphrase of the Tuyuca inferential is when the inferential covers at least partially the same range of the semantic map as must. Based on the semantic map in Figure 6, must would be an appropriate translation if the inferential is used for the evaluation of evidence (the evaluative). If the inferential is used for other meanings, such as assertion of evidence, but not evaluation, must would not be an appropriate translation of the Tuyuca inferential because the semantic maps do not overlap at that point.

The only thing non-specialists in Tucanoan languages have to go on is the English translation and additional explanation provided in Barnes (1984). The Tuyuca sentence (14) is translated as a simple sentence without a modal verb, while the accompanying explanation provides additional contextual clues. Since no mention is made of doubt on the part of the speaker, it would appear that the Tuyuca inferential and must have non-overlapping ranges, which would mean that the sentence “ he must have played soccer ” is not an appropriate translation.

However, such a conclusion is dependent on negative information, which is precisely what we wish to avoid and precisely why we need a bottom-up model. The point here is not to provide an analysis of the Tuyuca evidential system or to give an analysis of must, but rather to provide a means with which to perform such an analysis that is not dependent on a priori assumptions of status. Once a bottom-up semantic map has been drawn for the Tuyuca inferential, the answer to the question of which domain it belongs to will follow.

The question “is must an evidential?” is then rephrased in a bottom-up model as “how close is the meaning range of must to the meaning range of other linguistic elements?” It is by comparing the meaning ranges of linguistic elements that we can construct domains: if we find that a given meaning range on a semantic map occurs over and over again in the world’s languages, then we can assign domain-status to that segment of the semantic map because it is obviously linguistically salient.

6. Conclusion

In this paper we have constructed a bottom-up model of a semantic map for the area of epistemic modality and evidentiality. This approach is justified by the fact that these two notions are inherently vague and open to interpretation. The focus is therefore shifted to the status of individual linguistic elements. We have seen that a semantic map is eminently suitable for considering questions of modality and status of individual modal elements. We argued that if notions such as evidentiality or epistemic modality are too vague, we need to replace them with more precise notions such as evaluative and assertive. These notions can be mapped as primitive functions in a semantic map model. While it is not claimed that the bottom-up model is superior to a top-down model in every instance, the advantage in the area of modality and evidentiality is definite. [19] With this approach, we can attempt other thorny issues in modality, such as the status of realis and irrealis.

References

Aikhenvald, Alexandra Y. 2004. Evidentiality. Oxford: Oxford University Press.

Anderson, Lloyd B. 1982. The “perfect” as a universal and as a language-particular category. Tense-aspect: Between semantics and pragmatics, ed. by Paul J. Hopper, 227-64. Amsterdam: Benjamins.

Anderson, Lloyd B. 1986. Evidentials, paths of change, and mental maps: Typologically regular asymmetries. Evidentiality: The Linguistic Coding of Epistemology, ed. by Wallace Chafe and Johanna Nichols, 273-312. Norwood, NJ: Ablex.

Barnes, Janet. 1984. Evidentials in the Tuyuca verb. International Journal of American Linguistics 50.255-271. doi:10.1086/465835

Bybee, Joan, Revere Perkins, and William Pagliuca. 1994. The evolution of grammar: Tense, aspect, and modality in the languages of the world. Chicago: University of Chicago Press.

Coates, Jennifer. 1983. The semantics of the modal auxiliaries. London: Croom Helm.

Croft, William. 1991. Syntactic categories and grammatical relations . Chicago: University of Chicago Press.

-----. 2001. Radical Construction Grammar: Syntactic theory in typological perspective . Oxford: Oxford University Press.

-----. 2003. Typology and universals, 2nd edition. Cambridge: Cambridge University Press.

de Haan, Ferdinand. 1999. Evidentiality and epistemic modality: Setting boundaries. Southwest Journal of Linguistics 18.83-101.

-----. 2005. Modality in Slavic and semantic maps. Modality in Slavonic Languages: New perspectives, ed. by Björn Hansen and Petr Karlik. München: Sagner.

-----. 2009. On the status of “epistemic” must. Modality in English, ed. by Roberta Facchinetti and Anastasios Tsangalides. Bern: Lang.

Dooley, Sheila and Ferdinand de Haan. 2006. Evidentiality and epistemic modality: Swedish lär. Unpublished manuscript, University of Arizona.

Haspelmath, Martin. 1997. Indefinite pronouns. Oxford: Oxford University Press.

-----. 2003. The geometry of grammatical meaning: Semantic maps and cross-linguistic comparison. The new psychology of language: Cognitive and functional approaches to language structure, ed. by Michael Tomasello, vol. 2, 211-42 . Mahwah, NJ: Erlbaum.

Jurafsky, Dan and J. H. Martin. 2000. Speech and language processing: An introduction to natural language processing, computational linguistics and speech recognition. New York: Prentice Hall.

Kemmer, Suzanne. 1993. The middle voice . Amsterdam: Benjamins.

Lichtenberk, F. 1991. Semantic change and heterosemy in grammaticalization. Language 67.475-509. doi:10.1353/lan.1991.0009

Palmer, Frank R. 1986. Mood and modality. Cambridge: Cambridge University Press.

-----. 1990. Modality and the English modals, 2nd edition. London: Longman.

-----. 2001. Mood and modality, 2nd edition. Cambridge: Cambridge University Press.

Tognini-Bonelli, E. 2001. Corpus linguistics at work. Amsterdam: Benjamins.

van der Auwera, Johan and Vladimir Plungian. 1998. Modality’s semantic map. Linguistic Typology 2.79-124. doi:10.1515/lity.1998.2.1.79

Author’s contact information:

Ferdinand de Haan

University of Arizona

Tucson, AZ

85721 USA

fdehaan@u.arizona.edu

[1] I am grateful to the editors and to an anonymous reviewer for helpful comments. All remaining errors are my own.

[2] A comprehensive introduction to semantic maps is Haspelmath (2003). For a full discussion on the usefulness of semantic maps in typology see Croft (2003:133ff). Some areas of language for which semantic maps have been proposed are: the perfect (Anderson 1982), evidentiality (Anderson 1986), voice (Kemmer 1993), case (Croft 1991), coming and going (Lichtenberk 1991), modality (van der Auwera and Plungian 1998; De Haan 2005), and indefinite pronouns (Haspelmath 1997). In addition, semantic maps play a prominent role in Radical Construction Grammar (Croft 2001).

[3] The analysis is based on de Haan (forthcoming).

[4] See, for instance, Tognini-Bonelli (2001) who draws a distinction between the corpus-based approach and the corpus-driven approach. These two approaches essentially correspond to the difference between top-down and bottom-up, respectively. The corpus-based approach starts with an analysis and uses corpus data to confirm or deny that analysis. The corpus-driven approach starts with data (often an exhaustive analysis of individual words or phrases) and may or may not link these data with other bits of data. In practice, often a mix of the corpus-based and corpus-driven approach is seen.

[5] It is certainly not implied here that a top-down approach is less concerned with an exhaustive analysis of linguistic material than a bottom-up approach, far from it. The difference lies in the shift of focus from the domain to the linguistic material, a subtle but essential distinction.

[6] Of course, a synchronic semantic map is an excellent basis for the exploration of diachronic changes. See, for instance, the discussion on Swedish lär and English will in Section 4.5 below.

[7] Future tense is a domain in and of itself because, like past tense, it can have remoteness distinctions plus various shades of modality.

[8] The placement of Aikhenvald (2004) is difficult because she is more concerned with what evidentiality is not than with giving criteria for what it is. Hence, we have disregarded that study here.

[9] Data in (3)-(5) comes from the Brown and Switchboard corpora.

[10] Palmer feels that be bound to can almost be paraphrased with “ it is certain that… ” . Note that the “inevitability” sense of (7a) can be qualified by adding the adverb almost (1990:56), while must cannot be so modified. This is a contextual clue that the two verbs are non-synonymous.

[11] The article dates from 2003, the events described happened in 1929.

[12] This discussion is based on Dooley and de Haan (2006).

[13] See de Haan (forthcoming) for details and citations.

[14] There are other functions in the meaning range of will, such as habituality and volition, but they have been disregarded here.

[15] In these studies, no distinction is made between domains and functions, so there is no complete correspondence between the semantic map model proposed here and the studies described. For instance, van der Auwera and Plungian of course make no distinction between the strong epistemic and evaluative functions, so Figure 8b is not isomorphic with their Table 3 (1998:86).

[16] A good analogy is geographical maps. A map of the United States that takes up one page in an atlas is drawn to a larger scale than the state of Arizona, which also takes up one page. Arizona is also contained in the map of the United States, but with less detail. Arizona is part of the US, just as epistemic modality is part of modality as a whole.

[17] Tuyuca is chosen as an example here because it is a justly famous case. Any other example would do just as well as it is the methodology that matters here, not the particular analysis.

[18] INFER – inferential evidential; MASC – masculine; PAST – past tense.

[19] The top-down vs. bottom-up dispute is well-known in other areas. In some cases, a merger of the two approaches has proved to be the most effective way. For instance, in NLP parsing-applications a combination of top-down and bottom-up parsing rules is currently the most effective way to go (see Jurafsky and Martin (2000) for details). See also Footnote 2 above.