1. Introduction1 This article constitutes the second part of a joint work of reflection on how elicitation techniques and methodologies can change our view on the dialectal variation of specific language phenomena (see Cornips & Poletto 2005). Ultimately, it can influence our theoretical analyses in excluding some a priori logically possible analyses and guiding our research towards more detailed and fine grained hypotheses. We will present two case studies on sentential negation taken from the Syntactic Atlas of the Dutch Dialect (henceforth: SAND) and the Northern Italian Syntactic Dialect (henceforth: ASIS) projects in order to show how the data can drive our research up to a certain point. The overarching aim of this paper is to shed light on the hierarchy of formal properties to find out which are 'more superficial' or peripheral and can be changed by dialectal variation and which are more stable and vary only among different language groups. We will show that the distribution of negative markers displays an unexpected degree of similarity in the variation pattern, although the languages considered are Romance for the ASIS and Germanic for the SAND.
The subgoals of this paper are threefold. In our view, it is crucial to take into consideration in any study of dialectal variation in a large geographical area that:
a) gathering dialectal data is not a flat process, but involves various stages, each of which can exploit different types of tasks (see part I);
b) by means of investigating close enough languages or varieties it is possible to gather a very precise picture on the range of variation of a phenomenon, which can be blurred by interfering factors when examining languages that have a very different grammar;
c) one always have to take very seriously each piece of data, and see whether a single occurrence of a construction in a questionnaire including a large number of potential contexts in which the construction could have been present is due to interference or to external phenomena or whether it is a genuine case indicating a hidden 'iceberg' of phenomena whose surface is manifested in a very small set of data.
The SAND and ASIS case-studies are cases in point. The former reveals how the geographical distribution of a phenomenon can provide interesting clues for its analysis and permits us to distinguish between cases of interference with the stimulus (so-called task-effects, see PART I) from genuine phenomena. It is shown that geographical microvariation also provides us with the possibility of establishing more clearly whether there is a correlation between two phenomena or whether they are independent, a point which is clearly central to any analysis. The latter examines how the problem of investigating structures which are apparently optional, but hide semantically driven choices, can be solved.
By means of this paper, we hope to show that the investigation of a single phenomenon in a small area of inquiry can serve as a magnifying lens to restrict the range of possible analyses guiding our research in a way that is not possible when analysing a single language or a set of related but clearly more 'distant' languages. The general hypothesis leading our investigation conceives dialects as so closely related languages that one can in theory observe the variation range of a single phenomenon so to speak 'in vitru', without any other phenomena interfering in our experiment. In other words, dialectology is the closest way to depurate linguistic data from the interference of independent factors, a necessary condition to the fulfilling of a correct scientific investigation.
Before discussing the methodological problems which constitute the main topic of this article, we would like to briefly point out a couple of interesting theoretical problems, that have emerged from our empirical work in the ASIS and SAND atlas-projects and that are relevant for the way we conceive our dialectological investigations. First of all, investigating microvariation provides us with more refined tools for understanding how languages can minimally vary. Language variation has been extremely important for the development of the notion of universal grammar. Especially, the form in which it has been investigated by typologists, namely implicational universals, has proved extremely fruitful for linguistic research. We believe that, on the one hand, dialectology constitutes the other side of the same problem investigated by typological work, with the advantage that the field of investigation is magnified by the close similarity of the languages under investigation. On the other, microvariation might turn out to be more interesting from a very general perspective considering whether parameters are connected to each other in 'clusters' or are completely independent from one another: the type of dialectal variation found in the two projects we present here, displays an unexpected degree of similarity in the variation pattern, although the languages considered are Romance for the ASIS and Germanic for the SAND.
Baker (2001) has recently proposed the notion of macro-parameter; that is, fundamental properties that distinguish one language group from another. It is a fact that this type of property is never touched by dialectal variation concentrating on 'smaller' phenomena, which would probably go unnoticed in a typological perspective. Going back to the similarity with the biological study of families of bacteria that we used in the first part of this work, it is clear that within the same family of bacteria causing flue there is variation inside their DNA, so that one person can be immune to one subtype but not the other. However, no bacteria belonging to this family can cause cancer, as other types do. The DNA of the two types must be different in a way that is not found inside the same family. Our work is framed inside this perspective: we are trying to shed light on the hierarchy of parameters and see which are 'more superficial' or peripheral and can be changed by dialectal variation and which are more stable and vary only among different language groups.
The paper is organized as follows. In section 2 and 3 we introduce the methodology used both by the ASIS and SAND projects which has been designed for preparing syntactic atlases and includes the largest possible number of phenomena in various empirical domains (see Cornips and Poletto 2005 for a presentation of the two projects).1 In section 4 and 5 we examine the phenomenon of discontinuous or embracing negation and the phenomenon of negative concord with negative quantifiers regarding the SAND and ASIS data, respectively. Concluding remarks are presented in section 6 and 7.
2. The layered methodology One of the first problems a dialectologist is confronted with is the necessity of discovering what could be the phenomena that are subject to dialectal variation. These phenomena are investigated to find descriptive generalizations without which micro-comparative linguistic research is impossible.
In starting both the ASIS and SAND projects, the dialectologists found themselves in a similar situation, because at stage zero of their research there were for both language domains some sparse indications of how the syntax of a certain dialectal area (in our case Northern Italian Dialects (henceforth: NID) and the Dutch dialects in the Netherlands and Belgium/Flanders) could vary but only a couple of phenomena (for instance subject clitics) had been systematically investigated. Therefore, a preliminary survey was necessary; first by means of a literature scanning and then by using general questionnaires that were especially designed for testing a large set of phenomena. The form of a general questionnaire is determined by the necessity of finding new interesting phenomena, and not by that of describing in a detailed way. Therefore it contains several different types of sentences that can provide new insights more than a consistent set of examples investigating the distribution of single phenomena. When this has been done, the general properties of the area investigated were clear enough to permit a detailed analysis of single phenomena. This gave rise to a layered methodology in order to collect the data; that is, a stepwise procedure starting with a broad survey and progressively narrowing the target, producing a 'cascade' investigation which has the best chances to find something interesting for micro-comparative linguistic research. A first phase of review of the literature and first 'testing' questionnaires is necessary for any syntactic enterprise of this sort. A second phase of further more punctual investigation of single phenomena can, however, lend itself as a launching base for other discoveries, thus feeding a chain reaction of new more detailed studies. Consequently, the first positive effect of the layered methodology is a practical one: given that the area of investigation is large, it is uninteresting to make a long and expensive test to look for a phenomenon that perhaps does not even exist in a given dialect. This is the reason why a preliminary search is in order.
The second positive effect of a layered methodology has to do with the fact that in order to analyze a phenomenon, it is necessary to already know many of the syntactic properties of a language, like, for instance, whether it has verb movement or not, and to what extent, what is the basic order of the arguments and what are the restrictions on its left periphery. However, with respect to the possibilities of conducting a layered methodology, it is important to point out that it is crucially dependent on practical factors. For instance, the SAND-project had a lot of 'manpower' but was very restricted in time: all the data had to be gathered, transcribed, tagged and analyzed in three years time (Barbiers, Cornips & Kunst in press). The ASIS-project, however, has almost no 'manpower' but is in fact a longitudinal investigation of the NIDs. Of course, these practical considerations determine the nature and expansion of a layered methodology.
Notice that every linguistic research could be conceived as a layered enterprise, with a progressive and deepening analysis of the facts under consideration, this is obviously the case if we considered the history of widely discussed phenomena (like anaphors, V2, pro drop or clitics) in the literature.
However, what is meant here under the term “layered methodology” is not the ongoing discussion on the analysis of a given topic, but has to do with the way in which new phenomena themselves and relations among phenomena can be discovered and brought to the attention of the linguistic community.
Although the theoretical analysis is always part of the layered procedure, and drives our investigation and the choice of the variable under scrutiny, it can be used in turn as a tool to discover new phenomena and provide a detailed description of how linguistic systems we have in front of us work and how they are related, the ultimate question always being whether microvariation is qualitatively different from typological variation and whether it can tell us anything on the general problem of clusters of properties that might go together.
In other words, while the analysis of a phenomenon helps us to refine our theory, we can also use our theoretical framework to discover and describe new phenomena, which in turn will have an impact and modify our theoretical view of the linguistic system.
Moreover, behind our work in microvariation there is always the general question of establishing whether microvariation itself is not random but somehow driven by other properties of a given dialect or whether it is somehow limited with respect to typological variation: more specifically whether there are universally forbidden sequences, or sequences that are forbidden only when a language has other formal properties, or whether apparently unrelated phenomena go together or can reveal themselves as effects of one and the same abstract property. In other words, the layered methodology helps us find out whether microvariation can discover clusters of formal properties.
Therefore, the layered methodology is intrinsically necessary not only for theoretical analysis but also when our aim is a precise description of new phenomena. In other words we are not dealing here with the analysis of a phenomenon, but with the discovery procedure of new phenomena themselves.
How much a layered methodology can be used to refine more and more our description of variation is also a question of time span and aims. A project might have a shorter time span and therefore concentrate on phenomena that are already known in the literature to occur in a give area, while it could also be the case that the time span is not important and new phenomena and new relations among phenomena can be sought in a progression of theoretical research and field work.
2.1 The ASIS-project Regarding the ASIS project, the existence of many phenomena has been discovered simply by consulting descriptive grammars, or the AIS atlas (Atlas Italiens und Südschweiz). This atlas was primarily conceived as a lexical enterprise but contains syntactic data to a large extent, although it is not syntactically ordered. The limit of this bibliographical investigation was precisely that, hence, there was no systematic syntactic investigation on their properties although a lot of phenomena were registered for many dialects. The literature could thus be exploited for gaining a first general view of new phenomena to investigate, or could in the best case show some tendencies (see for instance Benincà (1992) who first noted on the basis of AIS charts that only those languages that have preverbal negation use a suppletive form for negative imperatives), but due to the lack of systematic ungrammatical data, it was hard to formulate any empirical generalizations and draw solid conclusions. In constructing our first survey questionnaire, our research was lead by a single phenomenon (subject clitics) which was one of the few syntactic properties that had been previously systematically investigated (see Benincà 1983, Renzi and Vanelli 1983). The first questionnaire was conceived primarily as a test for this and for other connected phenomena. So, it tried to determine whether subject clitics can occur in interrogative and relative clauses or whether they can co-occur with quantified or definite subjects, whether there are special clitics for auxiliaries or they interact with negation. Consequently, a large amount of data concerning clause types, quantifiers, auxiliaries and negation has been gathered in a rather systematic way. Once this was done, a number of different phenomena has been discovered, whose exact range of variation was still unknown. The following step in the research was the creation of a number of 'specialized' questionnaires for the single phenomena that had been found in the different dialects. The term 'specialized' has to be interpreted in two ways: specialized in the sense that this type of questionnaires is used only in those dialects that display the phenomenon (as revealed by the first inquiry), and 'specialized' in the sense that they are primarily concerned with a single phenomenon but in a rather systematic way. Notice that also for this more restricted type of investigation there is a certain amount of discovery. Even more, a stepwise procedure in collecting the data has the advantage of bringing in new data which do not always concern the phenomenon studied. For instance, investigating subject clitics in interrogative clauses led to the discovery of wh-in-situ in an area (and with properties) where this had never been registered, namely Eastern Lombardy. The specialized questionnaire that was created to analyse the properties of wh-in-situ in Eastern Lombardy has been designed primarily in the areas where the phenomenon of wh-in-situ had been found. In turn, this led to a number of new discoveries, for instance, in two Eastern Lombard dialects wh-in-situ co-occurs with what looks like the Romance counterpart of English 'do-support' (see Benincà & Poletto 2004 for a detailed analysis of this phenomenon). Moreover, although the first questionnaires did not test ungrammatical data, this has been done in the second phase when single phenomena have been described and analysed.
2.2 The SAND-project Regarding the SAND-project, a layered methodology actually consisted of four phases (Cornips & Jongenburger 2001, Barbiers et. al in press). The first phase was a comprehensive literature study of the four empirical domains examined in the SAND-project, namely negation and quantification, left periphery, right periphery and pronominal reference. All publications, i.e. articles, monographs and books and some former atlases (both lexical and syntactic) that appeared on Dutch dialect syntax were traced and all titles were fed into a database on the internet.
Such a preliminary survey of the existing literature in both the ASIS and SAND-project spots the areas where postverbal negation is found in the NIDs or preverbal negation is found in Dutch dialects. Subsequently, it eliminates unnecessary field inquiry for those dialects that do not display the phenomenon under investigation (see above section 2). On the basis of the syntactic phenomena already described in the literature, together with recent generative syntactic insights, a written questionnaire was prepared with respect to the four empirical domains containing 424 questions (including sub-questions and remarks to be made by the informants). This questionnaire was sent out and filled in by 368 subjects. The goal of the written questionnaire was threefold. First, the responses on the questionnaire provide insight in the geographic distribution of the syntactic variation investigated. Secondly, the responses show which part(s) of the Dutch-speaking area were of interest with respect to the four research topics. Finally, the results of the written questionnaire were needed as input for the next phase: the oral fieldwork. The oral fieldwork included 267 different dialects in the Netherlands and in the Dutch speaking part of Belgium and France. Also in this phase, test sentences were offered to the informants. The spoken data of these interviews involved 425 hours of speech in total. The methodology of the oral fieldwork partially differs from the ASIS one in having regionalized and multi-stage questions. More specifically, in all the oral interviews 'paths' were designed for every phenomenon to test. More concretely, the interviewer checks 'on the spot' whether a given dialect has a certain phenomenon, in which case he takes the 'path' concerning that phenomenon and controls the properties and possible range of variation of it. For instance, the sentence containing preverbal negation and a negative quantifier in (1) was only administered in a very restricted area since it is known from the literature that preverbal negation only occurs in the Dutch speaking part of Belgium and immediate surroundings:
(1) Jan en heeft niet veel geld meer
Jan not has not a lot money more
'Jan has not a lot of money'
Thus, a positive effect of the layered methodology is a practical one given that the area of investigation is large. This is one of the reasons why a preliminary search is in order. Further, sentences as in (2) were tested on the spot in order to get an insight whether they allow a double negation or a negative concord interpretation:
(2) Er wil niemand niet dansen
it wants noone not dance
Only in the case of the latter, the sentence below was administered in order to get more insight whether these dialects also allow for sentential negation with a modified postverbal nie (Barbiers 2000):
(3) Els wil niet dansen, en ze wil niet zingen ook niet
Els wants not dance and she wants not sing also not
The final phase in the SAND-project concerning data-collection involved telephonic interviews. The motivation to conduct these interviews, that is to say, to collect additional questions were that (i) the subjects had not produced a complete answer to some of the original questions in the oral interviews, (ii) they were considered decisive for a certain analysis, (iii) they replaced earlier unsuccessful questions and (iv) they checked the results of questions in the oral interviews.
What we would like to discuss in this paper is the utility of a layered methodology on the basis of two distinct inquiries on sentential negation in the Dutch and Northern Italian dialects. We will show that a flat method would have lead to wrong conclusions in the case of the interaction between sentence internal negation and negative quantifiers in Dutch dialects. Moreover, without a layered methodology it would have been impossible to discover a number of subtler meaning distinctions in the usage of postverbal negation in some northern Italian dialects.
3. Refining the methodology The second major problem a dialectologist is confronted with has to do with the reliability of the data. In general, in the generative framework not much attention is devoted to the question how data are obtained (cf. Gervain & Zemplén 2005). The researcher is often a mother tongue speaker of the language he/she analyses and relies on his/her own judgments or generally checks with a (usually not defined) number of speakers who are other linguists or people he knows. Even in those articles where the author is not a native speaker of the language(s) investigated usually no information on the elicitation techniques is provided.
However, when extensive micro-comparative work is performed, the problem of having as much comparable data as possible becomes unavoidable. The linguist is confronted with two main questions: one concerns the issue of homogeneity of the data across speakers. First, homogeneity of the data can be promoted by selecting speakers who share the same sociolinguistic variables such as age, gender, level of education and occupation. Subsequently, this will ensure that when heterogeneous data emerges, this is rather due to geographical factors than social ones. Further, as stated in PART 1, acceptability judgments can be influenced by a number of external factors. For instance, if we are testing the grammaticality of a given structure, we have to make sure that all the speakers judging the structure have in mind exactly the same interpretation of the sentence. Very often, syntactic phenomena are semantically driven, and the judgment can vary according to whether the native speaker is able to imagine an appropriate context for the sentence or not. As we will see, providing a context is one of the ways to circumvent this problem. Eliciting ungrammatical judgments is also problematic because we have to make sure that the speaker really has in mind our notion of (un)grammaticality and that the sentence is not excluded because of external factors (lexicon, intonation, phonology, pragmatic appropriateness etc.). By all means, although a questionnaire has a number of drawbacks (already examined in PART 1), it is a necessary, forced choice for the formal linguist who needs comparable data (and often exactly the same sentence, in order to have a minimal pair) across languages.
Moreover, the findings can vary according to the expectations of the inquirer: the fact that the data are gathered through a questionnaire drives the results, this is in fact a justification for the layered methodology. A stepwise procedure is needed to focus on the variables since questionnaires are always prepared without knowing exactly the variables involved in a phenomenon. This means that we have to go back to the same phenomenon, which has been discovered in the first questionnaire and narrow down the picture by trying to determine the exact range of variation of the phenomenon. At this second stage, it is necessary to select the variables according to which the phenomenon under investigation varies to create the specific questionnaire according to a first hypothesis, which in the end might turn out to be incorrect; there is evidently a lot of guessing in this procedure, nevertheless it is worth pointing out that we already know a lot from the research coming from other fields. If we take the example of negation the variables selected obviously have to be a list similar to the following (see also Van der Auwera & Neuckermans 2003):
a. position of the negative marker with respect to the verb (in V2 languages both main and embedded clauses have to be inserted, because the position of the verb varies);
b. position of the negative marker with respect to other elements located in the same area (low negation will have to be serialized with respect to lower aspectual adverbs, high negation with respect to higher adverbs, clitics, subject and the complementizer)
c. negative concord with negative quantifiers of different types (bare or phrasal ones);
d. changes of the negative marker with respect to sentence type (for instance imperatives or interrogative clauses), modality, presence of auxiliaries
Hence, the special questionnaire can be designed by constructing sentences that have one of this variables each. Sometimes, variables can also be combined in the same sentence to render the questionnaire less heavy but only if there are other examples that contains the two variables dissociated. The reason why it is better to dissociate variables is obviously that in a sentence containing two variables we do not know on which variable variation depends.
With these premises, we intend to examine the phenomenon of discontinuous negation investigated by the two projects (ASIS and SAND) and show how dialectal data can be decisive to discard some possible lines of research from the very beginning. Moreover, we will show that the distribution of clausal negation is better understood by adopting a layered methodology.
We are interested in three types of negation i.e. preverbal, embracing and postverbal negation. The discoveries of 'optional' negative markers are particularly interesting because it could lead us to a better understanding of the 'Jespersen cycle', namely that diachronic processes according to which a language having only a preverbal negative marker develops an 'optional' postverbal negative marker 'reinforcing' the preverbal negative element, which in turn becomes weaker and in the end disappears.2 Note that terms like “reinforce” or “weak” are not precise enough and we would like to gain a more detailed picture of how the cycle works. Dialectal variation is often said to be the synchronic counterpart of diachronic variation, and can thus be exploited in order to clarify syntactic processes we do not have access to anymore. Several authors (see among others Vai 1999) note that the diachronic cycle proceeds by progressively enlarging the contexts in which postverbal negation is tolerated (though not obligatory) and lead to a system that has both obligatory pre- and postverbal negation. The question arises of which precisely these contexts are, whether they can be put into a scale from the first to the last to occur with discontinuous negation and whether the scale is always the same in all languages going through the cycle. In other words, one might be interested in determining which factors allow the occurrence of the 'optional' postverbal negative marker, whether they are syntactic or semantic or both.
3.1 The ASIS-project It is well known from the literature that the position of negation varies in the NIDs: North-Western Italian dialects have only postverbal negation, while North Eastern ones have only preverbal negation. The first ASIS questionnaire contained a number of negative sentences, which were originally introduced to test the order between the preverbal negative marker and subjectclitics. Nevertheless, these data can be exploited to gather a general picture of the distribution of sentential negation across the NIDs. The picture gathered confirms what was already present in the literature: western NIDs only have postverbal negation, Central NIDs (Emilian) have discontinuous negation of the 'French' type and preverbal negation is present in the East, as presented in (4a), (4b) and (4c), respectively:: (4) a On dì mia isé Bassa Val Camonica (Bg)
it says not so
b A n' as dis brisa acsì Bondeno (Fe)
it not it says not so
c No se dize cussì Venezia
not it says so
‘We do not say so’
Interestingly, those dialects that can have exclusively preverbal negation (see (4c)) optionally admit a postverbal negative marker miga in some contexts. Importantly, the occurrence of miga in (4a) is unexpected or comes for 'free' since it has not been offered in the questionnaire:3 (5) No se dize (miga) cussì Venezia
not it says not so
‘We do not say so’
In the first questionnaire, the postverbal element (miga) is only sporadically present as it occurs in the same area only in some negative sentences, thus, not in all and only some speakers use it. From this picture we can derive the following conclusions:
(i) the fact that it does not occur in all sentences in a given questionnaire leads us to conclude that a postverbal negative marker is possible, though not obligatory in the dialect where it is found (at least one instance in one questionnaire);
(ii) the reason why it is found only in some questionnaires and not in others might be due to the following factors: either there is a diachronic change going on, so that some speakers use the postverbal negative markers, while others do not; or some speakers are simply better than others at imagining contexts.4 Thus, postverbal negation may be sensitive to a special context which, however, does not immediately come to the mind of some speakers;
(iii) the variation could just be random.
Thus, the picture provided by a first questionnaire is 'unfocussed' so to speak; that is, we know there is a postverbal negative marker but apparently it is subject to unknown restrictions yet. Notice however, that we take the data seriously in the sense that we do not discard the postverbal negative marker as a 'performance error' just because it does not occur in a systematic way. If we did, we would overlook a large number of interesting phenomena. Hence, it is quite hard to think that adding a second negative marker can be conceived as a performance error. However, there are other cases (especially when the order of some elements is concerned) that might be performance errors. Nevertheless, it is always useful to further investigate each single discrepancy in the data because it very often conceals interesting variation. Recall, however, that we should not expect from a first questionnaire that it provides all the features necessary to describe a phenomenon. Hence, a first questionnaire is not designed for that but only to detect new phenomena. Once the phenomenon has been discovered it is necessary to narrow down the field of inquiry and to create a more fine-grained questionnaire aiming to discover the range of variation of the phenomenon in question, in our case the distribution of non–obligatory postverbal negation in Veneto dialects. It is worth emphasizing that standard colloquial Italian spoken in the North also has a postverbal negative marker which has been analysed by Cinque (1977). We will see that it constitutes only one of the possible systems of postverbal negation. Vai (1999) notes that in Old Milanese V2 contexts a postverbal negative marker is more frequent. Subsequently, there seems to be both syntactic and semantic factors involved. A layered methodology is necessary in order to get more insight into these syntactic and semantic factors.