Extraordinary claims rEquirE Extraordinary EvidEncE ( and ordinary onEs rEquirE ordinary EvidEncE ) : on ExpErimEntal linguistics for lEss wEll studiEd languagEs

O falecido fisico Carl Sagan, citado na primeira parte do titulo deste artigo, formulou com habilidade a visao do senso comum sobre a natureza da evidencia nas ciencias maduras. Em linguistica, no entanto, a evidencia tornou-se um assunto controverso, especialmente quando se trata da investigacao das linguas menos estudadas. Neste artigo, defendo que o principio de Sagan deve ser aplicado a linguistica. A acessibilidade crescente a uma grande variedade de tecnicas experimentais e ferramentas computacionais para analisar dados linguisticos torna viavel apoiar propostas extraordinarias a partir de evidencias de uma grande variedade de fontes. Ao mesmo tempo, e, em muitas casos, possivel chegar a um acordo sobre o que constitui uma proposta cientifica comum, nao extraordinaria, deixando para concentrar qualquer esforco extraordinario apenas para apoiar propostas igualmente extraordinarias. Para propostas nao controversas nao e necessario mais do que um minimo de esforco para se estabelecer e documentar as evidencias.

PalavraS-Chave introduction evidence has become a topic generating substantial discussion in theoretical and descriptive linguistics.For instance, the University of Tübingen in Germany organizes a biannual conference entitled "linguistic evidence" since 2004.The conference describes itself in the call for papers for the 2014 conference as "a meeting place for linguists who wish to improve the empirical adequacy of linguistic theory and linguistic analysis."and aims to "more closely integrate data-driven and theory-driven approaches" 1  Implicit in this description is the view that the empirical adequacy of linguistic theory is open to improvement because the theory has also several recent journal contributions focus on the methodology of collecting evidence to address questions in linguistic theory.I discuss below a debate concerning evidence in theoretical syntax and semantics focusing on english data (GIBSOn andFedOrenkO, 2010, 2013;SPrOUSe and alMeIdaM, 2013;SPrOUSe et al. 2013), but also (MaTThewSOn, 2004;dIxOn, 2007).I hope to show that despite all the discussion within linguistics, the same view towards evidence used in the established sciences can also be applied in linguistics.SaGan (1980) aptly phrased this principle as I quote it in the title of this paper: "extraordinary claims require extraordinary evidence".articulate some general principle relating to evidence.In particular, I show the Sagan's principle has been the common view of the relationship between theoretical claims and empirical evidence for centuries dating back at least to hUMe (1748).The second part of my title is in parenthesis ("and ordinary ones require ordinary evidence") because it remains an implicature in Sagan's formulation.But I show that it has been understood as part of the principle since the beginning.I then provide a suggestion of how to understand the terms "extraordinary" and "ordinary" within Sagan's principle on the basis of prior likelihoods and the cost of mistakes: an extraordinary claim is one likely to cause high costs in the case of a mistake.In the second section, I review a recent debate concerning the sources of evidence for the study of well-studied languages (GIBSOn & FedOrenkO, 2010;SPrOUSe & alMeIda, 2013), and show that the outcome of the debate has essentially been Sagan's principle.In the second part, I review recent contributions on the methodology of the study of less well studied languages, especially by dIxOn (2007) and MaTThewSOn (2004), and point out some weaknesses of dixon's text-only method.But while I agree with Matthewson's inclusive approach for most cases, I show that there are cases where one should call upon experimental MaTThewSOn 'S own (2006) work on presuppositionality in Salish as a case in point.Instead of dixon and principle should be applied as a guideline for the study of less well studied languages as well.This entails that both evidence from traditional have a role to play.For many questions, though, a combination of modern recording technologies is most appropriate as the main source of for when to incorporate formal experiments to gather quantitative SaGan (1980): "extraordinary claims require extraordinary evidence."In the title, I furthermore articulate an implicature of Sagan's principle, namely that ordinary claims require only ordinary evidence.In this section, I argue that the Sagan's principle as well the implicature I added are an established tenet of the philosophy of science dating back to at least hUMe (1748).I furthermore consider how it can be applied to evidence in linguistics.history of science.I quote Sagan's principle here from a science Tv program Sagan appeared in.while Sagan's phrasing is widely quoted and original to Sagan, it is also well-known that the underlying principle is much older than Sagan's formulation of it.Two much older formulations of essentially the same principle are due to hUMe (1748) and laPlaCe (1814).Two relevant quotes from section 10 of hUMe'S (1748) book are "a wise man ... proportions his belief to the evidence" and "no such a kind that its falsehood would be more miraculous than the fact which it endeavors to establish."hume calls the latter quote a "general maxim worthy of our attention", so we might also use the term hume's maxim instead of Sagan's principle.a second, early relevant quote is the following from the French scientist laPlaCe (1814: p. 50): "Qu'it ne serait pas philosphique de nier les phénomènes, uniquement parce qu'ils sont inexplicable dans l'état actuel de nos connaissances.Seulement, nous devons les examiner avec une attention d'autant plus scrupuleuse, to deny phenoma solely because they are inexplicable according to the present state of knowledge.But we ought to examine them with an them.")wIkIPedIa (2014b) reports that FlOUrnOy (1899) reformulated laplace's principle as: "The weight of the evidence should be proportioned to the strangeness of the facts.",which might have inspired the more modern formulation of Sagan.The quotations show that Sagan's principle is an old principle of science.
Consider now the implicature of Sagan's principle that ordinary claims require only ordinary evidence.In hume's and Sagan's formulations, the implicature I added in the title is not made explicit.Conditionals "If p then q" are well known in the linguistic literature to trigger an implicature "If not p then not q", also referred to as conditional perfection (GeIS & ZwICky, 1971 and others).For example the recommendation "If it's raining, you should take an umbrella" implicates that if it's not raining, you shouldn't take an umbrella.Sagan's principle is just an elegant formulation of the conditional "if you make an extraordinary claim, 126 you must present extraordinary evidence for it."So, it implicates that ordinary claims require only ordinary evidence.The same holds for the second quote from hume above: The statement that no evidence except a non-miracle.actually both hUMe (1748) and SaGan (1980) had a reason to leave the implicature implicit: The quotes of hume's are from a chapter on miracles, so hume was focussing on the case of extraordinary claims.a similarly reason applies to Sagan's quote.The Sagan quote is inspired by the rather similar phrase "an extraordinary claim requires extraordinary proof " by the sociologist TrUZZI (1978) according to wIkIPedIa (2014a).while I have no opinion on whether Sagan was actually quoting Truzzi, it is of interest to the present argument that both Sagan and Truzzi were fellow founders of the Committee for the it makes sense for both of them to omit the implicature concerning formulations are explicit about the implicature since they speak of a proportional relation between the evidence furnished and the claim that one is sought to establish.
In sum, I showed in this section that Sagan's principle including its implicature are established principles in the philosophy of sciences.however, we still need to determine how to apply the principle to linguistic claims and evidence.To be able to do so, we need to understand the adjectives "ordinary" and "extraordinary" as applied to linguistic claims and evidence.In this next section, I suggest a general Bayesian for linguistic evidence.
To derive any consequences from Sagan's principle, we need to understand what constitutes and ordinary vs. an extraordinary claim and similarly what constitutes ordinary and extraordinary evidence.hume's term "miracle" indicates that extraordinary claims are those that we believe to be highly unlikely.But, I think we also intuitively understand that the cost of a potential error affects the quality of the evidence we desire: Before you go on an overseas trip, you might check several times that you have your passport with you.But you are more likely to not check at all that you packed your tooth brush.That this behavior is rational at least if you re are anything as forgetful as me, follows from the Bayesian computation linked to the cost of error.assume that your equally likely to not have packed your passport or your toothbrush.But the cost of not having your passport is substantial: you might not be you checked your passport 20 minutes ago might be mistaken, it makes sense to expend the energy to check again just to be sure you avoid the substantial cost of error.By comparison, the cost of not having your toothbrush is a lot smaller, so checking whether you have it on you is uneconomical --the error is too inconsequential to be worth the cost.The example shows to factors that play a role: the cost of the test and the cost of an error.a further factor that plays a role is the reliability of a test: Staying with the example, you might check for your passport either by quickly feeling through the outside of your bag that it contains a passport sized printed document or you might do a more elaborate, but more reliable check: open your bag, take out the passport, open it, and check the name and validity of the passport.
The statisticians neyMan and PearSOn (1928) introduced the discussion of two different type of error types in testing a hypothesis.a type I error occurs when the test comes out in favor of a hypothesis that's actually false, while a type II error occurs when the test results speaks against a hypothesis that's actually true.The two types of errors might cause different amounts of damage.The calculations involved in this type of scenario well understood in the context of medical tests and is discussed in conjunctions with the concepts of sensitivity and Consider the case of a medical test for a condition that has no other cost of carrying out the test and in terms of the suffering the test itself rates of the two errors: For the people that actually have the medical false negatives.and for the people that actually don't have the medical for the true negatives only the cost of the test itself is incurred.The framework of utility analysis of von neUMann & MOrGenSTern (1944) implies that the test is only useful if the cost of the test to an each of the four possible outcomes times the possibility of belonging to each of the four groups.and if we have to decide between two possible tests, we would want the cost increase the more expensive test causes the pricier test as compared to the cheaper test.Schematically, a test with many false positives is more acceptable when the likelihood of an actual positive increases or the cost done by treating a false positive decreases.and a test with many false negatives is more acceptable when the actual rate of positives is decreased or when the cost a false negative is decreased.In medical testing all the precise costs may be known at least in approximation: the cost of the test and the treatment, the cost reduction of successful treatment compared to not giving treatment to affected individuals and also the cost of the possible side effects of treating one of the false positives.In linguistic examples, the cost of a test is generally known in approximation: it's roughly the cost of the cost of an incorrect linguistic theory are not known.So we can't really determine what rate of Type I and Type II errors we should tolerate.
In behavioral psychology, researchers have agreed to tolerate 5% of scenario of testing a tested hypothesis that two groups of measurements are drawn from different populations vs. the null hypothesis that the two are drawn from two randomly chosen groups of the same population.The 5% and 20% rates correspond to the assumption that there is no prior bias as to whether the test hypothesis is correct or not, and but that the cost of publishing a false positive is greater than the cost of not not to the individual researcher).There are many scenarios in linguistics For example, we may compare whether males and females use an overt sex difference.In such a case the run-of-the-mill psychological method can be applied to possibly show that there is a sex difference.however, common where the psychological method isn't applicable.Consider for word used to describe rabbit in the language of an indigenous group.QUIne (1957) argues that it is close to impossible to conduct even such a simple task while strictly applying the method of behavioral experiments where indigenous speakers observe rabbits vs. some other stimuli to determine that the presence of rabbits triggers a different word from that of for example foxes and thereby shoot down the `null-hypothesis' that the words for rabbits and foxes is the same in the indigenous language.Quine concludes from this argument that translation is indeterminate, but this result derives in large part from Quine's strict adherence to the behavioral method.Some residual better discussed for medical research, nuclear physics, or the theory of mental faculties of the indigenous groups do not substantially from null-hypothesis that the indigenous group should lack a term for rabbit if rabbits occur frequently in the environment the indigenous group occupies?a different kind of starting point would be the hypothesis that the indigenous language is not substantially different from other previously studied languages including the well-studied european and east-asian languages except for the phonetic content of the lexical items.Under this perspective the Type I and II errors are the opposite from that of the Quinean perspective.So if we accept different rates of the two types of errors, we arrive at different outcomes.assume we accept 20% of Type II errors.Then, on the Quinean approach up to 20% of the claims of the form "this indigenous group doesn't have a word for x" would be false, while on the latter approach up to 20% of the claims of the form "this indigenous group has a word for x just like european languages" would be false.
The previous paragraph I argued that the choice of a null hypothesis is by no means obvious for linguistic research.One approach might lead to over-exoticizing the language under study, the other to over-to generally start from the null hypothesis that an indigenous languages accept a high rate of false negatives, the indigenous language would end up being described as lacking many distinctions better studied languages starts with the assumption that the indigenous language is similar to well-studied languages.Then any false negative corresponds to a claim that the language in question has a property of some well-studied language.regardless of approach, errors are of course unwanted and the expectation is that attempts at replication of a result are going to eliminate errors over time.But some rate of error is unavoidable in any is the less dangerous type of error.In the following, I'll call this the comparison-based approach: the term brings with it connotations of neo-colonialism and cultural imperialism, as well as memories of latinizing the description of some european languages by religious scholars and translators of the Bible.But in current linguistic work, a substantial variety of languages is quite well-described so the connotations mentioned don't apply.The body of current grammatical description is certainly still dominated by languages from the Indo-european family, but detailed descriptions of several eastasian languages, languages from other families spoken in europe and its periphery (Finno-Ugric, Turkic, Semitic, Basque), and substantial amount of grammatical description of other languages from all over the planet.Therefore the null hypothesis would in most cases need to be also need to identify the phonetic content of the lexical items.So, ensure that erroneous claims of the type that language x is like some well-studied language in some respect are not damagingly frequent.Furthermore claims of the type language x is like language y rarely excite great interest, and therefore the greater burden of proof should be placed on the researcher claiming that language x and language y differ.The over-exoticizing approach, in my view, is more severely handicapped: it seems to start from the presumption that the humanity of the indigenous group is in doubt and ends up all too frequently making false pronouncements of language x lacking some property simply on property in question.Given that claims of this type incur great interest, false negatives of this type.
should generally be taken to be ordinary if they closely correspond to generalization established on the basis of better studied languages.and one property of an extra-ordinary claim is that it claims that an indigenous language diverges from the grammatical properties of better studied languages.The resulting picture is different from one where it's assumed as null-hypothesis that some grammatical factor always plays no role in the indigenous language.In addition the degree of extraordinariness depends on the hypothetical cost caused by an erroneous claim (a false positive) and that of an erroneous rejection of the same claim (a false negative).But this cost can not even be estimated at this point and the actual decisions depend largely on the social consensus willing to revise their theoretical models to incorporate the new claim.In the following sections, I attempt to derive more practical consequences out of these general philosophical principles.
This section summarizes the current state of a debate about which is the most suitable method to gather acceptability judgments in one of the best studied languages there is: english (GIBSOn and FedOrenkO 2010, 2013, SPrOUSe and alMeIda 2013, SPrOUSe et al. 2013).The recent debate began with a broad accusation of sloppiness against research using traditional "armchair" methods in syntactic and semantic research.But at least at this point, the result of the debate is that quantitative evidence from formal experiments offers no better validity than evidence from the traditional "armchair" method.This is an important result to keep in mind also for linguistic work on less well-studied languages where also usually quantitative evidence is not collected.

GIBSOn and FedOrenkO of syntax and semantics of being open to researchers own cognitive
proposal.To support this claim they cite a couple of selected, individual cases of judgment data from the published literature that Gibson and Fedorenko failed to reproduce in quantitative studies involving multiple test conditions and multiple speakers.Gibson and Fedorenko therefore call for the widespread adoption of quantitative research methods for syntax and semantics.Given the advances in software technology to conduct judgment elicitation over internet platforms such as amazon Mechanical Turk, they claim that the cost both in terms of researchers time and payment of participant would be worth the putative gain in accuracy.SPrOUSe and alMeIda (2013) and SPrOUSe et al. (2013), however, show that Gibson and Fedorenko themselves are guilty of a violation of one basic of method of quantitative research: they don't consider a random sample of data in their evaluation of the "armchair" research method, but instead focus on a few selected cases of data that were already known to be controversial.SPrOUSe et al. ( 2013) present an evaluation of the "armchair" method based on a randomized selection of data from journal articles.They report that 95% of the contrasts using amazon Mechanical Turk.This means that the "armchair" method is at least comparable concerning the number of false positives it results in than the standard method of behavioral psychology.The failed that there is indeed some small mismatch, but at this point it remains open as to whether these are false positives of the "armchair" method or false negatives of the quantitative method.Overall this result entails that availability of experimental methods, but experimental methods are still important as they allow research on many questions that couldn't be addressed by the "armchair" method.also for surprising, extraordinary new claims there could be an advantage to providing stronger evidence as expected by Sagan's principle.SPrOUSe since it shows that currently most contrasts of acceptability relevant to linguistic theory are such strong effects that they can be reliably judged without resort to quantitative methods at least by trained linguists.There are some difference though that remain to be investigated: In most linguists, but from between one and a couple of language consultants.also while most interested researcher (e.g. the reviewers of a paper) can readily attempt to reproduce english judgments, this is in most cases impossible for indigenous languages.In the following section, I and the implications of the wider availability of documentation and experimental techniques.MaTThewSOn (2004) and dIxOn (2007).The two approaches represent two opposite ends of a spectrum of opinion (and therefore are exemplary for other's views the collection of texts in the target language, Matthewson advocates in addition the use of elicitation, of translation and use of a contact again the liberal view of Matthewson, but in addition to indicate some inventory. First, consider the recommendations dIxOn (2007) offers.while dixon's paper contains more general advice on the practical aspects here, the focus on collecting texts is very explicit in dixon's section 9 on "what to do".dixon focuses on three tasks: beginning to speak the language, compiling a dictionary, and recording and analyzing texts.The list doesn't mention grammatical elicitation, and dixon states at the end of the section (p.23) that grammatical elicitation "should play no role whatsoever p. 22, dixon writes that "the only way to understand the grammatical structure of a language is to analyse recorded texts in that language."Furthermore, in a later section on "what not to do" (p.27), dixon reiterates that controlled elicitation shouldn't be pursued.
dixon's view as far as I can gather is extremist and Matthewson mentions many researchers who have taken a different stand.texts.One general problem of corpus based linguistics is that it neither scenario.Given the important role these types of data play in linguistic two examples that illustrate the error-proneness of a text-only strategy.likely doesn't have a written form, but at the same time english is well studied.ThOMPSOn (2002) looks at evidence for complement clauses in spoken english corpora, and concludes that rather than complement clauses, spoken english only allows unembedded declaratives accompanied by an evidential phrase.To support her claim, Thompson extracted a sample of 452 complement-taking predicates from a corpus of spoken english and analyzed the structure and discourse contribution of each item in detail.newMeyer (2010) points out that Thompson committed a Type II error: concluding from the lack of evidence, that corpus of english than Thompson did: 170 Megabytes of text data, complement clauses in english.what is instructive here is the amount of text required to avoid a Type II error: For which indigenous language has anybody gathered and transcribed 170 Megabytes of data?If one page of text corresponds to 500 Bytes (characters), then 170 Megabytes correspond to 340 thousand pages of text: dixon's text-only approach transcribed stories required to determine whether a language has complement clauses or not.at the same time, the text-only approach is not immune to errors of the Type I type.In fact, once a linguist is actually immersed in a collection of texts and their translations, it seems language under investigation.an example indicating that type of failure to see an obvious linguistic difference to widely spoken languages in a text that was widely studied both in original and in translation concerns homeric Greek.as deUTSCher (2011) renarrates, the fact that the color categories of homeric Greek don't correspond to color categories of english, German, French and other modern european languages was only pointed out by GladSTOne (1858: 457-499).Gladstone slightly overinterpreted the data (he assumed color vision was different at homer's time), but given the modern knowledge of cross-linguistic variation in color terms Gladstone's basic observation was essentially correct, though overlooked by almost all others studying homer's be ruled out if linguistics was to rely entirely on the text-only method.
properties in texts: For one, since the gathered texts are usually stories, poetic language as was the case for homeric Greek, but one Gladstone argues to be incorrect.Secondly, relevant data in texts may be spread out very thinly over different stories, and may only be convincing when arranged into one paradigm.This probably was less of a factor in the case of homer since many researchers intensively studied his writing, say the very successful typological research on color terms by BerlIn & kay (1969) didn't rely on text corpora at all.
In sum, the text-only approach is prone to a large number of Type II errors and even then may not generally lead to a full view of the 138 language under investigation.a third concern about the approach is that I have always found it very natural to construct an example sentence in a language I'm interested in and then ask a native speaker of whether it's grammatical and what it means.I can't imagine that other linguists differ in this respect, even those who subscribe to the text-only view.In fact, and try to use it in conversation.Furthermore, he says it's important "to encourage people to correct all your mistakes" (p.20), which is very close to judgment elicitation: tell me whether what I say is grammatically correct and whether it's true in this scenario.The major difference here of this type, while I consider this at least as important to document such judgement-elicitation sessions as stories and other texts: Only if other based on (including the ungrammatical sentences), can they evaluate the arguments and it is this independent scrutiny that underpins progress important grammatical sentences should be recorded from at least one native speaker.
MaTThewSOn ( 2004) is primarily concerned with semantic like me takes a strong stance against the text-only approach.She in particular argues that the use of a contact language (she uses the term "meta-language") and of translations in elicitation need to be handled with care, but need not be detrimental.One recent example of the latter from work on Matses I was involved in is presented by MUnrO et al. (2012): we investigated the claim that Matses doesn't have a form of speech report corresponding exactly what is indirect speech in english and also in Spanish.as part of a controlled elicitation experiment, we asked Matses speakers who also spoke Spanish to translate sentences with indirect speech from Spanish into Matses.In this experiment, we might have easily ended up with data that show transfer effects from Spanish into Matses.however, actually eight of the nine speakers gave Matses responses that fully corresponded to the claim that Matses only has forms similar to direct speech in Spanish.The ninth speaker, who did show a transfer from Spanish to Matses, was working as a Spanish teacher.So, the example illustrates the potential for transfer effects involved in translation tasks, but at the same time the evidence obtained is actually revealing when speakers overcome the potential for grammatical transfer inherent in translation tasks.Overall, Matthewson's liberal approach is on the right track in attempting to strike a balance between attempting to block false positives, but at the same time not impose unreasonable methodological barriers that impede progress and cause a large number of false negatives simply because research methods commonly used for widely spoken languages are banned for indigenous languages.
One topic Matthewson doesn't address is the use of formal is no general, precise answer to this question until we know the cost of having different types errors in our linguistic theory.If one is too effort for results that could've been obtained with equal accuracy in an easier way.The satisfactory reliability of the "armchair" method shows that not every formal experiment is warranted.On the other hand, if one doesn't undertake the effort of a formal experiment when it's due, one may easily miss an opportunity to convince the community of an observation that is extraordinary in Sagan's sense.There can't be  (2001,2006) relies on the frequency of presupposition challenging responses to test on the presupposition of (1) that there is in elephant in your hair isn't "what elephant?"or similar.
(1) did you get the elephant out of your hair?
In 2006, Matthewson claims that St'át'imcets adults don't challenge what might appear to be presuppositions in the same way.For example, she reports on presenting the sentence in (2) to one of her consultants.The english translation of (2) presupposes that some other person's being in jail was mentioned before, but Matthewson reports that her St'át'imcets informant didn't challenge this response, but only asked what lisa did to land her in jail.Matthweson takes this to be evidence that "t'it" in (2) doesn't trigger a presupposition of the same type as english presuppositions, though it has the same semantic content.
(2) wá7 l-ti gélgel-a tsitcw k lisa be also in-deT strong-deT house deT lisa 'lisa is also in jail.' as far as I know Matthewson's claim has been largely ignored in the aBrUSan 2011) is to attempt to derive that some aspects of content must be presuppositional from general pragmatic and semantic principles.But if MaTThewSOn ( 2006) is correct, that enterprise would be futile since the parametric difference between english and St'át'imcets shows that the presuppositionality of some content is an arbitrary feature of the reason Matthewson's work has been ignored is that the evidence she present has not been as extraordinary as the claim Matthewson is making, and she should've done a more formal experiment on the matter.That Matthewson's claim is extraordinarily surprising is in this case clear: MaTThewSOn (2006) writes herself that her claim is "somewhat radical" (p.63).her 2006 generalization also differs from her previous 1998 work, where she writes (p.116) that "the lexical item corresponding to english 'too' induces presuppositions" 2 .
I conclude therefore that Matthewson should have done formal experimental work to corroborate the central claims of her 2006 work.
Matthewson compares the performance of english adults on (1) with that of St'át'imcets adults on (2).But neither the two groups nor the two sentences are very similar: as for the two groups, MaTThewSOn (2006) writes that her consultants "My relationship with the consultants from whom data were obtained is a friendly one, and I have known each of 2 MaTThewSOn St'át'imcets cannot be presuppositional, consistent with the 2006 claim.Furthermore, she does preface the discussion I quote from in the text with the proviso that the full investigation of the relevant claims is beyond the scope of her 1998 work.well.Though some regard Galileo's experiments as the starting point of modern science, many big discoveries in science weren't based on experiments, but only on close observation of nature.Consider just one evolution.almost the entire body of evidence for evolution that darwin based his theory comes from observations.darwin wasn't opposed to experiments --late in his life, he proved experimentally that earthworms improve the fertility of soil --, but darwin did not waste any time on establishing obvious facts underpinning his account of evolution such as the anatomy of the Galapagos fauna, and at his time of writing also of evolution.linguists too need to develop a taste for when quantitative data are useful, and when they are an impediment to progress.
To conclude, I take stock and attempt to derive some practical recommendations.It is impossible, though, to derive a precise general recommendation from the above considerations as to when to do a formal experiment and when it would be a waste of time.Clear cases of former are any cases where experiments are also called for for wellstudied languages: cases where the data are subtle, cases where additional measurements such as timing data or neurological data are expected to be revelatory, and cases where groups other than adult informants are under investigation.Clear cases of the latter are data that trust-worthy language consultants judge to be clear and that conform to patterns of a better studied language.That leaves a large area where it is up to the individual researchers judgment whether experiments are expected least consider and acquire the ability to perform formal, quantitative experiments.especially this should be the case in situations where without being taking away a large amount of time from other methods like judgment elicitation and story elicitation.
Compared to the investigation of well-studied languages the situation advance in the case of well-studied languages has been the availability of the internet based platforms that allow researchers to conduct trials, especially amazon's Mechanical Turk (a point of GIBSOn & FedOrenkO'S 2010 paper that I discussed above in section 2.1).The internet based methods make it possible to conduct quantitative trials for researchers that don't have access to lab space and research assistants that gather judgment data from a large group of english speakers.But, speakers of any indigenous language are unlikely to be available on Mechanical Turk --even for German and Japanese I found it impossible to get more than about 40 native speakers in a study conducted in 2011.
ways from recent technological progress and furthermore offers some 144 distinct advantages.The two ways technological progress make it easier smaller and cheaper equipment make it possible to create and manipulate freely available, public domain software tools facilitate both the creation of stimuli and analysis of data from formal experiments.For the analysis, especially the Praat software for phonetic analysis and the r software for statistical analysis and the creation of graphs can nowadays replace expensive commercial software such as SPSS for research purposes.
Still it costs time to prepare and conduct formal experiments to gather consider doing so in selected circumstances.The typical scenario I have has already provided evidence in favor of the conclusions that a formal experiment then attempts to corroborate.Of course there are also cases of traditional psycholinguistics and experiments are needed to establish any useable evidence.But even in cases where language consultants judgment provide some evidence, there may be reasons to conduct in addition a formal experiment.The foremost reason is that the evidence and the conclusions drawn from it are surprising, for example they may contradict apparently well-established generalizations in linguistics.If this is the case, the simplest possible experiment would be to independently question several additional members of the community.after all, if all good reason may be that in situations of language endangerment the evidence might not be available later.So it may be the last chance to document any property of such a language with greater reliability.a related third reason is that in situations where an indigenous language is not the main daily language of most members of a community anymore, individual consultants judgement may be more uncertain than otherwise, or it may be desirable to determine whether all members of the community share the relevant judgment.For example, in work of my own with the Teiwa in Indonesia (kratochvil, hollebrandse, and Sauerland, in progress), we primarily worked with younger speakers as consultants.however, younger speakers were all literate in Indonesian and we turned to experimental techniques to determine whether older, illiterate speakers shared the relevant judgments.a fourth advantage of all members of the community, while typically one works every day with the some preferred consultants in judgment elicitation who gain some to be recorded.In this situation, enrolling all comers as participants in an experiment and providing some appropriate compensation for the effort is a way to engage the whole community in the study and receive too hard, but rather fun for the participants, so I offer some simple methods in the following.
For work in syntax and semantics, simple methods are focused on the task of either judging that one sentence sounds better (i.e. more grammatical) than another, or that one sentence is more acceptable language acquisition research with children since materials for children must be designed to be engaging.Of course, one shouldn't overdo this: infantilizing is no better received by members of an indigenous community than by adults elsewhere.It also is helpful to ask the community or the consultants one works with closely for suggestions on how to do this.My own experience comes from work on Teiwa mentioned above, Matses (MUnrOe et al. 2012), and Pirahã (Sauerland, to appear).The methods 146 that are most easily and broadly applicable: we essentially compiled a list of testable examples working with a few primary informants.The list consisted primarily of questions of the type, can this sentence be used in this scenario?after the list was done, we went around with it and asked other speakers the questions on the list, and just reported their responses.In addition we did another experiment involving translations from Spanish into Matses.In the case of Teiwa and Pirahã, the goal was different from the one in Matses.For both languages, it was unclear in the language at all.In this case, we followed language acquisition research by using targeted elicitation: we created relevant scenarios and asked speakers to report relevant aspects of that scenario.So this was spontaneous production in a controlled situation.If it works, the results from this method provide strong evidence for the existence of the them spontaneously.But the work required especially for the evaluation is much greater, than in acceptability or comprehension studies, and additional comprehension experiments seemed necessary to me in determine the best method and setup.
S L 140 results, even assuming that the experiment if properly designed and has been ignored and this lack of reception is possibly due to nonexperimental nature is provided by Matthewson's own work on lillooet Salish.MaTThewSOn that St'át'imcets lacks presuppositions of the type that english has which place a requirement on the common ground (STalnaker, 1973 among others).She proposes that instead St'át'imcets can mark content as objective propositional content in the sense of GaUker (1998).To argue for this parameter Matthewson cites, on the one hand, unpublished experimental work by COnTI (1999), which I couldn't access, and published work by herself (MaTThewSOn et al. 2001) on english, and, on the other hand, reports on an informal experiment MaThewSOn