CorpusLinguistics and Language Teaching (LFTM05)
BY SUBMITTING THIS PAPER, YOUCONFIRM THAT YOU COMPLETED IT ON YOUR OWN, WITHOUT CONSULTING FELLOWSTUDENTS.
YOU AREALLOWED TO USE ANY MATERIALS YOU WISH TO HELP YOU ANSWER THEQUESTIONS.
Type your answers in the boxesbeneath the questions by clicking on ‘Click here to enter text’.You can save this file as you go along.
PLEASE ENTER YOUR NAME IN THE BOXBELOW:
Click here toenter text.
SEND YOUR COMPLETED TEST PAPER TOME BY EMAIL ([email protected])ON OR BEFORE 6.00pm GMT ON MONDAY 14TH MARCH2016. THIS TEST IS WORTH 25% OF YOUR TOTAL MARK FOR THE MODULE. THEREARE 100 MARKS AVAILABLE.
SECTION A [36 marks]
This section contains questions onusing the Corpus of Contemporary American English (COCA).
What search string is needed to carry out the following searches in COCA? Just write the search in the answer boxes, nothing else.
A search for all instances of the lemma GROW. 
A search for synonyms of sad. 
All instances of the word change functioning as a verb. 
All sequences of ‘in the X of a Y Z’ where X = any noun, Y = any adjective and Z = any noun. 
All forms of the verb get followed by an –ing participle. 
‘What a(n)’ followed by any noun. 
The base form of all verbs beginning with the consonant cluster pl-. 
What is the most frequent item belonging to the following word classes in the whole of COCA? Just write the word in the answer boxes, nothing else.
Reflexive pronoun 
Plural noun 
Modal verb 
Comparative adjective 
List the top four singular nouns in the following sections of COCA. (You should write four words in each box, nothing else.)
Science fiction/fantasy 
Way,man, head, hand
Academic (Humanities) 
Music, art, work, education
Academic (Medicine) 
Health, study, patient, care
News (Money) 
Company, business, market, money
In the whole of COCA, what are the most frequent collocates of the following words and phrases?
Most frequent singular noun collocate of the word blue occurring in a span of 4 to the right of the node? 
Most frequent adjective collocate of the word feels occurring in a span of 1 to the right of the node? 
Most frequent plural noun collocate of Obama occurring in a span of 4 to the right and 4 to the left of the node? 
SECTION B [20 marks]
This section contains questions onusing BNCweb.
What is the most economical query syntax required for the following searches? Just write the search in the answer boxes, nothing else.
All adjectives ending in -ly. 
Any word ending in -ful. 
All nouns ending in -sation or -zation. 
Using the ‘frequency list’ function, answer the following questions:
What is the most frequent man’s first name beginning with B in the whole BNC?  Brian
What is the most frequent city beginning with C in the whole BNC? 
What is the most frequent interjection in the whole of the BNC? 
Use ‘standard query’ and ‘distribution’ to find out about the distribution of the lemma CHAT (use Simple Query Syntax help if you need to). Answer the following questions.
Is the verb chat more common in ‘fiction and verse’ or ‘spoken conversation’? 
Is the verb chat more common in written texts by males or written texts by females?  Females
Is the noun chat more common in ‘fiction and verse’ or ‘spoken conversation’? 
The table below shows the distribution by gender of the word darling in the demographically sampled part of the spoken component of the BNC. What are the values for A, B, C and D? 
The following distribution was found:
No. of words
No. of hits
Dispersion (over speakers
Frequency per million words
SECTION C [6 marks]
This section contains questions onusing AntConc.
Load An Outcast of the Islands(in SunSpace Unit 5) into AntConc and answer the followingquestions (remember to ‘treat all data as lower case’).
What is the most frequent word in the novel? 
What is the most frequent 3-word cluster/n-gram in the novel? 
Using the default wild-card settings, what query syntax is needed to search for all words ending with the sequence –ms? 
Is the concordance plot below for Willems or Lingard? 
SECTION D [32 marks]
This section tests your knowledge ofconcepts in corpus linguistics and language pedagogy. It consists ofa series of definitions. You have to supply the words or phraseswhich match the definitions and write them in the box.
The typicalgrammatical patterning into which a word or grammatical constructionenters. 
The number ofwords either side of a search item to set the parameters whensearching for co-occurrence patterns. 
A large corpuswhich attempts to represent the general usage of a language throughthe careful selection of a cross-section of texts. 
The tendencyfor certain words to be related to a particular area of evaluative orattitudinal meaning, even when the meaning of the word itself appearsto be evaluatively neutral. 
A measurementof the proportion of lexical words to function words in a text. 
An approach tolearning in which students ‘discover’ rules and probabilitiesfrom the corpus examples they find. 
A process toenable direct comparisons of frequency to be made when comparingtexts and corpora of different sizes. 
Key Word In Context (KWIC)
The onscreendisplay of a concordance in which the search word is centred, withco-text on either side. 
The term usedto describe a corpus used as a background against which to comparefindings from another corpus (particularly in relation to theanalysis of keyness). 
The canonicalform of a word. 
The process bywhich each word in a corpus is assigned a label according to whichword class it belongs. 
Therelationship between an individual word and a set of semanticcategories. 
This class ofwords is sometimes referred to as grammatical words and consists ofpronouns, prepositions, determiners, conjunctions, etc. 
A word which isfound significantly more often in one corpus when compared with areference corpus. 
Ameasure of the rate of occurrence of a word or phrase across aparticular file or corpus. 
The defaultstatistical measure of keyness in AntConc.