Научная статья на тему 'CORPUS BASED APPROACH TO LANGUAGE TEACHING'

CORPUS BASED APPROACH TO LANGUAGE TEACHING Текст научной статьи по специальности «Языкознание и литературоведение»

CC BY
35
15
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
corpus / corpora / concordance.

Аннотация научной статьи по языкознанию и литературоведению, автор научной работы — Ergasheva Mokhira Bakhridinovna

The report aims to describe the action research that took place to study the use of corpus – based approach in an ESL classroom. The author created a corpus using Sketch platform, and the topic of the corpus is “Food and Nutrition”. It was used to create several activities, which have been provided in the Appendix. In this action research 24 students at the upper – intermediate level participated and the majority of them found the approach engaging. The data was collected by interviewing the subjects after the study. According to the participants, the corpus can be applied best to learn collocations both in and outside the classroom.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «CORPUS BASED APPROACH TO LANGUAGE TEACHING»

CORPUS - BASED APPROACH TO LANGUAGE TEACHING Ergasheva Mokhira Bakhridinovna

Master of Arts; English language teacher at Moscow State University (Lomonosov) in Tashkent

https://doi.org/10.5281/zenodo.10113169

Abstract. The report aims to describe the action research that took place to study the use of corpus - based approach in an ESL classroom. The author created a corpus using Sketch platform, and the topic of the corpus is "Food and Nutrition". It was used to create several activities, which have been provided in the Appendix. In this action research 24 students at the upper - intermediate level participated and the majority of them found the approach engaging. The data was collected by interviewing the subjects after the study. According to the participants, the corpus can be applied best to learn collocations both in and outside the classroom.

Keywords: corpus, corpora, concordance.

Introduction. Since language learning has become one of the main trends, a lot of attention has been paid to language teaching around the world. The need for learning languages and technology development encourages teachers to find modern ways of dealing with language problems. As a computer-based tool, corpus linguistics is focused on offering "a ready resource of natural, or authentic, texts for language acquisition (Reppen, 2010)." Johansson (2009) makes a solid case for the usefulness of corpora in language courses by demonstrating how it might influence exam preparation, textbooks, activities, and syllabus design. The use of corpora in language instruction can be a useful technique for teaching vocabulary, grammar, and language use to EFL learners (Dazdarevic et.al 2014).

The report aims to describe ways of introducing new vocabulary using a corpus-based approach to the topic "Food and nutrition". Therefore, it first introduces the corpus, its type, and its usage, second, a quantitative analysis of the corpus is provided, third, limitations are noted, next it reports activities designed due to the corpus, and finally, reflects on the experience at the usage of those activities.

Methodology. The corpus was compiled in 2022, and it contains authentic materials including guidebooks on food and nutrition, several magazines, some professional articles and an encyclopedia on food, etc. The data come from the same point in the time (2010 - 2022), which makes the corpus synchronic. It is a written monolingual corpus consisting of a million words which are quite small for its type (O'Keefe, McCarthy, and Carter, 2007). The corpus was created to be used in an ESL classroom, however, it can be called a learner corpus. To be more precise, learner corpora include digital textual information of a language developed by foreign language learners (Pravec, 2002). It is a specialized corpus as it is limited to one topic area - food and nutrition. This corpus can also fulfil the needs of English for Specific Purposes learners as it serves the needs of ESP learners, for example, dieticians and nutritionists.

Corpora are most usually correlated with quantitative studies due to the simplicity with which frequency analysis can be generated (Timmis, 2015). The table (Figure 1) below illustrates the most frequent words in the corpus.

WO rd <1i2° itema 1 122'800 to,al frequency)

Word Frequency Word Frequency

Word

Frequency

Word

Frequency

1 and 37,001

2 a 17,151

3 are 7,952

4 as 6,319

5 an 2,776

6 al 2,387

t also 2,337

8 at 2,222

9 about 1,400

10 all 1,363

it am 1,279

12 adults 1,093

13 acid 943

14 associated 932

15 a. 916

is acids 914

17 after 874

is among 870

19 ameriean 859

20 although 859

21 age 854

22 any 600

23 animal 647

24 amount 590

25 association 573

25 activity 556

27 americans 553

28 addition 535

29 available 507

so added 495

31 animals 469

32 agriculture 424

33 amounts 417

34 adolescents 388

35 according 375

36 aged 362

37 average 359

38 adequate 349

39 another 345

40 against 32B

41 ages 320

42 absorption 317

43 alcohol 314

44 african 313

45 around 313

46 assoc. 290

47 additional 272

48 analysis 263

49 amino 235

so academy 225

Figure 1. Food and Nutrition word frequency list.

Below frequency lists from the corpus are provided and consist of lemmas. Lemma is the basic word form (Timmis, 2015): for example, the lemma "eat" has the word forms eats, ate, and eating.

It can be seen that the most used lemmas (figure 2) are articles and prepositions. Also, it can be discovered that the materials used to build the corpus are mostly about health and diet; the word "study" indicates that there are some academic articles about studies too.

I 0 m m 9 (19.357 items | 939,263 total frequency)

Lemma Frequency Lemma Frequency Lemma Frequency

1 the 37,433

2 and 37,001

3 of 33,225

4 be 30,483

5 in 23,596

€ a 19,659

7 tO 19,482

a that 8,805

9 for 8,512

io food 7,759

11 or 7,097

12 with 6,722

13 calcium 6,565

14 as 6,319

is milk 6,299

16 have 5,865

17 Ori 4,925

is Ircm 4,327

19 intake 4,785

20 dairy 4,269

21 diet 4,204

22 it 4,182

23 by 3,944

24 fat 3,842

25 Study 3,690

26 j. 3,491

27 not 3,482

2e vitamin 3,352

25 than 3,007

30 this 2,986

31 other 2,930

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

32 health 2,891

33 can 2,872

34 more 2,867

35 bone 2,817

36 dietary 2,780

37 blood 2,664

3B increase 2,643

35 use 2,643

Lemma Frequency

40 effect 2,577

41 may 2,529

42 high 2,514

43 which 2,456

44 risk 2,454

45 et 2,410

46 al. 2,397

47 also 2,337

48 they 2,318

49 mg 2,262

50 al 2,222

Figure 2. Food and Nutrition frequency list of lemmas.

Figure 3 demonstrates that the word "dietary" is used more than the word "healthy", or see the difference between "significant" and "important".

adjective

(4,636 items 1102.681 total frequency)

Lemma Frequency

Lemma Frequency

Lemma Frequency

1 dietary 2,714

2 other 2,581

3 high 2,344

4 low 1,836

5 more 1,731

6 fat 1.489

7 such 1,346

8 many 1,247

9 good 1,071

id total 870

11 healthy 848

12 old 815

13 nutrient 795

14 small 784

15 less 753

16 whole 737

17 fatty 718

is low-fat 565

19 large 663

20 most 646

21 same 636

22 great B28

23 Important 616

24 human 562

25 raw 550

26 several 519

27 available 507

26 sweet 505

29 different 485

3D similar 477

31 fresh 475

32 young 474

33 due 472

34 white 467

35 common 459

36 few 456

37 clinical 441

3B american 427

39 benelicial 422

Lemma Frequency

40 early 418

41 nutritional 418

42 significant 409

43 first 395

44 saturated 368

45 daily 368

46 major 363

47 adequate 349

43 colorectal 349

49 red 348

50 SOft 338

Figure 3. Food and Nutrition frequency list of adjectives

The diagram below (figure 4) represents that the word "intake" is used four times more than "consumption". Students can use it to make assumptions about the frequency of words or lemmas on a particular topic and can check the hypothesis using the frequency lists.

I*"] Q PI <i3322 terna I 376,051 lotal frequency)

Lemma Frequency Lemma Frequency

Lemma Frequency

Frequency

i food 7,759

2 calcium 6,565

3 milk 6,193

4 intake 4,785

5 dairy 4,269

s diet 4,110

7 study 3,617

a vitamin 3,352

g health 2,891

io bone 2,769

11 blood 2,661

12 effect 2,570

13 risk 2,444

14 fat 2 328

is mg 2,262

ie nutrition 2,096

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

17 cancer 2,078

18 day 2 031

19 tody 1,964

20 year 1,912

21 protein 1,895

22 product 1,854

23 disease 1,796

24 child 1,781

25 woman 1,770

26 acid 1,728

27 fruit 1,587

2e weight 1,571

29 lactose 1,562

30 chapter 1,505

31 pressure 1,475

32 nutr 1,436

33 cheese 1,421

34 level 1,391

35 cholesterol 1,387

36 page 1.365

37 water 1,264

3B am 1,248

39 adult 1,242

40 source 1.183

41 consumption 1,169

42 nutrient 1,160

43 foods 1,134

44 calorie 1.101

45 group 1,098

46 vegetable 1,073

47 part 1,053

48 CUp 1,006

49 amount 1,006

so factor 976

Figure 4. Food and Nutrition frequency list of nouns.

(2'455 iienK I 122-444 to*31 frequency)

Lemma Frequency Lemma Frequency Lemrr

Frequency

I be 30,481

2 have 5,885

3 Increase 2,155

4 use 2,002

5 do 1,865

6 reduce 1,738

7 consume 1,589

8 include 1,466

9 eat 1,464

io find 1,445

11 make 1,389

12 contain 1,150

13 associate 936

14 serve 860

15 show 860

16 provide 847

17 compare 7B0

16 grow 780

19 cook 755

20 add 753

21 lead 660

22 need 651

23 indicate 616

24 see 611

25 call 584

26 age 582

27 help 580

29 follow 551

29 produce 543

30 recommend 541

31 prevent 533

32 contribute 502

33 improve 459

34 decrease 454

35 suggest 454

36 demonstrate 429

37 develop 427

3S come 426

39 randomize 426

Lemma Frequency

40 know 410

41 take 404

42 feed 399

43 become 384

44 give 381

45 accord 379

46 influence 375

47 meet 370

46 report 363

49 vary 355

50 consider 351

Figure 5. Food and Nutrition frequency list of verbs.

Using the list above (figure 5) learners can know the difference between the frequency of "eat" and "consume" or "provide" and "give". They can be taught that in an academic context the words "consume" and "provide" are used more rather than their synonyms.

□ doc#0: Promote health and reduce chronic diseases associated Decrease sodium intake (2.400 milligrams or with diet and weight less daily) Weight status and growl

□ doc#Q rough health plans Nutrition counseling for medical conditions Increase fruit intake (2+ servings daily) Increase vegetable intake (3+ servings dally) Include nut

□ doc#0 3dical conditions Increase fruit intake (2+ servings daily) Increase vegetable intake (3t servings daily) Include nutrition counseling In physician office visits Foot

□ doc#Q on counseling in physician office visits Food security Increase grain product intake (6t servings daily) Increase access to nutritionally adequate and safe foods

□ doc#Q ase access to nutritionally adequate and safe foods Decrease saturated fat intake (less than 10% of calories) for an active, healthy life Decrease total fat Intaki

[] . doc#0 itake (less than 10% of calories) for an actrve, healthy lite Decrease total fat intake (no more than 30% of calories) 'Nutrition and Overweight is one focus area [] . doc#01 slight degree, so have deaths from some cancers. </s><s>0n average, the intake of total fat and saturated fat has decreased.</s=,<s=,Food labeling provides r □ . doc#0 at the recommended 5 servings of fruits and vegetables.=/s><s>0verall, fat intake Is decreasing (from 40 percent of calories In the late 1970s to 33 percent In

Figure 6. Food and Nutrition concordance lines for the word "intake". Concordance lines above (figure 6) help to see how collocations of the word "intake" are employed together. O'Keeffe et al. (2007) describe it as a way 'to find every occurrence of a

particular word or phrase' (Yilmaz and Soruc, 2014). Furthermore, the most frequent collocates which occur within two or three words can be seen using concordance lines.

According to Jones and Durrant (2010), three essential points should be taken into consideration while applying the word frequency lists in the pedagogic context, it implies that:

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1. The corpus "Food and Nutrition" is only for writing reports and should not be applied to speaking activities. The word frequency might be irrelevant for British English or other specific learners as most of the journals are by American publishers.

2. The frequency lists generated above are about lemmas; however, words can also be treated similarly.

3. While creating activities for students, polysemous words were not dealt with since the essential meanings of the lemmas were used.

Although there are a lot of various uses of the corpus, the limitations should also be noted. First, the corpus is formed to assist students to write about their diet, therefore, it includes mainly scientific articles and magazines. It does not benefit those who desire to see the frequency of a specific term in informal situations. Second, it is only for high intermediate and advanced level learners, and cannot be applied to elementary or low intermediate classes.

Discussion. According to this corpus, seven activities (See the Appendix) were designed for upper - intermediate ESL learners aged 17-18. The final learning outcome of the lesson is that students should be able to use the new vocabulary in practice. The main aim of the activities is to enable students to be familiar with the collocations and to use them in future report writing tasks. The writing task should include the collocations of discovering the corpus and/or can be assigned as homework. The activities offer the learners an opportunity to exploit the corpus directly, which enables Data-Driven Learning (DDL). Within the CLT paradigm, vocabulary learning is shifting away from teaching individual words and toward exposing students to lexical elements in authentic and relevant situations (Balunda, 2009). When compared to typical vocabulary teaching approaches such as consulting a dictionary, viewing concordance lines has been shown to result in tiny but constant gains in students' vocabulary knowledge, improved recall, and the learning of transferrable word knowledge (Balunda, 2009). The beauty of using DDL is in its autonomy as learners can discover new topics for themselves (Boulton, 2009). The designed activities for this corpus can be utilized either in the classroom or assigned as homework. The data can be reached by accessing the corpus directly or printing them and distributing them during the lesson. All activities can be done by deriving the hypothesis and testing them against further data which makes them researchers (Johns, 1991; Aston, 2001). It gives the opportunity for learner autonomy, and the classroom becomes less teacher-centralized (Gilquin et al., 2010). The activities cover the main collocations that need to be learned on the topic "Food and nutrition", and these are word phrases with "protein", "carb", "fat", "consume", "nutrition", "food", "intake" and "eat". Types of the activities are gap-filling, guessing, and making sentences, and mostly require quantitative analysis of the corpus. However, the last activity can be an example of qualitative analysis of corpus when other larger corpora such as BNC or COCA are used to compare the data.

Conclusion. Reflecting on the experience at the usage of the activities several things should be noted. The corpus was introduced in two upper - intermediate classes; both classes received the activities in printed forms. Out of 24 students almost 80 % were curious about the approach they found new. The rest of the learners described it as "time - consuming". More than half of the students created a new corpus to compile written essays on other topics to see the most frequent

collocations. The majority of the learners liked the concordance lines as they offered to compile a corpus of listening transcripts of their course book. One of them noted that it would be great to see the forms of collocations in the context. The usage of the corpus caused the students to become curiosity - driven learners (Aston, 2001). By the end of the lesson, the students developed reports including the collocations.

The need for new ways of learning and teaching languages is increasing, therefore, employing corpus - based approach in a pedagogical context can lead to students' success. This report tried to describe the appliance of the corpus designed for B2 level learners on the topic "Food and Nutrition". Although there were several benefits of using the particular corpus, some limitations were observed too. The activities designed were to introduce vocabulary only, however, it would be a better idea to integrate some grammar focused activities. I believe that in the future, a corpus-based approach to teaching will become one of the most applicable.

REFERENCES

1. Almutairi, N.D. (2016). The Effectiveness of Corpus- Based Approach to Language Description in Creating Corpus-Based Exercises to Teach Writing Personal Statements. English Language Teaching, 9(7), p.103. doi:10.5539/elt. v9n7p103.

2. Aston, G. (2001). Learning with corpora. undefined. [online] Available at: https://www.semanticscholar.org/paper/Learning-with-corpora-

Aston/08cfc 19d84291306d240962f6d6a9115b810c264 [Accessed 4 Dec. 2022].

3. Balunda, S. (2009). Teaching academic vocabulary with corpora: student perceptions of data-driven learning. [online] Available at: https://scholarworks.iupui.edu/bitstream/handle/1805/2049/Balunda%20MA%20Thesis%20 Teaching%20Academic%20Vocabulary%20with%20Corpora.pdf [Accessed 4 Dec. 2022].

4. Boulton, A. (2009). Testing the limits of data-driven learning: language proficiency and training. ReCALL, 21(1), pp.37-54. doi:10.1017/s0958344009000068.

5. Cheng, W. (2010). What can a corpus tell us about language teaching. undefined. [online] Available at: https://www.semanticscholar.org/paper/What-can-a-corpus-tell-us-about-language-teaching-Cheng/f0cb0b67a672486df52babc4b0ca2fa152f4f6eb [Accessed 4 Dec. 2022].

6. Dazdarevic, S., Fijuljanin, F. and Rastic, A. (2015). Using Corpus in Enhancing Reporting Verb Patterns in Teaching/Learning Process. Epiphany, 8(2), p.131. doi:10.21533/epiphany. v8i2.166.

7. Johansson, S. (2009). Some thoughts on corpora and second-language acquisition. Studies in Corpus Linguistics, pp.33-44. doi: 10.1075/scl.33.05joh.

8. Johns, T. (1991). Should you be persuaded. Two samples of data-driven learningmaterials. undefined. [online] Available at: https://www.semanticscholar.org/paper/Should-you-be-persuaded.-Two-samples-of-data-driven-Johns/4b146bc51031fff7c159096da40524a2edbc098c.

9. O'keeffe, A., Mccarthy, M. and Carter, R. (2007). From corpus to classroom : language use and language teaching. Cambridge ; New York: Cambridge University Press.

10. Pravec, N. (2002). Survey of learner corpora. [online] Available at: http://korpus.uib.no/icame/ij26/pravec.pdf [Accessed 4 Dec. 2022].

11. Reppen, R. (2010). Using Corpora in the Language Learning Classroom. Cambridge: Cambridge University Press.

12. Staples, S. (2013). Review: Gilquin, De Cock and Granger (2010)Louvain International Database of Spoken English Interlanguage. Louvain-la-Neuve, Belgium: Presses Universitaires de Louvain. Corpora, 8(2), pp.261-264. doi:10.3366/cor.2013.0043.

13. Yilmaz, E. and Soruç, A. (2015). The use of Concordance for Teaching Vocabulary: A Data-driven Learning Approach. Procedia - Social and Behavioral Sciences, 191, pp.2626-2630. doi:10.1016/j.sbspro.2015.04.400.

i Надоели баннеры? Вы всегда можете отключить рекламу.