ISSN 2658-5138
УДК 81
ББК 81.1
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
DOI: 10.24412/2658-5138-2022-7-25-43
PRAGMATICAL, TERMINOLOGICAL AND FUNCTIONAL
ANALYSIS OF THE UN CORPUS
Elena Yu. Balashova, ORCID ID 0000-0002-9993-8116, Saratov State Law Academy,
1, Volskaya Str., Saratov, 410056, Russia, [email protected]
Ekaterina A. Bogacheva, ORCID ID: 0000-0003-0340-840X, Saratov State Law
Academy, 1, Volskaya Str., Saratov, 410056, Russia, [email protected]
Nikita I. Ilyukhin, ORCID ID: 0000-0002-8890-093X, Saratov State Law Academy,
1, Volskaya str., Saratov, 410056, Russia, [email protected]
Genre annotation for UN Corpus has linguopragmatic and functional grounds in modern
linguistics. UN Corpus contains legal terminology and this makes its annotation more
complicated. The analysis of legal papers and documents, belonging to UN Corpus, shows that
some basic communicative functions (argumentative, informative, regulative, the function
of command, analytical function) represented by a limited set of Functional Text Dimensions,
are manifested in legal corpus either. In our study we offer genre annotation of UN Corpus based
on text-external and text-internal properties of genre prototypes. For this purpose, we will describe
linguistic features, genre prototypes, key words and the most frequent collocations. To textexternal features we refer text-level features (for example, means of structuring a sentence
or a legal document, annexes, extra information: years and dates, references and citations etc.) and
part-of-speech features (verb forms, pronoun forms). Text-internal features are considered
to be lexical (metaphors and comparisons, official cliché, legal terminology, verb groups like
suasive verbs, necessity verbs, possibility verbs, modal verbs) and syntactical (amplifiers,
rhetorical means of argumentation, parentheses as a means of text logical organization, complex
sentences, adverbial phrases). The inter-textual and extra-textual corpus analysis shows that
communicative functions are connected with linguistic features of corpus texts.
Key words: UN Corpus, Functional Text Dimensions, genre annotation, text-internal
features, text-external features, corpus analysis.
Introduction
Genre differences have complicated natural language processing and linguistic
annotation of corpus texts for years. Well-annotated gold standard data includes
sentence boundary detection, tokenization, part-of-speech tagging and syntactic
parsing. Corpora in special areas (for example, CRAFT – Colorado Richly Annotated
Full Text Corpus) sometimes may contain the annotation of concepts and specific
25
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
terms. So one of the language processing tools CRAFT creators have performed
is terminology recognition, specifically of gene names and ontological concepts [1].
Like CRAFT creators, V. Zakharov and M. Khohlova extract terminology
automatically and build up distributive thesaurus in their research [2; 3]. They
classify morpho-syntactical, lexical and semantical types of words’ compatibility
and single out one-word terms and terminological collocations which are typical for
particular genres. As a rule, terminological collocations are normative and
statistically frequent. V. Zakharov and M. Khohlova describe theoretical aspects
of statistical approach to terminological collocations’ study and practical methods
of their identification like finding n-grams (normally digrams or trigrams) within
the given context and determining syntagmatic and associative links between words.
One of the easiest methods is to make the list of the most frequent collocations for
a specific genre. Besides statistical criteria of automatic extraction of terminological
collocations, the above-mentioned researchers point out the methods based
on linguistic models like describing lexico/morpho-syntactic patterns [2: 185]. This
allows to conduct analysis of text-internal properties in corpus-driven study.
Corpus data enable to solve theoretical and practical issues of genre studies.
V. Dementiev and N. Stepanova offer to classify key phrases characteristic for
a number of speech genres. These key phrases may be described as the text
markers of speech genres and statistically analyzed via corpus manager toolkit.
Besides, the mentioned scholars single out the particular speech genres which are
easily identified by simple grammatical patterns: imperative – a request,
performative – an order, infinitive – a command, invective – an insult, pronoun
as a subject and compound nominal predicate with positive or negative
connotation – a compliment/a blame [4: 61]. V. Dementiev and N. Stepanova offer
the following algorithm of genre study on corpus data: genre – key lexeme – key
phrase/-es for genre identification – the quantity of entries into corpus/sub-corpus
– key phrases for context analysis – the quantity of entries into corpus [4: 57].
Still genre annotation remains one of the most difficult tasks for corpus
creators and under-researched topics in papers dedicated to corpus linguistics.
FTD and genre annotation of UN Corpus: the correlation
of pragmatic functions and text properties
We believe that genre identification in corpora should be based on analysis
of pragmatic functions for communicative needs and linguistic features which are
typical for specific genres.
26
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
In the Functional Text Dimensions (FTD) framework [5] the former are
presented as text-external properties, as they are defined by the communicative
situation and their similarity to the genre prototypes, while the latter are presented
as text-internal properties in the form of linguistic features.
In our study we experimented with annotation of the UN Corpus [6] with
respect to text-external and text-internal properties. In this paper we will describe
genre prototypes, their linguistic features, key words and the most frequent
collocations for each function. For example, the common communicative functions
are the argumentative and regulatory texts, as well as texts aimed at providing
reference information. The range of text-internal features include the distribution
of POS tags, lexical properties (such as metaphors and comparisons, official cliché,
legal terminology, verb groups like suasive verbs, necessity verbs, possibility verbs,
modal verbs) and syntactical properties (amplifiers, rhetorical means
of argumentation, parentheses as a means of text logical organization, complex
sentences, adverbial phrases). We will present our analysis of how communicative
functions are connected to linguistic features of texts in the UN corpus.
Corpus annotation, based on 20 functional text dimensions is suitable to any
traditional general reference corpus with a minimum of special terminology.
However, the analysis of legal papers and documents belonging to UN Corpus,
shows that some basic communicative functions represented by a limited set
of functional text dimensions are manifested in legal corpus either. The study
of UN Corpus allowed to single out basic FTD for its annotation and to describe
text-external and text-internal properties of corpus texts.
FTD scale represents the following communicative functions of a text:
the argumentative function is realized in texts A1 (argumentative), A13 (ideopuff),
A17 (evaluative); informative function (reporting function is closely connected
with the latter) can be manifested in texts A8 (hardnews), A16 (info); the function
of command is represented in texts A7 (instruct), A20 (appell); the function
of analysis is realized in texts A14 (scitech); regulative function is represented
in texts A9 (legal).
Linguistic features are subdivided into typical lexical features, syntactic
features, part-of-speech features and text-level features [7]. The inter-textual and
extra-textual corpus analysis shows that communicative functions are connected
with linguistic features of corpus texts.
We present the description of linguistic features of UN Corpus texts
according to the principles of FTD annotation.
27
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
I.
Argumentative function (FTD: A1, A13, A17)
Lexical features:
- Suasive verbs (require, encourage, recommend, propose, call upon, offer,
suggest, urge, appeal to etc.)
- Modal verbs (need, should, may, would)
- Verbs denoting mental processes (bearing in mind, taking into account,
consider, note, believe, feel)
- Necessity verbs (need, should)
- Possibility verbs (may, might)
- Evaluative adjectives (helpful, worrisome, peaceful, welcome, unfortunate,
sincere) and verbs with evaluative semantics (condemn)
Syntactic features:
- Amplifiers (do promote)
- Interrogations as a rhetorical means of argumentation
- Sentences with indefinite-personal subjects “one”, “who” (one could argue,
one might argue, no one knows)
- Impersonal sentences (It should be noted…; It must be kept in mind…;
It should be emphasized…)
- Composite conjunctions (on the one hand – on the other hand, in such a case
– in any case)
- Parentheses as a means of text logical organization (as mentioned above,
in addition, in this regard)
- Subordinate clause (if-sentence)
- Concessive clause (conjunctions “while” and “since”)
Part-of-speech features:
- Future verbs
- 1st person singular and plural subject
- Present verbs
Text-level features:
- Ordinal numerals structuring parts of a sentence (first, second, third…)
II.
Informative function (FTD: A8, A16)
Lexical features:
- Modal verbs (only for A8)
Syntactic features:
- Time clauses (only for A8)
28
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
- 3d person singular and plural subject (he, she, they)
- Compound sentences with the conjunction “that”
- Interrogative sentences
- Indirect speech
Part-of-speech features:
- Past verbs
- Possessive pronouns (his, her) and personal pronouns (he, she)
Text-level features:
- Future years and dates (only for A8)
- Exact past dates and exact data – per cents, sums of money, statistics
(for A16)
- Annex in the text
III.
The function of command (FTD: A7, A20 as a fragment in texts
A16, A9)
Lexical features:
- Necessity modals
- Suasive verbs
Syntactic features:
- Conditional clauses (if-sentences)
Part-of-speech features:
- Imperatives (imperative verbs)
- Future verbs
Text-level features:
- Instructions have a strict logical structure and are divided into items and
paragraphs
IV.
The function of analysis (FTD: A14)
This communicative function has no specific lexical features, but it possesses
a special recognizable syntax.
Syntactical features:
- Composite conjunctions and connective words structuring the text
(for instance, at the same time, primary, secondary or tertiary, in some cases)
- Homogeneous parts of the sentence
- Coordinating and subordinating conjunctions (on the contrary, therefore…)
for organizing the text
- Amplifiers
29
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
- Interrogations, metaphors and comparisons (ex. The door is still left open)
- Complex sentences with comment clauses (ex. While it is true that…),
adverbial phrases (typically, as a result, as a consequence)
Part-of-speech features:
- Present verbs
- 1st person pronouns (we describe, we recognize, we compiled data)
Text-level features:
V.
Well-organized and structured text with logical parts, paragraphs and
conclusions.
VI.
Regulative function (FTD: A9)
Lexical features:
- Legal terminology (multilateral agreement, legislation, obligation)
- Modal verbs (shall, should, must)
- Official cliché
Syntactic features:
- Compound sentences with connective words
- Homogeneous parts of sentences
Parts-of-speech features:
- Gerund forms (ex. Stressing that…; Reaffirming…; Expressing…)
- 1st person plural pronoun as a subject (ex. We acknowledge)
- Present verbs (in legislative draft documents)
Text-level features:
- Standard structure of legislative documents: introduction, general
provisions, specific provisions, recommendations, conclusions, annex
- Homogeneous gerunds and numeration of legislative decisions
in resolutions (ex. Considering…, taking into consideration… 1. Expresses…, 2.
Invites…, 3. Requests…)
- References, citations and quotations of earlier documents
Legal genres via Functional Text Dimensions
All UN Corpus texts and their genres were categorized according FTD.
Genre annotation is a flexible technique allowing to adopt legal texts
of UN Corpus to functional text dimensions. Structurally UN Corpus texts can be
classified into 4 types:
1)
Texts consisting of large fragments (sometimes more than
2 paragraphs) with distinctive genre borders are more preferable for genre
30
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
annotation. The most characteristic genre combination is A9 (legal texts) and A16
(informative texts).
./1994/a/49/pv_84.xml
A9 The draft resolution which the Sixth Committee recommends to the General Assembly
for adoption is reproduced in paragraph 8 of the report. Under the preambular part of the draft
resolution, the General Assembly would, inter alia, declare itself convinced of the continuing
value of established humanitarian rules relating to armed conflicts and the need to respect and
ensure respect for these rules in all circumstances within the scope of relevant international
instruments. It would also stress the need for consolidating and implementing the existing body
of international humanitarian law and for the universal acceptance of such law. Under the
operative part of the draft resolution, the General Assembly would, inter alia, note that,
in comparison with the Geneva Conventions of 1949, the number of States parties to the two
additional Protocols is still limited.
A16 The General Assembly would further request the Secretary-General to proceed with
the organization of the United Nations Congress on Public International Law, to be held from
13 to 17 March 1995, within existing resources and assisted by voluntary contributions, taking
into account the guidance provided by the Sixth Committee at the forty-eighth and forty-ninth
sessions of the General Assembly. The Assembly would also recognize the relevance
of international humanitarian law and, in this connection, would invite all States to disseminate
widely the revised guidelines for military manuals and instructions on the protection of the
environment in times of armed conflict received from the International Committee of the Red
Cross and to give due consideration to the possibility of incorporating them into their military
manuals and other instructions addressed to their military personnel.
2)
Texts consisting of relatively small fragments (1-2 paragraphs) with
blurred borders and changing each other subgenres. A9, A1, A14, A16
is a frequent combination. Genres are changing in the context of one or two
paragraphs. Texts of this type are hardly suitable for genre annotation but they
might be fruitful for effective analysis of subgenres and borderline genres.
./2014/ctoc/cop/wg_3/2014/2.xml
A9 In addition to requiring States parties to deem offences under the Convention
extraditable under existing treaties, the Convention also encourages States parties to give effect
to its provisions when concluding new bilateral or multilateral agreements. Thus, pursuant
to article 18, paragraph 30, a State party concluding a new bilateral or multilateral agreement
on mutual legal assistance is to consider reflecting key provisions of article 18 of the
Convention in the new agreement, including the types of assistance that may be requested
(art. 18, para. 3); requirements for the content of the request (art. 18, para. 14); grounds
for refusal of the request (art. 18, para. 21); a prohibition on declining requests on the ground
of bank secrecy (art. 18, para. 8); and provisions on costs (art. 18, para. 28).
31
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
A1 The inclusion of such provisions in bilateral and multilateral agreements reflects the
modern judicial cooperation practice drawn from a wide range of treaties. In addition to the
provisions of the Organized Crime Convention itself, the Model Treaty on Extradition and the
Model Treaty on Mutual Assistance in Criminal Matters are valuable tools for the development
of bilateral and multilateral arrangements and agreements in the area of judicial cooperation.
A14 Of relevance are also the model bilateral agreement on the sharing of confiscated
proceeds of crime or property and model treaties on issues where the Organized Crime
Convention contains generic provisions, such as article 21, on the transfer of criminal
proceedings, and article 17, on the transfer of sentenced persons. From the perspective of the
Organized Crime Convention, many of the provisions on extradition and mutual legal assistance
closely reflect the approach of the Model Treaties. The Model Treaty on Extradition,
for example, contains 18 articles covering, inter alia, extraditable offences; mandatory and
optional grounds for refusal; channels of communication and required documents; simplified
extradition procedures; provisional arrest; surrender and postponed or conditional surrender;
the rule of speciality; concurrent requests; and costs. A review of bilateral extradition treaties
reveals both a number of similarities and differences with the Model Treaty on Extradition.
3)
Texts with unclear genre borders, genres intermingle in the context
of 1-3 sentences. Texts of this type are not suitable for genre annotation.
./2007/s/2007/10/add_43.xml
A9+A8 Summary statement by the Secretary-General on matters of which the Security
Council is seized and on the stage reached in their consideration Addendum Pursuant to rule
11 of the provisional rules of procedure of the Security Council, the Secretary-General
is submitting the following summary statement. <…> The Security Council resumed its
consideration of the item at its 5771st (private) meeting, held on 29 October 2007 in accordance
with the understanding reached in its prior consultations. At the close of the meeting,
in accordance with rule 55 of the provisional rules of procedure of the Security Council, the
following communiqué was issued through the Secretary-General in place of a verbatim record:
On 29 October 2007, the Security Council, pursuant to resolution 1353 (2001), annex II,
sections A and B, held its 5771st meeting in private with the troop-contributing countries to the
United Nations Mission in the Sudan. The Security Council and the troop-contributing countries
heard a briefing under rule 39 of its provisional rules of procedure by Jean-Marie Guéhenno,
Under-Secretary-General for Peacekeeping Operations.
The analysis of UN Corpus according to FTD technique allowed to describe
its genre composition. UN Corpus contains genre prototypes belonging to the
following functional text dimensions: A1 (argumentative), A7 (instructive),
A8 (news, coming events), A9 (legal), A14 (analytical), A16 (informative), A20
(appellative).
32
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
Genre composition of UN Corpus according to FTD annotation technique
FTD
Code
A1
Genre prototype
policy recommendations; public speech; official letters;
conclusion/observations (the final part of a report can also be
argumentative); working paper (recommendations with guidance);
argumentative part of the report; recommendation, note.
Note: all types of recommendation documents belong to A1.
A7
administrative instructions; technical instructions; working
paper (recommendations + guidance); white paper (its instructive
part); guidelines; information circular.
A8
informative report; outline
of meetings; provisional overview.
A9
Act; agreement; amendment proposal; amendment; article
(article of declaration/convention/statute/charter); Bill; bulletin;
Case Chapter of a doc; charter; circular; commitment;
Constitution; convention; court decision; court statement;
covenant; decision; declaration; decree; directive; draft; draft
manual; draft ministerial declaration; general or specific
provisions of articles; individual opinion; International Covenant;
judgement; legislative programme; memorandum; norms; optional
protocol; ordinance; outcome document; pact; programme
of work; proposal; protocol; provisional rules of procedure;
provisions;
regulation;
requirements;
resolution;
rules
of procedure; special provision; statute; technical regulation; note
by the President of the Security Council; views; White Paper
(commentary and analysis).
A14
analytical part of White Paper; analytical report/review;
background; inquiry; legal analysis of legislation; legal analysis;
legal research/legal research paper; study.
of
schedule;
programme
33
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
A16
A20
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
a progress report; a verbatim record; agenda; analytical
report; annex; annotation; annual report; assessment report; audit
report; background paper; briefing; combined report;
communiqué; comprehensive report; conclusion; draft report;
evaluation report; executive summary; explanatory note; fact
sheet; final report; financial report; flagship report; follow-up
review; information circular; initial report; inventory report; joint
report; leaftet; letter; major report; memorandum; monitoring
report; news report; note; notification; official records; oral
briefing; oral report; outline report; performance report; periodic
briefings; periodic report; periodic review; progress report;
provisional agenda; quarterly report; recommendation report;
record; regional report; regular report; report on the
implementation of convention/legislation etc.; report; review
report; review; statement; statistical report; status report; summary
of the report; summary records; summary statement; survey;
synthesis report; the Trade and Development report; work
programme.
request, recommendations, programme of meetings.
D. Berūkštienė points out, that different legal texts have different functional,
structural and linguistic features, they are classified into genres on the basis
of different criteria [8: 89]. Still it is sometimes difficult to attach a particular legal
text to a particular genre. FTD annotation allows to take into consideration
all those features while analyzing genre composition of UN Corpus.
Texts belonging to the same genre should have similar lexico-grammatical
and functional properties. Swales considers, that «a genre comprises a class
of communicative events, the members of which share some set of communicative
purposes <…> exemplars of a genre exhibit various patterns of similarity in terms
of structure, style, content and intended audience» [9: 58].
D. Berūkštienė treats legal texts like «special-purpose texts, which
are different from other kinds of texts in respect of their text-internal and textexternal properties» [8: 95-96]. Legal texts are drafted in legal language, which
may be defined as a language for specific purposes. J. Lehrberger described legal
language according to the following 6 text-external and text-internal parameters:
34
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
Text-external parameters:
1. Limited subject matter (law)
2. The use of special symbols
Text-internal parameters:
1. Lexical, semantic and syntactic restrictions (e.g. the use of legal
terminology)
2. “Deviant” rules of grammar
3. High frequency of certain constructions (e.g. formalized sentence patterns
in statutory texts)
4. Text structure (e.g. legislation or contracts) [10: 22].
According to the text-internal parameters №1 and 3 we conducted lexical and
quantitative analysis of UN Corpus and determined that each functional text
dimension and genres belonging to it contain statistically significant list
of particular terms and collocations.
FTD
A1
Frequent terms and collocations
achieve; aim; a final consideration is that…; accordingly; another
advantage of; another consideration is; approach; as a matter of course;
as a result;
as mentioned above; as noted above…; balance; based on this
position; beyond the example; conflict resolution; consider; development;
emphasis; factor; finally; for example; for example; for instance;
fundamental; further/further more; goal; however; however; therefore;
in fact; in order to; in our view; in this connection; in this context;
intention; international policies; issue; it could be argued; it is advisable;
it is important; it is important to emphasize; it is necessary; it is particularly
important; it may be advisable; it would seem that…; key; lack of; legally;
means; measures; moreover; need; on that basis on the other hand; policy;
presumably;
programme;
promotion;
purpose;
recommend;
recommendation(-s); region; require; role; similarly; specifically; strategy;
submit; support; sustainable; taking into consideration; the abovementioned; the evidence implies; there is a strong need; this means/
it means;
35
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
A7
A8
A9
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
to achieve; tools; to stress the importance; to stress the need; we are of
the view; we are open; we recommend/propose; while
are required; are responsible; encourage; ensure; have to (modal verb);
in order to…; it is essential; it is expected that; it should be noted; please;
pursuant to rule…; this requires…; urge
a follow-up; expected outcome; in the near future; it is expected; now;
ongoing; onwards; outline; then; tomorrow
acknowledge; act; administration; administrative measures;
administrative; adopted unanimously; adoption; agreement; amend to read;
amendment; appeal; applicant; are subject to; article reads; article states
that…; article provides…; article; as follows; authorities; authority;
available; be entitled; be exercised; bilateral; bound; budget; state budget;
by law; case law; case; chamber; chapter; chief; citizen; civil action; claim;
complaint; concept; conformity with…; Constitution; constitutional checks
and balances; control; cooperation; council; count court; crime; criminal
proceedings; criminal responsibility; de facto; de jure; decide; decision;
defence; delisting; directive; discrimination; document; domestic law;
domestic legislation; draft law; drafting; effective remedy; embody;
enactment; enshrine; entitled; entry into force; enter into force; equality
before law; European Court of First Instance; European Union; executive;
expiry; fair trial; for the purposes; framework; free; freedom; fundamental /
human rights; good faith; guarantee; guidelines; guilt; have agreed
as follows; herein; human rights (economic, social, political, cultural, civil);
human rights law; implementation; in accordance with; in compliance with;
in order to; in respect of; indictment; individual; innocence; instruments
and programmes; inter alia; International Court of Justice; international
law; investigate; judgement; judicial review; judicial; Jurisdiction; jus
cogens; norm; justice; juvenile justice; law and practice; law enforcement;
law; legal authorities; legal basis; legal issues; legal obligation; legal
system/practice/entity/instrument; legality; legally binding; legislation;
legislative; legislative framework; list; listing; management; means
of redress; measures; mechanism; media law; model legislation; national
law; negotiation; notwithstanding; obligation; office; optional protocol;
organ; package of legislation; paragraph; part; parties; penal code; pending;
36
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
A13
A14
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
people; peremptory; person; petition; plea; policies and institutions;
policies; prevent; principle; private; procedure /-es; proceedings progress
achieved; proposal; prosecuting; prosecution; prosecutor; protect;
protection; provision / article…reads / states; provision; pursuant to /
resolution…; article…; rate; ratification; ratify; read as follows; reference;
reform; refugee law; regions; regulations; regulatory; relevant to;
representation; resolution; respect; revised version; right/s; rules; rules
of procedure; rule of law; sanctions; section; sentence; separation
of powers; signature; staff; standing case law; state; stipulate; subject to;
take effect; terms; the proposed model; thereto; thereto; third party;
to amend; to ensure; to file a suit; to grant; to oblige; to provide; to read;
to rule; the rule of law; to state; to submit; to take measures; transmit;
treaty; trial; tribunal; violation; whereas; working group
challenge; citizens; civil society; democracy; democratic institutions;
democratization; development; diversity; equality; equitable; equity;
freedom; freedom of association; freedom of assembly; freedom
of expression; human rights; mutual respect; non-discrimination; people;
political rights; progress; promotion; social; social integration; social
justice; tolerance; value
academic institutions; accordingly; additionally; affect smth.;
alternatively; although; analogous; analysis showed; analysis; analyzing;
approach; as a result; as follows; as noted above; as previously noted;
as shown; as well as; aspect; at least partly; at present; at the end of the
paper; at the same time; background; basis; by definition; categories;
comprehensive overview; concept; conclude; consequently; criteria;
criticism; democracy; described above
deserve attention; development; directly; due to; economics;
especially; examination; expert; factor; focus on the problems; focus; for
example; for instance/for example; for this reason; fundamentally; further;
furthermore; given…; guidelines; however; however; imply; in addition;
in comparison; in contrast; in fact; in other words; in part; in particular;
in practice; in recognition of this fact; in sum; in that sense; in the context;
in the light of; in this connection; in this regard; in this respect;
in this/that/the case/some cases; institute; issues; it is certainly known;
37
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
A16
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
it is clear from; it is on the other hand; true that; it is questionable; it is to be
emphasized; it should be borne in mind; it should be noted; it should be
pointed out; it should be stressed; knowledge; lastly; legal bases; legal
scholars; likewise; mainstream; mechanism; mentioned-above; more
specifically; moreover; moreover; nevertheless; on the ground of; on the
other hand; opponents; overview; paper; partly resulting from…;
phenomenon; policies; prerequisites for work; qualitative analysis;
questionnaire; rather; recent works on…; regarding; related; relevant;
representative; research results; research work; research; result (verb);
review; reveals; science; scope; scrutiny; similarly; since; so far; source;
statistics; strategy; studies; study carried out; survey; technologies; tend;
the following; the key question is…; the present paper; the question is; the
study aims at therefore; thinking; this is due to the fact; this results…;
throughout its history; thus; to a certain extent; to argue; to present; to
some/the extent; to stem from; the fact; trend; type; typically; ultimately;
upon analysis; with regard to
access; accordingly; account; accused; ad hoc; adopt; adoption of the
report; agenda; analyst; announce; appeal; approach; as a consequence /
consequently; as a result; as mentioned in report; as of the date of this
report; assure smbd.; at its meeting; at the time of drafting the present
report; available from; be in a position; bilateral; briefing; case; chairman;
commission; committee; companies; conference; consideration; contained
in the report; continued report; contract; core issue; country; court; current
session; currently; data; decide; delegation; detailed information;
developed/less developed/developing countries; development; discuss;
documents; during the reporting period; education; efforts; emphasize;
employment; evidence; factor; fair; finally; furthermore; gap; government;
guidelines; having considered the report…; herein; highlight; human rights;
impact; implementation; improvement; in addition; in conclusion;
in contrast; in general; in particular; in past decades; in recent years;
in short; in that regard; in the context; in this respect; income; indictee;
indictment; inform smbd.; information; information provided/contained;
information, said that…; inter alia; international; internet access;
investigation; issue; it should also be pointer out; it was agreed; it was
38
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
A17
A20
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
mentioned; it was noted; it was reported that; item; labour market; lack;
latest report; lawyer; legal; letter; list; mandate; measures; meeting;
member states; millennium; monitoring; more importantly; multilateral;
nevertheless; nonetheless; officials; on a more positive note; on a regular
basis; on the one hand – on the other hand; organization; organization;
panel discussion; participants; people; pilot; policies; present a report;
presentation; president; pre-trial; principle; problem; process; programme;
progress; project; promotion; prosecution; prosecutor; provide information;
rapporteur; rates; ratio; record; refugee; region; report concluded that…;
report indicates; report on smth.; reported cases; reporting; reporting
period; reporting status; research and analysis; resources; respect;
responsibility; review; safety; said that…; secretary; security; Security
Council; similarly; since; smbd. stated that…; staff members; state party;
state; statement; statistics; strategy; submit; summarize; summit; support;
tendency; testimony; The council decided; the report stated…; therefore; to
launch; to provide; to report back; to show that; to show; to welcome;
trend; trial chamber; trial; tribunal; United Nations; urge; website; witness;
working group; workplan
commend;
condemn;
conditions;
degradation;
intolerance;
is concerned/concern; lack; slum dwellings; unhealthy; unsafe; violence;
welcome
contact; encourage; for further details; indicate; note; please; provide;
recommend; request; urge
The conducted analysis shows that pragmatically, functionally and
semantically FTD A1, A9, A14, A16 are the most perspective while annotating
UN Corpus. The markers of these FTD are statistically significant and they
contain more frequent terms and collocations than FTD A7, A8, A13, A17, A20.
Borderline genres and subgenres in FDT framework
The analysis of text varieties in UN Corpus shows that each functional
dimension has specific set of genres and subgenres. However, there are lots
of borderline genres like assessment report, summary report, memorandum,
request, recommendations, conference programme belonging simultaneously
39
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
to FTD A1, A8, A9, A16, A20. These can be considered complicated cases of
genre annotation.
Some legal texts of UN Corpus contain large fragments of legal subgenres
with distinctive borders performing specific pragmatic functions. For example,
large fragments belonging to FTD A17 perform evaluative or critical function
in argumentative (A1) and informative (A16) texts. The fragments of FTD A13
perform the communicative function of propaganda in argumentative (A1),
analytical (A14) and informative (A16) texts.
The fragments A8 (descriptions of future events and activities) can often be
seen in informative texts (A16) in such subgenres as summary reports, work
programmes and conference programmes. For example, the subgenre performance
report:
./2004/a/59/547.xml
A8 The Investigations Division is headed by the Chief of Investigations (D1), who is responsible for the overall management and efficient performance of the
Division, including the Information and Evidence Section and the Request for
Assistance Unit. The Chief is currently assisted in those functions by three
deputies (P-5). With the planned reduction in staff and the corresponding
reorganization of the Division in 2005, there will no longer be a need for three
deputy chiefs.
A16 Experience has shown that the resources approved for 2004-2005 were
not sufficient to support the increased workload of the Appeals Unit both in terms
of numbers and complexity of cases. To date, almost every person convicted by
the Tribunal has exercised his or her right to appeal and to bring a second level
case to the Appeals Chamber. The prosecution has also exercised its right to
appeal. In addition, numerous complex appeals are raised during trial to address
issues that require resolution before the trial can be completed.
The main legal subgenres belonging to different FTD are the following:
1)
assessment report is a legal subgenre containing analytical part (often
in the form of research – FTD A14) and informative part (typical of any kind
of reports – FTD 16);
2)
conclusive part of legal documents is often both argumentative (FTD
A1), analytical (FTD F14) and informative (FTD A16);
3)
memorandum is informative reference document containing views
and opinions (argumentative part – FTD A1 and legal legislative part – FTD A9);
40
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
4)
work programme and programme of meetings are often borderline
genres, as they contain information about future events (FTD A8), instructions
(FTD A20), general and specific information (FTD A16) and sometimes legal
components (FTD A9);
5)
recommendations and requests are complicated subgenres for
annotation due to their multi-structural composition with argumentative (FTD
A1), instructive (FTD A7), legislative (FTD A9) or appellative (FTD A20)
components;
6)
Summary/review reports are legal subgenres containing informative
(FTD A16) part with information about future events (FTD A8);
7)
Analytical report is a borderline genre representing combination
of research paper or background paper (FTD A14) and comprehensive informative
report (FTD A16);
8)
Working paper usually contains both legal recommendations and
instructions (FTD A1, A7, A9);
9)
White paper, representing both commentary and legal analysis,
is a legal subgenre including argumentative (FTD A1), instructive (FTD A7) and
analytical (FTD A14) parts;
10)
Legal analysis of legislation is often incorporated into different legal
genres and usually combines legislative (FTD A9) and analytical (FTD A14)
parts.
Conclusions
Genre annotation is one of the most complicated kinds of corpus annotation.
There are several main difficulties genre annotation presupposes in corpus-driven
study. Firstly, if a corpus belongs to any sublanguage and contains a specific set
of genres, its genre annotation is an elaborate task for corpus creators, as a special
corpus contains many subgenres with intermingling borders. To determine the
borders between genres of a special corpus belonging to a certain thematic area
(for example, legal corpus) is always a difficult work. Secondly, any special
corpus always contains numerous terms and terminological collocations.
Terminology can also be an impediment for qualitative and quantitative corpus
analysis. Thirdly, genre annotation also implies pragmatic study of extra-textual
and intra-textual properties according to their functions in corpus texts. The set
of pragmatic functions correlates with functional text dimensions (FTD).
41
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
Extra-textual properties contain text-level features like syntactical means
of organizing sentences and structuring texts, extra-linguistic data (i.e. future and
past dates, exact information – per cents, sums of money, statistics, annexes in the
text, references, citations and quotations of earlier documents). Intra-textual
properties include lexical (i.e. verbal categories: suasive verbs, necessity verbs,
modal verbs, possibility verbs, verbs denoting mental processes; evaluative
adjectives; legal terminology; collocations; cliché, etc.), syntactical (subordinate
clauses of different types: conditional, time, concessive clauses; amplifiers;
interrogations; impersonal sentences; composite conjunctions; parentheses;
indirect speech; adverbial phrases; metaphors and comparisons, etc.) and part-ofspeech features (present, past and future forms of verbs; possessive and personal
pronouns; imperatives; gerund forms, etc.).
So genre annotation of a special corpus, representing a certain thematic area
allows to analyze its statistically significant terminology, to single out the
functional set of genres and subgenres and to distinguish the borders between
them, to describe morpho-syntactical patterns and extra/intra-textual properties
of corpus texts.
References
1. Verspoor, K., Cohen, K., Lanfranchi, A. (2012). A corpus of full-text journal articles is
a robust evaluation tool for revealing differences in performance of biomedical natural language
processing tools. BMC Bioinformatics, 13, 207-232 (in English).
2. Zakharov, V.P., Khohlova, M.V. (2014). Avtomaticheskoje vyjavleniye
terminologicheskih slovosochetaniy [Automatic identification of terminological phrases].
Strukturnaya i prikladnaya linguistika [Structural and Applied Linguistics], 10, 182-200
(in Russian).
3. Zakharov, V.P. (2015). Korpusnyj podkhod k postroeniyu tezaurusa i ontologii
[Corpus-Based Approach to Thesaurus and Ontology Construction]. Strukturnaya i prikladnaya
linguistika [Structural and Applied Linguistics], 11, 123-143 (in Russian).
4. Dementiev, V.V., Stepanova, N.B. (2016). Korpusnyje metody v issledovanii
rechevykh zhanrov [Corpus methods of speech genres]. Vestnik RUDN. Serija: Linguistika
[Bulletin of the peoples' Friendship Russian University. Linguistics], (20) 3, 57-76 (in Russian).
5. Sharoff, S. (2018). Functional Text Dimensions for annotation of Web corpora.
Corpora, 13 (1), 65-95 (in English).
6. Ziemski, M., Junczys-Dowmunt, M., Pouliquen, B. (2016). The United Nations parallel
corpus v1.0. Proceedings of the Tenth International Conference on Language Resources and
42
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)
ISSN 2658-5138
Язык науки и профессиональная коммуникация. 2022. № 2 (7)
Evaluation (LREC’16). Portorož, Slovenia: European Language Resources Association (ELRA),
3530-3534 (in English).
7. Sharoff, S. (2021). Genre annotation for the Web: text-external and text-internal
perspectives. Register Studies, 3, 1, 1-32 (in English).
8. Berūkštienė, D. (2016). Legal discourse reconsidered: genres on legal texts.
Comparative Linguistics, 28, 89-117 (in English).
9. Swales, J. (1990). Genre analysis: English in academic and research settings.
Cambridge: Cambridge University Press (in English).
10. Lehberger, J. (1986). Sublanguage analysis. Analyzing language in restricted domains:
sublanguage description and processing. New Jersey: Lawrence Erlbaum Associates, 19-38
(in English).
УДК 81.111’42
ББК 81.2 Англ.-5
DOI: 10.24412/2658-5138-2022-7-43-54
ОБРАЗ ГОСУДАРСТВА В МАССМЕДИЙНОМ ДИСКУРСЕ
Зарайский Александр Александрович, доктор филологических наук, профессор
кафедры иностранных языков. Саратовская государственная юридическая академия.
Саратов, Россия, [email protected]
Трудно переоценить роль массмедиа в современном мире. Они обеспечивают
процесс массовой коммуникации, то есть взаимодействия всех социальных субъектов,
и играют ключевую роль в процессе общественного познания и самопознания.
Массмедийный дискурс имеет ключевое значение как при формировании общественного
мнения, так и при формировании индивидуальной картины мира отдельно взятого
человека. Массмедиа производят, воспроизводят и транслируют обществу ценностные
смыслы, тем самым во многом определяя отношение людей к окружающей
их реальности.
Ключевые слова: дискурс, массмедийный дискурс, образ государства, оценочная
лексика, медиатекст.
THE IMAGE OF THE STATE IN MASS MEDIA DISCOURSE
Alexandr A. Zaraiskiy, ORCID ID: 0000-0002-6928-713X, Saratov State Law
Academy, Saratov, Russia, [email protected]
43
ISSN 2658-5138
Language of Science and Professional Communication. 2022. № 2 (7)