180 Section9
Convolutional
neural
networks
and
readability
evaluation
for
Russian
texts
D. A. Morozov1
1Novosibirsk State University
Email: d.morozov8@g.nsu.ru
DOI 10.24412/cl.35065.2021.1.02.68
Automatic evaluation of the age of the reader is a task of current interest in applied linguistics [1]. An algorithm
that makes it possible to evaluate the age at which a text, on the one hand, will be understandable,
and on the other, interesting, has a broad spectrum of potentialapplications in education and recommendation
systems.
The classical methods for evaluating the readability of a text are linear regressions on a small number of
simple features leading to their low reliability. The development of natural language processing methods and
the use of neural network algorithms can significantly improve the estimation accuracy. In this paper, a new
algorithm based on convolutional neural networks is presented. The main advantage of this approach is the
set of chosen features. We use combination of classical features such as average sentence length, semantic
vectorsandsome manuallyconstructed abstractfeatures. Trainingsampleiscollectedfromexperimental data
onthereal pReferencesofRussianschoolstudentsincreasingthepracticalvalueofourwork.
The reportedstudywasfundedbyRFBR,project number19.29.14224.
References
1. Glazkova Anna, Egorov Yury, Glazkov Maksim. A Comparative Study of Feature Types for Age.Based Text
Classification. Analysis of Images, Social Networks and Texts, 2021.
Computer
system
for
searching
for
chemicals
with
predefined
properties
A. L. Osipov, V. P. Trushina
Novosibirsk State Universityof Economics andManagement
Email: alosip@mail.ru
DOI 10.24412/cl.35065.2021.1.02.69
An important area of scientific research is the search for patterns between the structures of substances
and their various properties, including biological ones. The problem of predicting new promisingcompounds is
solved using data mining and machine learning methods. They allow you to filter and filter out unnecessary
compoundsand leavea small percentageof compounds that canbe experimentally investigated.Methodsand
models for predicting the physico.chemical, medicinal, and biological properties of organic substances using
factographic databases have been developed.Avirtual screening procedurehas been created, whichincludes
an automatedreview of the database of chemicals and the selection of those for which the desired properties
are predicted [1]. The developed mathematical modeling methods and computer technologies allow us to significantly
limitthe search area for chemicals with the required properties.
References
1. Osipov A.L., Bobrov L.K. The use of statistical models of recognition in the virtual screening of chemical
compounds/A.L. Osipov,L.K. Bobrov// Automatic Documentationand Mathematical Linguistics.2012. Vol.46.No4.
P. 153.158.