Применение нейросетей в образовательной сфере: возможности и вызовы
Соколова Алла Германовна
кандидат технических наук, доцент, доцент кафедры иностранных языков и профессиональной коммуникации, Национальный исследовательский Московский государственный строительный университет, [email protected]
Архипов Александр Владимирович,
Кандидат географических наук, доцент, доцент кафедры иностранных языков и профессиональной коммуникации, Национальный исследовательский Московский государственный строительный университет, [email protected]
Современное образование на протяжении многих лет остается неизменным на фоне стремительных изменений в различных сферах жизни современного общества. Получение актуального и качественного образования для будущих специалистов является одной из основных острых проблем общества. Однако вопросы оценки качества образования остаются практически не изученными, в то время как большинство вузов осознают, что без создания внутренней информационно-аналитической системы проблему качества образования решить невозможно. Таким образом, создание информационно-аналитической системы является актуальной задачей, имеющей большое значение в управлении и развитии высших учебных заведений. Использование современных подходов в информационных технологиях анализа данных позволит вузу эффективно управлять своей деятельностью. Системы онлайн-обучения с их скрытыми структурами и закономерностями в данных бесценны в образовании, поскольку они позволяют разрабатывать гибкие, адаптивные, индивидуальные предложения образовательных ресурсов и получать более глубокое понимание. В настоящее время сети глубокого обучения рассматриваются как наиболее многообещающий инструмент в этом направлении, тогда как выбор наиболее подходящей архитектуры глубокой сети с точки зрения производительности и возможностей требует опыта и профессиональных навыков.
Ключевые слова: нейросети, образование, траектория обучения, рекуррентные сети, искусственный интеллект, технология глубокого обучения.
Introduction
What are neural networks?
As a rule, the growth of students and staff, the rapid development of the university, participation in priority government programs, the ever-increasing volume of information in various departments - all these factors become prerequisites for the creation of its own information system in the university. Another important condition is the existence of a university-wide computer network for collecting and processing information about employees, students, the university's scientific activity, the material and technical base, and the demand for graduates in the labor market.
However, nowadays, it is no longer sufficient just to develop information systems to collect data related to the activities of the university. There is a need to create effective and productive means of analyzing the information obtained to assess the quality of education. This is the only way to transform the university information system into an information-analytical system. When building such systems, the university, in terms of the problem of education quality, gets an opportunity to assess:
- faculty quality;
- quality of knowledge received by students;
- state of the material and technical base of the university;
- the level of competitiveness of specialists, graduates of the university, in the labor market.
One of the ways to solve this problem is the introduction of all kinds of technical means, training systems, the use of Internet learning.
The development of learning systems is currently a very popular and intensively developing scientific activity, due to the renewed interest in the use in practice of artificial intelligence technologies, as well as the intensive development of Internet technologies, which allow engineers to use new productive development tools, which did not exist before. The popularity of this area of research has led to a large number of scientific studies on the topic, hundreds of training systems have been developed, and unique approaches in methodology have been implemented.
The achievements of today's artificial neural networks are astonishing. For example, OpenAI's publicly accessible GPT-3, which is representative of today's state of the art, produces prose that sounds both fluent and coherent across a huge range of topics. Cars now drive themselves in complicated traffic situations. Robots load and unload dishwashers without chipping a cup. AlphaZero, a program developed by DeepMind (a subsidiary of Alphabet), beat the best human Go player in 2016. Networks are able to translate complex, highly idiomatic passages in a blink of an eye. They predict protein-folding better than human experts. Near-perfect transcription of rapid-fire speech in real time is also possible. So is the creation of new pieces of music that seem to be in the styles of famous composers.
X X
о
го А с.
X
го m
о
2 О
м
CJ
fO
es o es
es
o
LU
m
X
<
m o x
X
Currently, there are many systems for creating training structures, among which the leading places are occupied by artificial neural networks. Neural networks are computational structures that model simple biological processes, usually associated with human brain processes. The main properties of neural networks include network learning, generalization, parallelism, distributed representation of information and further calculations, adaptability, moderate power consumption, contextual processing of information, and processing of erroneous situations.
Artificial neural networks allow solving such problems as image classification, clustering / categorization (without a tutor), function approximation, tasks with prediction and forecasting, all kinds of optimizations, tasks related to content-addressable memory, pattern recognition, and various kinds of management tasks.
Given their increasing diffusion, deep learning networks have long been considered an important subject on which teaching efforts should be concentrated, to support a fast and effective training. In addition to that role, the availability of rich data coming from several sources underlines the potential of neural networks used as an analysis tool to identify critical aspects, plan upgrades and adjustments, and ultimately improve learning experience. Analysis and forecasting methods have been widely used in this context, allowing policy makers, managers and educators to make informed decisions. The capabilities of recurring neural networks—in particular Long Short-Term Memory networks—in the analysis of natural language have led to their use in measuring the similarity of educational materials. Massive Online Open Courses provide a rich variety of data about the learning behaviors of online learners. The analysis of learning paths provides insights related to the optimization of learning processes, as well as the prediction of outcomes and performance. Another active area of research concerns the recommendation of suitable personalized, adaptive, learning paths, based on varying sources, including even the tracing of eye-path movements. In this way, the transition from passive learning to active learning can be achieved. The authors attempted to outline challenges and opportunities of the application of neural networks in the educational sector.
Learning useful representations from raw data means extracting relevant information in a compact form and removing redundant information as well as noise. In other words, constructing a simplified model that explains observed data. Analysis of the obtained representation can highlight latent factors, disclose previously unseen relationships among variables, and ultimately help gaining useful insight into the phenomenon being observed. Finding a good representation is crucial in multiple research fields, where data come from several sources and are characterized by high complexity. Neural networks are a widely used and successful representation learning technique. Neural networks, as their name suggests, are inspired by the structure of the cortex in the human brain. They consist of a number of units arranged in a directed graph (undirected for the Boltzmann machines) by means of connections. A unit takes as input a weighted sum of the outputs of the units connected to it and produces its output by applying to that sum a nonlinear activation function— typical such functions are the hyperbolic tangent and the logistic sigmoid. The neural computation model has some nice theoretical properties and neural networks can be
shown to be universal approximators (Goodfellow et al., 2016).
Neural networks base their learning on the collection of training samples. Training a neural network is usually done by Stochastic Gradient Descent, with the calculation of the gradient of the loss function (quantifying the prediction error) with respect to the network parameters being obtained through the backpropagation algorithm. To keep the architecture simple, restrictions are applied to the topological structure of networks: Units are arranged in layers, with connections only between units in adjacent layers, whereas intermediate layers are called hidden layers. Neural networks with at least two (some authors mention three) hidden layers are called deep learning networks. It is this hierarchical structure that provides deep network with the ability to build powerful representations. Subsequent layers work on intermediate representation constructed by previous layers, so that internal representations are at an increased level of "abstraction".
Results and findings. Application to the educational sector
Psychological studies on human and animal learning have been conspicuous sources of inspiration in developing machine learning paradigms. In its general meaning of automatic deriving knowledge from experience, machine learning—crystallized in data—is specifically attractive for the educational sector. There are two main reasons for this. Firstly, the educational environment is extremely complex and little assumptions can be made about the data distribution. Secondly, vast amounts of data become available for exploration. Useful applications of machine learning in educational sphere comprise an array of objectives (Coelho &Silveira, 2017). Accurate monitoring student's states during learning can support personalized, flexible, and adaptive learning, with direct benefit for students and an increased retention rate for providers. Student modeling can be based on several data sources, including for interaction logs, facial features, and eye movements.
The usage of deep learning models in educational sphere gained momentum in 2015 (Guo et al., 2015), when a prediction system for student performance was introduced. An attractive benefit of such a system is its capability of providing early warnings so that students at risk could be identified where there is still time for corrective actions. While applying deep learning and RNN (recurrent neural network) models to an educational context is obviously desirable, the scenario creates some unique challenges that need to be addressed. In particular, inhomogeneity and redundancy often characterize data in educational analysis, especially in detection of student boredom, and they should be approached timely.
Designing handcrafted feature to represent student behavior can be challenging (Bosch & Paquette 2017). Unsupervised autoencoders are trained to find data embeddings, mappings to lowdimensional spaces that (a) improve the performance of classifiers, and (b) have the potential of showing interesting insights in data, emphasizing previously unseen connections. Regardless their use as building blocks in modular architectures of complex neural networks, the embedding themselves can be analyzed and researched separately, seeking possible clues about unexpected associations evidenced by spatial closeness in the simplified representation.
In a personalized and adaptive learning environment the learning path, instead of being fixed, is very flexible, continuously adapted, based on student's individual characteristics and knowledge level, to enable students to achieve their learning objectives in the shortest time possible. Customized recommendation systems allow the realization of personalized learning path for various individuals, capitalizing on the experience of others. Recommendation systems should be optimized in terms of diversity, novelty and interaction intensity. In early recommendation systems, content-based filtering derived recommendations for a learner on the basis of what was preferred in the past by learners with similar tastes. In order to aggregate learners with similar preferences in Collaborative Learning, it is natural to think to clustering algorithms based on various similarity metrics (Pelanek, 2019). Sparsity and volume of the data volume call, however, for different solutions that can scale in a better way. Kim et al.
(2017) combined Probabilistic Matrix Factorization with a Convolutional Neural Network (CNN) to model contextual information and consider Gaussian noise. Features used to represent learning resources need to keep some fundamental assumptions into account (Zhou et al., 2018). In particular, some knowledge is regarded as essential in a learning plan and must be included in any path related to that plan. Zhou et al.
(2018) used an LSTM predictor for learning paths, in particular because of its ability to handle sequences of different length. In contrast, Kim et al. (2017) preferred a CNN to a LSTM or GRU, because of the faster training times offered by the former. In fact, CNN's, due to their fixed structure, can use simple backpropagation, whereas recurrent networks have to resort to backpropagation through time in order to keep long-term dependencies. The relationship between learners, items, and tags can be represented by means of a tripartite graph, which was originally static and based on historical information. Recently, an approach where the interaction tripartite graph— modeling the ternary relation among learners, interaction behaviors, and learning content—is made dynamic has been proposed (Hu et al., 2019). In this way, trendy topics attracting much attention may easily propagate among learners. The weights in the dynamic interaction tripartite graph are initialized and then through an attention-driven CNN. In online platforms, a large number of exercises are prepared and loaded to assess the degree to which a learner has mastered a topic. The ability to find similar exercises, i.e., exercises sharing the same purpose, can substantially improve the richness of learning. Automatically grouping exercises on the basis of similarity is not at all trivial, because exercises usually contain heterogeneous data such as text and images, and similarity at word level—and even at notion level—can easily lead to erroneous grouping. For this task, a CNN and an Attention-based LSTM have been combined (Liu et al., 2018). The CNN processes images, an embedding layer creates representations for notions, while the Attention-based LSTM produces the final, semantic, representation. Such combination of components is telling of a research trend that is in progress. In future developments, subnetworks will either continue to be juxtaposed in a modular way, each component dedicated to the portions of input it handles best, or we might witness the development of new, hybridized architecture designed specifically so that it will be natively able to process all the data.
Ethical problems of neural networks application in educational spheres
The use of neural networks in educational process can pose a threat to scientific communities of universities in terms of solving plagiarism issue. Thus, a student of the Russian
State University of Humanities wrote his Master's degree thesis in 24 hours using the ChatGPT neural network. At the defense of the thesis the young man was able to get a grade of "poor," and the anti-plagiarism system evaluated the student's material at 82%. The company OpenAI launched the ChatGPT chatbot in autumn 2022. The neural network is capable of generating text, writing code, and answering questions.
Futhermore, lecture notes can already be organized on the basis of opensource models. The technology consists of two steps: first the speech is translated into a transcript using the speech-to-text model, then using language models such as BERT to extract from the transcript the sentences that best represent the text of the lecture. This can be useful for methodologists to automate the creation of memos for course students. At the same time, students might use this technology to save time and effort and to avoid tedious writing of lecture notes and disseminate the notes between groupmates instantaneously.
Speaking of exams and tests, neural networks can create school essays, scientific articles with very high percentage of the text originality, and solve problems in physics, chemistry, and mathematics. For instance, Mikhail Pavlovets, a teacher at one of Moscow's lyceums, gave the GPT-3 neural network the task of coming up with a final essay for admission to the Unified State Exam. The topic of the essay was "Why might the advances of progress that give man convenience and comfort be dangerous to humanity?" Having set the topic and the volume of the text, the teacher pressed the button, and in two minutes the GPT-3 neural network produced an essay in English, which was then translated into Russian using machine translating. Two teachers, unaware of the essay origin, reviewed the essay introducing minor corrections and credited it.
It can be summed up that the inability to understand directly the line of reasoning, which neural networks follow, make it easy and tantalizing to use them as absolute and unbiased judges of the world. Neural networks are meant to be emotionless and efficient pattern finders that are supposed to perfection and refine their performance on any given task. But it is still humans who make the decision as moral boundaries cannot be explicitly programmed into conventional neural networks.
Conclusions
Thus, the creation of an information-analytical system is an urgent task of great importance in the management and development of higher education institutions. The use of modern approaches in information technology of data analysis will allow the university to effectively manage its activities. It is also worth noting that the construction of such systems requires significant labor costs and can only be implemented by a team of highly qualified developers with the constant support of the university management.
Online learning systems with their hidden structures and patterns in data are invaluable in education, as they enable to devise flexible, adaptive, customized offering of educational resources and to gain a deeper understanding. Nowadays, deep learning networks are seen as the most promising tool in this endeavour, whereas the selection of the most appropriate deep network architecture, in terms of performance and capabilities, requires expertise and professional skills.
References
1. Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2), 157-166.
X X
o 00 A c.
X
00 m
o
2 O
ho CJ
fO
es o es
es
O m m
X
<
m O X X
2. Bosch, N., & Paquette, L. (2017). Unsupervised deep autoencoders for feature extraction with educational data. Paper presented at the Deep Learning with Educational Data Workshop at the 10th International Conference on Educational Data Mining, Urbana, IL, USA.
3. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y.
(2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language
4. Processing (EMNLP) (pp. 1724-1734), Association for Computational Linguistics.
5. Coelho, O. B., & Silveira, I. (2017). Deep Learning applied to Learning Analytics and Educational Data Mining: A Systematic Literature Review. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educaçâo-SBIE) (Vol. 28, No. 1, p. 143-152).
6. Goodfellow, I, Bengio, Y, & Courville, A. (2016). Deep learning: MIT Press.
7. Guo, B., Zhang, R., Xu, G., Shi, C., & Yang, L.
(2015). Predicting students performance in educational data mining. In 2015 International Symposium on Educational Technology (ISET) (pp.125-128), IEEE.
8. Ha, D., Dai, A., & Le, Q. V. (2016). Hypernetworks. arXiv preprint arXiv:1609.09106.
9. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition - CVPR 2016 -(pp. 770-778), IEEE.
10. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
11. Hu, Q., Han, Z., Lin, X., Huang, Q., & Zhang, X. (2019). Learning peer recommendation using attention-driven CNN with interaction tripartite graph. Information Sciences, 479, 231-249.
12. Kim, D., Park, C., Oh, J., & Yu, H. (2017). Deep hybrid recommender systems via exploiting document context and statistics of items. Information Sciences, 417, 72-87.
13. Liu, Q., Huang, Z., Huang, Z., Liu, C., Chen, E., Su, Y., & Hu, G. (2018). Finding similar exercises in online education systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 1821-1830), ACM.
14. Olah, C. (2015). Understanding LSTM networks. Retrieved 1 February, 2023, from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
15. Pelánek, R. (2019) Measuring Similarity of Educational Items: An Overview. IEEE Transactions on Learning Technologies. (Early Access: D0I:10.1109/TLT.2019.2896086).
16. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L, & Polosukhin, I. (2017). Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.) Advances in Neural Information Processing Systems 30 - NIPS 2017 - (pp. 5998-6008), Curran Associates, Inc.
17. Zhou, Y., Huang, C., Hu, Q., Zhu, J., & Tang, Y. (2018). Personalized learning full-path recommendation model based on LSTM neural networks. Information Sciences, 444, pp. 135-152.
Application of neural networks in education: opportunities and
challenges Sokolova A.G., Arkhipov A.V.
Moscow State University of Civil Engineering
JEL classification: C10, C50, C60, C61, C80, C87, C90_
Modern education has remained unchanged for many years against the backdrop of rapid changes in various spheres of modern society. Obtaining relevant and high-quality education for future specialists is one of the main acute problems of society. However, the issues of assessing the quality of education remain practically unexplored, while most universities realize that it is impossible to solve the problem of education quality without creating an internal information and analytical system. Thus, the creation of an information-analytical system is an urgent task of great importance in the management and development of higher educational institutions. The use of modern approaches in information technologies for data analysis will allow the university to effectively manage its activities. Online learning systems, with their hidden structures and patterns in data, are invaluable in education as they allow the development of flexible, adaptive, customized educational resource offerings and gain deeper understanding. Currently, deep learning networks are considered as the most promising tool in this direction, while choosing the most appropriate deep network architecture in terms of performance and capabilities requires experience and professional skills. Keywords: neural networks, education, learning path, recurrent networks,
artificial intelligence, deep learning. References
1. Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term
dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2), 157-166.
2. Bosch, N., & Paquette, L. (2017). Unsupervised deep autoencoders for
feature extraction with educational data. Paper presented at the Deep Learning with Educational Data Workshop at the 10th International Conference on Educational Data Mining, Urbana, IL, USA.
3. Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F.,
Schwenk, H., & Bengio, Y. (2014). Learning representation phrases using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language
4. Processing (EMNLP) (pp. 1724-1734), Association for Computational
Linguistics.
5. Coelho, O. B., & Silveira, I. (2017). Deep Learning applied to Learning
Analytics and Educational Data Mining: A Systematic Literature Review. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educagao-SBIE) (Vol. 28, No. 1, p. 143-152).
6. Goodfellow, I, Bengio, Y, & Courville, A. (2016). Deep Learning: MIT Press.
7. Guo, B., Zhang, R., Xu, G., Shi, C., & Yang, L. (2015). Predicting student
performance in educational data mining. In 2015 International Symposium on Educational Technology (ISET) (pp.125-128), IEEE.
8. Ha, D., Dai, A., & Le, Q. V. (2016). hypernetworks. arXiv preprint
arXiv:1609.09106.
9. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for
image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition - CVPR 2016 -(pp. 770-778), IEEE.
10. Hochreiter, S., & Schmidhuber, J. (1997). Long term memory. Neural computation, 9(8), 1735-1780.
11. Hu, Q., Han, Z., Lin, X., Huang, Q., & Zhang, X. (2019). Learning peer recommendation using attention-driven CNN with interaction tripartite graph. Information Sciences, 479, 231-249.
12. Kim, D., Park, C., Oh, J., & Yu, H. (2017). Deep hybrid recommender systems via exploiting document context and statistics of items. Information Sciences, 417, 72-87.
13. Liu, Q., Huang, Z., Huang, Z., Liu, C., Chen, E., Su, Y., & Hu, G. (2018). Finding similar exercises in online education systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 1821-1830), ACM.
14. Olah, C. (2015). Understanding LSTM networks. Retrieved 1 February, 2023, from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
15. Pelánek, R. (2019) Measuring Similarity of Educational Items: An Overview. IEEE Transactions on Learning Technologies. (Early Access: DOI:10.1109/TLT.2019.2896086).
16. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.
N., Kaiser, L, & Polosukhin, I. (2017). Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.) Advances in Neural Information Processing Systems 30 - NIPS 2017 - (pp. 5998-6008) , Curran Associates, Inc.
17. Zhou, Y., Huang, C., Hu, Q., Zhu, J., & Tang, Y. (2018). Personalized learning full-path recommendation model based on LSTM neural networks. Information Sciences, 444, pp. 135-152.