Научная статья на тему 'TRANSFORMING THE FUTURE: A REVIEW OF ARTIFICIAL INTELLIGENCE MODELS'

TRANSFORMING THE FUTURE: A REVIEW OF ARTIFICIAL INTELLIGENCE MODELS Текст научной статьи по специальности «СМИ (медиа) и массовые коммуникации»

CC BY
458
81
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
DEEP LEARNING / MACHINE LEARNING / NEURAL NETWORKS / LANGUAGE MODEL / GENERATIVE MODEL / AI STORYTELLING

Аннотация научной статьи по СМИ (медиа) и массовым коммуникациям, автор научной работы — Pugachev A.A., Kharchenko A.V., Sleptsov N.A.

A comprehensive review of existing artificial intelligence models, focusing on fourteen prominent language and multimodal generative models from four rapidly evolving categories: Marketing, Copywriting, Image Improvement, and Social Media, is made. As of May 2023, 1,523 AI models are available to end users, with notable Russian services such as Balaboba, GigaChat, and Kandinskiy 2.0 emerging as counterparts to popular foreign neural networks. The potential applications of these tools in various media production domains, including journalism, marketing, and copywriting, are explored. It was necessary to talk about language models, since these are the ones, most connected not only to the media sphere, but to academic writing as well. Moreover, the authors delve into the ethical considerations associated with the use of AI models in professional settings, addressing potential challenges and concerns. The importance of responsible development, use, and regulation of AI technology, as well as the need for collaboration among researchers, governments, and private organizations to ensure ethical AI practices, is highlighted. The authors also outline the prospects for further development of AI models and related research, emphasizing the need to foster an environment of continuous learning for innovation that is inclusive and accessible. This approach will help maximize the benefits of AI while minimizing potential harm, paving the way for a more prosperous, equitable, and sustainable future. The presented materials can serve as an introduction to the emerging branch of AI models development.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «TRANSFORMING THE FUTURE: A REVIEW OF ARTIFICIAL INTELLIGENCE MODELS»

RUDN Journal of Studies in Literature and Journalism Вестник РУДН. Серия: Литературоведение. Журналистика

ISSN 2312-9220 (Print); ISSN 2312-9247 (Online) 2023 Vol.28 No. 2 355-367

http://journals.rudn.ru/llterary-crltlclsm

DOI: 10.22363/2312-9220-2023-28-2-355-367 EDN: RZMQIG

UDC 004.89

Scientific review/ Научный обзор

Transforming the future: a review of artificial intelligence models

Andrei A. Pugachev , Alina V. Kharchenko , Nikolai A. Sleptsov

RUDN University, 6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation ^ [email protected]

Abstract. A comprehensive review of existing artificial intelligence models, focusing on fourteen prominent language and multimodal generative models from four rapidly evolving categories: Marketing, Copywriting, Image Improvement, and Social Media, is made. As of May 2023, 1,523 AI models are available to end users, with notable Russian services such as Balaboba, GigaChat, and Kandinskiy 2.0 emerging as counterparts to popular foreign neural networks. The potential applications of these tools in various media production domains, including journalism, marketing, and copywriting, are explored. It was necessary to talk about language models, since these are the ones, most connected not only to the media sphere, but to academic writing as well. Moreover, the authors delve into the ethical considerations associated with the use of AI models in professional settings, addressing potential challenges and concerns. The importance of responsible development, use, and regulation of AI technology, as well as the need for collaboration among researchers, governments, and private organizations to ensure ethical AI practices, is highlighted. The authors also outline the prospects for further development of AI models and related research, emphasizing the need to foster an environment of continuous learning for innovation that is inclusive and accessible. This approach will help maximize the benefits of AI while minimizing potential harm, paving the way for a more prosperous, equitable, and sustainable future. The presented materials can serve as an introduction to the emerging branch of AI models development.

Keywords: deep learning, machine learning, neural networks, language model, generative model, AI storytelling

Conflicts of interest. The authors declare that there is no conflict of interest.

Article history: submitted February 9, 2023; revised March 10, 2023; accepted March 25, 2023.

For citation: Pugachev, A.A., Kharchenko, A.V., & Sleptsov, N.A. (2023). Transforming the future: A review of artificial intelligence models. RUDN Journal of Studies in Literature and Journalism, 28(2), 355-367. http://doi.org/10.22363/2312-9220-2023-28-2-355-367

© Pugachev A.A., Kharchenko A. V., Sleptsov N.A., 2023

Iga ffi® I Th's work is licensed under a Creative Commons Attribution 4.0 International License КЯЕХЯ https://creativecommons.Org/licenses/by-nc/4.0/legalcode

Трансформация будущего: обзор моделей искусственного интеллекта

А.А. Пугачев , А.В. Харченко , Н.А. Слепцов

Российский университет дружбы народов, Российская Федерация, 117198, Москва, ул. Миклухо-Маклая, д. 6 ^ [email protected]

Аннотация. Выполняется комплексный обзор существующих моделей искусственного интеллекта (ИИ). Особое внимание уделено четырнадцати известным языковым и мульти-модальным генеративным моделям из четырех быстро развивающихся категорий инструментов: Marketing, Copywriting, Image Improvement и Social Media. По состоянию на май 2023 г. конечным пользователям доступны 1523 модели ИИ, среди которых выделяются такие российские сервисы, как Balaboba, GigaChat и Kandinskiy 2.0, являющиеся аналогами популярных зарубежных нейросетей. Рассматриваются потенциальные возможности применения этих инструментов в различных сферах медиапроизводства, включая журналистику, маркетинг и копирайтинг. Обсуждаются языковые модели, поскольку именно они больше всего связаны не только с медиасферой, но и с академическим письмом. Затрагиваются этические аспекты: потенциальные проблемы, связанные с использованием моделей ИИ в профессиональной сфере. Подчеркивается важность ответственного подхода к разработке, использованию и регулированию технологий ИИ, а также сотрудничества между исследователями, правительствами и частными организациями для обеспечения этичности применения ИИ. Описаны перспективы дальнейшего развития моделей ИИ и соответствующих исследований, выделена необходимость создания среды непрерывного обучения в области инноваций, которая должна быть инклюзивной и доступной. Такой подход поможет максимизировать преимущества ИИ при минимизации потенциального вреда, прокладывая путь к более процветающему, справедливому и устойчивому будущему. Представленные материалы могут служить введением в развивающуюся отрасль разработки моделей ИИ.

Ключевые слова: глубокое обучение, машинное обучение, нейросети, языковая модель, генеративная модель, ИИ-сторителлинг

Заявление о конфликте интересов. Авторы заявляют об отсутствии конфликта интересов.

История статьи: поступила в редакцию 9 февраля 2023 г.; отрецензирована 10 марта 2023 г.; принята к публикации 25 марта 2023 г.

Для цитирования: Pugachev A.A., Kharchenko A.V., Sleptsov N.A. Transforming the future: a review of artificial intelligence models // Вестник Российского университета дружбы народов. Серия: Литературоведение. Журналистика. 2023. Т. 28. № 2. С. 355-367. http://doi.org/10.22363/2312-9220-2023-28-2-355-367

Introduction

In the rapidly evolving field of artificial intelligence (AI), we are witnessing unprecedented advancements that are revolutionizing the way we perceive and interact with technology. As AI models become increasingly sophisticated and permeate various domains, it is essential to gain a deeper understanding of the models driving this transformation. This scientific article, aims to provide a thorough examination of the groundbreaking AI models that form the founda-

tion of this revolution, offering insights into their underlying principles, methodologies, and applications, as well as the scientific papers around that topic. This paper's additional aim is to reveal the AI's unique possibilities in the representation of reality.

From natural language processing to computer vision, AI has transcended disciplinary boundaries and has become an integral component of modern scientific research. We would present an overview of the key AI models that have emerged over the years, elucidating their architectures, algorithms, and the challenges they address. By delving into the inner workings of these models, we hope to foster a greater appreciation for the technological marvels that are reshaping our world and, in turn, inspire further exploration and innovation in the field of AI.

The study of artificial intelligence goes back to 1950, when a British polymath named Alan Turing started exploring the mathematical possibility of artificial intelligence, publishing his work titled "Computing Machinery and Intelligence". However, as we would be talking more about neural networks, we should delve deeper into this subject's history first. The initial neural networks employed to address real-world issues were ADALINE and MADALINE (Multiple ADAptive LINear Elements). These models were designed to identify binary patterns, enabling them to predict the subsequent bit when reading streaming bits from a phone line. In 1982, enthusiasm in the domain was rekindled when John Hopfield of Caltech introduced a paper titled "Neural networks and physical systems with emergent collective computational abilities" to the National Academy of Sciences. Hopfield's strategy involved utilizing bidirectional connections to develop more practical machines, as opposed to the earlier approach where neuron connections were unidirectional.

But research that concentrates on developing neural networks is relatively slow. Due to the limitations of processors, neural networks take weeks to learn.

Discussion

For this review, we have selected existing AI models that can be or have already been implemented in various areas of the media industry: journalism, marketing, copywriting, etc. According to the AI models aggregator website Future Tools, as of May 2023 there are 1,523 AI tools available to end users. They are all categorized according to the functions they perform or the areas in which they can be used. So, we selected a list of categories that could potentially contain tools suitable for media production workers: Text-To-Speech, Speech-To-Text, Copy-writing, Image Improvement, Video Editing, Generative Video, Motion Capture, Text-To-Video, Voice Modulation, Podcasting, Music, Marketing, Social Media, Translation.

It should be noted that under the general name of "Others" we have combined the following categories: AI Detection, Chat, For Fun, Generative Code, Image Scanning, Productivity, Self-Improvement, Aggregators, Gaming, Inspiration, Prompt Guides, Avatar, Finance, Generative Art, Research. In the Figure we can see that from the list of categories we selected, most of the tools belong to the sections Marketing (205 models) and Copywriting (131). The Image Improvement and Social Media categories share third place with the same number of tools in each (59).

__Copywriting

^ 8,6% Voice Modulation

mt M Generative Video

Hk Image Improvement

Text-To-Video

I Translation

Marketing

13,5%

Others

54,1%

Models categorized by functions on the Future Tools website

Source: compiled by the authors.

The tools in the Marketing category can help in creating websites, generating logos, banners, posters and creating an identity, writing post-releases and other types of content, optimizing work with email newsletters and advertising networks. Most of the tools in the Copywriting category work with content: they help with creating, editing, rewriting, paraphrasing, summarizing, etc. Services from the Image Improvement category correspond to the name of the category: they allow users to edit images, change their quality and resolution, remove the background or unnecessary elements from a photo, as well as add new ones. Social Media category tools allow you to work directly with social media: they can help to create content for social networks and video hosting, optimize work with the audience and subscribers - to write and respond to comments and private messages.

From this, we decided, first, that it was necessary to talk about language models, since they are the ones most related not only to the media sphere, but to academic writing as well. Language models are based on the theory of probability: algorithms calculate the chance that a particular word will appear in the text. This requires taking into account the context, the style of speech and the meaning of words, but, as all machines do, the language model can only work in numbers. This means that the text is getting converted into a numeric representation. Such a process is called coding, and the result we can get on the output is called embedding. Each word in the dictionary has its own sequence number, and embed-dings are formed from them. We also included Russian AI models, to compare their potential and functionality with the foreign ones. Then we decided to describe popular text-to-image models, as they are getting more popular. Finally, we described professionally oriented neural networks.

Chat GPT - the most famous developing language model by OpenAI. Generative Pre-trained Transformer 3, or GPT-3, is a large-scale language model created by OpenAI that can produce text with 175 billion parameters. It has been trained on a very large quantity of data (Brown, et al., 2020). Chat GPT uses all the gathered data of GPT-3's to provide replies to user input in a conversational way.

As of today, GPT-3 can perform a wide variety of functions. It can write and compile computer programs, compose fairy tales, write student essays, write business pitches, compose music or teleplays, provide answers for test questions (for example, recently this model passed the entrance exam from law and business schools1), write song lyrics, poetry, summarize text, translate text, simulate chat rooms with multiple people, run a Linux-like system and play simple games like tic-tac-toe or warships.

This model is already being used to simplify life not only for journalists, but for other people as well. For example, recently a Russian student used ChatGPT to write his bachelor thesis in the span of a day.2

Media specialists use Chat-GPT as a news-aggregation tool, for data-driven journalism, for fact-checking and social-media management.

The ethical problems arising in using such a model comes from the quality and truthfulness of data being provided by ChatGPT. Although authors are trying to be careful and not include some sort of a bias or providing dangerous information (drug recipes, firearm schematics and so on), some users still can bypass the restrictions.

Caktus positions itself as a content-generation tool with several different options beyond writing. Its creators call it an "academic curated search engine", as it provides content for several different topics including, science, technology, engineering and math, and school studying, as well as coding and professional services. It is, as most other language models is being compared to ChatGPT, because it has such functions as summarizing tool, essay writer, paragraph writer and such unique tools as humanizer (making the text generated by AI look more human) and movie scene analyzer. It might also be able to better evade AI plagiarism detection than its competitor, and has the ability to add citation sources to essays.

Caktus has a range of uses. It is not limited to essays, although it can produce a pretty detailed and conclusive one in just a couple of minutes. It can create a resume or CV, single paragraphs, convert TED Talks on Youtube to essays, analyze a song, write a bullet pointed list, personal statement for college admission, coding (such as Python and Javascript), and so on.

1 Murphy Kelly, S. (2023). ChatGPT passes exams from law and business schools. CNN Business. Retrieved April 21, 2023, from https://edition.cnn.com/2023/01/26/tech/chatgpt-passes-exams/index.html

2 Zhadan, A. (2023). How I wrote my thesis using ChatGPT and found myself in the middle of a dispute about neural networks in education. Journal Tinkoff. (In Russ.) Retrieved April 25, 2023, from https://journal.tinkoff.ru/neuro-diploma/

The obvious ethical problem of using such a service is the process of education and tasks given out by the professors and teachers alike. Such features as "Humanize" can hinder the detection of AI written text and make, for example, homework, useless at some subjects.

Novel AI is an online website-based, software-as-service model, an interactive story generator software. It uses modern and powerful AI algorithms to generate original and creative content in seconds. The service offers a range of three modes to choose from: storyteller mode, text adventure mode, image generation mode.

Storyteller mode: the main story generator feature of NovelAI is the Storyteller mode. In this mode, you give the AI some basic input called a "prompt" and it starts writing the story for you. This mode generates unique and creative content. Besides, it offers you full control over the content (easy edits, regenerated answers, and more).

Text Adventure mode: The text adventure mode is a text game that requires some imagination from the user. The user decides what the game setting and character are. The AI will take care of the actions based on the additional input.

Image generation mode: developers thought that the AI story generator tool needs an image generator for showing fantasy-style images in user's stories, novels, and tales. The image generator asks for simple text input, which it uses to then generate an image.

As in every generative model, the main concern of scientists is the unlimited possibilities of scenarios users can come up with, including some illegal topics.

Character AI is a neural language model, chatbot web application that can generate human-like text responses and participate in contextual conversation. Constructed by previous developers of Google's LaMDA (Language Model for Dialog Application), Noam Shazeer, and Daniel De Freitas, the beta model was made available to use by the public in September 2022.

Character "personalities" are designed via descriptions from the point of view of the character and its greeting message, and further molded from conversations made into examples, giving its messages a star rating and modification to fit the precise dialect and identity the user desires.

The main controversial topic that arises in such a model is users indulging into sexual conversations with characters that may be underage.

GigaChat is a Russian multimodal neural network that can answer questions, maintain dialogue, and generate texts, program code, and images. Some call it the Russian's answer to rapidly developing ChatGPT.

GigaChat has an open architecture, while many of its competitors have a closed system. GigaChat is positioned as an answer to ChatGPT, because it knows Russian and can create images, which ChatGPT can not do at the moment.

The GigaChat consists of two AI models NeONKA (NEural Omnimodal Network with Knowledge-Awareness) and Kandinsky 2.1. The first enables solving of intellectual tasks, and the second - creating images by text request. According to the developers, the neural network was trained on a supercomputer Christofari Neo.

In the near future the developing company Sber plans to transfer the code of their neural network to another model called ru-GPT 3.5 with 13 billion parameters for public access. This model is also actively being implemented in the company's other products, such as banking apps.

The main concerning ethical aspects of such a service are the same as for CHAT-Gpt but with another take on the matter. Some people think that a Russian service may include the government bias in its replies.

Balaboba is not a service in itself, but a demonstration of technology that tries to explain the idea of language models accurately and clearly. The main task of this model is to generate the next word to match the previous one grammatically and stylistically. Due to the fact that the model has seen terabytes of texts, it solves this problem so well that it is able to communicate in different scenarios. This is how most language models, such as ChatGPT work.

The service is based on the Yandex-created YaLM (Yet another Language Model). It was trained on some of the pages indexed by Yandex on the Russian Internet, such ast: Wikipedia articles, news articles, books, and open publications by users of social networks and forums. Repetitive, incomplete or unnatural texts were "cleaned" in the process. For the neural network to work correctly, it was loaded with three billion parameters that it uses to asses if the word the network uses is grammatically, and stylistically correct.

To learn how to write in a particular style, YaLM only needs a few examples. When a user selects a style, invisible samples are added in front of the text being typed, based on which Balaboba completes what is written.

Midjourney is a text-to-picture AI service developed by an independent research lab of the same name. The service allows users to generate images based on textual descriptions, creating a wide range of art forms, from realistic to abstract styles. The style in which the image will be generated is chosen by the user, utilizing a myriad of prompts. Midjourney's AI is especially known for its high-quality, well-structured, and detailed images and is used in graphical design, artwork generation and sometimes even in photo generation.

Now, Midjourney can only be accessed using third-party messenger service Discord, by opening up a conversation or adding Midjourney to a group. Image generation process can be completed in simple steps. Users need to use the imagine command and type text of what they want to see in the artwork. After that, Midjourney comes up with a set of four images, from which a user can choose 1 to 'improve,' upscaling it. Midjourney is also capable of describing an already uploaded image with text prompts that can be later used to generate new artwork. It can also seamlessly blend two pictures together, combining them.

Recently, Midjourney and other image generation services faced complete restriction for free generation, due to the nature of the said generated images. For example, some users may find pictures of Pope Jorge Mario Bergoglio in different attires offensive.3

3 Landymore, F. (2023). Midjourney nixes free generations after AI pope images go viral. Futurism. Retrieved April 25, 2023, from https://futurism.com/the-byte/midjourney-excommunicates-freeloaders-after-pope-images

DALL-E and its updated model DALL-E 2, are advanced neural models created and published by OpenAI, a company that created GPT-3 and GPT-4. It is designed to generate digital visuals, be it artwork, graphics or photorealistic images, based on textual prompts. The software's name comes from a combination of names of a robot from a Pixar movie "Wall-e" and famous artist Salvador Dali.

DALL-E was introduced to the public in January 2021, and was based on GPT-3 to work. According to the company name, the source code of the model is closed. DALLE-2 entered open beta, providing access to users on July 20, 2022. During the beta-testing period, the free number of generated images per month was restricted, but users could buy additional generations. At the start of the period, access to the model was limited to only a selected number of users, as Open AI considered ethical and safety aspects of their created model Finally, on 28th of September 2022 the model became available to the public for free.

Later, in November of 2022, the API of DALL-E 2 was released, meaning that now developers were able to include DALL-E 2 in their own applications. Microsoft, for example, used this model in their search engine Bing, as well as their internet browser Microsoft Edge. And even though the API adopted a perimage buying model, with price depending on the image resolution, the Open AI team offers discounts to the companies that will implement these models into their apps and services.

Stable Diffusion - launched in 2022, is a deep learning model focused on text-to-image generation. It primarily creates detailed visuals based on textual descriptions, but it is also versatile enough for other applications like inpainting, outpainting, and generating image-to-image translations directed by text prompts. Stability AI, a startup, developed this model in partnership with several academic researchers and non-profit organizations.

Stable Diffusion functions as a latent diffusion model, a type of deep generative neural network. The model's code and weights are publicly available, and it operates on most consumer hardware equipped with a moderate GPU boasting at least 8 GB VRAM. This approach contrasts with earlier proprietary text-to-image models like DALL-E and Midjourney, which were only accessible through cloud services.

Kandinsky 2.1 is an artificial intelligence system engineered to produce distinctive art pieces. It derives its name from the renowned Russian painter Vasily Kandinsky, celebrated for his abstract creations. This AI employs deep neural networks and machine learning techniques to examine and pinpoint various artistic styles and movements' features. It subsequently generates its own artwork by merging these styles and crafting entirely novel visuals. The neural network has undergone training with a vast collection of art images spanning multiple styles and time periods. The image generation process encompasses several phases:

1. Style and movement analysis: identifying the key characteristics and distinct artistic styles, such as Cubism, Impressionism, Surrealism, and more.

2. Innovative combinations: fusing different styles and movements to form new, original, and unforeseen compositions.

3. Visual generation: Kandinsky converts these compositions into graphic images using machine learning and deep neural network algorithms.

Kandinsky signifies a novel phase in artistic evolution, sparking debates and discussions about the capacity of AI to produce authentic art. It is crucial to recognize that Kandinsky does not merely replicate existing styles and pieces but generates new visuals based on the analysis and amalgamation of diverse artistic elements.

Runway ML serves as an innovative software solution tailored for creative professionals, such as filmmakers, designers, visual effects and computer-generated imagery specialists, artists, programmers, musicians, students, and instructors. This all-encompassing video editing platform focuses on empowering video producers by offering machine learning-assisted tools for seamless editing without any coding requirements. Accessible directly through the user's web browser, the software streamlines video manipulation processes with a diverse range of features, such as masking, color adjustment, compositing, content generation, and visual effects.

Additionally, its rotoscoping capabilities allow users to transform any video into a green screen, streamlining the editing experience. The software's Inpainting function intelligently eliminates unnecessary elements from the video, ensuring efficient and cohesive edits. Through the platform's integrated multiband video stream format, creators can enhance their work using AI-generated analytics and metadata, resulting in more engaging visuals. Creative Conduit ML produces precise and distinctive depth maps, contributing to a heightened sense of realism in the imagery.

Furthermore, an embedded optical flow function assists users in comprehending object motion through relative movement analysis. Other noteworthy features encompass real-time previews and rapid rendering in multiple formats, such as PNG, ProRes, and more.

Using such advanced tools raises the ethical problem of streamlining the other professions in this sphere. For example, why should a company hire a professional video editor, if an AI can do the same in seconds?

Soundraw is an AI-driven music creation platform developed to assist video producers in crafting bespoke music for their projects. Based in Tokyo, Japan, and established in 2020, the platform allows users to choose a theme, mood, or genre and receive auto-generated tracks, which can be further customized in terms of structure, duration, and instrumentation. Soundraw's AI algorithms examine an extensive library of music patterns and structures to produce unique AI-generated music that caters to user preferences.

Upon accessing soundraw, users are presented with a straightforward menu offering three selections: mood, genre, or theme. This offers a basic approach to music creation using the platform. For a more intricate experience, users can explore the 'primary editor,' where they can adjust or remove individual blocks, define the song's progression, and modify it to meet their specific requirements.

Mubert is a generative music streaming application, accessible via web browsers and mobile devices. The app showcases an assortment of music streams categorized by genres and corresponding activities, such as "sleep" and "work."

Each of Mubert's streams represents an ever-evolving "live" track that continually transforms, as the company highlights, "Generative music cannot be rewound, as it is produced in real-time."

Within the app, Melodion offers "like" and "dislike" buttons for users to express their opinions, which the company states are employed to refine their algorithms based on user preferences. This feedback could be evaluated using machine learning algorithms to gauge the performance of samples globally or to suggest selections based on the rankings of other users. For instance, when a user selects "like," Mubert might assess feedback from other users who appreciated the current samples to determine which additional samples they enjoyed and play those subsequently. Similarly, if numerous users choose "dislike" while a specific sample is playing, Mubert's algorithms could opt to exclude that sample from future playback.

ElevenLabs has gained recognition for its web-based, AI-enhanced speech synthesis solution, capable of generating realistic spoken output by emulating emotional tone and inflection. The company emphasizes that its software is designed to adapt the delivery's intonation and tempo according to the input language's context. Via its beta platform, users can input text and produce audio files from a range of pre-existing voice options. Subscribers of the premium tier have the added benefit of uploading personalized voice samples to develop unique vocal styles.

With the almost indistinguishable nature of voice generated content, many fear that the creation of the so-called "deep fakes" can harm people from whom the voice samples have been took from4.

Ethical boundaries in AI models

The ethical principles in such a topic are a problem that has gathered attention of many scientists, due to its (AI) rapid increase in usage in recent years.

Thus, some scholars consider the ethical problems that may arise in the near future from the spread of AI, including those that may be associated with the approximation of artificial intelligence to human intelligence (Bostrom, Yudkowsky, 2014). Others focus on regulating the possibilities of using AI for the benefit, not the detriment to human development (Dignum, 2018). Some scientists consider the ethical aspect of the use of artificial intelligence in education and research activities (Lund et al., 2023; Cordova, Vicari, 2023). Proposals for a code of ethics for artificial intelligence have been under consideration since 2017 (Boddington, 2017). The Russian academic community is also talking about the need for ethical regulation of the use of artificial intelligence technologies in the media environment, a discussion on this issue first launched by M. Lukina et al. (2022).

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

In the past five years, private companies, research institutions and public sector organizations have issued principles and guidelines for ethical artificial in-

4 Santanu, R. (2023). Internet up in arms as 4Chan user uses AI voice simulator to deepfake Emma Watson's voice, makes her read hitler's autobiography. FandomWire. Retrieved April 27, 2023, from https://fandomwire.com/internet-up-in-arms-as-4chan-user-uses-ai-voice-simulator-to-deepfake-emma-watsons-voice-makes-her-read-hitlers-autobiography/

telligence. However, despite an apparent agreement that AI should be "ethical", there is debate about both what constitutes "ethical AI" and which ethical requirements, technical standards and best practices are needed for its realization. Anna Jobin, President of Swiss Federal Media Comission, states that the best way to define ethical AI is "an AI that follows three principles: Justice, equity and fairness". And, for the so-called conversational Al's, the main principles would also be termination and/or mitigation of the "biased data".

As in this paper we would describe not only language models that are used in conversational based Al's but also image generation models, video generation models and voice or audio generation models, we would need to discern the ethical problems that arise while using a particular model.

Conclusion

As we found, the number and the sheer variety of AI models grows by the day. Even while working on this paper, we needed to correct the number of AI models from 1,505 to 1,523, as new ones have been published. These models encompass a wide range of architectures, applications, and industries, showcasing the versatility and potential of AI in addressing various challenges and tasks. Among the many models, we selected a list of categories that could potentially contain tools suitable for media production workers. The number of these tools in each category led us to conclude on the most developing 4 areas: Marketing, Copywriting, Image Improvement, and Social Media.

Despite the plethora of AI models, the development of more advanced, accurate, and efficient models remains a priority for researchers and engineers. As the field continues to mature, we can expect AI to further integrate into our daily lives, revolutionizing the way we live, work, and interact.

Collaboration among researchers, governments, and private organizations is crucial in ensuring responsible development, use, and regulation of AI technology. Additionally, it is important to address the ethical, societal, and economic implications of AI, such as fairness, privacy, and job displacement, in order to maximize the benefits while minimizing potential harm.

As we continue to explore the possibilities and limitations of AI, it is essential to foster an environment of continuous learning and innovation that is both inclusive and accessible. By doing so, we can better harness the power of AI to create a more prosperous, equitable, and sustainable future for all.

References

Benefo, E.O., Tingler, A., White, M., Cover, J., Torres, L., Broussard, C., Shirmohammadi, A., Pradhan, A.K., & Patra, D. (2022). Ethical, legal, social, and economic (ELSE) implications of artificial intelligence at a global level: A scientometrics approach. AI Ethics, 2, 667-682. https://doi.org/10.1007/s43681-021-00124-6 Boddington, P. (2017). Towards a code of ethics for artificial intelligence. Cham: Springer. https://doi.org/10.1007/978-3-319-60648-4

Bostrom, N., & Yudkowsky, E. (2014). The ethics of artificial intelligence. In K. Frankish & W. Ramsey (Eds.), The Cambridge Handbook of Artificial Intelligence (pp. 316 -334). Cambridge: Cambridge University Press. https://doi.org/:10.1017/CBO9781139046855.020 Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., & Agarwal, S. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901. https://doi.org/10.48550/arXiv.2005.14165 Coeckelbergh, M. (2021). Time machines: Artificial intelligence, process, and narrative. Philosophy & Technology, 34(4), 1623-1638. https://doi.org/10.1007/s13347-021-00479-y Cordova, P.R., & Vicari, R.M. (2023, January). Practical ethical issues for artificial intelligence in education. Technology and Innovation in Learning, Teaching and Education: ThirdInternational Conference, TECH-EDU 2022, Lisbon, Portugal, August 31 - September 2, 2022, Revised Selected Papers (pp. 437-445). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-22918-3_34 Dignum, V. (2018). Ethics in artificial intelligence: Introduction to the special issue. Ethics

and Information Technology, 20(1), 1-3. https://doi.org/10.1007/s10676-018-9450-z Hartmann, K., & Giles, K. (2020). The next generation of cyber-enabled information warfare. 2020 12th International Conference on Cyber Conflict (CyCon), Estonia, 2020, 233-250. https://doi.org/10.23919/CyCon49761.2020.9131716 Heaven, W.D. (2022). Language models like GPT-3 could herald a new type of search engine. K. Martin (Ed.), Ethics of Data and Analytics (pp. 57-59). Auerbach Publications. https://doi.org/10.1201/9781003278290 Lukina, M.M., Zamkov, A.V., Krasheninnikova, M.A., & Kulchitskaya, D.Y. (2022). Artificial intelligence in the Russian media and journalism: the issue of ethics. Theoretical and Practical Issues of Journalism, 11(4), 680-694. (In Russ.) https://doi.org/10.17150/2308-6203.2022.11(4).680-694 Lund, B.D., Wang, T., Mannuru, N.R., Nie, B., Shimray, S., & Wang, Z. (2023). ChatGPT and a new academic reality: artificial intelligence-written research papers and the ethics of the large language models in scholarly publishing. Journal of the Association for Information Science and Technology, 74(5), 570-581. https://doi.org/10.1002/asi.24750 Luzi, L., Siahkoohi, A., Mayer, P.M., Casco-Rodriguez, J., & Baraniuk, R. (2022). Boomerang: Local sampling on image manifolds using diffusion models. https://doi.org/10.48550/arXiv.2210.12100 McKay, F., Williams, B.J., Prestwich, G., Bansal, D., Hallowell, N., & Treanor, D. (2022). The ethical challenges of artificial intelligence-driven digital pathology. The Journal of Pathology: Clinical Research, 8(3), 209-216. https://doi.org/10.1002/cjp2.263 Paek, S., & Kim, N. (2021). Analysis of worldwide research trends on the impact of artificial

intelligence in education. Sustainability, 13(14), 7941. https://doi.org/10.3390/su13147941 Thorne, S. (2020). Hey Siri, tell me a story: Digital storytelling and AI authorship.

Convergence, 26(4), 808-823. https://doi.org/10.1177/13548565209138 Zhou, K.Q., & Nabus, H. (2023). The ethical implications of DALL-E: Opportunities and challenges. Mesopotamian Journal of Computer Science, 2023, 17-23. https://doi.org/10.58496/MJCSC/2023/003 Zhu, P., Pang, C., Wang, S., Chai, Y., Sun, Y., Tian, H., & Wu, H. (2023). ERNIE-music: Text-to-waveform music generation with diffusion models. https://doi.org/10.48550/arXiv.2302.04456

Bio notes:

Andrei A. Pugachev, PhD student, Department of Mass Communication, Faculty of Philology, RUDN University, 6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation. ORCID: 0000-0001-6722-2431. E-mail: [email protected]

Alina V. Kharchenko, lecturer, Department of Mass Communication, Faculty of Philology, RUDN University, 6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation. ORCID: 0009-0001-8105-892X. E-mail: [email protected]

Nikolai A. Sleptsov, lecturer, Department of Mass Communication, Faculty of Philology, RUDN University, 6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation. ORCID: 0000-0002-3447-8008. E-mail: [email protected]

Сведения об авторах:

Пугачев Андрей Алексеевич, аспирант, кафедра массовых коммуникаций, филологический факультет, Российский университет дружбы народов, Российская Федерация, 117198, Москва, ул. Миклухо-Маклая, д. 6. ORCID: 0000-0001-6722-2431. E-mail: [email protected]

Харченко Алина Вадимовна, преподаватель, кафедра массовых коммуникаций, филологический факультет, Российский университет дружбы народов, Российская Федерация, 117198, Москва, ул. Миклухо-Маклая, д. 6. ORCID: 0009-0001-8105-892X. E-mail: [email protected]

Слепцов Николай Андреевич, преподаватель, кафедра массовых коммуникаций, филологический факультет, Российский университет дружбы народов, Российская Федерация, 117198, Москва, ул. Миклухо-Маклая, д. 6. ORCID: 0000-0002-3447-8008. E-mail: [email protected]

i Надоели баннеры? Вы всегда можете отключить рекламу.