Научная статья на тему 'DEVELOPMENT OF THE INTELLECTUAL VOICE ASSISTANT WITH EXTENSIBILITY SUPPORT'

DEVELOPMENT OF THE INTELLECTUAL VOICE ASSISTANT WITH EXTENSIBILITY SUPPORT Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
47
8
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
INTELLECTUAL SYSTEMS / VOICE CONTROL / THE INTELLECTUAL VOICE ASSISTANT / INTERACTION INTERFACES / SYNTHESIS OF A VOICE / OPERATION WITH API

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Drozdova Elena N., Kovalenko Alexander N., Shavkunov Victor S.

Relevance of intellectual voice assistants consists that they allow not to make direct physical contact with technique; the device will accept, interprets and will execute a voice command. The purpose of article consists in representation of development process of the personal voice assistant capable to interact with an operating system and Internet services. As work benches of development are selected: programming environment of Delphi 10 Seattle, system of voice recognition Google Speech API, system of synthesis of a voice Ivona Voice Synthesis. As a result of this research the logic of operation of the computer intellectual voice assistant is offered. Materials of article can be useful to developers of intellectual systems.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «DEVELOPMENT OF THE INTELLECTUAL VOICE ASSISTANT WITH EXTENSIBILITY SUPPORT»

DEVELOPMENT OF THE INTELLECTUAL VOICE ASSISTANT WITH EXTENSIBILITY SUPPORT

Abstract

Relevance of intellectual voice assistants consists that they allow not to make direct physical contact with technique; the device will accept, interprets and will execute a voice command. The purpose of article consists in representation of development process of the personal voice assistant capable to interact with an operating system and Internet services. As work benches of development are selected: programming environment of Delphi 10 Seattle, system of voice recognition Google Speech API, system of synthesis of a voice Ivona Voice Synthesis. As a result of this research the logic of operation of the computer intellectual voice assistant is offered. Materials of article can be useful to developers of intellectual systems.

Keywords

intellectual systems, voice control, the intellectual voice assistant, interaction interfaces, synthesis of a voice, operation with API

AUTHORS Elena N. Drozdova

PhD, Associate Professor, Informational and the Controlling Systems Department St. Petersburg State University of Industrial Technologies аnd Design. 18, Bolshaya Morskaya street, Saint-Petersburg, 191186, Russia. E-mail: endrozdova2@list.ru

Alexander N. Kovalenko

PhD, Associate Professor, head of Informational and the Controlling Systems Department St. Petersburg State University of Industrial Technologies аnd Design. 18, Bolshaya Morskaya street, Saint-Petersburg, 191186, Russia. E-mail: akovalenko@uprint.spb.ru

Victor S. Shavkunov

Quality Assurance Specialist, SaberInteractive Media Company, St. Petersburg, Russia. E-mail: vshavkunov@gmail.com, vshavkunov@saber3d.ru

Introduction

The concept of an artificial intelligence, — however, as well as the intelligence — is rather fuzzy. In general the intelligence can be described as connection of knowledge and thinking (Artificial intelligence, 2016). The artificial intelligence, in turn, implies property of artificial system to perform functions of the person. Within this scientific direction tasks of the hardware and software simulations of a human thought are set.

To the middle of the 20th century premises for origin of an artificial intelligence appeared. It were statements concerning operation of a human brain, the provision of an algorithm theory and creation of the first computers. Possibilities of computers in the

speed of computation and depth of logic were much higher than human. It gave the chance to set new tasks.

Special case of an artificial intelligence are the intellectual systems directed to specific tasks solution. The structure of intellectual system includes three main units — the knowledge base, the mechanism of an output of decisions and the intelligent interface. Knowledge of specific area of tasks is kept in the knowledge base.

Now popularity was acquired by the intellectual systems called by the Intellectual Personal Assistant. It is a mobile program complex (agent) which can carry out tasks (services) for the users. The tasks decision is executed on the basis entered by the user of information, data on his location, and also information obtained from different Internet sources (weather, traffic, news, exchange rates and. etc.). The most popular examples of such agents at the moment are GoogleNow, MicrosoftCortana, Siri.

Relevance of these systems consists that they bring interaction with electronic technique to new level. Voice level of communication allows not to make direct physical contact with technique; the device will accept, interprets and will execute a command.

Futurologists and science fiction writers predicted appearance of similar systems, but only now technical development of humanity was gained by enough opportunities for execution of a dream.

The market of personal assistants is of huge interest to IT corporations today, they invest millions of dollars in development of own products, but most of them, unfortunately, are present only at mobile platforms and don't support Russian. Assistants interact with an operating system of own platforms especially by means of the provided API. It strongly restricts their opportunities.

In St. Petersburg State University of industrial technologies and design at Department of the information and control systems the working prototype of the Russian-speaking intellectual assistant with voice actuation for PC compatible computers is developed (Drozdova, 2016). As work benches of development are selected: programming environment of Delphi 10 Seattle, system of voice recognition GoogleSpeech API, system of synthesis of a voice IvonaVoiceSynthesis. As perspective of development of the project it is supposed to transfer development to the environment C#.

Methods of carrying out experiment

Now we will consider features of the main stages of operation over the project.

2.1 Common research organization. The sprint priorities

For effective and flexible software development, it is the best of all to organize process by means of short steps (sprints). At the end of each sprint there is an assessment of the achieved results, and the plan of further steps is adjusted. At the first stages of planning of sprint from a becklog the most priority tasks are taken. It allows to evaluate a possibility of achievement of result. The moving on sprint consists in the solution of a set of tasks of this sprint. The decision of tasks is evaluated in time. The decision of tasks from the following sprints shall be accompanied by an exception of already solved tasks (Sahota, 2012).

Within this research sprints of A1, A2, Backlog were developed. The short task list, delivered to the first sprint is shown in table 1.

TABLE 1. SPRINT A1

№n/n Task type Task priority Task name Schedulable time for the solution of a specific objective

1 Task Blocker To connect Speech API 2h

2 Task Blocker To connect a voice synthesizer 8h

3 Task Blocker To set up the qualifier of requests 5h

4 Bug Critical The qualifier gives probability - «inf» for some requests 2h

5 Task Critical To train a qualifier basis 5h

6 Task Major To set up the user interface 10h

7 Bug Critical Remove 403/Forbidden in id HTTP 2h

8 Task Critical To connect sample API 2h

9 Task Critical To set up JSON parcer 1h

10 Task Major To study a possibility of connection of a multilayer neural network 10h

11 Task Blocker To realize operation with audio via system devices 3h

12 Task Minor To prepare the API list of services 5h

13 Bug Critical To eliminate false operations of the analyzer 4h

The type of the task can be Bug and Task.

The priority of the task is determined by a scale: Blocker, Critical, Major, Minor, Trivial. Blocker - lock of further development; Critical - the critical task breaking operation of some part of system; Major - the important task necessary for further advance; Minor - the task which isn't influencing remaining processes; Trivial - the task which isn't influencing the general quality of a product, a hardly noticeable error.

In a figure 1 distribution on a priority of the tasks set in the first sprint is provided.

BLOCKER CRITICAL MAJOR MINOR

To connect Speech API To exclude inf-result in the qualifier To study a possibility of connection of a neural network To prepare the API list for switching on in the project

To connect IVONA Voice Synthesis Disable classifier ^ database To set up UI > i

To set up the requests qualifier To eliminate an error 404/Forbiden in idHTTP

To realize operation with audio-device To connect API sample

FIGURE 1. SPRINTS PRIORITIES

2.2. Appeal to audio recording devices

There is a set of methods of record of audio from the microphone in a programming environment of Delphi: TCaptureDeviceManager, Bass.dll, the Windows API and also with use of third-party components.

For record of audio the most optimum, from the point of view of support and a performance, is the Windows API becouse it is supported by the native operating system and also provides control of an audio stream at a low level. Such method isn't suitable for audio playback as, owing to features of operation of service of synthesis of the speech, the Windows API badly works with the non-standard audio formats. Therefore the successful decision will be use the NewAudioComponents. It provides big flexibility, support of operation with memory flows, and also simplicity of addressing.

In a figure 2 the program fragments for a part of the necessary functional operation with audio are shown.

mciSendString('OPEN NEW TYPE WAVEAUDIO ALIAS speech, nil, 0,0); mciSendString('SET speech TIME FORMAT MS ' + // set time 'BITSPERSAMPLE 16 ' + // 16 Bit

'CHANNELS 1 ' + // MONO

'SAMPLESPERSEC 11025 ' + // 11025 Hz

'BYTESPERSEC 11025', // 11025 Bytes/s

nil, 0, 0);

mciSendString('RECORD speech, nil, 0, 0); (...)

mciSendString('STOP speech, nil, 0, 0);

mciSendString(PChar('SAVE speech "'+AppPath+'/record.wav'"), nil, 0, 0); mciSendString('CLOSE speech, nil, 0, 0);

Code fragment. Reproduction st:=TMemoryStream. create; GetSayStream(text,'Male',st); mp3in 1 .Stream: =st; dxaudioout1. Run; (...)

ssst.Free;

FIGURE 2. THE FRAGMENT OF THE PROGRAM REALIZING AUDIO PLAYBACK

2.3. Preparation of conditions for operation with the voice Speech API analyzer

GoogleSpeech API is the programming interface of applications providing the access to speech sensing technologies created by Google (Pultz, 2014) corporation.

Users of API get access only to voice recognition, sending, as well as in a case with YandexSpeechKit, POST-request through HTTP.

At the moment GoogleSpeech API is in a status of ALPHA-testing therefore it is impossible to give certain outputs on this software product.

By the experiments it was revealed that GoogleSpeech API perceives the speech from WAV files with much bigger percent of errors, than from files in a FLAC format. As the Windows API doesn't allow to send an audio stream directly to the FLAC file, there is a task of conversion of formats. For the decision of this task the flac.exe freeware program is used. It is capable to accept file name and parameters required for conversion of an input

For implementation of operation with a network both conventional components of Delphi-Indy 10, and third-party libraries, such as Synapse are used

Due to the need to set the protected connection by means of https, libeay32.dll and ssleay32.dll libraries were imported to the project. These libraries are the cryptography packet with the source code OpenSSL allowing to perform work with the SSL/TLS protocol.

Operation with speech services requires use hashing of the SHA256 and HMAC-SHA256 type. This type of hashing is intended for creation of "prints" of messages of arbitrary byte length and intended for information security.

OpenSSL provides a set of types of hashing and encoding, but doesn't support an algorithm SHA-2. Therefore that solution of a question of hashing was made in standard library DelphiSystem.Hash. It is necessary to consider that existence of this library is realized only in the newest versions of the environment EmbarcaderoDelphi.

The Synapse library is used for transmission of the voice file to GoogleSpeech API as possesses more flexible settings and differs in bigger stability, than Indy (How to do something, 2010).

2.4. Development of user interface

The main screen of the intellectual assistant is executed in AdobePhotoshop packet and is the black background with graphic elements

For creation of animation it was decided to use AdobeAfterEffects packet allowing to control rather thinly behavior of each object located in a frame

The row of avi-video files showing different statuses of intellectual system became result of operation in this software environment: a status of idle time, listening, execution of background operation, the notification of the user about an event, the notification about an error.

Such effects as Blur, Glow, Coloroverlay were applied to visual style. The main form of indication was drawn in AdobePhotoshop software package.

The output of the received and rendered videos is made by means of the TMediaPlayer component. This component operates with mci the messages similar to that were used in system of record of audio. As the screen the TPanel component is used. In it by editing parameters all frames were deleted.

Support of change of animation in the interface of the program is carried out by timely loading in the TMediaPlayer component of necessary video files and their reproduction.

37 Modern European Researches No 2 / 2017 2.5. Analysis of spoken commands

For the analysis of spoken commands at this development stage "the naive Bayes classification" is used (Bazhenov, 2012). Further use of several levels of Bayesian classification and connection of a neural network is planned.

In application this method is realized by means of operation with the ini-files which are used as the trained basis. The basic function of the analysis of a request is provided by method of Bayesian classification in fig. 3

function GuessBaes(s:string):string;

varwords:Tstringlist ;

i,j,topindex:integer;

r,top,tmp:real;

variks:Tstringlist;

begin

variks:=TStringList. Create; words:=TStringList.Create; Baes_set_ini.ReadSections(variks);

result:=variks[0]; words.Delimiter:=' ';

words.DelimitedText:=AnsiLowerCase(s); D;

top:=-9999999; topindex:=0; r:=0; for i := 0 to Baes_kinds.Count-1 do begin for j := 0 to words.Count-1 do begin tmp:=ln((Wc(words[j],Baes_kinds[i]))/((V+Lc(Baes_kinds[i])+1))); if tmp>-9999999 then r:=r+tmp else r:=r-7; end;

tmp:=ln(Dc(Baes_kinds[i])/D);

if tmp>-9999999 then r:=r+tmp else r:=r-7; if r>top then begin top:=r; topindex:=i; end; r:=0; end;

result:=variks[topindex]; words.Free; end;

FIGURE 3. BASIC FUNCTION OF THE REQUEST ANALYSIS BY METHOD OF BAYEVSKY CLASSIFICATION

2.6. Speech synthesis

After determination of subject and details of a request the intellectual system needs to comment on an event and to report about it to the user. Voice attending is the most natural and simple for such notification

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

For speech synthesis the Ivona system developed by Amazon corporation is selected. Within this research different voice synthesizers from different developers were tested: Yandex, Google, Acapella-Group, Vokalizer, RHVoice, ESpeak, Festival. Ivona showed the best sounding of a voice as close as possible to natural.

The Ivona service provides free of charge up to 50000 requests for synthesis of a voice a month. It completely covers needs of the intellectual environment. So, for one days more than one and a half thousand requests are possible. It is equivalent seventy requests an hour. Interaction with service is implemented by means of open API to which rich technical documentation is submitted

In fig. 4 the fragment of the source code of the program realizing a request to the Ivona system is provided.

canonicalRequest:=PChar(method+eol+canonicalUri+eol+"+eol+canonicalHeaders+eol+signedhead ers+eol+hashedRequestPayload);

hashedCanonicalRequest:=THashSHA2.GetHashString(canonicalRequest,SHA256);

stringToSign:=algorithm+eol+requestDate+eol+credentialScope+eol+hashedCanonicalRequest;

signature:=DigAsString(THashSHA2.GetHMACAsBytes(stringtosign,GetSignatureKey(SecretKey,dat eStamp, regionName,serviceName ),SHA256));

authorization:=algorithm+' Credential='+AccessKey+'/'+dateStamp+'/'+regionName+'/'+serviceName+'/aws4_request, SignedHeaders='+signedHeaders+', Signature=+signature;

idhttp1:=TIdHTTP. Create(nil); application. ProcessMessages;

SSLProvider:=TIdSSLIOHandlerSocketOpenSSL Create(idhttp1); idh ttp1. lOHandler:=SSLProvider;

idhttp1.Request. ContentType:=contenttype; idhttp1.Request. ContentLength:=length(requestPayload); idhttp1.Request.CustomHeaders.AddValue( X-Amz-date ',requestDate); idhttp1.Request.CustomHeaders.AddValue( 'Authorization',authorization); idhttp1.Request.CustomHeaders.AddValue( 'x-amz-content-sha256',hashedRequestPayload); IdHTTP1. Request. Host:=host; try

RequestBody := TStringStream.Create(requestPayload, TEncoding.UTF8); //resultStream := TMemoryStream.Create; application. ProcessMessages;

IdHTTP1. Post('https: / / +host+canonicalUri, RequestBody,Stream);

application. ProcessMessages;

RequestBody.Free;

/ /stream. CopyFrom( resultStream, resultStream. Size);

//resultStream. Free;

IdHTTP1. Request. CustomHeaders. Clear;

except

end;

end;

FIGURE 4. THE FRAGMENT OF THE SOURCE CODE OF THE PROGRAM, REALIZING A REQUEST TO THE IVONA SYSTEM

Apparently from the provided code, interaction happens as follows:

• requestPayload in the JSON format which is a set of parameters for synthesis is created;

• requestPayload is hashed by means of SHA256;

• • The credentialScope list is created;

• The canonicalHeaders list storing in itself Headers transferred to the server is created;

• From the created earlier variable the integral request of canonicalRequest is generated

• Based on a request stringToSign - latch string which will be signed with secret key by means of hashing of HMAC-SHA256 is generated;

• In the variable «authorization» the final request for transmission to the server is created;

• From the created earlier variable the integral canonicalRequest is generated

• Based on a request stringToSign is generated. This string is signed by secret key by means of hashing of HMAC-SHA256

• In the variable «authorization» the final request for transmission to the server is created

• By means of the TIdHTTP protocol the request is transferred by the POST method to the server

• The result of a request execution registers in a TMemoryStream variable and is transmitted to the main program

2.7. Interaction with external API

As a rule, the majority of Internet services allows to interact with the API by means of sending REST request. At the same time data, necessary for the developer, return in the form of JSON or XML of an object (Introduction to JSON, 2012). Purpose of the developer is reading this object, and in particular its specific parameters necessary for implementation of a functionality in application.

In the offered intellectual system interaction with the OpenWeatherMap service is realized. Sending GET-request with necessary parameters (such as coordinates, requirement of metric system, unique identifier of the developer), we receive JSON-object of the type provided in fig. 5.

{"coord":{"lon":30.32, "lat":60.08}, "weather":[{"id":803, "main": "Clouds", "description": "broken clouds", "icon": "04n"}], "base": "stations", "main":{"temp":3.53, "pressure": 1000, "humidity":64, "tempjmin" :2.22, "tempjmax":6.11}, "wind":{"speed":2.66, "deg":281. 5}, "rain":Q, "clouds":{"all":56}, "dt": 146551806 1,"sys":{"type":3, "id":77040, "message":0.0285, "country":"RU", "sunrise": 1465519000, "sunset": 146558643 3}, "id":873882, "name": "Torfyanoye", "cod":200}

FIGURE 5. JSON-OBJECT

Thus, we can ask to intellectual system the questions about such parameters as: coordinates, unique identifier of a weather status, air temperature, pressure, air humidity, minimum and maximum temperatures, speed and direction of wind and others.

By means of the code provided in fig. 6 intellectual system interacts with weather service:

idhttpl. Get('http: //api. openweathermap. org/data/2.5/weather?lat=60.07&lon=30.33&APPID =SECRET_ID_NOT_FOR_PUBLIC_LISTINGSàunits=metric',stream); stream. Position:=0;

memo 1. Lines. LoadFromStream(stream); stream. Position:=0;

FJSONObject:=TJSONObject. ParseJSONValue(stream. ReadString(stream.Size ) ) as

TJSONObject;

JsonArray:=FJSONObject.GetValue('main');

if Assigned(FJSONObject) then ss:=TJSONObject(JsonArray). Get('temp').JsonValue. Value;

ss2:=TJSONObject(JsonArray).Get('pressure').JsonValue.Value; ss3:=TJSONObject(JsonArray).Get('humidity').JsonValue.Value; JsonArray:=FJSONObject.GetValue('weather'); if Assigned(FJSONObject) then //ss4:=TJSONObject(JsonArray).Get('main').JsonValue.Value;

ss4:=caseweatherid(strtoint(TJSONObject(TJSONObject(JsonArray).Get(0)).Get('id').JsonValue.Value

));

JsonArray:=FJSONObject.GetValue('wind'); if Assigned(FJSONObject) then //ss4:=TJSONObject(JsonArray).Get('main').JsonValue.Value;

ss5:=TJSONObject(JsonArray). Get('speed').JsonValue. Value; ss6:=casewinddir(strtofloat(TJSONObject(JsonArray).Get('deg').JsonValue.Value));

result:=inttostr((round(strtofloat(ss))))+' градусовпоцельсию, daBMeHue'+ss2+' гектопаскалей, влажность +ss3+

'процентов, ветер '+ss6+', '+inttostr((round(strtofloat(ss5))))+'метроввсекунду, +ss4; stream.Free;

idhttpl.Free;

end;

FIGURE 6. THE PROGRAM CODE REALIZING INTERACTION OF INTELLECTUAL SYSTEM WITH

WEATHER SERVICE

Thus, can ask system about such parameters as: coordinates, unique identifier of a weather status, air temperature, pressure, air humidity, minimum and maximum temperatures, speed and direction of wind and others.

The analysis of a code, shows that the algorithm looks so:

• Creation of a TIdHTTP-object;

• Sending a GET-request by means of IdHTTP and receiving a JSON object in the form of a TStringStream-type variable;

• Data reading from a string flow in a TJSON-object ;

• Parsing of data from JSON;

• Generation of the response.

This example shows interaction with external services by means of API. The general methodology of connection of other API is similar.

Results

As a result of this research the logic of operation of the intellectual computer assistant is offered. It consists in the following:

• activation by Hotkey-method,

• appeal to mci,

• Hotkey waiting,

• closing of the mci-file,

• file conversion in flac

• flac transmission to the speech analysis server ,

• obtaining JSON response,

• extraction of data from JSON,

• data transfer to the qualifier,

• appeal of the qualifier to a basis,

• creation of Bayes probability model,

• required result extraction from model,

• execution of a user query,

• synthesis of a voice.

Thus, the developed intellectual assistant is capable to perceive the speech of the user, to make its analysis and to work according to the given patterns.

Discussion

To estimate novelty of development, we will carry out the review and the comparative analysis of popular intellectual assistants.

4.1. SIRI

SIRI (Speech Interpretation and Recognition Interface) — the personal assistant and question-answer system developed for iOS. This application uses processing of the natural speech to answer questions and to make recommendations. SIRI adapts to each user individually, studying his preferences for a long time.

The approximate algorithm of Siri-operation looks as follows (How Apple's Siri really work, 2016):

• The digitized voice immediately remains in memory of the device.

• The saved signal goes on a wireless network through provider to cloudy servers. There it is analyzed by a set of systems to understand request language.

• At the same time, the mobile phone evaluates a command locally. Most likely, its execution won't require operation of the server. So, the user can ask SIRI to include the selected song on user device. In this case, phone transfers a command of canceling of processing of the signal sent earlier to a cloud server.

• The server evaluates the speech of the user and the word of which the request consists, in particular, there is a comparing to statistical model.

• The server interprets the user's command in a set of internal commands and returns on the user device.

• With confidence in a recognition correctness, phone executes a command. Otherwise, he specifies information at the user

The main opportunities of SIRI concentrate on the dialogue interface, on recognition of a context and interaction from service. Voice recognition of SIRI is based on development of voice technologies made by the NuanceCommunications company.

4.2. Cortana

Cortana is the virtual voice assistant with elements of an artificial intelligence from Microsoft for WindowsPhone 8.1, MicrosoftBand, Windows 10, Android. Development for iOS, and XboxOne is also announced (Cortana (voice assistant), 2016).

The personal assistant to Cortana is called to foresee needs of the user. In case of desire, it can give access to your personal data, such as e-mail, the address directory, history of net searches, etc. She will use all these data for preguessing of your needs

Cortana will replace a standard search engine and will be called by clicking of the button. The necessary the request can be collected on the keypad or is set by the speech. It will find necessary information, relying on search results in the systems Bing, Foursquare and among personal files of the user.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

This virtual assistant isn't deprived of sense of humour: it can support with you a conversation, sing songs and tell jokes. She will remind you of a scheduled meeting, birthday of the friend and other important events in advance. Cortana will report if your flight was cancelled, or on roads there are a lot of corks.

Its interface has very flexible settings of confidentiality which allow the user to define what kind information can be provided to the virtual assistant. According to developers, neither SIRI, nor GoogleNow can brag of such level of monitoring.

4.3. GoogleNow

GoogleNow is the universal assistant from Google integrated into Chrome and Android and intended for delivery to the user of different information. It is capable to interact with the user on the basis of voice commands (GoogleNow: what is able and as to them to use, 2016). GoogleNow is an analog of Siri and Cortana voice assistants.

Application issues information taking into account the current location of the user, his personal information from a calendar, history of search queries, history of relocation, history of the visited pages etc. The user can set up requests under the needs and delete unnecessary cards. According to developers, the similar interface is the most convenient for continuous updating of information

At the moment at GoogleNow there are 36 information cards: Pedometer, Birthday, Concert, Currency, News, Actions, etc.

It is possible to note that GoogleNow envelops a wide range of various services and provides a large number of information for all occasions.

4.4. Comparing of personal assistants

Based on the short reviews of intellectual systems given above, we will compare data in the form of the pivot table. In this Table 2 it is possible to evaluate visually the strong and weaknesses of intellectual assistants

TABLE 2. COMPARATIVE TABLE OF PERSONAL ASSISTANTS

SIRI CORTANA GOOGLE NOW

VOICE ACTUATION + + +

EXISTENCE OF THE VOICE OUTPUT + + -

SUPPORT OF RUSSIAN + - +

GRAPHIC INTERFACE + + +

OPERATION WITH GEOLOCATION + + +

SUPPORT OF EXTENSIONS - - +

SUPPORT OF EXTENSIONS OPERATION ON MOBILE PLATFORMS + + +

OPERATION ON STATIONARY PLATFORMS - + -

OFFLINE OPERATION - - -

WOMEN'S/MEN'S VOICE w w -

INTELLECTUAL HINTS + + +

INTERACTION WITH EXTERNAL SERVICES + + +

The analysis of Table 3 and material in item 4.1 allows to be guided by intellectual assistants SIRI and CORTANA. They have the greatest number of pluses in the table. Table 3 shows and on what qualities of the created systems it is necessary to work. It is support of extensions which increases number of users of intellectual assistants. It is offline operation which allow to accelerate actions. For our country it is very important to develop support of Russian.

Conclusion

The choice among the known basic decisions allowed to offer structure of the hardware and software of intellectual assistant. It is intended to perceive the speech of

the user, to make the analysis of the speech and on the basis of the connected databases to work according to the given patterns. Besides, the intellectual assistant is capable to synthesize the speech and to sound the operations performed by him.

Novelty of the development provided in this article (in comparison with similar intellectual systems) consists in support of Russian, support of PC-compatible platforms, extensibility support, and also emphasis on interaction with Internet services and operating system. As a result, the image of some "butler" is created. This personal assistant differs from similar systems which place emphasis on operation as a reference system.

Plans of further development and support of application concern such the area of work as: connection of intelligent analyzers of higher level, connection of third-party APIservices, monitoring of user activities, implementation of intellectual councils, interaction of different services, system management at lower level, implementation a plug-in modules, support of application on a spiral model of life cycle of a software

Recommendations

On annual International Consumer Electronics Show (CES) the tendency in representation of the products anyway connected to an artificial intelligence is noticeable. These are the programs controlling the smart house; autonomous cars; the virtual reality; interactions with smart electronics, personal assistants.

The provided research will be useful for developers of intellectual systems, namely mobile software agents with voice actuation. Also this development can be interesting to a wide range of experts who work in the field of robotic technology and intellectual systems.

REFERENCES

Artificial intelligence (2016). In Wikipedia, the free encyclopedia. https://ru.wikipedia.org/wiki7McKyccrBeHHbiti_MHTe.n.neKT

Bazhenov, D. (2012) Naive Bayes qualifier. http://bazhenov.me/blog/ 2012/06/11/naive-bayes.html

Cortana (voice assistant) (2016). In Wikipedia, the free encyclopedia. https://ru.wikipedia.org/wiki/K0pTaHa_(ro.n0C0Bafl_n0M0^H^a)

Drozdova, E. N. & Shavkunov, V. S. (2016). Development of the voice intellectual assistant for control of OS Windows and interaction with Internet services. Messenger of young scientists of SPGUTD, 2 (40-43), 2312-2048.

GoogleNow: what is able and as to them to use (2016). http://chezasite.com/android/google-now-faq-87818.html

How Apple's Siri really work (2016). http://www.zdnet.com/article/how-apples-siri-really-works

How to do something (2010). http://synapse.ararat.cz/doku.php/public:howto

Introduction to JSON (2012). http://www.json.org/json-ru.html.

Pultz, M. Accessing Google Speech API (2014). http://mikepultz.com/2011/03/ accessing-google-speech-api-chrome-11

Sahota, M. (2012) An Agile Adoption And Transformation Survival Guide. InfoQ

i Надоели баннеры? Вы всегда можете отключить рекламу.