Научная статья на тему 'О решении задачи оптимального размещения звуковых объектов в n-мерных тембральных пространствах'

О решении задачи оптимального размещения звуковых объектов в n-мерных тембральных пространствах Текст научной статьи по специальности «Математика»

CC BY
104
27
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
СОНИФИКАЦИЯ / ТЕМБРАЛЬНОЕ ПРОСТРАНСТВО / ЗАДАЧА УПАКОВКИ ШАРОВ / МНОГОМЕРНЫЕ ПРОСТРАНСТВА / ОПТИМИЗАЦИЯ

Аннотация научной статьи по математике, автор научной работы — Рогозинский Глеб Гендрихович

Методы мониторинга на основе неречевого звукового представления различных данных, т.е. методы сонификации, становятся востребованными в условиях платформы Industry 4.0, характеризуемой потенциально значительным уровнем информационного перегруза. Комплексный подход к промышленным системам сонификации требует проектирования тезаурусов звуковых объектов в соответствующих тембральных пространствах, а также метода оптимального размещения этих объектов. Статья рассматривает метод на основе математической задачи об упаковке шаров и сводит сугубо сонификационные проблемы к решению этой задачи, предложенному в последние годы. Возможность такой оптимизации делает возможным создание теоретической базы для методов разработки тезаурусов сообщений в задачах промышленного звукового дизайна и систем сонификации.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «О решении задачи оптимального размещения звуковых объектов в n-мерных тембральных пространствах»

ON SOLUTION OF OPTIMAL SOUND OBJECTS PLACEMENT PROBLEM IN N-DIMENSIONAL TIMBRAL SPACES

DOI 10.24411/2072-8735-2018-10087

Gleb G. Rogozinsky,

The Bonch-Bruevich St.Petersburg State University of Telecommunications, St.Petersburg, Russia, gleb.rogozinsky@gmail.com

Keywords: sonification, timbral space, sphere packing problem, higher-dimensional spaces, optimization

The Industry 4.0 started to change the existing technological paradigms even before its expected outbreak within the recent years. Since the last several years, we have produced the amount of data of overall size greater than the humanity had had for the whole history. The Internet of Things, Ubiquitous Sensor Networks and Big Data form the backbone of the Cyber-Physical Systems paradigm, which will power the Industry 4.0. Meanwhile, in such a futuristic environment we still depend a lot on a human operator inside the complex structures. We still not able to delegate the critical operating functions to the machine world. Therefore, this implies the necessity of developing new ways of human-oriented monitoring of the complex systems, i.e. increasing the human abilities at the challenge of growing data and global complexity [1].

The monitoring methods based on a non-speech sound representation of various data, called sonification, become actual in the Industry 4.0 world, characterized by potentially sufficient informational overload. The complex approach to the industrial sonification systems demands developing the thesauri of objects in the corresponding timbral space, as well as the method of optimal placement of such objects. The paper reviews an approach related to problem known as the Sphere Packing problem and reduces the sonification related tasks to the solutions proposed in the recent years. A possibility to apply such optimization provides a theoretical base for thesauri design methods in the field of industrial sound design and sonification systems.

Information about author:

Gleb G. Rogozinsky, PhD, Deputy Head of Medialabs, The Bonch-Bruevich St.Petersburg State University of Telecommunications, St.Petersburg, Russia; Senior Researcher, Solomenko Institute of Transport Problems of the Russian Academy of Sciences, St.Petersburg, Russia

Для цитирования:

Рогозинский Г.Г. О решении задачи оптимального размещения звуковых объектов в n-мерных тембральных пространствах // T-Comm: Телекоммуникации и транспорт. 2018. Том 12. №5. С. 54-58.

For citation:

Rogozinsky G.G. (2018). On solution of optimal sound objects placement problem in n-dimensional timbral spaces. T-Comm, vol. 12, no.5, pр. 54-58.

Introduction

To add more, the newly acquired ability to have an access to log literally anything, at anytime and anywhere, i.e. the output of any process and status, determines new demands and problems for the up-to-date monitoring and controlling systems. The increasing informational volume issues inspire new developments in a field of human-machine interface design, since the most common and straight-forward solutions, typically based on visual representation of any received data, turn to be ineffective in eases of simultaneous visualization of thousands of objects. Any attempts for visualization of Big Data, such as representing of the various complex networks and the included data as graphs, are useful in some way, but reveal the main draw back of that class of methods - the use of the visual sensory system only. The human eye is able to distinguish about ten million colors, but it cannot keep in focus many parameters at the same time. The architecture of a visual presentation of a complex system is not always able to correctly choose the optimal field of view - oi' requires special additional development (revision, additional fixing, tuning). Software mode! tools can excessively modify initial array when rendering it.

The specificity of monitoring processes of complex systems creates an issue of development of fundamentally new interfaces Ibr displaying big data. Regarding the auditory interface along with the visual one seems to be sell-evident, but is also provable by practice and theoiy. In other words, under condition of the informational overload we should research the alternative ways for representing and understanding the constantly increasing volume of data.

One of the solutions for dealing with the ever-growing data is the development of multimodal interfaces. Existing in the world, where visual displays dictate, we forget about the possibilities of other sensory systems our body features. The research of auditory representation, tactile and olfactory representations is by no means novel, though the application of such approaches to the world of future technologies can provide us the new edge of data control. Hearing is the second modality after the vision, by the amount of information it can provide to us. Hearing has its own peculiar features, which can be effectively used for the data understanding.

Among different ways to utilize the hearing for obtaining information, various Bonification methods (representation of data with non-speech sound) [2] are the most popular and perspective ones. The auditory representation of data has a long history [3], including altimeters and Geiger counters, although most systems of such kind exploit sound sets of rather small number of thesaurus, The development of Bonification system for the purposes of modern informational monitoring systems demands first creation of methodology, based on system analysis, sound design, acoustics and data mining.

N-dimensioniil Timbral Spaces for Sonification Purposes

For the purposes of the theory of sonification, we intentionally use two different notions - the sound space and the limbral space.

We consider several types of sound spaces [4], among which the most popular are generic Amplitude-Frequency-Time (AFT) space, Pierre Schaeffer space (subjectivized AFT space) [5], and also Di-nov-Gibson space [61, [7], which is completely subjectivized.

The sound spaces, as we understand it, describe sound objects in the known terms and together can theoretically describe any existing sound object. Meanwhile, for the purposes of multi-parametric sonification we should outline some exact parameters, which define given sound object. These parameters are defined

with accordance to the sound synthesis algorithm, used for the sound object synthesis.

Therefore, the dimensionality of timbral spaces directly relates to the sound synthesis algorithm. The timbral spaces we respond to are discrete and with finite dimensions.

Comparing to the 'classical' timbral spaces [8], [9], which are generally used for the timbres of acoustical-only instruments, the timbral spaces we refer to in our research arc intended for the computer sound synthesis.

From the one side, in such situation we have to deal with sufficiently big number of parameters, but from the other side it simplifies the formalization problem of timbre description, since the timbre is initially formalized as a totality of parameters of a given sound synthesis algorithm. This allows developing a timbral space proceeding from the array of input parameters of sound synthesis and processing algorithm. Thus, the sound object will correspond to a point in a given timbral space.

From the position of system analysis, we can allocate two solutions in the limbral space design. First leads us to the generalization of most known sound synthesis algorithms and their utilization as a basis for the generalized universal timbral space.

The evident positive aspect is the universality of designated timbral space, and the theoretical possibility of its application w ith the various algorithms of sound synthesis and processing.

At the same time, we should consider the vast variety of sound synthesis algorithms. Such algorithms, at least from the theoretical point of view, are limited only by the hardware limitations, so the big variety exists in the sound synthesis and sound processing methods. Therefore, we cannot imagine the possibility of having the unified thesaurus, in which terms any sound object can be described.

An alternative approach for the timbral space design lays in the sound synthesis algorithm. Comparing to existing timbral spaces, designed ibr acoustical musie instruments with limited variability of timbral characteristics, the timbral spaces of computer sound synthesis and electronic/computer musie can vary greatly. Wherein, assuming that the sound synthesis algorithm is prelimi-naiy known, we possess a complete picture of the parameter set used for the reviewed timbre forming. Considering that, we can conclude that the timbral spaces, designed in such way, give complete and accurate characteristics of derivable timbres.

On the other hand, each new algorithm of sound synthesis will span a new timbral space, or its own new thesaurus. Thus, it limits this approach. However, the solution we introduce here is to form the set of protospaees, which are the metasets for the thesauri families with closely related thesauri.

Besides, at the sufficient timbral variability within the frameworks of given timbral space, it is expediently to use the same pre-selected timbral space for obtaining various timbral groups of sounds.

Now w e define some characteristics of timbral spaces, which allow to set some important relations for better understanding the sound.

Some Descriptors in Timbral Spaccs

Within the 'classical' study of computer sound synthesis, the forming process of any timbre can be divided into following generalized stages: the core set sound generation, set manipulation for the main timbre forming, ancillary processing. For example, in the typical subtractive synthesis model, the first stage

7T\

T

includes oscillators (mostly saw tooth and square waves); the second stage includes mixing, different modulations, and a filler section. The final stage normally includes various FXs, i.e. delays, reverbs, choruses, flangers, phasers, etc.

Typically, the sound effects can be reviewed as a byproduct of a sound synthesis. Therefore, dissecting timbral thesaurus into the components, we have

(if. (Tf° u(Tf\(ry\ (rf^^,

™ m JY p

where í — timbra) core thesaurus, C - thesaurus of timbral elements manipulation, ¿fF- FX thesaurus.

So we can define the level of similarity between one timbral space and another, to reduce or/and generalize the description.

The number of parameters of timbral space defines the Timbral Space Dimensionality. The complete number of dimensions (or total number of parameters) can be sufficiently higher comparing to the actual number of parameters used to create such a timber with the corresponding sound synthesis algorithm. Therefore, we need to introduce the notion of Actual Timber Space Dimensionality, defined from the actual sound synthesis algorithm parameters.

Under critical step of dimension i we assume the minimum possible increasing of a value of the corresponding parameter along the axis X of N-dimensional timbral space EN. Since we are limited by the frames of sound synthesis field, we can state the finiteness of the step in the N-dim en si on. Moreover, various interfaces of sound synthesizers impose even more sufficient limits on the step resolution, in particular, in some cases it is due to necessity of providing compatibility with MIDI 1.0 standard, which defines a typical parameter resolution equal to 1/2 (for 7bit MIDI controllers), and 1/2N in some not so commonly used cases (except Pitch Wheel Change messages).

Together with the step resolution over axis, we should also consider the sensibility resolution, or the sensibility step. The forming of timbre in the multi-parametric media is a complex and non-linear process, with non-zero probability of situations, when for the minimum sensibility change inore than one resolution step should be taken. This can happen due to the sound synthesis algorithm itself, as well as because of psychoacoustic factors.

The sensibility resolution in general is a non-linear parameter, depending on combination of other parameters. E.g., at the relatively low value of LP cut off, the higher components or sound are weak and blurred (if even bearable), our ear is not able to sense their changes distinctly. At the same time, if LP is fully opened, or bypassed, the same changes of higher components of sound spectra can evoke the sense that is far more perceptible.

Another important characteristic of timbral space is its weight. In the generalized case, we define the weight W of timbral space EN as

Ws" = nA;'

where A¡ - number of critical steps across the dimension i of ^-dimensional timbral space HN.

In practice, the weight of a timbral space should be defined from the sensitivity step; though it complicates the weight calculation due to non-linearity of step's spread, since

where a - number of sensitivity steps along dimension i of

jV-dimensional timbral space En, dependent on the current point position inside HN.

Some Constructions Over the N-dimensional Timbral

Space En

In spite of the fact, that any timbre is obviously a point in an N-dimensional timbral space E\, from the practical view with taking in account a critical step and a minimal sensitivity step, it is more expediently to refer to some neighborhood S around point x & En. Within the limits of this neighborhood, we can consider the timbral sensitivity of human hearing is the same accurate to c.

Therefore, under the notion 'timbre' we should consider the neighborhood S, and under one realization of timbre - point % from that a neighborhood S.

In the practical study of sound synthesis, the strictly static sound objects are not as typical as dynamical ones. The last are simply more interesting and less boring to the listener. In the process of forming the dynamic sound objects, various sound synthesis parameters become affected by the multiple non-periodic and periodic modulations.

Thus, a more common case of existence of sound object, as an element of timbral space, should be described as its trajectory inside the given timbral space EN.

The proposed sound and timbral spaces provide a description, or a set of descriptors, for sound objects, which can be used for the purposes of Bonification with lots of parameters. The totality of given constructions in the timbral space Ssj shows the necessity of introducing of some special metric to prov ide the separation of elements. On a higher level, it is an important aspect of cyber-physical models mapping onto sound and timbral spaces for the industrial Bonification purposes.

The Problem of Optimal Placement of Timbre Objects

Consider the N-dimensional timbral space which includes subdomain Q. We assume that this subdomain is a subject of analysis. In addition, we should assume that every element of this subdomain could be sensed, i.e. heard, by a human as a sound object. The artificial neural network can perform a cluster analysis of this subdomain ii and group the subdomain elements according to the known Bonification thesauri, resembling the corresponding human sensing and classification with the given accuracy.

T-Comm Tom 12. #5-2018

T

An artificial neural network of arbitrary complexity can always be substituted by the equivalent 3-layer network. Wherein neural network divides the given subdomain into polygons, thus the first layer divides the parametric space into two half-spaces with the cutting plane. Next, the second layer allocates convex polygons, cut by the cutting planes. The third layer builds polygonal groups, not necessarily convex, and each group corresponds to the network's output reaction.

For the sake of simplicity, we w ill assume that any polygonal group consists of a single polygon. The neural network has a maximum resolution of one group, so the sound objects within the limits of one polygon are indistinguishable. In other words, within the limits of each polygon the reaction of neural network remains the same.

Thus, the maximum possible amount of sound objects is defined by number of polygons, allocated within the subdomain il It is natural to consider, that the diameters of polygons are small.

We are going to introduce the natural metric within the subdomain Q, w ith such metric to express the measure of dissimilarity between the sound objects. Let the human recipient is being informed through the auditory (sonification) channel with the thesauri of length K, i.e. with the K different sound objects. We assume that the K is far behind the limits of recognition, i.e. K is much less than the total number of polygons. For the recipient's comfort and for the reliability of his/her decisions, the sound objects should be as much dissimilar to each other as possible. With the usage of natural metric, such problem reduces to the finding of a set of M points in the timbral space such that the distances between pairs of points are at maximum. Next we give two important definitions.

Def 1. The path Y between points A and B is such a sequence of polygons A,...Al, that A £ Ah B £ Ai, and for any ¿=1 ...¿-1 the polygons A^-i have the common edges. Then L is a length of a path Y.

Def 2. The distance between points A and B from subdomain il is a number fi^(AtB)= minify),

i.e. the minimum among the all-possible paths between A and B.

The function p„ is a half-metric on £2, since

p1)(Ar,i')=0. XJ^Am-

To approximate the function plh we choose such full metric

p, as p{X,Y)= pa(X,Y)+0{l\

There / is the maximum diameter of a polygon in subdomain il. Since diameter of A,„ tends to zero, we can assume that in a practical ease the half-metricpo is indistinguishable from the full metric p. This metric we consider to be a model for solving the problem of choosing of maximum number of objects within the subdomain ii of timbral space Ev

Then the problem of choosing of sound objects with the maximum dissimilarity reduces to the choosing the set of points 0]...0m, such as function min(/?(Oi, 0^)) is at maximum.

Let the smooth transition of variables x/...xs to>'v exists for the subdomain CI, such as in the new coordinates Y, the metric p becomes Euclidian. Therefore, in the Y, coordinates the problem of choosing of maximum number of sound objects within the subdomain £2 of the timbral space H* becomes equivalent to the problem of packing of M balls of maximum possible radius 6 into Y(Q). The centers of balls M correspond to the objects of timbral space, positioned at the distance of 8 or greater.

Therefore, in the optimal positioning of the desired sound objects we can lean on the results of Marina Vyazovskaya [ IOJ obtained for the problem of hyperspheres packing in the higher-dimensional Euclidian spaces. In particular, Vyazovskaya solved the packing problem to dimensions of 8 [101 and, with others, of 24 {11]. These spaces are known for the higher possible density of packing.

Thus, we developed a formal approach for the search of most dense positioning of sound objects in the timbral space at the given distance between them, such as it provides the object's separation according to the recipient's thesaurus, i.e. the set of sound objects in the developed sonification system based on N-dimensional timbral space.

Conclusion

The paper gives a solution of a problem of optimal placement of timbre objects in the N-dimensional timbre spaces. We demonstrated that such problem is equivalent to the known problem of sphere packing in higher dimensional space. The existence of solution for this problem provides a way for an optimal placement of given number of timbral objects in N-dimensional timbral space and for corresponding construction of sonification thesauri.

Acknowledgments

/ thank Dmitry Korikov, PhD for assistance with the solving of mathematical issues and for the comments that greatly improved the manuscript.

i would also like to show my gratitude to Dr. Prof. Alexander D. Sotnikov of The Bonch-Bruevich Saint-Petersburg State University of Telecommunications for sharing his pearls of wisdom with we during the course of this research.

1. Rogozínsky O.O., Sotnikov A.D. (20!8). Principles of cyber-physical models mapping onto sonification sound spaces. 2018 Systems of Signals Generating and Processing in the Field of on Board Communications, March 2018.

2. Hermann, T; Hunt, Andy; Ncuhoff, J. (2011). The Sonification Handbook. Logos Ver lag Berlin.

3. Worall D, (2009), An Introduction to Data Sonification. The Oxford Handbook ofComputer Music. Ed. Roger T. Dean. 624 p.

4. Rogozinsky G.G. (2018). On three classes of sound spaces for sonification systems design. T-Comm. Moscow, vol. 1, pp. 59-64.

5. Schaeffer P. (2012). In Search of a Concrete Music. CA; UC Press.

6. Gibson D. (2005). The Art of Mixing: A Visual Guide to Recording. Engineering and Production. Boston: Artist pro.

7. Dinov V,G. (2007), Zvukováya kartína, Zapiski o zvukorejis-sure (The picture of Sound. Sound Design Writtings). St. Petersburg: Gelikon Press(in Russian)

8. Aldoshina I.A,, Pritts R (2006). Muzikal'naya akustika (Musical acoustics) 3.9. Timbre. St.P, Kompozitor. (in Russian)

9. Wessel D. (1979). Timbre Space as a Musical Control Structure. Computer Music Journal.

10. Viazovska M, (2016). The sphere packing problem in dimension S. arXiv:l603.04246.

11. Colin H., Kumar A,, Miller S.D., Radchenko D., Viazovska M, (2016). The sphere packing problem in dimension 24. arXiv: 1603.06518.

References

т

О РЕШЕНИИ ЗАДАЧИ ОПТИМАЛЬНОГО РАЗМЕЩЕНИЯ ЗВУКОВЫХ ОБЪЕКТОВ В ^МЕРНЫХ ТЕМБРАЛЬНЫХ ПРОСТРАНСТВАХ

Глеб Гендрихович Рогозинский, Санкт-Петербургский университет телекоммуникаций им. проф. М.А. Бонч-Бруевича; Институт проблем транспорта им. Н.С. Соломенко Российской Академии Наук, gleb.rogozinsky@gmail.com

Aннотация

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Методы мониторинга на основе неречевого звукового представления различных данных, т.е. методы сонификации, становятся востребованными в условиях платформы Industry 4.0, характеризуемой потенциально значительным уровнем информационного перегруза. Комплексный подход к промышленным системам сонификации требует проектирования тезаурусов звуковых объектов в соответствующих тембральных пространствах, а также метода оптимального размещения этих объектов. Статья рассматривает метод на основе математической задачи об упаковке шаров и сводит сугубо сонификационные проблемы к решению этой задачи, предложенному в последние годы. Возможность такой оптимизации делает возможным создание теоретической базы для методов разработки тезаурусов сообщений в задачах промышленного звукового дизайна и систем сонификации.

Ключевые слова: сонификация, тембральное пространство, задача упаковки шаров, многомерные пространства, оптимизация.

Литература

1. Rogozinsky G.G., Sotnikov A.D. Principles of cyber-physical models mapping onto sonification sound spaces / 2018 Systems of Signals Generating and Processing in the Field of on Board Communications, March 2018.

2. Hermann, T; Hunt, Andy; Neuhoff, J. The Sonification Handbook. Logos Verlag Berlin, 2011.

3. Worall D. An Introduction to Data Sonification. The Oxford Handbook of Computer Music. Ed. Roger T. Dean. 2009. 624 p.

4. Рогозинский Г.Г. Три класса звуковых пространств для проектирования систем сонификации // T-Comm: Телекоммуникации и транспорт. 2018. Т. 1. С. 59-64.

5. Schaeffer P. In Search of a Concrete Music. CA: UC Press. 2012.

6. Gibson D. The Art of Mixing: A Visual Guide to Recording, Engineering and Production. Boston: Artistpro. 2005.

7. Динов В.Г. Звуковая картина. Записки о звукорежиссуре. СПб.: Геликон-Пресс, 2007. 488 с.

8. Алдошина И.А., Приттс Р. Музыкальная акустика. Глава 3.9. Тембр. СПб.: Композитор, 2006. 730 с.

9. Wessel D. Timbre Space as a Musical Control Structure // Computer Music Journal, 1979.

10. Viazovska M. The sphere packing problem in dimension 8. 2016. arXiv:l603.04246.

11. Cohn H., Kumar A., Miller S.D., Radchenko D., Viazovska M. The sphere packing problem in dimension 24. 2016. arXiv:l603.065l8. Информация об авторе:

Глеб Гендрихович Рогозинский, к.т.н., зам. начальника НОЦ "Медиацентр", Санкт-Петербургский университет телекоммуникаций им. проф. М.А. Бонч-Бруевича; С.н.с., Институт проблем транспорта им. Н.С. Соломенко Российской Академии Наук, Санкт-Петербург, Россия

i Надоели баннеры? Вы всегда можете отключить рекламу.