Научная статья на тему 'Development of mathematical models and methods of task distribution in distributed computing system'

Development of mathematical models and methods of task distribution in distributed computing system Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
92
21
i Надоели баннеры? Вы всегда можете отключить рекламу.

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Matteo Gaeta, Michael Konovalov, Sergey Shorgin

The article deals with certain aspects of Grid and Grid modeling. Grid is a distributed softwarehardware environment based on new computation and job flow management structure, principally. For analyzing the problems related to the logics of user-resource interaction, there has been developed a general model scheme. Within that scheme the authors consider the models that allow the formulation of concrete mathematical tasks. The ways of solving the assigned tasks are discussed as well.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Development of mathematical models and methods of task distribution in distributed computing system»

DEVELOPMENT OF MATHEMATICAL MODELS AND METHODS OF TASK DISTRIBUTION IN DISTRIBUTED COMPUTING SYSTEM1

Matteo Gaeta

Department of Information Engineering and Applied Mathematics of the University of Salerno, Italy Michael Konovalov

Institute of Informatics Problems, Russian Academy of Sciences, Moscow, Russia Sergey Shorgin

Institute of Informatics Problems, Russian Academy of Sciences, Moscow, Russia

Abstract

The article deals with certain aspects of Grid and Grid modeling. Grid is a distributed softwarehardware environment based on new computation and job flow management structure, principally. For analyzing the problems related to the logics of user-resource interaction, there has been developed a general model scheme. Within that scheme the authors consider the models that allow the formulation of concrete mathematical tasks. The ways of solving the assigned tasks are discussed as well.

1. Grid concept and review

Today the world's scientific community considers Grid technologies as the most perspective computing model that is able to use geographically distributed resources. Grid is a software-hardware environment that provides reliable, stable, occurring everywhere and inexpensive access to high performance computing resources [1]. It is a distributed software-hardware environment with principally new computing organization and knowledge/data flow management. Grid concepts have given birth to a new model of organization of different forms of data processing (computing) by suggesting the technologies of remote access to resources of different types regardless of their positioning within the global network environment. Hence, by using Grid technology it became possible to execute software units on one or several computers simultaneously; data storages with structurized (data bases) and non-structurized (files) information, data sources (data transmitters, instruments, observations) and program-driven devices are being made accessible everywhere. Grid name can be explained by some analogy with electric power network (power grid) the latter providing universal access to electric energy [2].

The purpose of creating the Grid was integration of a certain number of spatial distributed resources in order to provide possibility of accomplishing a wide class of applications on any aggregated combination of these resources regardless of their location

Grid implementation is an infrastructure consisting of resources located in different places, telecommunication networks connecting these resources (networking resources) and consistent along the whole infrastructure software (middleware) for supporting remote operations and accomplishing controlling and management functions over operational environment. Grid is created for bringing the

1 This work was supported by the Russian Foundation for Basic Research, grant 05-07-90103

to

resources in general use. The resources owners and the users act in conformity with certain definite rules of providing/using the resources form a virtual organization.

Grid is the collective computing environment where each of the resources has its owner, and access to resources is cleared in time-and-space divided mode for numerous members of the virtual organization. The virtual organization can be created dynamically and has a limited lifetime.

In that way, following [3], one can define Grid as a spatial distributed operating environment with flexible, secure and well-coordinated resources distribution for accomplishing applications in virtual organizations created dynamically.

The ideas of the Grid were brought together by scientific community. Research and development of Grid at the initial stage were focused on supporting high throughput scientific-technological computing tasks. As a result, a number of protocols have been suggested. These included communication and secure authorization protocols as well as [...] protocol for accessing the remote computing, file and information resources. The set of protocols is sufficient for running and controlling tasks as well as delivering input and output files. The protocols were supported by means of realization of host system Globus Toolkit [4]. Globus Toolkit (GT) and a number of software products developed on its basis have formed a software component of several large Grids, including DataGrid [5] and GriPhyn [6]. The applications for processing the results of experiments in nuclear physics were installed on these Grids. Integration of distributed resources became of utmost importance and proved to be extremely useful for processing large data volumes and solving massive computational problems.

The main provisions of Grid software architecture standard proposed in [7] (OGSA - Open Grid Service Architecture) follow the object-oriented model and consider the Grid service as a key object. By means of service remote call methods the application receives a definite servicing type. In that way, unification of different functions such as access to computing and storage resources, to databases and any software data processing is accomplished.

Architecture of the Grid-services solves a problem of distributed environment -interoperability problem by means of standardization of the way of service interfaces description. In this regard OGSA is based on the Web-services standards [8].

Beginning from the version 2.0 Globus Toolkit became a de-facto Grid standard accepted by scientific community as well as by the leading players of IT industry [9]. Because GT from the very beginning have been carrying a status of open software, by the present moment considerable experience of its application in large-scale projects have been accumulated. By using the GT tools different groups of specialists have developed additional services for file replication, authentification, task management, etc. In 2003 the first version of toolkit based on OGSA architecture, namely Globus Toolkit 3.0 was launched.

• Problems of Grid implementation and development require to solve a number of tasks, which solving is impossible without using mathematical methods. It is possible to make out the following directions of investigations in this field:

• Formalization of construction of GRID structure as a whole;

• Planning of flows in the network graph with the purpose to grant to an user both terminal and network resources;

• Forecasting of a situation (congestion of resources, time of performance of the tasks, etc.);

• Formalization of processes of search and granting of resources with the purpose of development of appropriate protocols;

• Adaptive resources management.

to

2. Modeling of tasks distribution in distributed resources network

In this article, we shall deal with the last direction and consider certain aspects of operation of the systems consisting of pre-assigned number of distributed computing resources and separate users, remoted from the resources, that access them in order to accomplish the newly arising tasks.

Let's extract the key factors dealing with the logics of "user-resource" interaction organization from the problem of complex computing realization on distributed resources.

At the same time we do not touch the issues of software, technical and any other support of such interaction. Principal attention should be given to preparation (parallel processing) of service task and to selection of concrete resources for carrying out computing. For making analysis of these problems a general model scheme should be worked out. Within the scheme different models allowing formulation of concrete mathematical tasks can be considered. The ways of solving the assigned tasks can be discussed as well.

2.1. Conceptual model development

The main purpose of the model is to provide a general logical scheme that could describe the key directions within the problem of collective use of distributed computing resources such as the analysis and optimization of the processes of tasks distribution and accomplishment.

The model must be designed in such a way that the main problems amenable to mathematical formalization and deal with calculation of characteristics interaction of "user-resource" type, scheduling these processes and executing instant management over them can be stated within it. The model must take into account the following factors:

- A system that is being modeled consists of the following components: 1) Computing resources 2) Resource users 3) Telecommunication network for exchanging information between resources and users.

- Resources are understood as any technical means and facilities that are capable to provide the users with processor time and access to main storage and read-only memory, software systems and databases. Depending on the situation, the resource role can be played by different objects from PC to powerful territory distributed computing complexes.

- Resource users can be regarded as different arbitrary users from physical persons to intergovernmental organizations.

- Connecting telecommunication network can be different depending on the situation; it may be one or several local or global networks, their usage making possible to provide interaction between the system participants under consideration.

- All the system components are not static and their characteristics change during the period that is shorter than characteristic modeling time.

- All the participants of the system under consideration are independent from each other and their vital activities not necessarily come to accomplishment of certain functions within the system under consideration.

- The main system participants can form sub-systems. It is dynamic aggregation that can reflect physical peculiarities of the objects under consideration as well as ones of "virtual" nature.

- A system as a whole, its main participants such as the resources and users, sub-systems that are built from separate components - all these carry the features of purposeful behavior. The purpose of the system's functioning, in the broadest sense, consists of the most advantageous and full utilization of resources for maximal accomplishment of user's queries.

to

- The above-mentioned factors must be reflected in the model, preferably, in formalized, algorithmically precise form. But at the same time, because of the breadth of problems related to the processes of collective usage of computing resources, the model can admit non-formal, verbal descriptions along with purely analytical units and the usage of a formal body of mathematics.

2.2. Stating certain mathematical and algorithmic problems

2.2.1. A problem of optimal task partitioning. This problem statement was initiated by application works carried out by Computing Center of the Moscow State University, Russia [10] and must reflect the following qualitative features of a real object.

A resource user must accomplish a single task by using the services of several computing resources. The task in question cannot be fulfilled completely during the appropriate user time on any of the available resources. The user must divide the task into parts and access different resources by forwarding them in task fragments. Reduction of the volume of forwarded portions lowers the risk of undesirable, depending on the resource condition, interruption of the task fulfillment. However, too small portions may be disadvantageous due to their very large number resulting in unjustified additional time losses. Selection of "partitioning diameter" (during planning stage, or in the process of task accomplishment) is the subject of optimization task.

2.2.2. A static problem of task distribution among computing resources. This problem statement corresponds to conception of functioning of the upper level of Grid system management (the level of resource broker). Its mathematical base is a discrete-combinatory problem of assignment.

The problem consists of finding the optimal schedule of task distribution among computing resources at the stage previous to direct accomplishment of these tasks.

Limitations of the model are the current or forecasted characteristics of the nodes of information and computing network that are determined at the lower level of management system (the level of local Grid system resource manager). These characteristics can be, for instance, the number of processors, memory capacity, local schedule of task accomplishment, etc. Optimization criterion must contain the factors reflecting the user's budgetary capabilities and aspiration of the resource supplier for maximizing his profits.

2.2.3. A dynamic problem of resource selection management. This problem statement corresponds to conception of functioning of the lower level of Grid system management (the level of local manager). Its mathematical base is a theory of Markov sequences control.

The problem consists of selecting a strategy of resource selection during direct accomplishment of the tasks. This strategy should use as a base the schedule of task distribution that was worked out at the upper level of management system and adjust the schedule in real time proceeding from observation of the process's current states.

For solving the problem statements listed above, there should be worked out the methods and algorithms of their solution. A computer model of double-layer system of interaction between resource suppliers and users should be built. A software system must implement the following models and algorithms:

- Algorithm of assignments problem solving as applied to a problem of creating the optimal (static) schedule of resource distribution at the highest level of management system.

- The process that simulates accomplishment of tasks from different users on distributed computing resources.

- Algorithm of resource selection adaptive management at the low level of managing system that was integrated into simulation model of task accomplishment process.

2.2.4. A problem of selecting efficient task servicing discipline. While allocating their resources for public usage, the owners are interested in the most efficient use of their resources. Usually, while solving the problem of resource distribution, it is implied that the effectiveness functions are the system's augmented throughput, or decreased tasks execution time. However, it is necessary to keep in mind that the significance of tasks accomplishment increases with approach of the term of applied task accomplishment. Therefore, it is necessary to study the service disciplines, where the decisions regarding task accomplishment priority and allocation of the parts of resources depend on the current state.

A queueing system is considered here, where the calls are characterized by a number of parameters that influence servicing duration. In particular, each call is characterized by data volume that decreases in the course of servicing. Instantly the system can service only one query.

For the system under consideration one must use service disciplines with time-sharing. Among the simplest algorithms of that type is Round Robin, where all the calls are served in the order of their arrival according to time quantum cycle after cycle. But it would be natural to assume that there should exist algorithms with more complex, but more efficient servicing schemes. Under such algorithms the servicing efficiency criterion is considered to be not only the maximal data processing speed of a server but also responsiveness of the user's interest.

It is intended to use algorithms where servicing priority is given to queries that were sitting in ^ the system for the longest time. In the beginning, all the queries are serviced during one time quantum.

ut If a query was not served completely during that period, then it is moved to the next group where more s time quanta are allocated for servicing of the queries, etc. There are very many variants of servicing discipline selection. For studying different query servicing algorithms and choosing the most efficient servicing schemes, it is necessary to develop the emulation model that can be used for tackling different variants and selecting the values of corresponding parameters.

Conclusion

Grid is a distributed software-hardware environment with new organization of computing and task/data flow management principally. When Grid infrastructure is created, the problem of efficient resources' use organization arises. Within this problem one can point out the necessity of developing a general model of the processes of tasks distribution among computing resources of distributed computing system and creating the methods and algorithms for solving particular optimization problems. Among them are the problems of controlling the volumes of forwarded tasks, optimal selection of appropriate resources on the stage of task preparation and accomplishment, determining the efficient disciplines of task servicing, etc. At present, authors actively work in this direction. The relevant R&D results will be published in the next articles.

Bibliography

1. http://www.gridclub.ru/library/publication.2004-11-29.5830756248/publ file .

2. http://www.parallel.ru/info/education/msu_grid-intro.doc .

3. Foster I., Kesselman C., Tuecke S., "The Anatomy of the Grid: Enabling Scalable Virtual Organizations", in International Journal of High Performance Computing Applications, 15 (3). 200-222. 2001. http://www.globus.org/research/papers/anatomy.pdf .

4. http://www.globus.org .

5. http://www.eu-datagrid.org .

6. http://www. griphyn.org .

7. Foster I., Kesselman C., J. Nick, Tuecke S., "The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration" http://www.globus.org/research/papers/ogsa.pdf .

8. S. Graham, S. Simeonov, T. Boubez, G. Daniels, D. Davis, Y. Nakamura, R. Neyama, Building Web Services with Java: Making Sense of XML, SOAP, WSDL, and UDDI, 2001.

9. http://www.globus.org/developer/news/20011112a.html .

10. http://x-com.parallel.ru .

£

to

i Надоели баннеры? Вы всегда можете отключить рекламу.