SugisakaM., LoukianovA., XiongfengF., KubikT., KubikK.B. УДК681.518
DEVELOPMENT OF AN ARTIFICIAL BRAIN LIFEROBOT
1. Introduction.
The objective of our research is that the artificial brain has to provide the robot with abilities to perceive an environment, to interact with humans, to make intelligent decisions and to learn new skills. Application areas of the LifeRobot is for human welfare, service tasks, and data collection and processing, etc.
The research issues of LifeRobot prototype are navigation, learning, information, humanrobot interface, and reasoning. In the navigation, robot localization investigated. The problem is to estimate the robot location in environment using sensor data. Localization issues are position tracking, global localization; recovery from errors and incorrect location information.
In the navigation, planning is investigated. The problem is to find how to move to the desired place based on the topological map of the environment. In the learning, skills investigated. Learning issues are to provide intelligent robot with an ability to adapt to changes in environment and to eliminate the need in robot programming by allowing the robot to interactively acquire new skills. In the human-robot interface, tracking considered. The problem is to find a face in the image taken by robotcameras and trace its movements. These problems are under our investigations. This paper describes the new
techniques developed in our researches for LifeRobot.
2. Research issues of LifeRobot.
The LifeRobot prototype developed at Artificial Life and Robotics Laboratory The University of Oita is called "Tarou". The picture of Tarou is shown in Fig. 1. The main specifications are listed in Table 1. The research issues of LifeRobot is shown in Fig. 2. As stated in Section 1, there are five main research issues, namely, navigation, information processing, learning, human-robot interface, and reasoning. These issues are concerned with the design of the artificial brain for the LifeRobot Tarou. In this paper, the techniques developed at Artificial Life nad Robotics Laboratory for the above research issues are described briefly.
3. Artificial brain of LifeRobot "Tarou".
The hardware is shown in Fig. 3 two CCD cameras, two computers, two DC motors, one stepping motor, six ultra sonic sensors, 5 LEDs are installed. One portable computer is used for voice recognition and command executing system (however, this computer is replaced by the above two computers later).
The analogy of brain functions between human being and robot is shown in Fig. 4(a) and
450mm
Fig. 1. The picture of Tarou (front and side views).
иркутским государственный университет путей сообщения
Table 1.
The specifications of LifeRobot "Tarou" which is a mobile robot (names of makers and type are shown).
• 2 driving wheels (driven by DC motors controlled via I/O card, D/A converters, and PWM ,
• 2 incremental optical encoders,
• 2 castor wheels,
• rotating head (rotated by stepping motor),
• 2 CCD cameras (with zoom, pan and tilt features controlled by SONY VISCA protocol).
• 4 status Light Emitting Diodes
• 6 ultrasonic sensors,
• touch panel,
• keyboard,
• speakers and microphone.
• 3 computers, connected by Ethernet network*
* 2 (on board) computers are used for robot control and vision processing (with standard video card). 1 laptop is used for voice processing and cellular phone operation (with sound card and digital cellular modem).
Fig. 3. Hardware of LifeRobot "Tarou".
Fig. 4. Brain for human being (a). Brain for AlifeRobot "Tarou" (b).
Fig. 5. Software structure of artificial brain for Tarou.
(b). The correspondence between human being brain and robot brain is clear. However, human brain is much more complex than robot brain. The hardware artificial brain has been developed for the recognition and tracking system for moving objects [1,2] thereafter for the mobile vehicle [3-5].
The artificial brain for Tarou consists of the special software. The artificial is able to perform vision based navigation [6-10], voice recognition, and face detection and face recognition. In Fig. 5 software structure of the artificial brain is shown. In Fig. 6 table is illustrated. In Fig. 5 client threads in the upper part correspond to perception and stimulus module, motion controller with three control subsystems in the lower part correspond to behavior module, and main control loop with shared memory to both client threads and motion
controller correspond to memory (knowledge) that causes reflex behavior in the artificial brain software of LifeRobot Tarou shown in Fig. 7. As shown in Fig. 7 artificial brain consists of memory, perception and stimulus module, thinking and decision module, and behavior module connected to each other.
Fig. 6 shows behavioral table. When an event occurs, the event refers behavior, namely, behavior table checking action occurs, and thereafter action execution that sends signal to motor driver, visual output, and audio output, takes place. After the actions are executed, behavioral table updating occurs. The main loop is repeated in the artificial brain software of Tarou. Fig. 8 LifeRobot Tarou implementation of scheme in Fig. 7.
The artificial brain of Tarou produces control commands to robot actuators via wheel control subsystem and head control subsystem, video camera control via camera control subsystem, speech output, touch screen output, and status LED, On the other hand, the artificial brain receives analog sound signal from microphone or mobile phone to robot communication subsystem and video image frame from CCD camera to robot vision subsystem. Thus, behaves autonomously to perform complex tasks instructed by human being.
Fig. 6. Behavioral table.
Fig. 7. LifeRobot Tarous artificial brain software.
иркутский государственный университет путей сообщения
Fig. 8. LifeRobot Tarou implementation of scheme in Fig. 7.
3.1. Vision based navigation.
There are two types of landmarks used in the navigation based on two CCD cameras which can capture nearly 24 images/s: one is the continuous landmark (guideline), another is the common landmarks (circle, triangle, etc).
The guideline is used for positioning, and other landmarks can make the robot some specific action, such as turn, stop etc. The robot moves at the least speed of 12 cm/s.
After one image was taken, the guideline and landmarks were extracted and from that image. The robot will act based on the results of the image processing. And it has the capability to search its target as fast as possible when it is moving [6-8,10].
Three examples for localization, learning skills, and path planning are shown in Figs. 9-11, respectively. As shown in Fig. 9 is to estimate robot location in environment using sensor data.
Fig. 9. Mobile robot rocarization of Tarou.
Fig. 10. Learning skills of Tarou.
Fig. 11. Path planning.
иркутский государственный университет путей сообщения
The localization issues are position tracking, global localization, recovery from errors and incorrect location information. For this problem the probabilistic landmark localization using video camera has been developed [11]. For the precise explanations of the figures shown in Fig.9, please refer to [11].
As shown in Fig. 10 skills are one of artificial brain functions. As indicated in the figure in Fig. 10 (robot) acts to environment (gives action to environment) and environment reacts to robot (returns state to robot) to learn skills by making model internally for both supervised and unsupervised cases. Learning issues are how to provide intelligent robot with ability to adapt to changes in environment; how to eliminate the need in robot programming by allowing the robot to interactively acquire new skills. Learning visual tracking skills has been developed for visual serving control [12]. As indicated in the figure in Fig. 10, image is obtained from CCD camera. The
error and its derivative between object and goal are calculated and they are fed to neural network controller as state, which produces control signal to actuator. The experimental results are also shown in Fig. 10. For precise explanations of contents, please refer to [12].
As shown in Fig. 11 path planning is to find out how to move from a present to the desired place based on the topological map of the environment. A topological map is represented as a set of rules describing the robot environment in a natural way. Additional rules provide the information about; local navigation strategies, landmarks along paths, etc. Command plan is a list of executable commands constructed from the rules on request. In Fig. 12(a) of map using rule is given and in Fig. 12(b) of other rules are shown; one is partial navigation strategy and other is landmark to determine location. For precise explanations, please refer to [13].
Fig. 12. (a) Navigation in indoor environment (one example) (b) Navigation in indoor environment (examples other rules).
3.2. Voice recognition.
Tarou has a voice recognition and command executing system, which are on two computers on hardware and link and local network. The voice recognition system utilizes IBM ViaVoice SDK (Software DeveloperKit) for dictation and IBM ViaVoice TTS (Text to Speech) SDK for speech application.
By using this system, Tarou can now recognize the following commands in and in Japanese:
1 Go ### meter (s)/centimeter (s).
2 Turn your head to the left/right.
3 Straighten your head.
3 Look up/down/straight.
4 Turn left/right ### degrees.
5 No (for canceling of a command).
(### means a number)
In order for Tarou to recognize the above five commands, the grammar file shown in Table 2 been developed for the implementation [14]. For easy understanding the grammar file, please refer to [14].
IBM ViaVoice provides a tool for building a personal voice model. For , if a voice model is made by a Japanese people (who speaks English with Japanese accent), then, it will be easier for the voice recognition system to understand other Japanese peoplespeaking.
3.3. Face detection and face recognition.
At first, as an example, the face tracking is described. Thereafter real-time recognition system follows. As shown in Fig. 13, face tracking is to find a face in the image taken by robotcameras and trace it movements. In this algorithm, CAMSHIFT (Continuously Adaptive Mean Shift Algorithm) is used to search and track a human face in images of video frame sequences. Firstly, a skin color model should be build to transfer a color image into a probability distribution image (Artificial fee-forward neural network (AFNN) could be used to build this model). Then, face tracking is performed by the following of CAMSHIFT.
1. Set the calculation region of the probability distribution to the whole image.
2. Choose the initial location of the 2D mean shift search window.
3. Calculate the color probability distribution in the "D region at the search location in an area slightly larger than mean shift window size. For precise explanation, please refer to [15].
We have built a real-time face recognition system which is combined with detection (based on a neural network) and face recognition (based on Embedded Hidden Markov Models (EMM)) techniques. It constantly takes images from the surroundings with the camera mounting on the robothead, finds faces in them. If a face is detected, the system tries to recognize it.
Table 2
Tarous commands implementation-grammar file.
<digit>= oh -> 0|0|1|2|3|4|5|6|7|8|9.
<teen>= 10| 11|12|13|14|15|16|17|18|19.
<deca>= twenty -> 2|thirty -> 3|forty -> 4|fifty-> 5|
sixty -> 6|seventy -> 7|eighty -> 8|ninety -> 9. <n2>= {<deca>} {<digit>} -> {1}{2}|{<deca>} -> {1}0|<teen>. <n3>= {<digit>} (hundred and?)? {<n2>} -> {1}{2} {<digit>} hundred -> {1}00. <numb>= <n3>|<n2>|<digit>.
<g-unit>= centimeter -> c|centimeters -> c |meter -> m|meters -> m. <r-unit>= degree -> d|degrees -> d. <direction>= left -> 1|right -> r. <cmd-1>= Stop -> s.
<cmd-2> = Go {<numb>} {<g-unit>} -> g{2}{1}.
<cmd-3>= Turn {<direction>} {<numb>} {<r-unit>} -> t{1}{3}{2}.
<cmd-4> = Yes please -> y.
<cmd-5> = No -> n.
<sentence>= <cmd-1>|<cmd-2>|<cmd-3>|<cmd-4>|<cmd-5>.
иркутский государственный университет путей сообщения
Fig. 13. Face tracking.
The neural network is trained with many face examples to get the concept of a face. It receives a 20x20 pixel region of the image as input, and generates an output ranging from 0 to 1, signing how close it is to a face. We use it to examine a sub-image subtracted from the input image whether it is a face.
For face recognition, we built a separate EMM model for each person in the database. If a face is detected, it will be matched against each model in the database. The model, which yields the largest similarity value, is reported as the host. The system have achieved a detection rate of more than 85% for front views or slightly rotated views of faces.
4. Conclusions.
In this paper a part of the results obtained for development of the artificial for LifeRobot Tarou was described briefly. The development of the artificial brain by LSI chip(System on Chip) has been developed as another project based on the LifeRobotartificial brain results. The concept of the artificial will be used for biologically inspired dynamic bipedal humanoid robots driven by
motors or artificial muscles and this project is under investigation.
Finally, the idea of the artificial brain stated in this paper is depicted in Fig. 14.
REFERENCES
1. M. Sugisaka, Neurocomputer control in an artificial brain for tracking moving objects, Life and Robotics 1 (1) (1997) pp.47-51.
2. M. Sugisaka, Fast pattern recognition using a neurocomputer, Artificial Life and Robotics 1 (2) (1997) pp. 69-72.
3. M. Sugisaka, X. Wang, T. Matsumoto, Control strategy for a smooth running of a mobile with neurocomputer, in: Proceedings of Twelfth International Conference on Systems Engineering, Coventry, UK, September 9-11, 1997, pp. 664-669.
4. M. Sugisaka, Design of an artificial brain for robots, Artificial Life and Robotics 3(1) (1999) 7-14.
5. M. Sugisaka, X. Wang, J.J. Lee, Artificial brain for a mobile vehicle, Applied Mathematics Computation 111 (2-3) (2000) pp. 137-145.
Fig. 14. The idea of artificial brain stated in this paper.
6. C. Radix, M. Sugisaka, Further development of an "artificial brain, in: Proceedings of 2000 Symposium on Mechatronics and Intelligent Mechanical System for 21 Century, Kyong Sang Nam-Do, Korea, October 4-7, 2000, pp. 107-110.
7. A. Loukianov, M. Sugisaka, Supervised learning technique for a mobile robot controller in a line tracking task, in: Proceedings of The Sixth International Symposium on Artificial Life and Robotics (ArOB 6th ), Tokyo, Japan, January 15-17, 2001, vol. 1, pp. 238-241.
8. C.A. Radix, A.A. Loukianov, M. Sugisaka, Evaluating motion on the Alife Robot prototype : Proceedings of the 32nd International Symposium on Robotics (ISR 2001), Seoul, Korea, April 19-21, 2001, pp. 714-719.
9. M. Sugisaka, Development of Communication methods between artificial liferobot and human, IEEE Region 10th International Conference on Electrical and Electronic Technology, CD Rom, August 19-22, 2001, pp. 894-898.
10. J. Wang, M. Sugisaka, Study on visual-based indoor navigation for an Alife mobile robot, in: H. Selvaraj, V. Muthukukumar (Eds.), Proceedings of Fifteenth International
Conference on Systems Engineering, Las Vegas, USA, August 6-8, 2002, pp. 390-396.
11. A.A. Loukianov, M. Sugisaka, A hybrid method for mobile robot probabilistic localization a single camera, in: Proceedings of International Conference on Control, Automation and Systems, Cheju University, Jeju, Korea, October 17-21,2001,pp. 280-283.
12. A.A. Loukianov, M. Sugisaka, An approach for learning a visual tracking skill on a mobile, Proceedings of the SICE/ICASE Workshop "Control Theory and Applications, Nagoya, Japan, 2001, pp. 83-87.
13. T. Kubik, M. Sugisaka, Rule based robot navigation system working in an indoor , in: Proceedings of XIV International Conference on Systems Science, II, Wroclaw, Poland, 2001, pp. 212-219.
14. T. Kubik, M. Sugisaka, Use of a cellular phone in mobile robot voice control, Power Point , internal reference, 2001, pp. 26.
15. X. Feng, Z, Wang, M. Sugiska, Non-parametric density estimation with application to face tracking on mobile robot, in: Proceedings of International Conference on Control, Automation and Systems, Cheju University, Jeju, Korea, October 17-21, 2001, pp. 426-429.