System for Transmitting HDTV over IP Networks Using Open-Source Software
Mateusz Starzak, Wojciech Zabierowski, Andrzej Napieralski
Abstract - The article presents results of work in the field of transmission of standard and non-standard high definition television signals over IP networks for application in medical environment. An example system for realtime signal acquisition, compression and broadcast is presented. It was built from off-the-shelf computer equipment, as a part of a video consulting system for medical (cardiological) purposes. To complete the task a DirectShow filter imitating a video capture device driver for Microsoft Windows operating system has been created.
Index T erms — HDTV, DTV, digital video compression, IP, Internet
I. Introduction
UP until recent years the process of realtime compression of video materials in higher than standard definition was out of reach for PC users. The growth of computational power in general purpose microprocessors and the introduction of multi-core parallel processing made on-the-fly software compression and decompression possible [3]. In the year 2006 when the project started, it was already possible to perform the realtime compression, however there were just a few frame grabber devices available, that could perform full motion capture of HD video signals. Most of them were very expensive, and on the downside, the support for standard multimedia device access methods like DirectShow or Video4Linux under Windows and Linux operating systems was nonexistent. The software supplied with Matrox Solios XA frame grabber, which was available for tests and development, allowed capturing still images and video sequences to file, but no further live video processing was available, including forwarding the stream to other applications.
II. Technical Requirements
• On the transmitter side, the system has to work under Windows XP.
Mateusz Starzak - Computer Centre, Technical University of Lodz, [email protected]
Wojciech Zabierowski - Departament of Microelectronics and Computer Science, Technical University of Lodz, [email protected]
Andrzej Napieralski - Departament of Microelectronics and Computer Science, Technical University of Lodz, [email protected]
• Must ensure stable and uninterrupted transmission for minimum 12 hours.
• Input interfaces must include Matrox Solios XA and internal sound card of the transmitter PC.
• Standard XGA signal coming from Kramer VP-725DS scaler will be processed.
• A video codec, which will ensure smooth 1024x768 stream at 60 fps has to be used.
• Artifacts introduced by the video codec should be imperceptible to the user.
• Both parties of the video link have to see and hear each other. The system should allow uninterrupted conversations between them.
• Full bandwidth must not exceed 30 Mbit/s.
III. Research
The lack of DS and V4L support prevented frame grabber devices from being used together with publicly available video encoding software such as open-source FFMPEG, MPEG4IP, VideoLAN, MEncoder and proprietary Windows Media Encoder or Flash Media Encoder as a mean of HDTV signal webcasting.
From two possible approaches - writing an encoding application from scratch or creating a universal interface for use with any standard multimedia application, the latter was selected. The frame grabber was acquired with a software development kit for Microsoft Windows only. A fast research showed, that the fastest way to implement the interface between the device and software would be to create a userspace DirectShow filter. This would allow the use of VideoLAN Client application as the transmitter and receiver of the audio-video stream.
Such DirectShow filter [2] (so called “source filter”) has been created in course of the project, using MIL-Lite programming libraries for the Matrox frame grabber, and Microsoft Windows SDK, which include a sample video stream generating filters (“PushSource” and “Ball”), which can give a good idea on how to write your own source filter. The missing information about registering the filter in the system registry, so that it is seen by the system as a video capture device was obtained from microsoft.public.win32.programmer.directx.video newsgroup.
38
R&I, 2012, №1
IV. The System
Two independent communication channels were established: the main HD stream from the operating room, and a lower quality and low bitrate return channel carrying video, sound from the conference venue. The technical crew was provided with text messaging using the same channel (Fig. 1).
Fig. 1. Complete video consulting system schema
There were three video sources on site at the operating room: HDTV camera, digital angiography device, and electrocardiography recorder, each with different resolution and refresh rate. All input signals in the operating room were scaled down to 1024x768 resolution and 60 Hz refresh rate using Kramer VP-725DS scaler/switcher (Fig. 2). This exact resolution was chosen because multimedia projectors available on the remote location had XGA native resolution. Unification of resolutions also had to be done because there is no way of determining the parameters of the signal coming into the exact frame grabber card used in the project, basing on the data provided by the frame grabber itself. The user has to know the parameters to perform a successful capture in the first place.
Audio and video digitization was done on a dual processor Intel Xeon based PC equipped with Matrox Solios XA frame grabber and an AC’97 compatible motherboard integrated soundcard.
The frame grabber is a 64-bit PCI-X card based on Altera Stratix FPGA chipset. It has 4 independent 10-bit input A/D converters. They can be linked to form two dual signal input (S-video) or a single three signal YUV/RGB component video input. The maximum capture resolution is 1024x1024 pixels, and pixel clock rate of 65 MHz [5], provided the pixel clock signal is fed directly to the card. In case of a computer video signal with standard VESA timings and VGA signaling, the maximum resolution is 1024x768 at 60 Hz refresh rate (pixel clock rate of this signal is exactly 65 MHz) [4].
GE CardioLab
Shimadzu DAR-2400
Sony HDR-FX1E
Kramer VP-725DS
Matrox Solios XA
Fig. 2. Video signal paths and interfaces. Blue - component video, orange - RGB
VideoLAN Client (version 0.8.6) was used for encoding the video and audio signals, because of its flexibility. To minimize IP overhead and benefit from near-lossless network, UDP protocol was used. After numerous tests with different codecs, multiplexers and bitrate, a decision was made to set the bitrate level at 8 Mbit/s, and use MPEG-4 Simple Profile codec for video and MPEG-1 Layer 1, 192 Kbit/s for audio with MPEG Transport Stream multiplexing. An example stream made using these parameters was shown to the medical staff, and their decision was that the picture doesn’t loose any important details and can be used for medical diagnosis.
Fig. 3. Cardiac surgery seen from the conference venue. HD feed on the left, and the return channel on the right
Because only one frame grabber card was available, for the return channel a software videoconferencing solution was considered. Tests with various open-source H.323 and SIP clients were unsuccessful because of their low stability and the choice fell on free Skype software. Should the high definition channel fail, a possibility to continue the webcast with the emergency usage of Skype’s video call function was available.
R&I, 2012, №1
39
VideoLAN Client I
Skype
Fig. 4. Final system’s architecture
The delay measured was approximately 2 seconds on the HD channel, and under 100 ms for the return channel. Various literature including [1] mentions a tolerable latency value of 200-300 ms over which the communication between two people is starting to be perceived by them as “uncomfortable”.
V. Conclusion
It has been shown, that currently available PC computers are able to perform realtime compression of HDTV signals using open-source software solution. Currently there are devices on the market that allow VGA capture using DirectShow compatible drivers, meaning that most effort described in the article could be spared. It was observed, that the MPEG-4 encoder implementation in VideoLAN Client doesn’t fill the defined bandwidth. Current choice of codecs could also be revised. As H.264 provide lower bitrate while maintaining the same overall picture quality it would be advisable to switch to the new codec. What is more, the open-source implementation of this codec, x264 since march 2010 is optimized for low latency encoding. Latency is a crucial factor for video conferencing and video consulting applications. Currently obtained latency makes interaction between two locations a quite challenging task, and without some amount of training it’s not possible to have a fluent conversation. The VideoLAN Client wasn’t designed for low latency processing, so further improvements would include replacing it with another application, possibly FFMPEG.
References
[1] B. Reeves, D. Voelker, Effects of Audio-Video Asynchrony on Viewer’s Memory, Evaluation of Content and Detection Ability.
[2] M. D. Pesce, Programming Microsoft DirectShow for Digital Video and Television.: Microsoft Press, 2003.
[3] B. Fuhrt, Handbook for multimedia computing.: CRC Press LLC, 1999
[4] Video Electronics Standards Association, Vesa Monitor Timing Specifications Version 1.0, Revision 11., 2007.
[5] Matrox Electronic Systems Ltd., Matrox Solios eA/XA technical specification
Wojciech Zabierowski (Assistant Professor at Department of Microelectronic and Computer Science Technical University of Lodz) was bom in Lodz, Poland, on April 9, 1975. He received the M.Sc. and Ph.D. degrees from the Technical University of Lodz in 1999 and 2008, respectively. He is an author or co-author of more than 70 publications: journals and most of them - papers in (international conference proceedings. He was reviewer in six international conferences. He [supervised more than 90 Msc theses. He is focused on internet technologies and automatic generation of music. He is working in linguistic analysis of musical structure.
40
R&I, 2012, №1