Научная статья на тему 'ANALYTICAL STUDIES RELATING TO BANDWIDTH EXTENSION FROM WIDEBAND TO SUPER WIDEBAND FOR NEXT GENERATION WIRELESS COMMUNICATION'

ANALYTICAL STUDIES RELATING TO BANDWIDTH EXTENSION FROM WIDEBAND TO SUPER WIDEBAND FOR NEXT GENERATION WIRELESS COMMUNICATION Текст научной статьи по специальности «Медицинские технологии»

CC BY
115
31
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
Super wideband / Linear prediction analysis / Enhanced voice service coder / Mean opinion score

Аннотация научной статьи по медицинским технологиям, автор научной работы — Rajnikant Rathod, M. S. Holia

In a recent scenario of advancement in the upcoming next-generation system reconstructed speech(voice) signal by the side of the receiver side found stiffened, barely audible, and slim because of limited bandwidth of 300-3.4 kHz. To get back the genuineness of speech(voice) signal, narrow band speech encoders must be upgraded to wide band encoders supporting 50-kHz bandwidth. The extensive period has been left for advancement from N.B. to fully W.B.'s well-suited systems. The terminal and network must be altered to make the N.B. system compatible with the W.B. system. During that span novel technique has been urbanized to widen the N.B. bandwidth of speech (voice) signal at handset end (receiver) for humanizing final speech quality. The technique of attaining the original W.B. signal from a band-limited N.B. speech(voice) signal without actually transmitting the W.B. signal is called bandwidth extension. The same concept applies to attain super wideband from the W.B. signal. B.W.E. based on sinusoidal transform coding, linear prediction, non-linear device is giving good results compared to spectral folding/spectral translation approaches employed by various researchers. In the modern scenario of advancement in technology, various coding algorithms have been urbanized for S.W.B. and F.B. to obtain the full benefit of advancement in available telecommunications bandwidth, predominantly for the internet. In the proposed method based on source filter model fundamental thought adopted for the B.W.E. are the separate extension of the spectral envelope and the residual signal. Each part is processed separately through different speech enrichment procedure to get the highband component and added to the resampled and delayed version of the signal to acquire the final extended output which is compared through intelligibly(subjective) and quality(objective) perspective and results are compared with baseline algorithm and next-generation super wide band coder algorithms to prove that obtained results are comparable with both algorithms. Algebraic evaluation for getting missing high band components from the original W.B. signal is not needed and method tremendously well-organized and commence only minor time interval or delay..

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «ANALYTICAL STUDIES RELATING TO BANDWIDTH EXTENSION FROM WIDEBAND TO SUPER WIDEBAND FOR NEXT GENERATION WIRELESS COMMUNICATION»

Rajnikant Rathod, M.S.Holia RT&A, Special Issue № 1 (60)

ANALYTICAL STUDIES RELATING TO BANDWIDTH Volume 16, Janyary 2021

ANALYTICAL STUDIES RELATING TO BANDWIDTH EXTENSION FROM WIDEBAND TO SUPER WIDEBAND FOR NEXT GENERATION WIRELESS COMMUNICATION

Rajnikant Rathod, M. S. Holia

Gujarat Technological University rathod.45@gmail.com,msholia@bvmenginnering.ac.in

Abstract

In a recent scenario of advancement in the upcoming next-generation system reconstructed speech(voice) signal by the side of the receiver side found stiffened, barely audible, and slim because of limited bandwidth of 300-3.4 kHz. To get back the genuineness of speech(voice) signal, narrow band speech encoders must be upgraded to wide band encoders supporting 50-kHz bandwidth. The extensive period has been left for advancement from N.B. to fully W.B. 's well-suited systems. The terminal and network must be altered to make the N.B. system compatible with the W.B. system. During that span novel technique has been urbanized to widen the N.B. bandwidth of speech (voice) signal at handset end (receiver) for humanizing final speech quality. The technique of attaining the original W.B. signal from a band-limited N.B. speech(voice) signal without actually transmitting the W.B. signal is called bandwidth extension. The same concept applies to attain super wideband from the W.B. signal. B.W.E. based on sinusoidal transform coding, linear prediction, non-linear device is giving good results compared to spectral folding/spectral translation approaches employed by various researchers. In the modern scenario of advancement in technology, various coding algorithms have been urbanized for S.W.B. and F.B. to obtain the full benefit of advancement in available telecommunications bandwidth, predominantly for the internet. In the proposed method based on source filter model fundamental thought adopted for the B.W.E. are the separate extension of the spectral envelope and the residual signal. Each part is processed separately through different speech enrichment procedure to get the highband component and added to the resampled and delayed version of the signal to acquire the final extended output which is compared through intelligibly(subjective) and quality(objective) perspective and results are compared with baseline algorithm and next-generation super wide band coder algorithms to prove that obtained results are comparable with both algorithms. Algebraic evaluation for getting missing high band components from the original W.B. signal is not needed and method tremendously well-organized and commence only minor time interval or delay..

Keywords: Super wideband, Linear prediction analysis, Enhanced voice service coder, Mean opinion score

I. Introduction

Third and fourth generation wireless communication (W.C.) system is becoming popular in the world market due to their low bit rate coder and it was proposed to provide interactive multimedia communication including teleconferencing, internet access and an assortment of another service that becomes practicable with low complex, low cost, low bit rate and less processing delay. So one can articulate that these are the major attributes that play a vital role while designing any particular coder. In digital telecommunication systems, it is always demanded to transmit speech(voice) signal powerfully[1,3].Public switched telephone network (P.S.T.N.) shrink the bandwidth(B.W.) of the transmitted speech(voice) signal from the frequency range of 50 Hz to 7 kHz to 50 Hz to 3.4 kHz. The condensed bandwidth leads to a stiffened, barely audible, and slim sound. As per the recommendation of ITU-T conducted listening tests it is well clear that the speech(voice) B.W. affects the perceived voice(speech) quality[2].

The introduction of the wide band(W.B.) communication system aggravated the transmission of the W.B. signal having a cutoff frequency of at least 7 kHz for improved speech quality in form of the intelligibility and naturalness. The restricted access problem in the employment of W.B. coders and communication is the up-gradation of current narrow band (N.B.) coders and transmission to the W.B. system where hardware and software up-gradation and compatibility are forever a major issue. Better speech signal quality performance offered by W.B. coders, at rest hasty alternate of complete N.B. coding and transmission systems is not sufficient due to remarkable infrastructure expenses incurred to network operators and also for the customers who want to make use of the system. One decade has been elapsed for changing over existing N.B. systems to fully W.B. compatible systems. The terminal and network must be updated to make the N.B. system compatible with the W.B. system. During that span novel approach has been urbanized to widen the N.B. B.W. of speech (voice) signal at the handset end (receiver) for humanizing final speech quality[5].Bandwidth extension (B.W.E.) system for speech signals is a compelling and reasonable option to acquire wideband speech with excellent quality sound for the existing wireless communication infrastructures viz. P.S.T.N. and Global System for Mobile Communication (G.S.M.)[1].

There are two types of bandwidth extension techniques of speech (voice). In one technique namely the blind bandwidth extension technique the missing frequency component is achieved from the available N.B. speech component with not necessary to transmit data about the stolen frequencies because the expansion is wholly w.r.t. N.B.'s speech signal, these techniques can be done at the enviable end of the channel. The other technique namely the non-blind bandwidth extension technique depends on the steganography (data hiding) method. The greater part of the prior strategy makes W.B. speech by source filter model(S.F.M.) demonstration [4], excitation signal, and linear prediction(L.P.) coefficient for the spectral envelope. The B.W.E. techniques of speech because of the data hiding method insert high-frequency segments data into the N.B. speech bit stream, and then at the user end terminal, the WB speech is recaptured w.r.t. high-frequency data. A limitation of the B.W.E. technique with side data is that a similar technique is bolstered at the two ends of the transmission line[7,8]. In this paper bandwidth extension of speech(voice) signal based on S.F.M. is carried out.

This paper is mainly structured into six sections. Sect.2. describes the problem definition and Sect.3. denotes the details of reported previous works. In Sect.4, theory related to bandwidth extension from N.B. to W.B. and W.B.to S.W.B. has been discussed. In Sect. 5, the results obtained through a series of simulation in MATLAB has been depicted. Finally, the concluding remarks are given in Sect.6.

II. Problem Definition

In the modern state of affairs of wired and wireless communication systems reconstructed speech(voice) signal at the end device is found stifled and slim due to non-attendance of high band(H.B.)spectral components[1]. Limited bandwidth of 300-3.4 kHz trims down superiority and clearness of voice signal as a result of going astray high-frequency components which play a significant role particularly in consonant sounds[6]. So to get back the genuineness of speech(voice) signal N.B. speech encoders must be upgraded to W.B. encoders that support bandwidth(BW) of 50Hz-7kHz.

In the last couple of decades, trivial research has been carried out for upgrading existing N.B. systems to fully W.B. compatible systems. The terminal and network must be modified and upgraded to make the N.B. system compatible with the W.B. system. During that span novel technique has been urbanized to widen the N.B. B.W of speech signal at handset end (receiver) for humanizing final speech quality. This technique converts original N.B. signals into Artificial W.B. signals by estimating the missing high-frequency contents based on the existing low-frequency contents. W.B. improved speech reproduction with no reaching the natural quality of face-to-face exchange or the high quality of professionally recorded speech. Now a day's various coding algorithms have been built-up for super wideband (S.W.B.) (50 Hz -14 kHz) and full band (FB) (20 Hz -20 kHz) to obtain the full benefit of advancement in available telecommunications bandwidth, predominantly for the internet. In the recent scenario of advancement in next-generation wireless communication systems, numerous elegant devices hold up premium speech communication services at S.W.B.. S.W.B. can be attained from the W.B. signal by utilizing the concept of B.W.E. without the burden of data rate and changing of a network component. This paper focuses on the novel approach based on the S.F.M. for evaluating the performance of the speech signal.

III. Previous works.

While studying bandwidth extension several parameters and constraints need to be taken into consideration and propose a highly efficient algorithm that introduces only negligible latency. During the literature review, many research papers, journals, and other articles on bandwidth extension are referred. A general overview of the bandwidth extension has been presented in many research papers [8-17]. This step led us to define the problem of research. [18] gives a solution to trim down the wideband speech coder bitrates by coding the parameters of wideband voice(speech) employing noteworthy enlarge in bitrates of N.B. coders.[19-21,22] were discussed various approaches based on the linear prediction algorithm using L.P.C. coefficients and focus on the codebook mapping approach and model the bandwidth extension and make subjective measurement comparison for various audio wave files. Performance assessment of speech(voice) signal based on the sub-band filter and source filter model are discussed in [17,23] and found that linear prediction is an effective tool for the bandwidth extension and play a very significant role in achieving data rate compression. The detailed linear perdition analysis and synthesis and stability criteria also studied and found that for next-generation wireless communication bandwidth extension from the signal itself (without sending sideband information) is useful for data rate reduction because no need for sending information in sideband is needed for reproduction of signal at the receiver side. Codebooks mapping [24,25], linear mapping [26], neural networks etc.. is used to estimate the missing components[27-29] bring into being the potential features of speech, and evaluate their performance for B.W.E. application.[30,31] targeted a narrative method to achieve bandwidth extended output at far end terminal without alteration and up-gradation of the current N.B. system. In [17] the author has done work on various methods for bandwidth extension. In [16] the author has designed the S.F.M. based on the vocal tract to retrieve bandwidth extended output.

IV. Discussion

I. B.W.E. from N.B. to W.B. Speech Conversion

In Digital Signal Processing signals are band-limited w.r.t. use of sampling frequency,N.B.,W.B., S.W.B. Speech has a sampling frequency of 8Khz, 16Khz, 24Khz respectively. Based on N.B. to W.B. speech conversion and baseline algorithms(H.F.B.E.), the proposed method for W.B. to S.W.B. has been discussed. which is a blind method because they estimate missing H.F. component from available L.F. components. The proposed flow chart of bandwidth extension based on baseline and proposed algorithms discuss the achievement of W.B./S.W.B. speech signal at the receiver side without actually transmitting W.B./S.W.B.

II. General Model for BWE

The fundamental thought adopted here for the bandwidth extension system is the separate extension of the spectral envelope and the residual signal as depicted in "Fig.1".First, the incoming telephone-band signal is analyzed through the LPC-analysis. based on the spectral envelope and further telephone-band short time features the spectral envelope is extended. The extended version is essential to define the shaping filter characteristic. The extended residual signal for this shaping filter is calculated from the telephone-band residual signal as highlighted in "Fig.1". According to the linear model of speech production, the synthetic signal is generated by driving the shaping filter with the extended residual signal. The resulting power of the synthetic signal has to be matched to the telephone-band signal power such that both signals can be added to build the desired wide-band signal[6].

III. Baseline System Model for B.W.E.

As shown in "Fig.2",The baseline system model accepts decoded data as input from any lossy audio decoder and recreates high frequencies blindly i.e. by using only the decoded audio signal and nothing else. Since the bandwidth of the input signal is unknown it is first estimated by using a real-time bandwidth detection process. After detecting the highest frequency present in the signal at any given time, the decoded signal is divided into sub-bands up to half the detected highest frequency of the signal. Each sub-band signal is then individually passed through non-linearity to generate harmonics. The generated harmonics are gain scaled to achieve spectral envelope shaping and added back to the original signal[32,33].

IV. Proposed System Model for B.W.E.

"Fig.3",depicts the Proposed S.F.M. based Algorithm block diagram for W.B. to S.W.B. speech(voice) conversion. The block diagram is mainly categorized into four parts as narrated below:

(1) first and main important steps is to acquire the W.B. input signal from pre-processing stage and the framing & windowing is performed on WB input signal.

(2) In second Steps the output of the first step is carried out by L.P. algorithm to divide the signal into two parts spectral information and residual error signal, so one can say that missing H.F. component are estimated from accessible L.F. components only.

(3) In third step original L.F. component are extracted from input WB frame by zero insertion.

(4) In final step both the L.F. and H.F. component are added together to get estimated S.W.B. output.

Rajnikant Rathod, M.S.Holia RT&A, Special Issue № 1 (60)

ANALYTICAL STUDIES RELATING TO BANDWIDTH V°lume 16, Janyary 2°2:l

V. Detailed Analysis of Proposed System Model Framing

Partitioning of a speech signal into frames is the first basic component of our proposed approach. on the whole, a speech(voice) signal is not stationary, but it is typically is stationary in windows of 20 ms. Therefore the signal is divided into frames of 20 ms which corresponds to n1 samples:

m = tst * f ..................(1)

tele phone-band signal

Fig. 1. General Model for B.W.E.[9]

Bandwidth Sub-band Non-Linear Post-

Detection * Filtering 3 Processing Processing |

Delay

Bandwidth Limited Input(WB)

Bandwidth Etended / S.W.B. Output

Fig.2. High Frequency Bandwidth Extension (H.F.B.E.) Algorithm[32 Fig.4 depicts pictorial representation for framing the signal into four frames of 20 ms. Windowing

During frame blocking, there is a possibility that signal discontinuity may arise at the beginning and end of each frame. To reduce the signal discontinuity at either end of each block the next step employed is windowing. The window function exists only inside a window and evaluates to zero outside some chosen interval. When multiplied with the original speech frame, the window function taper down the beginning and the end to zero and thereby minimize spectral distortion at both the end. The simplest rectangular window function is given by

0 < n < N -1

......................(2)

otherwise

A more commonly used somewhat smoother function (hamming window) is defined as

ro(n) = -

0.54 - 0.46cos2nn . 0 < n < N -1

N-1 ................(3)

otherwise

where n= sample number in a frame,N= total number of samples & 10-30ms window length[6].

Linear Prediction

Linear prediction (L.P.) is the heart of the bandwidth extension algorithm for N.B. to W.B. and W.B. to Super wideband(S.W.B.). In linear prediction, the present sample can be estimated as a linear combination of past samples[32]. "Fig.5", represent all-pole spectral shaping synthesis filter. Naturally, three filters namely glottal pulse model G(z), vocal tract(VT) model V(z), and radiation model R(z) are utilized to model the speech creation. The glottal pulse model(GPM) contour the pulse train before it is used as input to V(z). Three models together can be represented via single T.F. H(z), i.e.:

Where H(z) is called as the synthesis filter and is shown in "Fig.5", Obviously the synthesis filter can be represented via the inverse of the analysis filter, i.e.:

H(z) = 1/A(z)

In this way, we can parameterize a voice signal and it is a suitable and precise method[10]. H.F. & L.F. Component addition to get estimated SWB speech signal

As discussed in Step 2 of the proposed system model for B.W.E. hereafter the framing, windowing, and LP analysis the signal is divided into two different parts namely spectral envelope estimation and residual error signal. Both parts are processed separately as shown in the figure and shaped by shaping filter and High pass

H(z) = G(z)V(z)R(z)

Input from Preprocessing

Windowing

Zero Insertion

LP

LPF

1 + £ akwbz -

k=1

H (Z) | z = e jw

X

HPF

IFFT

FFT

Synchronization

+

Estimated SWB

Xswb

Fig. 3 Block diagram of the proposed approach for W.B. to S.W.B.

213

1

speech (voice) signal

Fig.4. Pictorial

Representation

for Framing [7].

Fig. 5. All Pole Spectral Shaping Synthesis Filter

filter to get H.F. component. The Band's limited signal at the input side is resampled via zero insertion to get the L.F. component. after getting both components they are added together via the over-lap add method to get the estimated SWB output. "Fig.6", Depicts the proposed flow for W.B. to S.W.B. and subjective and objective measurement. As shown in the flow diagram first the simulation parameter is defined and the required filter is loaded. After that, the speech files AMRWB, EVS, S.W.B. are read by utilizing audio read function and it is passed through the H.F.B.E. algorithm function as shown in "Fig.7", H.F.B.E. baseline algorithm first W.B. signal is up sampled via zero insertion and low pass filter. after that to extract the highest octave from the up-sampled W.B. signal and find out the

absolute value of function bandwidth detection, sub-band filtering, and N.L.D. devices are utilized. After that to capture the required parts of the spectrum and omit the remaining part post-processing with Filtering is utilized and finally, it is added with a delayed version of input to get the required S.W.B. output.

"Fig.8", depicts the proposed algorithm proposed flow chart for obtaining the estimated S.W.B. signal. The basic difference between the baseline and the proposed approach is that in baseline only AMRWB input, filter, and gain parameter are processed while in the proposed approach three parameters like LPC order, NFFT Points, window length of S.W.B. are additional parameter to be processed. In the proposed method input W.B. signal framing, separation of a framed signal into a spectral envelope & residual error, and finding of missing H.B. components are done in a loop which is to process for the whole number of frames. then after adjustment of delay H.B. and N.B. components are added to get the estimated S.W.B. signal. then spectrogram, objective & subjective measurement are performed on the estimated S.W.B. signal w.r.t. input wave files.

All experiments reported here were carried out by utilizing voice records from The CMU file(database) [34] & TSP file(database) [35] at different sampling rate fs,

214

Data Pre-Processing and Assessment for W.B. to S.W.B.

"Fig.9",demonstrates the data pre-processing and assessment for W.B. to S.W.B.. TSP and CMU ARCTIC were down sampled to SWB signals databases so that both the databases have a common fs of 32kHz. Down sampling can be done employing the ResampAudio tool contained in the AFsp package [36]. The voice level of all utterances in both file(databases) maintained to 26dBov [37] to produce Xswb. After that next-generation voice coder (EVS) encoding [38] is applied to produce Xevs, Xswb down sampled to 16kHz and processed through BPF[39] as per recommendation P.341, so finally we can get the data Xwb, on which AMR-WB coding [40] has been applied to produce Xamr, (Xwb in Fig. 10 is replaced by Xamr). As shown in "Fig.3", Xamr is the input to baseline/proposed Algorithm to get the Estimated S.W.B. output after processing the signal through the Algorithm. The proposed B.W.E. algorithm is to evaluate and compare to AMR-WB and next-generation voice coder(E.V.S.) processed voice signals, baseline

H.F.B.E.algorithm[32].Subjective& objective measurement for W.B. to S.W.B..

For quality measurement which is highly subjective in nature subjective evaluation comparison based mean-opinion score (C.M.O.S.), mean opinion score (M.O.S) ratings are performed on various speech files [28,41]. In each examination bandwidth, extended signals are compared with Xevs and Xehbe. Each examination was carried out by 56 listeners, among them 28 were male and 28 were female Speakers. They were requested to judge against the superiority of 13 randomly ordered pairs of speech signals X and Y, either X or Y can be processed with the proposed algorithm and remaining by baseline or next-generation coder. The selected wave files were judged by individual listener and give a rating to the selected wave files in the range of -3 to 3 (7 Scale) where -3 is much worse and 3 is much better. zero ratings meaning both are of the same quality.-2 & -1 for slightly worse and worse while 1 & 2 rating means better and slightly better. The age group selected for the above measurement is in between 21years to 50years. As shown in Table 1 Each Listener has given the rating in the scale of -3 to 3 for Each of the 13 wave files and finally, the average value of all listeners are calculated to rate/judge the M.O.S. Score of the proposed algorithm and baseline/next generation coder algorithm. here the table is prepared for proposed coder, the same way table can be prepared for baseline/next generation coder algorithm. The samples were played using logitech good quality headphones. speech files used for quality measurement are available online. For intelligibly measurement which is highly objective in nature PESQ can be characterized as

y= 0.999 + 4.999-0.999 ..............(1)

1+e1.3669*x+3.8224

where x is the raw P.E.S.Q.output (ITU P.862.2). one can use the above expression to obtain an equivalent PESQ score for the M.O.S.L.Q. score (=y) as discussed in subjective measurement for W.B. to S.W.B.. the below expression can be used to get P.E.S.Q.output from M.O.S.L.Q. score[41].

4.6607 - ln

x = -

4.999- y y - 0 . 9 9 9

.(2)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1.4945

V. Result Analysis

From the MATLAB based simulation results obtained through programming in MATLAB in terms of subjective measurement(mean opinion score) the result of the proposed coder is

215

comparable in comparison with the baseline algorithm and next-generation super wideband E.V.S. coder as seen in "Fig.10",. by using equation 1 we can convert M.O.S.L.Q. score into P.E.S.Q. Score and we can get comparable result for proposed method based on S.F.M. in terms of P.E.S.Q. Score like mean opinion score in comparison with the baseline algorithm and next-generation Super wideband E.V.S. coder. The comparative analysis of all algorithms for various wave files has been carried out as shown in Tables 1 and 2 and from the results, it can be concluded that the result of the proposed coder is comparable in comparison with the baseline algorithm and next-generation Super wideband E.V.S. coder. By observing the spectrogram of input and output speech files for baseline, proposed as well as a next-generation coder and by comparing all the performed results one can say that in the bandwidth-limited input signal(N.B./W.B.) the required spectral component is absent due to missing HB information and the result obtained by proposed coder are comparable in comparison with baseline algorithm and next-generation super wideband E.V.S. coder as shown in "Fig.11",and "Fig.12".

VI. Conclusion

In this paper, an approach to B.W.E. lying on S.F.M.(L.P.) for W.B. to S.W.B. is discussed with measurement based on subjective and objective criteria. By performing an experiment for various speech files and comparing the spectrogram of estimated S.W.B. with the baseline algorithm & next-generation super wideband (E.V.S.) coder algorithm one can say that the results obtained are comparable to the next-generation super wideband (E.V.S.) coder algorithm & baseline algorithm. By observing various spectrograms one can say that in the bandwidth-limited input signal(N.B./W.B.) the required spectral component is absent due to missing H.B. information and the result obtained by the proposed coder are comparable in comparison with baseline algorithm and next-generation Super wideband E.V.S. coder.

Load filters

Read speech files

Band limit extended speech file to 15kHz

Remove few samples at the start and end to remove inconsistencies for initial and last 2 frames

Subjective St Objective measures foe given speech files

Plot arid analyze various spectrograms

Tig.6. Proposed Flow tor W.B. to S.W.B. & Subjective and Objective Measurement

Define Input and output parameters

Define Filt er delays

J n L

Up sample input WB signal

■ * Extract highest octave from the up sampled WB signal & find out absolute value using function >- j

Fxtr^rt HFl cinpprh

Synthesize extended speech

Fig,7,Baseline Algorithm Proposed! Flow

1,14 1,135 1,13 1,125 1,12 1,115 1,11 1,105 1,1

bdl arctic a0001.wav

Gain= Gain= Gain= (Gain= (Gain=

Unity) Unity) Unity) Unity) Unity)

(LP-12, (LP-16, (LP-24,

(MOS- (MOS- (MOS- (MOS- (MOS-

LQO) LQO) LQO) LQO) LQO)

Proposed Proposed Proposed HFBE EVS

Algorithm Algorithm Algorithm Algorithm Algorithm

bdl arctic a0001.wav

Fig. 10. MOS-LQO for PROPOSED, HFBE, LP_order=12,16,24 & EVS Algorithm for one SWB

wav file

TABLE I. M.O.S.-L.Q.O. for Proposed, H.F.B.E., LP_order=12,16,24 & E.V.S. algorithm for 13

different speech files

Speech Files Mean Opinion Score

Proposed Algorithm (LP_12) Proposed Algorithm (LP_16) Proposed Algorithm (LP_24) HFBE Algorithm (Baseline) EVS Algorithm

MA01_01 1.469 1.477 1.473 1.816 4.644

MA01_02 2.116 2.128 2.123 2.573 4.644

MA01_03 1.726 1.739 1.729 2.189 4.644

MA01_04 1.572 1.562 1.558 1.992 4.644

MA01_05 1.926 1.959 1.953 2.388 4.644

FA01_01 1.143 1.145 1.142 1.223 4.644

FA01_03 1.186 1.189 1.182 1.287 4.644

FA01_04 1.145 1.15 1.142 1.241 4.644

FA01_05 1.234 1.239 1.231 1.365 4.644

arctic_a0002 1.369 1.379 1.37 1.606 4.644

arctic_a0003 1.24 1.242 1.233 1.37 4.644

arctic_a0004 1.552 1.568 1.562 1.834 4.644

arctic_a0005 1.696 1.7 1.69 1.964 4.644

TABLE II. P.E.S.Q. for Proposed, H.F.B.E., LP_order=12,16,24 & E.V.S. algorithm for 13 different speech files

Speech Files PESQ

Proposed Algorithm (LP_12) Proposed algorithm (LP_16) Proposed Algorithm (LP_24) HFBE Algorithm (Baseline) EVS Algorithm

MA01_01 1.335 1.34 1.336 1.8 1.094

MA01_02 2.114 2.125 2.121 2.48 1.094

MA01_03 1.712 1.721 1.718 2.17 1.094

MA01_04 1.509 1.528 1.524 1.991 1.094

MA01_05 1.913 1.925 1.918 2.34 1.094

FA01_01 1.126 1.128 1.123 0.73 1.094

FA01_03 0.165 0.17 0.164 0.92 1.094

FA01_04 1.138 1.142 1.139 0.78 1.094

FA01_05 0.77 0.78 0.772 1.114 1.094

arctic_a0002 1.128 1.137 1.131 1.54 1.094

arctic_a0003 0.78 0.79 0.782 1.138 1.094

arctic_a0004 1.46 1.48 1.470 1.7 1.094

arctic_a0005 1.66 1.66 1.63 1.958 1.094

Fig.11. Spectrogram for speech file A1.wav

Fig. 12. Spectrogram for speech file MA01_01.wav

References

[1] T.Rappaport,"WirelessCommunications:PrinciplesandPractice", Prentice-Hall, 1996..

[2] P. Jax and P. Vary "Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding" IEEE Communications Magazine,, vol. 44, 2006, pp.5106-5111.

[3] Pooja Gajjar, Ninad Bhatt, Yogeshwar Kosta,ABE of Speech & its Applications in Wireless Communication Systems: A review, IEEE Computer society, 2012,pp.563-568.

[4] Vijay K. Garg & Joseph E. Wilkes, "principles and application of GSM",Pearson Education, 2004.

[5] P.Jax and P. Vary, On artificial bandwidth extension of telephone speech, Signal Process., 2003 ,vol. 83, pp. 1707-1719.

[6] Laura Laaksonen," ABE of narrowband speech - enhanced speech quality and intelligibility in mobiledevice",2013,DOCTOR DISSERTATIONS 64/2013,Aalto University.

[7] Wai C.Chu,. Speech Coding Algorithms Foundation and Evolution of Standardized Coders, Wiley Publication year-2003.

[8] BWE of Narrowband Speech using Linear Prediction, AALBORG UNIVERSITY, Institute of Electronic Systems,2004.

[9] Ulrich Kornagel, "Speech Techniques for ABE of telephone"Signal Processing 86.2006.pp.1296- 1306(Elsevier),Germany.

[10] Peter Jax& P.Vary, BWE of Speech Signals: A Catalyst for the Introduction of Wideband Speech Coding?, RWTH Aachen University, IEEE Communications Magazine -May 2006.

[11] N.Prasad,"Bandwidth Extension of speech signal: A comprehensive Review", MECS-2016.

[12] Ninad S. Bhatt," Implementation and Performance Evaluation of CELP based GSM AMR NB coder over ABE", IEEE- 2015.

[13] Ninad S. Bhatt, "Simulation and overall comparative evaluation of performance between different techniques for high band feature extraction based on ABE", Int. Journal of Speech Technology,2016.

[14] A. Sagi and D. Malah. Bandwidth extension of telephone speech aided by data embedding. EURASIP Journal on Advances in Signal Processing, 2007(1),2007.

[15] S. Chen and H. Leung. BWE by data hiding and phonetic classification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),Honolulu,Hawaii, USA, 2007, volume 4, pp. 593-596,

[16] G.Gandhimathian "Analysis On SFM Based ABE", Journal of Theoretical And Applied Information Technology,2014.

[17] Janki Patel, "Bandwidth Extension of Speech Signal : A Review" IJERECE-2018.

[18] SchnitZier, J." A 13.0 Kbit/S W.B.codec based on SB- ACELP", in ProciCASSP, 1998, Vol.1, pp-157- 160,

[19] Makhoul, J., Berouti,M.," High frequency generation in speech coding system ", in proc. ICASSP, 1979, pp 428- 431.

[20] Carl,H., and Heute,U., "Bandwidth Enhancement of narrow band speech signals", Signal Processing VII Theories and applications,EUSIPCO, 1994,Vol 2, pp. 1178- 1181,

[21] Yoshida ,Y.,and Abe ,M., "An algorithm to reconstruct the wideband speech from NB speech on code book mapping ", in Proc.ICSLP,1994,pp1591- 1594.

[22] Jax P., and Vary P., " WB Extension of speech using HMM" in Proc. IEEE workshop on speech coding,2000.

[23] R.N.Rathod, M.S.Holia & N.S.Bhatt, "BWE And Quality Evaluation Of Speech Signal Based On Qmf And SFM Using Simulink AndMatlab",IJRAR, 2019.pp. 404-411.

[24] CCITT,"7 kHz Audio Coding Within 64 kBit/s", RecommendationG.722,1988, Vol. III.4 of Blue Book,Melbourne.

[25] Cheng,Y.M.,'Shaughnessy,D.O, Mermelstein,P.Statistical Recovery of WB from NBd Speech", IEEE Transactions on Speech and Audio Processing, 1994, vol.2, no-4,pp. 544- 548.

[26] Chennoukh,S., Gerrits, A., Miet, G.,and Sluitjer, R,"Speech enhancement via frequency BWE using LSF," Pro. IEEE Int. Conf. On Acoustics, Speech, Signal Processing, 2001, vol. 1:,665- 668.

[27] Gandhimathi,G., Narmadh,C., and Lakshmi,C.,"Simulation of NB Speech Signal using BPN Networks,Internationa Journal of Computer Applications(0975-8887), 2010, Vol. 5,pp.38-42,.

[28] D. Zaykovskiy and B. Iser, "Comparison of neural networks and linear mapping in an application for bandwidth extension,"in Proc. of Int. Conf. on Speech and Computer (SPECOM),2005, pp. 1-4.

[29] Gandhimathi.G.,Jayakumar,S.,2013,"Speech enhancemen Using Artificial Bandwidth Extension Algorithm in Multicast conferencing through Cloud services, Information Technology Journal , ISSN 1812- 5638,pp.1-8.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

[30] A novel approach for artificial bandwidth extension of speech signals by LPC technique over proposed GSM FR NB coder using high band feature extraction and various extension of excitation by N Bhatt, Y Kosta,International Journal of Speech Technology,2015, 18 (1), 57-64.

[31] Ninad S. Bhatt," A novel approach for artificial bandwidth extension of speech signals by LPC technique over proposed GSM FR NB coder using high band feature extraction and various extension of excitation methods",Springer-2014.

[32] ] M. Arora, High Quality Blind BWE of Audio for Portable Player Applications, Audio Engineering Society, Paris,France,2006.

[33] Erik Larsen and Ronal M. Aarts, Audio Bandwidth Extension. John Wiley and Sons, 2004.

[34] ]J.Kominek,A.Black,"CMUARCTICdatabases [Online]:http://festvox.org/cmuarctic/index.html.

[35] ]P.Kabal,"TSPSpeechDatabase,2002,pp.02-10,[Online] http://mmsp.ece.mcgill.ca/Documents/Data/.

[36] ITU, 2012. [Online]: https://www.itu.int/rec/T-REC-P.501.,2012.

[37] "ITU-T Recommendation P. 56, Objective measurement of active speech level," ITU, 2011.

[38] "Codec for Enhanced Voice Services; ANSI C Code (fixed point) (3GPP TS 26.442 ver. 13.3.0 rel. 13)," 2016.

[39] "ITU-T Recommendation G. 191, Software Tool Library 2009 User's Manual," ITU, 2009.

[40] "ANSI-C Code for the AMR-WB Speech Codec (3GPP TS 26.173ver.13.1.0rel.13),"2016.

[41] "ITU-T Recommendation P. 800: Methods for subjective determination of transmission quality,"ITU,1996.

i Надоели баннеры? Вы всегда можете отключить рекламу.