Документ взят из кэша поисковой машины. Адрес оригинального документа : http://imaging.cmc.msu.ru/pub/2010.DSPA.Lyubimov_Lukin.CWM.ru.pdf
Дата изменения: Tue Apr 6 23:00:00 2010
Дата индексирования: Mon Oct 1 19:34:48 2012
Кодировка:

_____________________________________________________________________________________________

nels requires the introduction of so-called cordon bands and it does not allow to realize the potentially achievable transmission rate. Thus, the development of a method of synthesis of signals of finite duration, optimal in the sense of maximum concentration of energy in specified frequency ranges, is the actual problem. It is in this formulation, the problem of forming of channel signals and is considered in this work. The method consists in the formation of the channel signal on the basis of eigenvectors with coefficients, which are information bits of the original signal. Eigenvectors are formed on the basis substrip matrix calculated for a given frequency range. The sequence of bits must have a bipolar appearance. This form of the original sequence eliminates the possibility of losing the eigenvector with the multiplication of a zero coefficient. The coefficients can be any number of which increases the speed of the transmitted information. The probability of correct reception of the transmission of information is comparable to the best method of binary phase shift keying, which has the highest noise immunity among existing methods. High noise immunity of the optimal method is conditioned by the fact that the transfer of information bits are used eigenvectors substrip matrices, which is known to be orthogonal to each other. Security information transfer is provided by the permutations of the eigenvectors to the formation of the channel signal, which requires knowledge of the exact location of rearranged eigenvectors when restoring data at the receiving end, the key of this method of protection will be a map of the exact location of the eigenvectors. The method allows to significantly improve the efficiency of frequency resources by minimizing the energy fraction of the chapels of a given frequency range, while also significantly reduce the interference between adjacent channels. In addition, the generated channel signal has a noise immunity comparable with the most noise-stable binary phase shift keying, with no loss in speed of information transmission.

. ., . .. ,,

..

( , , « « » («blind») » («non-blind») . « SBR [2]. AAC+. ( ). .

)

, [1]. : -

» , , ( ) . . , mp3PRO

. 50%, XM Satellite Radio » , [3], . . [6,7]. « [4] . , » « » . [5],

, «

Digital Radio Mondiale [2]. , . -

,

_____________________________________________________________________________________________

251
Digital signal processing and its applications

_____________________________________________________________________________________________

, . 1. 2. ( , . . . 1. ), :

-

-

. 1. . ( . , ..

-

-

)

, -

f cut .
. (STFT) : 30

, ( ) 15 .

,

x

w i

y

t

Mx tw , t 1, 2, ..., T , M
s

R

NN

(3)

z

t

Fx

w t

, t 1, 2, ..., T

(4)

F
( )

.

Fx
.

x , s 1.

F
:

-

z

t

g

2

F g1
y

x
w t

w t

(5) . (6) : yt

g1 , g 2
:
t

Mz , t 1, 2, ..., T

,

yt «
yt (k ), 1 k st (k ) yt (k ), k
cut

» :

.

kcut , k
.

t 1, 2, ..., T N 1 2

(7),

k

cut

f cut N , fs

fs -

-

s

t

.

t 1, 2, ..., T ,
:
yt (k ), 1 k st (k )
t

kcut ,
cut

(k ) yt (k ), k

k

t 1, 2, ..., T N 1 2
0 t

(8)

t

(k )
. , ,

,
t

.

-

, )

(k )

yt0 ( k )

yt ( k ) ,

y (k )
24

(

. .
t

(k )

252

_____________________________________________________________________________________________ 12Proceedings of the 12-th International Conference

_____________________________________________________________________________________________

. . . , ,

qi ( x)
i

.

. -

( x) ,

, [6]. CWM [8]. .2 7 .

. 2. .

. 3. . , . ) ( 0( , , 44.1 : . 5.5 . , (7), + (8). . + (8) .3 , . . . , . [4]. ). -

. , . ) 100 (

+

.

,

. . . . « » 20092013 . ,

-

-

1. Erik Larsen, Ronald M. Aarts, "Audio Bandwidth Extension: Application of Psychoacoustics, Signal Processing and Loudspeaker Design", John Wiley & Sons, ISBN: 0-470-85864-8, September 2004.
_____________________________________________________________________________________________

253
Digital signal processing and its applications

_____________________________________________________________________________________________

2. M. Dietz, L. Liljeryd, K. Kjorling and O. Kunz, "Spectral Band Replication, a novel approach in audio coding," 112th AES convention, Munich, Germany, May 2002. 3. Chi-Min Liu, Wen-Chieh Lee, and Han-Wen Hsu "High Frequency Reconstruction for Band-Limited Audio Signals", DAFX'03, London, UK, September 2003. 4. Manish Arora, Joonhyun Lee, and Sangil Park "High Quality Blind Bandwidth Extension of Audio for Portable Player Applications", 120th AES convention, Paris, France, May 2006. 5. Chatree Budsabathon, Akinori Nishihara "Bandwidth Extension with Hybrid Signal Extrapolation for Audio Coding", IEICE Trans. Fundamentals, Vol. E90-A, No. 8, August 2007. 6. N. Enbom and W.B.Klein "Bandwidth Expansion of Speech Based on Vector Quantization of the Mel Frequency Cepstral Coefficients", in IEEE Workshop on Speech Coding, Porvoo, Finland, 1999. 7. M. Nilsson, H. Gustafsson, S.V. Andersen, and W.B. Kleijn. "Gaussian mixture model based mutual information estimation between frequency bands in speech." ICASSP '02, IEEE International Conference on, 2002. 8. Neil Gershenfeld, "Cluster-Weighted Modeling: Probabilistic Time Series Prediction, Characterization and Synthesis", Nature of Mathematical Modeling, MIT Press, 1998.

BANDWIDTH EXTENSION FOR AUDIO SIGNALS USING CLUSER-WEIGHTED MODELING Lyubimov N., Lukin A. Moscow Lomonosov State University, Dept. of Computational Mathematics and Cybernetics Laboratory of Mathematical Methods of Image Processing Introduction Bandwidth extension is the process of re-synthesizing missing frequency components in order to improve the subjective quality of the audio signal. Bandwidth extension methods can be found in modern perceptual audio coding standards, such as mp3PRO and AAC+. Such methods can be blind, when no information about missing signal components is available, and non-blind, when certain information about missing components is available during the synthesis stage. A typical algorithm flowchart for both blind and non-blind methods looks as follows: 1. Time-frequency decomposition, 2. "Rough" generation of high-frequency spectral content, 3. Shaping of the energy spectrum envelope of high-frequency content, 4. Synthesis of the resulting signal from a time-frequency representation. Algorithm description In this work, a new algorithm for blind bandwidth extension is proposed. It is capable of accurate prediction of high-frequency energy envelopes using a Cluster-Weighted Model for MFCC coefficients of the audio signal. A bandwidth-reduced audio signal x(t) is input to the algorithm. It is transformed using STFT with 30-ms Hann windows that overlap by 15 ms. A nonlinear distortion (waveshaping) is used to generate "rough" high-frequency components in time domain: z(t) = |x(t)|s. Aliasing is reduced by the use of oversampling. These high-frequency components are transformed using a similar STFT filter bank. To finally shape the resulting high-frequency signal, its energy is computed in 24 critical bands, and a regression model is developed to predict the shape of high-frequency energy envelope from a low-frequency energy envelope. In this work, a Cluster-Weighted Modeling is proposed for such prediction. The model is trained on the original fullbandwidth audio signals to determine a set of clusters of low-frequency envelopes and corresponding highfrequency envelopes. During evaluation stage, each input low-frequency envelope is represented as a weighted sum of several "cluster" envelopes, and the corresponding high-frequency envelope is predicted as a weighted sum of modeled high-frequency envelopes. Once the desired high-frequency envelope is calculated, the shape of a "rough" high-frequency signal is transformed to match the desired envelope using STFT filter bank. Results and conclusion The algorithm has been evaluated on speech and music using a subjective evaluation protocol with several listeners. The proposed method has been compared with other methods, such as linear extrapolation of high-frequency envelope (see full references within the paper). The proposed algorithm has shown the highest subjective quality results. Evaluations have shown that the algorithm is more effective on speech and solo music than on polyphonic music. A possible cause of this effect is introduction of intermodulation components by the process of nonlinear distortion. In our future work, we are planning to apply source separation techniques for individual processing of signal components.

254

_____________________________________________________________________________________________ 12Proceedings of the 12-th International Conference