April 2004, Volume 13, Number 1

J. Novotny, P. Sovka, J. Uhlir [references] [full-text]
Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment

This paper deals with the analysis and optimization of a speech command recognition system (SCRS) trained on Czech telephone database Speechdat(E) for use in a selected noisy environment. The SCRS is based on hidden Markov models of context dependent phones (triphones) and mel-frequency cepstral coefficients analysis of speech (MFCC). The main aim is to analyze and to search for the optimal settings of SCRS with respect to additive noise robustness without use of additional techniques for additive noise reduction. The analysis is pointed to the appropriate setting of MFCC computation, the silence model adjustment and grammar selection possibilities. It is shown, that the correct performance of SCRS strictly depends on an appropriate adjustment of the silence model. The ability of the silence model adaptation is confirmed. When SNR is higher than 15 dB the suitable performance of SCRS can be guarantied without any modification of the triphones speech models by: 1. the optimal setting of MFCC computation, 2. the proper silence model adaptation. The assumption of a speech command recognition system use in an environment where SNR is higher than 15 dB is fulfilled in many applications.

R. Kolar, R. Jirik, J. Jan [references] [full-text]
Estimator Comparison of the Nakagami-m Parameter and its Application in Echocardiography

This article deals with the comparison of various estimators of the m parameter from the Nakagami distribution. This kind of distribution has been used in many engineering applications and we present another possible application in biomedical engineering, particularly the ultrasound tissue characterization in the echocardiographic application. Matlab 6.5 was used as a proper tool for fast and efficient scientific research.

R. Simik [references] [full-text]
Improvement of Watson's DVQ Metric

An improvement of Watson's DVQ (Digital Video Quality) metrics is introduced. This metric was chosen for its easy implementation caused by using DCT (Discrete Cosine Transform) for video decomposition into spatial channels. The metric is upgraded by segmentation tool. This tool is used for weighting the masked differences.

J. Tuckova, J. Santarius [references] [full-text]
Neural Network Program Package for Prosody Modeling

This contribution describes the programme for one part of the automatic Text-to-Speech (TTS) synthesis. Some experiments (for example [14]) documented the considerable improvement of the naturalness of synthetic speech, but this approach requires completing the input feature values by hand. This completing takes a lot of time for big files. We need to improve the prosody by other approaches which use only automatically classified features (input parameters). The artificial neural network (ANN) approach is used for the modeling of prosody parameters. The program package contains all modules necessary for the text and speech signal pre-processing, neural network training, sensitivity analysis, result processing and a module for the creation of the input data protocol for Czech speech synthesizer ARTIC [1].

R. Hovancak, D. Levicky [references] [full-text]
Digital Image Watermaking in Color Models Using DCT Transformation

In recent years, an access to multimedia data has become much easier due to the rapid growth of the Internet. While this is usually considered an improvement of everyday life, it also makes unauthorized copying and distributing of multimedia data much easier, therefore presenting a challenge in the field of copyright protection. Digital watermarking, which is inserting copyright information into the data, has been proposed to solve the problem. In this paper two original watermarking schemes based on DCT transformation for ownership verification and authentication of color images were proposed. Some color models in process of watermarks embedding and extracting are described too.

J. Mihalik, V. Michalcin [references] [full-text]
3D Motion Estimation and Texturing of Human Head Model

This paper deals with 3D motion estimation of the wire frame head model on the basis of the analysis of the parameters of 3D global motion of the real human head for each frame of videosequence. The proposed algorithm of 3D global motion estimation is given by solution of 6 linear equations for three extracted feature points of the real human head in each frame. Next there is presented an algorithm of texturing of 3D wire frame model of human head after its estimated global motion. Texturing is carried out by two dimensional affine transform directly in synthesized frames. Both proposed algorithms can achieve very low bit rate in model based image coding.

M. Laipert, M. Vlcek, J. Vrbata [references] [full-text]
Contribution to the Chebyshev Approximations of the Normalized Low-Pass Prototype

The standard approximation algorithms are well described in the literature, but some equiripple approximations are described with some deficiencies. Especially Chebyshev and inverse Chebyshev approximations are often wrongly interpreted or implemented. In this paper, we propose all formulas for computing Chebyshev approximations in a standard form. Transformations, which are necessary for circuit implementation, are presented in the analytical form too.

B. Taha-Ahmed, M. Calvo-Ramon, L. de Haro-Ariet [references] [full-text]
The Capacity and Interference Statistics of High Car Traffic W-CDMA Street Cross-Shaped Micro-Cells (Uplink Analysis),

Since interference is related to the capacity and performance of W-CDMA system, it is necessary to investigate the interference characteristics (the mean value and the variance). Thus, the uplink capacity and the interference statistics of the sectors of the cross-shaped W-CDMA microcell have been analyzed using geometry with 17 microcells. A single slope propagation model with a lognormal shadowing factor has been used in the analysis. The cells have been assumed to exist in city streets with high car traffic. The capacity and the interference statistics of the sectors have been studied for different sector ranges, and different side-lobe level. The results show that the capacity increases with the increment of the sector range and with the reduction of the side-lobe level of the antennas used.

B. Taha-Ahmed, M. Calvo-Ramon, L. de Haro-Ariet [references] [full-text]
The Performance of W-CDMA Highways Infostations

The expected value of the signal to noise ratio of W-CDMA infostations is derived. A model of 5 cells is used to analyze the system performance. The infostations are assumed to exist in rural zones. The performance of the infostations is studied for different breakpoint distances, different infostations separation, a different number of users for each infostation and for different bit rate.

