8,649 research outputs found

    Bit rates in audio source coding

    Get PDF
    The goal is to introduce and solve the audio coding optimization problem. Psychoacoustic results such as masking and excitation pattern models are combined with results from rate distortion theory to formulate the audio coding optimization problem. The solution of the audio optimization problem is a masked error spectrum, prescribing how quantization noise must be distributed over the audio spectrum to obtain a minimal bit rate and an inaudible coding errors. This result cannot only be used to estimate performance bounds, but can also be directly applied in audio coding systems. Subband coding applications to magnetic recording and transmission are discussed in some detail. Performance bounds for this type of subband coding system are derived

    Reducing Audible Spectral Discontinuities

    Get PDF
    In this paper, a common problem in diphone synthesis is discussed, viz., the occurrence of audible discontinuities at diphone boundaries. Informal observations show that spectral mismatch is most likely the cause of this phenomenon.We first set out to find an objective spectral measure for discontinuity. To this end, several spectral distance measures are related to the results of a listening experiment. Then, we studied the feasibility of extending the diphone database with context-sensitive diphones to reduce the occurrence of audible discontinuities. The number of additional diphones is limited by clustering consonant contexts that have a similar effect on the surrounding vowels on the basis of the best performing distance measure. A listening experiment has shown that the addition of these context-sensitive diphones significantly reduces the amount of audible discontinuities

    Time-scale and pitch modifications of speech signals and resynthesis from the discrete short-time Fourier transform

    Get PDF
    The modification methods described in this paper combine characteristics of PSOLA-based methods and algorithms that resynthesize speech from its short-time Fourier magnitude only. The starting point is a short-time Fourier representation of the signal. In the case of duration modification, portions, in voiced speech corresponding to pitch periods, are removed from or inserted in this representation. In the case of pitch modification, pitch periods are shortened or extended in this representation, and a number of pitch periods is inserted or removed, respectively. Since it is an important tool for both duration and pitch modification, the resynthesis-from-short-time-Fourier-magnitude-only method of Griffin and Lim (1984) and Griffin et al. (1984) is reviewed and adapted. Duration and pitch modification methods and their results are presented.\ud \u

    Biometric Authentication System on Mobile Personal Devices

    Get PDF
    We propose a secure, robust, and low-cost biometric authentication system on the mobile personal device for the personal network. The system consists of the following five key modules: 1) face detection; 2) face registration; 3) illumination normalization; 4) face verification; and 5) information fusion. For the complicated face authentication task on the devices with limited resources, the emphasis is largely on the reliability and applicability of the system. Both theoretical and practical considerations are taken. The final system is able to achieve an equal error rate of 2% under challenging testing protocols. The low hardware and software cost makes the system well adaptable to a large range of security applications

    Binary Biometric Representation through Pairwise Adaptive Phase Quantization

    Get PDF
    Extracting binary strings from real-valued biometric templates is a fundamental step in template compression and protection systems, such as fuzzy commitment, fuzzy extractor, secure sketch, and helper data systems. Quantization and coding is the straightforward way to extract binary representations from arbitrary real-valued biometric modalities. In this paper, we propose a pairwise adaptive phase quantization (APQ) method, together with a long-short (LS) pairing strategy, which aims to maximize the overall detection rate. Experimental results on the FVC2000 fingerprint and the FRGC face database show reasonably good verification performances.\ud \u

    Extraction of vocal-tract system characteristics from speechsignals

    Get PDF
    We propose methods to track natural variations in the characteristics of the vocal-tract system from speech signals. We are especially interested in the cases where these characteristics vary over time, as happens in dynamic sounds such as consonant-vowel transitions. We show that the selection of appropriate analysis segments is crucial in these methods, and we propose a selection based on estimated instants of significant excitation. These instants are obtained by a method based on the average group-delay property of minimum-phase signals. In voiced speech, they correspond to the instants of glottal closure. The vocal-tract system is characterized by its formant parameters, which are extracted from the analysis segments. Because the segments are always at the same relative position in each pitch period, in voiced speech the extracted formants are consistent across successive pitch periods. We demonstrate the results of the analysis for several difficult cases of speech signals

    Forensic Face Recognition: A Survey

    Get PDF
    Beside a few papers which focus on the forensic aspects of automatic face recognition, there is not much published about it in contrast to the literature on developing new techniques and methodologies for biometric face recognition. In this report, we review forensic facial identification which is the forensic experts‟ way of manual facial comparison. Then we review famous works in the domain of forensic face recognition. Some of these papers describe general trends in forensics [1], guidelines for manual forensic facial comparison and training of face examiners who will be required to verify the outcome of automatic forensic face recognition system [2]. Some proposes theoretical framework for application of face recognition technology in forensics [3] and automatic forensic facial comparison [4, 5]. Bayesian framework is discussed in detail and it is elaborated how it can be adapted to forensic face recognition. Several issues related with court admissibility and reliability of system are also discussed. \ud Until now, there is no operational system available which automatically compare image of a suspect with mugshot database and provide result usable in court. The fact that biometric face recognition can in most cases be used for forensic purpose is true but the issues related to integration of technology with legal system of court still remain to be solved. There is a great need for research which is multi-disciplinary in nature and which will integrate the face recognition technology with existing legal systems. In this report we present a review of the existing literature in this domain and discuss various aspects and requirements for forensic face recognition systems particularly focusing on Bayesian framework

    Transparent Face Recognition in the Home Environment

    Get PDF
    The BASIS project is about the secure application of transparent biometrics in the home environment. Due to transparency and home-setting requirements there is variance in appearance of the subject. An other problem which needs attention is the extraction of features. The quality of the extracted features is not only depending on the proper preprocessing of the input data but also on the suitability of the extraction algorithm for this problem. Possible approaches to address problems due to transparency requirements are the use of active appearance models in face recognition, smart segmentation, multi-camera solutions and tracking. In this paper an inventory of problems and possible solution will be give
    corecore