191 research outputs found
Viseme-based Lip-Reading using Deep Learning
Research in Automated Lip Reading is an incredibly rich discipline with so many facets that have been the subject of investigation including audio-visual data, feature extraction, classification networks and classification schemas. The most advanced and up-to-date lip-reading systems can predict entire sentences with thousands of different words and the majority of them use ASCII characters as the classification schema. The classification performance of such systems however has been insufficient and the need to cover an ever expanding range of vocabulary using as few classes as possible is challenge.
The work in this thesis contributes to the area concerning classification schemas by proposing an automated lip reading model that predicts sentences using visemes as a classification schema.
This is an alternative schema to using ASCII characters, which is the conventional class system used to predict sentences. This thesis provides a review of the current trends in deep learning-
based automated lip reading and analyses a gap in the research endeavours of automated lip-reading by contributing towards work done in the region of classification schema. A whole new line of research is opened up whereby an alternative way to do lip-reading is explored and in doing so, lip-reading performance results for predicting s entences from a benchmark dataset
are attained which improve upon the current state-of-the-art.
In this thesis, a neural network-based lip reading system is proposed. The system is lexicon-free and uses purely visual cues. With only a limited number of visemes as classes to recognise, the system is designed to lip read sentences covering a wide range of vocabulary and to recognise words that may not be included in system training. The lip-reading system predicts sentences as a two-stage procedure with visemes being recognised as the first stage and words being classified as the second stage. This is such that the second-stage has to both overcome the one-to-many mapping problem posed in lip-reading where one set of visemes can map to several words, and the problem of visemes being confused or misclassified to begin with.
To develop the proposed lip-reading system, a number of tasks have been performed in this thesis. These include the classification of continuous sequences of visemes; and the proposal of viseme-to-word conversion models that are both effective in their conversion performance of predicting words, and robust to the possibility of viseme confusion or misclassification. The initial system reported has been testified on the challenging BBC Lip Reading Sentences 2
(LRS2) benchmark dataset attaining a word accuracy rate of 64.6%. Compared with the state-of-the-art works in lip reading sentences reported at the time, the system had achieved a significantly improved performance.
The lip reading system is further improved upon by using a language model that has been demonstrated to be effective at discriminating between homopheme words and being robust to incorrectly classified visemes. An improved performance in predicting spoken sentences from the LRS2 dataset is yielded with an attained word accuracy rate of 79.6% which is still better than another lip-reading system trained and evaluated on the the same dataset that attained a word accuracy rate 77.4% and it is to the best of our knowledge the next best observed result attained on LRS2
Antibacterial effect of essential oils of two plants Eucalyptus camaldulensis and Artemisia herba alba on some bacterial strains
Essential oils are secondary plant metabolites and have many therapeutic properties. The aim of our study is to determine the antibacterial effect of the essential oils of two plants cultivated in a semi-arid region located in the Northeast of Algeria (Tebessa), Eucalyptus camaldulensis (Myrtaceae) and Artemisia herba alba (Asteraceae). The yield of essential oils of the two plants were 1.45 ± 0.026 and 1.21 ± 0.061 g/100 g of the dry matter of the aerial part respectively. The test of the antibacterial effect is based on the diffusion method on solid medium (sensitivity), this method allows us to determine the susceptibility or resistance of an organism vis-à-vis the sample studied. Our study reveals that E. camaldulensis essential oil had very strong activity on all bacterial strains tested, except on Pseudomonas aeruginosa and Enterococcus faecalis for which there was no inhibitory effect. However, A. herba alba essential oil had very strong activity on all bacterial strains tested except on Pseudomonas aeruginosa. The MIC of Artemisia essential oil ranged between 0.08 and 1.57 µL/mL, with the lowest activity for S. aureus and P. mirabilis (1.57 µL/mL) and the highest activity was observed against E. feacalis, E. coli, and K. pneumonia (0.09 µL/mL). The MIC of the second plant EO ranged between 0.08 and 0.36 µL/mL, with the lowest activity for P. mirabilis (0.36 µL/mL) and the highest one was observed against S. saprophyticus and E. coli (0.08 µL/mL). Statistical analysis shows that the two plants have the same efficacy against S. saprophyticus while E. faecalis, K. pneumoniae and P. mirabilis species are affected more by the essential oil of A. herba alba. While, E. camaldulensis has a higher efficiency than that of A. herba alba on the species: S. aureus and E. coli. Therefore, the essential oils of E. camaldulensis and A. herba alba suggests avenues for further non clinical and clinical studies
An Effective Conversion of Visemes to Words for High-Performance Automatic Lipreading.
As an alternative approach, viseme-based lipreading systems have demonstrated promising performance results in decoding videos of people uttering entire sentences. However, the overall performance of such systems has been significantly affected by the efficiency of the conversion of visemes to words during the lipreading process. As shown in the literature, the issue has become a bottleneck of such systems where the system's performance can decrease dramatically from a high classification accuracy of visemes (e.g., over 90%) to a comparatively very low classification accuracy of words (e.g., only just over 60%). The underlying cause of this phenomenon is that roughly half of the words in the English language are homophemes, i.e., a set of visemes can map to multiple words, e.g., "time" and "some". In this paper, aiming to tackle this issue, a deep learning network model with an Attention based Gated Recurrent Unit is proposed for efficient viseme-to-word conversion and compared against three other approaches. The proposed approach features strong robustness, high efficiency, and short execution time. The approach has been verified with analysis and practical experiments of predicting sentences from benchmark LRS2 and LRS3 datasets. The main contributions of the paper are as follows: (1) A model is developed, which is effective in converting visemes to words, discriminating between homopheme words, and is robust to incorrectly classified visemes; (2) the model proposed uses a few parameters and, therefore, little overhead and time are required to train and execute; and (3) an improved performance in predicting spoken sentences from the LRS2 dataset with an attained word accuracy rate of 79.6%-an improvement of 15.0% compared with the state-of-the-art approaches
Recurrent Neural Networks for Decoding Lip Read Speech
The success of automated lip reading has been constrained by the inability to distinguish between homopheme words, which are words have different characters and produce the same lip movements (e.g. ”time” and ”some”), despite being intrinsically different. One word can often have different phonemes (units of sound) producing exactly the viseme or visual equivalent of phoneme for a unit of sound. Through the use of a Long-Short Term Memory Network with word embeddings, we can distinguish between homopheme words or words that produce identical lip movements. The neural network architecture achieved a character accuracy rate of 77.1% and a word accuracy rate of 72.2%
Contour Mapping for Speaker-Independent Lip Reading System
In this paper, we demonstrate how an existing deep learning architecture for automatically lip reading individuals can
be adapted it so that it can be made speaker independent, and by doing so, improved accuracies can be achieved on a
variety of different speakers. The architecture itself is multi-layered consisting of a convolutional neural network, but if
we are to apply an initial edge detection-based stage to pre-process the image inputs so that only the contours are
required, the architecture can be made to be less speaker favourable.
The neural network architecture achieves good accuracy rates when trained and tested on some of the same speakers
in the ”overlapped speakers” phase of simulations, where word error rates of just 1.3% and 0.4% are achieved when
applied to two individual speakers respectively, as well as character error rates of 0.6% and 0.3%. The ”unseen speakers”
phase fails to achieve as good an accuracy, with greater recorded word error rates of 20.6% and 17.0% when tested on
the two speakers with character error rates of 11.5% and 8.3%.
The variation in size and colour of different people’s lips will result in different outputs at the convolution layer of a
convolutional neural network as the output depends on the pixel intensity of the red, green and blue channels of an input
image so a convolutional neural network will naturally favour the observations of the individual whom the network was
tested on. This paper proposes an initial ”contour mapping stage” which makes all inputs uniform so that the system can
be speaker independent.
Keywords: Lip Reading, Speech Recognition, Deep Learning, Facial Landmarks, Convolutional Neural Networks,
Recurrent Neural Networks, Edge Detection, Contour Mappin
Immiscible thermo-viscous fingering in Hele-Shaw cells
We investigate immiscible radial displacement in a Hele-Shaw cell with a temperature dependent viscosity using two coupled high resolution numerical methods. Thermal gradients created in the domain through the injection of a low viscosity fluid at a different temperature to the resident high viscosity fluid can lead to the formation of unstable thermo-viscous fingers, which we explore in the context of immiscible flows. The transient, multi-zone heat transfer is evaluated using a newly developed auxiliary radial basis function-finite collocation (RBF-FC) method, which locally captures variation in flux and field variable over the moving interface, without the need for ghost node extrapolation. The viscosity couples the transient heat transfer to the Darcy pressure/velocity field, which is solved using a boundary element - RBF-FC method, providing an accurate and robust interface tracking scheme for the full thermo-viscous problem.
We explore the thermo-viscous problem space using systematic numerical experiments, revealing that the early stage finger growth is controlled by the pressure gradient induced by the varying temperature and mobility field. In hot injection regimes, negative temperature gradients normal to the interface act to accelerate the interface, promoting finger bifurcation and enhancing the viscous fingering instability. Correspondingly, cold injection regimes stabilise the flow compared to isothermal cases, hindering finger formation. The interfacial mobility distribution controls the late stage bifurcation mode, with non-uniformities induced by the thermal diffusivity creating alternate bifurcation modes. Further numerical experiments reveal the neutral stability of the thermal effects on the fingering evolution, with classical viscous fingering dynamics eventually dominating the evolution. We conclude the paper with a mechanistic summary of the immiscible thermo-viscous fingering regime, providing the first detailed analysis of the thermal problem in immiscible flows
Swelling-induced changes in coal microstructure due to supercritical CO2 injection
©2016. American Geophysical Union. All Rights Reserved. Enhanced coalbed methane recovery and CO2 geostorage in coal seams are severely limited by permeability decrease caused by CO2 injection and associated coal matrix swelling. Typically, it is assumed that matrix swelling leads to coal cleat closure, and as a consequence, permeability is reduced. However, this assumption has not yet been directly observed. Using a novel in situ reservoir condition X-ray microcomputed tomography flooding apparatus, for the first time we observed such microcleat closure induced by supercritical CO2 flooding in situ. Furthermore, fracturing of the mineral phase (embedded in the coal) was observed; this fracturing was induced by the internal swelling stress. We conclude that coal permeability is drastically reduced by cleat closure, which again is caused by coal matrix swelling, which again is caused by flooding with supercritical CO2
Comparison of CO2 trapping in highly heterogeneous reservoirs with Brooks-Corey and van Genuchten type capillary pressure curves
Geological heterogeneities essentially affect the dynamics of a CO2 plume in
subsurface environments. Previously we showed how the dynamics of a CO2 plume
is influenced by the multi-scale stratal architecture in deep saline
reservoirs. The results strongly suggest that representing small-scale features
is critical to understanding capillary trapping processes. Here we present the
result of simulation of CO2 trapping using two different conventional
approaches, i.e. Brooks-Corey and van Genuchten, for the capillary pressure
curves. We showed that capillary trapping and dissolution rates are very
different for the Brooks-Corey and van Genuchten approaches when heterogeneity
and hysteresis are both represented.Comment: 10 pages 6 figure
A numerical study of dynamic capillary pressure effect for supercritical carbon dioxide-water flow in porous domain
This is the accepted version of the following article: DAS, D.B. ... et al., 2014. A numerical study of dynamic capillary pressure effect for supercritical carbon dioxide-water flow in porous domain. AIChE Journal, 60 (12), pp. 4266-4278, which has been published in final form at http://dx.doi.org/10.1002/aic.14577Numerical simulations for core-scale capillary pressure (Pc)–saturation (S) relationships have been conducted for a supercritical carbon dioxide-water system at temperatures between 35°C and 65°C at a domain pressure of 15 MPa as typically expected during geological sequestration of CO2. As the Pc-S relationships depend on both S and time derivative of saturation (∂S / ∂t) yielding what is known as the ‘dynamic capillary pressure effect’ or simply ‘dynamic effect’, this work specifically attempts to determine the significance of these effects for supercritical carbon dioxide-water flow in terms of a coefficient, namely dynamic coefficient (τ). The coefficient establishes the speed at which capillary equilibrium for supercritical CO2-water flow is reached. The simulations in this work involved the solution of the extended version of Darcy’s law which represents the momentum balance for individual fluid phases in the system, the continuity equation for fluid mass balance, as well as additional correlations for determining the capillary pressure as a function of saturation, and the physical properties of the fluids as a function of temperature. The simulations were carried for 3D cylindrical porous domains measuring 10 cm in diameter and 12 cm in height. τ was determined by measuring the slope of a best-fit straight line plotted between (i) the differences in dynamic and equilibrium capillary pressures (Pc,dyn – Pc,equ) against (ii) the time derivative of saturation (dS/dt), both at the same saturation value. The results show rising trends for τ as the saturation values reduce, with noticeable impacts of temperature at 50% saturation of aqueous phase. This means that the time to attain capillary equilibrium for the CO2-water system increases as the saturation decreases. From a practical point view, it implies that the time to capillary equilibrium during geological sequestration of CO2 is an important factor and should be accounted for while simulating the flow processes, e.g., to determine the CO2 storage capacity of a geological aquifer. In this task, one would require both the fundamental understanding of the dynamic capillary pressure effects for supercritical CO2-water flow as well as τ values. These issues are addressed in this article
- …
