Page 208 - DCAP208_Management Support Systems
P. 208

Unit 12: Applications of Neural Network




          12.2.7 Speechreading (Lipreading)                                                     Notes

          As part of the research program Neuroinformatik the IPVR develops a neural speechreading
          system as part of a user interface for a workstation. The three main parts of the system include
          a face tracker (done by Marco Sommerau), lip modeling and speech processing (done by Michael
          Vogt) and the development and application of SNNS for neural network training (done by
          Günter Mamier).
          Automatic speechreading is based on a robust lip image analysis. In this approach, no special
          illumination or lip make-up is used. The analysis is based on true color video images. The
          system allows for realtime tracking and storage of the lip region and robust off-line lip model
          matching. The proposed model is based on cubic outline curves. A neural classifier detects
          visibility of teeth edges and other attributes. At this stage of the approach the edge between the
          closed lips is automatically modeled if applicable, based on a neural network’s decision.

          To achieve high flexibility during lip-model development, a model description language has
          been defined and implemented. The language allows the definition of edge models (in general)
          based on knots and edge functions. Inner model forces stabilize the overall model shape. User
          defined image processing functions may be applied along the model edges. These functions and
          the inner forces contribute to an overall energy function.

               !
             Caution  Adaptation of the model is done by gradient descent or simulated annealing like
             algorithms.
          The figure shows one configuration of the lip model, consisting of an upper lip edge and a lower
          lip edge. The model edges are defined by Bezier-functions. Outer control knots stabilize the
          position of the corners of the mouth.

                               Figure 12.13: Configuration of the  Lip  Model






















          Source: http://tralvex.com/pub/nap/#CoEvolution of Neural Networks for Control of Pursuit & Evasion
          The model interpreter enables a permanent measurement of model knot positions and color
          blends along model edges during adaptation to an utterance.
          The resulting parameters may be used for speech recognition tasks in further steps.




              Task  Analyze the use of Bezier-functions.


                                           LOVELY PROFESSIONAL UNIVERSITY                                   201
   203   204   205   206   207   208   209   210   211   212   213