Issue: Volume: 23 Issue: 11 (November 2000)

Making Faces



DIANA PHILLIPS MAHONEY

A bad face makes a lasting impression. One of the most significant challenges to successful character animation is building a believable face-not necessarily a perfect face, just a believable one. Not only is the infinite range of human facial expressions difficult to re-create, our perceptive capabilities are such that we can immediately detect a digital face that is just ever-so-slightly off. We may not be able to articulate just how or why it is off, but we know instinctively that it is. Add to the challenge the need to build and animate a facial model for a real-time animation or interaction, such as for a game or virtual reality experience, and the likelihood of success using existing tools and techniques diminishes even further.

Character animation experts in the MIRALab at University of Geneva in Switzerland are tackling the face challenge head on. Under the direction of Nadia Magnenat-Thalmann, researcher Won-Sook Lee has developed a fast, efficient method for building an animatable face model for virtual environments using photographs. At the heart of the system is a feature-detection algorithm that identifies key features, or points, in each of two photographs (a front and side view of the same face) and makes correspondences between the feature points and those on a generic 3D model.

The first step in the process involves building a generic, animation-ready 3D facial model with standard modeling software. Next, the researchers generate a 2D wireframe for each photographic view. The wireframes comprise hundreds of points representing the areas of significant structure variation that give a face its unique character, such as around the nose, eyes, mouth, and cheeks. The wireframes are scaled to compensate for size discrepancies that may exist between the two photographs. Coordinate values are predefined for structural characteristics that appear in both views. These values serve as controls for orienting the image data from both views to each other and to the relative coordinate points on the 3D model.
To build an animation-ready face, MIRALab researchers have developed an algorithm for detecting key features from front and side view photos, which are correlated to create a 2D wireframe. The feature points in the wireframe are then correlated to a gener




Once the wireframes are complete, the feature-detection algorithm extracts feature points from the data and associates them with control points on the generic model using an active contour-finding technique called a snake algorithm. In order to fit a parametric curve to an edge in an image, the snake algorithm locates optimal contours by considering the "strength" of the edges and their spatial distribution. Then the contour is iteratively optimized to define the exact position of the feature. For the application at hand, the Swiss researchers employ an enhanced version of the snake technique that incorporates additional parameters, such as elasticity and rigidity. The system also uses parameters for color characteristics, which is critical for such components as hair, facial shadows, wrinkles, and spots. Because the automatic analysis can break down somewhat in low-contrast areas, the researchers have designed a user interface for interactive correction, when necessary.

The challenge in relating the 3D feature points to the generic model lies in the fact that the smooth surface of the latter is defined by more than a thousand points versus the 160 extracted from the photographs. To compensate for the discrepancy, the researchers use non-linear and freeform deformation techniques, whereby the 3D feature points become the control points for the deformation. The new image is, in essence, an interpolation of the displacements of the control points.
A reconstructed head developed from picture data is mapped with image textures using a multi-plane projection technique that identifies common feature lines among the planes and transfers the points on those lines to 2D image space.




When the 3D model has been suitably modified, textures are mapped onto the surface to increase realism. The textures, however, do not come directly from the input images because neither view, on its own, contains enough information. Texture mapping requires both a texture image and texture coordinates. Each point on a head needs one or several coordinates. Thus, a deformation process is employed to connect the side and front views.

The texture images are mapped to the head model using a technique that involves projecting the 3D model onto three planes, identifying common feature lines, then transferring the points located on those lines to the combined 2D image space.

In addition to photographic data, the system can also use range data from laser scanners or stereoscopic cameras. Unfortunately, while such techniques can provide more accuracy in terms of matching the shape of the face, the voluminous datasets they generate preclude accelerated animation speed, and the cost of data acquisition can be prohibitive. In contrast, using pictures from only two views greatly limits the size of the dataset associated with an individual head and is economical.

Currently, the researchers are fine-tuning the algorithm to achieve higher accuracy. They expect the technique, once refined, will be useful in a range of applications, from video conferencing to facial surgery simulation. They are also working on extending the system to whole-body reconstruction.

Diana Phillips Mahoney is chief technology editor of Computer Graphics World.