EP1177528A1 - Verfahren und vorrichtung zur steuerung von reaktionen einer computererzeugten persönlichkeit - Google Patents
Verfahren und vorrichtung zur steuerung von reaktionen einer computererzeugten persönlichkeitInfo
- Publication number
- EP1177528A1 EP1177528A1 EP99937371A EP99937371A EP1177528A1 EP 1177528 A1 EP1177528 A1 EP 1177528A1 EP 99937371 A EP99937371 A EP 99937371A EP 99937371 A EP99937371 A EP 99937371A EP 1177528 A1 EP1177528 A1 EP 1177528A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- character
- image
- deltaset
- morph
- vision
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2213/00—Indexing scheme for animation
- G06T2213/12—Rule based animation
Definitions
- the present invention pertains to automated methods and apparatuses for the controlling and transforming of two- and three-dimensional images. More particularly, the present invention relates to methods and apparatuses for changing the elements of image through the use of one or more sets of modification data in real time.
- the computer system 810 includes a system unit having a processor 811, such as a Pentium® processor manufactured by Intel Corporation, Santa Clara, California.
- the processor is coupled to system memory 812 (e.g., Random Access Memory (RAM)) via a bridge circuit 813.
- the bridge circuit 813 couples the processor 811 and system memory 812 to a bus 814, such as one operated according to the Peripheral Component Interconnect standard (Nersion 2.1, 1995, PCI Special Interest Group, Portland, Oregon).
- the system unit 810 also includes a graphics adapter 815 coupled to the bus 814 which converts data signals from the bus into information for output at a display 820, such as a cathode ray tube (CRT) display, active matrix display, etc.
- a display 820 such as a cathode ray tube (CRT) display, active matrix display, etc.
- CTR cathode ray tube
- a graphical image can be displayed at display 820.
- the graphical image can be created internally to the computer system 810 or can be input via an input device 830 (such as a scanner, video camera, digital camera, etc.).
- a graphical image is stored as a number of two-dimensional picture elements or "pixels," each of which can be displayed.
- graphical images e.g., of a person's face
- a graphical image can be changed by allowing the user to modify a graphical image by "moving" (e.g., with a cursor movement device such as a mouse) the two-dimensional location of one or more pixels ( For example: Adobe Photoshop Version 3.0.5 (Adobe Systems, Inc., San Jose, California)).
- the other pixels around the one that is being moved are filled in with new data or other pixel data from the graphical image.
- the graphical image of the person's face can be modified using this product by making the person's nose larger or smaller.
- This two-dimensional phenomenon is analogous to stretching and warping a photograph printed on a "rubber sheet”.
- Morphing programs typically work by allowing the operator to select points on the outline of the specific starting image and then to reassign each of these points to a new location, thereby defining the new outline of the desired target image.
- the computer then performs the morph by: (1) smoothly moving each of these points along a path from start to finish, and (2) interpolating the movement of all the other points within the image as the morph takes place.
- a first region of a first graphical image is identified and then it is modified based on a first set of predetermined modification data.
- a variety of applications can be performed according to further embodiments of the present invention.
- the morph (e.g., the application of modification data) for a first starting image can be readily applied to other starting images.
- the morphs automatically impart desired characteristics in a custom manner to a multiplicity of starting images.
- a method of the present invention described herein automates this process.
- the morphs of the present invention enable a wide variety of human, animal, or other characters to be rendered chimp-like using a single "chimp" morph. An example of this is shown in Fig.
- modification data includes deltasets and deltazones described in more detail below.
- deltasets or zones categorically identify regions, feature by feature within differing starting images so that these images are uniquely altered to preserve the automated morph's desired effect. Because a single morph can be applied to a number of different starting images, the morph exists as a qualitative entity independently from the images it acts upon. This independence creates an entirely new tool, a morph library, a collection of desired alterations or enhancements which can be generically used on any starting image to create specific desired effects as illustrated in the above "chimpification" example. Second, once an image has been morphed to add a particular characteristic or quality, the resulting image can be subjected to a different morph to add a second characteristic or quality. Fig.
- the additive property of the automated, additive morphing system can be used in a number of ways to bring new functionality and scope to the morphing process. Five distinct additive properties of automated, additive morphs will be described below along with their practical application.
- morphs can be provided that allow a graphical image character to speak, move, emote, etc.
- a moving morph can be created during which a character can continue speaking, moving, and emoting by cross applying an automated additive morph to a ("morph sequence").
- the morph sequence that is known in the art (such as what is shown in programs by Dr. Michael Cohen at the University of California at Santa Cruz and products of Protozoa, Inc. (San Francisco, California) allows for computer-generated characters to move their mouths in a manner which approximates speech by running their characters through a sequence of "Niseme” morphs.
- a Niseme is the visual equivalent of a phoneme, i.e., the face one makes when making a phonetic sound.
- Such programs use a specific initial image of a character at rest, and a collection of target images. Each target image corresponds to a particular facial position or "Niseme” used in common speech.
- Fig. 3 shows how these target images can be strung together in a morph sequence to make an animated character approximate the Up movements of speech. This figure shows the sequence of Viseme endpoints which enable the character to mouth the word "pony". It is important to note that the creation and performance of this sequence does not require the special properties of the morphing system presented herein.
- Fig. 4 shows the cross-addition of an automated, additive morph to the "pony" morph sequence described above in Fig. 3.
- the four vertical columns of character pictures represent the progressive application of the "chimp" morph described earlier (from left to right multiplying the modification data by multiplication values of 0%, 33%, 66%, 100% prior to application to the starting image).
- “chimp” morph is nonspecific as to its starting point (as are all automated additive morphs according to the present invention), it is possible to increasingly apply the "chimp” morph while changing the starting point within the morph sequence, producing the progression shown in the darkened diagonal of squares. This diagonal progression, shown in horizontal fashion at the bottom of Fig. 4 yields a character which can speak while this character is morphing. This is the underlying structure of the moving morph. Traditional morphs (being specific rather than generic) cannot be cross-applied in this manner. Characters created using the methods of the present invention can be made to not only speak, but also emote, and react from morph sequences.
- characters can remain fully functional during an automated, additive morph rather than being required to "freeze frame" until the morph has been completed as do the morphs of the prior art.
- An additional benefit of this cross-additive procedure is that morphs can be stopped at any point to yield a fully functional, consistent new character which is a hybrid of the starting and final characters.
- the methods of the present invention provide for parametric character creation in which newly-created characters automatically speak, move, and emote using modification data stored in a database or library.
- (1) morphs can exist as qualitative attributes independent of any particular starting image; and (2) morphs can be applied, one after the other, to produce a cumulative effect.
- appearance parameters length or width of nose, prominence of jaw parameters, roundness of face, etc.
- these attributes can be selectively applied in such a way as to create any desired face from one single starting image.
- a multiracial starting character is defined and a morph library of appearance parameters is created which can be used to adjust the characters features so as to create any desired character.
- Fig. 5 shows an example of this process.
- the parameter adjustments in this figure are coarse and cartoon-like so as to yield clearly visible variations. In realistic character generation, a much larger number of parameters can be more gradually applied.
- the first three morphs shown in this illustration are "shape morphs”.
- the coloration or "skin" which is laid over the facial shape is changed rather than the shape itself. This step can be used to create the desired hair color, eye color, skin tone, facial hair, etc. in the resultant character.
- the parametric character creation described above can be combined with the moving morph, also described above, to create characters which automatically speak, emote and move.
- This dual application is illustrated in Fig. 6, wherein not only the underlying structure, but also the full speaking and emoting functionality of the original character are automatically transfe ⁇ ed to the new character.
- the character shown in Fig. 6 not only contains a mutable physical appearance, but also a full set of Nisemes, emotions, and computer triggered autonomous and reactive behavior. All of these functions can be automatically transferred to a vast range of characters which can be created using parametric character creation. This represents an exponential savings in animation time and cost over existing procedures which require custom creation of not only the character itself, but every emotion, Niseme, blink, etc. that the new character makes.
- morph sequences can be used simultaneously to combine different behavioral sequences.
- Fig. 7 illustrates the simultaneous utilization of an emoting sequence and a speaking sequence.
- the Niseme sequence required to make the character say "pony" has been added to an emotive morph sequence (center column) in such a manner that the timing of each sequence is preserved.
- the resultant sequence (right column) creates a character which can simultaneously speak and react with emotions.
- This procedure can also be used to combine autonomous emotive factors (a computer-generated cycling of deltas representing different emotions or "moods") with reactive factors (emotional deltas triggered by the proximity of elements within the character's environment which have assigned emotive influences on the character). Such procedures can be used to visualize the interplay between conscious and subconscious emotions.
- a computer-generated character character (or characters) in a multidimensional, computer-generated environment can be automatically made to seem aware of objects in that environment.
- This apparent “awareness” of the character includes the ability to "look around” for objects in the environment, to turn its eyes, head and the remainder of the body to "focus” on an object once it enters the character's "field of vision", and to create an apparent emotional response to that object. All of this behavior can be automatically created in a computer-generated character using the method described below.
- Fig. 1 shows an example of an automated morph according to an embodiment of the present invention.
- Fig. 2 shows an example of an automated, additive morph according to an embodiment of the present invention.
- Fig. 3 shows an example of a morph sequence that can be performed according to an embodiment of the present invention.
- Fig. 4 shows an example of a moving morph according to an embodiment of the present invention.
- Fig 5 shows an example of parametric character creation according to an embodiment of the present invention.
- Fig. 6 shows an example of automatic behavioral transference according to an embodiment of the present invention.
- Fig. 7 shows an example of behavioral layering according to an embodiment of the present invention.
- Fig. 8 is a computer system that is known in the art.
- Fig. 9 is a general block diagram of an image transformation system of the present invention.
- Figs. 10 a-d are polygonal models used for the presentation of a graphical image of a human head or the like.
- Figs. 11 a-f are polygonal images showing the application of deltaset in accordance with an embodiment of the present invention.
- Figs. 12 a-g are graphical images of a person's head that are generated in accordance with an embodiment of the present invention.
- Fig. 13 shows an input device for controlling the amount of transformation occurs when applying a deltaset to an image.
- Figs. 14 a-d are graphical images of a person's head that are generated in accordance with an embodiment of the present invention.
- Fig. 15 shows a communication system environment for an exemplary method of the present invention. Detailed Description
- modification data is generated that can be applied to a starting image so as to form a destination image.
- the modification data can be difference values that are generated by determining the differences between first and second images. Once these differences are determined they can be stored and later applied to any starting image to create a new destination image without the extensive frame-by-frame steps described above with respect to morphing performed in the motion picture industry. These difference values can be created on a vertex-by-vertex basis to facilitate the morphing between shapes that have an identical number of vertices. Alternatively, difference values can be assigned spatially, so that the location of points within the starting image determines the motion within the automated morph.
- An example of the vertices-based embodiment of the present invention includes the generation of a first image (e.g., a neutral or starting image) comprising a first number of vertices, each vertex having a spatial location (e.g., in two- or three-dimensional space) and a second image is generated (e.g., a target or destination image) having an equal number of vertices.
- a difference between a first one of the vertices of the first image and a co ⁇ esponding vertex of the second image is determined representing the difference in location between the two vertices.
- the difference is then stored in a memory device (e.g., RAM, hard disc drive, etc.).
- Difference values for all co ⁇ esponding vertices of the first and second images can be created using these steps and stored as a variable array (refe ⁇ ed to herein as a deltaset).
- the deltaset can then be applied to the first image to create the second image by moving the vertices in the first image to their co ⁇ esponding locations in the second image.
- a multiplication or ratio value can be multiplied by the entries in the deltaset and applied to the first image so that an intermediate graphical image is created.
- the deltaset can be applied to any starting image having an equal number of vertices. This allows the user to create new destination images without performing, again, the mathematical calculations used to create the original deltaset.
- the system 900 includes a library or database of deltasets 931.
- the library of deltasets 931 can be stored in the system memory 912 or any other memory device, such as a hard disc drive 917 coupled to bus 914 via a Small Computer Standard Interface (SCSI) host bus adapter 918 (see Fig. 8).
- SCSI Small Computer Standard Interface
- deltasets are variable a ⁇ ays of position change values that can be applied to the vertices of a starting image.
- the deltaset information is composed and cached in device 932 (e.g., processor 811 and system memory 812 of Fig.
- Both the starting and second images can then be displayed at display 920, or any other output device (memory) or sent to file export (e.g., the Internet system).
- Inputs to the system 900 of Fig. 9 include a variety of user controls 937, autonomous behavior control 938, and face tracker data input 939 which will be further described below.
- Other inputs can come from other systems such as the so-called World Wide Web (WWW).
- WWW World Wide Web
- audio data can be supplied by audio data input device 940 which can be supplied to deltaset caching and composing device 932.
- the neutral geometry 933 is based on the image of a person's head that has been captured using any of a variety of known methods (e.g., video, scanner, etc.).
- the image data of the person's head is placed onto a polygonal model 1051.
- the polygonal model comprises a plurality of vertices 1052 and connections 1053 that extend between the vertices.
- Each polygon 1054 of the polygonal model is defined by three or more vertices 1052.
- an example is discussed below using simple polygons (e.g., a square, a triangle, a rectangle, and a circle).
- Each polygon has an identifiable shape. For example, looking at Fig. 1 la, a square polygon is shown having eight vertices (points 1100 to 1107) in two-dimensional space. By moving individual vertices, the square polygon can be converted into a number of other polygon shapes such as a rectangle (Fig. 1 lb), a circle (Fig. 1 lc) and a triangle (Fig. l id; where vertices 1100, 1101, and 1107 all occupy the same point in two-dimensional space).
- a deltaset is a set of steps that are taken to move each vertex (1100 to 1107) from a starting polygon to a target or destination polygon. For example, the steps that are taken from the square polygon of Fig.
- the deltaset defines the path taken by each vertex in transforming the starting polygon to the destination polygon.
- the deltaset defines the difference in position of co ⁇ esponding vertices in the starting and target polygons.
- deltasets can be created for the transformation of the square polygon of Fig. 1 la to the circle polygon of Fig. 1 lc and of the square polygon of Fig. 1 la to the triangle polygon of Fig. l id.
- the deltaset is created by transforming a starting polygon shape into another, however, one skilled in the art will appreciate that a deltaset can be created that are not based on specific starting and target shapes, but created in the abstract. Moreover, once a deltaset is created, it can be used on any starting shape to create a new shape. For example, the deltaset used to transform the square polygon of Fig. 1 la to the rectangle polygon of Fig. 1 lb (for convenience, refe ⁇ ed to as Deltaset 1) can be used on the circle polygon of Fig. l ie. Thus, the circle polygon of Fig. 1 lc becomes the starting shape and after applying Deltaset 1, would become the ellipse polygon of Fig. 1 le (i.e., the target shape).
- Deltasets can also be combined (e.g., added together) to create new deltasets.
- Deltasetl, Deltaset2 i.e., transform from the square of Fig. 1 la to the circle of Fig. 1 lc
- Deltaset3 i.e., transform from the square of Fig. 1 la to the triangle of Fig. l ie
- Deltaset4 Applying Deltaset4 to the starting square polygon of Fig. 11a, the target shape of Fig. l lf is achieved.
- the starting polygon, destination polygon, and deltaset must have the same number of vertices. Additional algorithms would be necessary to transform between shapes or objects having a differing number of vertices.
- An additional method for moving vertices can be derived from the deltaset method wherein the motion to the points of a deltaset are interpolated such that a continuous field of motion is created.
- deltazones can be used to morph images i ⁇ espective of their particular triangle strip set because a one to one co ⁇ espondence between movements and vertices upon which the deltasets rely are replaced by a dynamical system of motion which operates on any number of vertices by moving them in accordance with their original location.
- deltasetJType The datatype structure for a deltaset (DeltasetJType) is similar to that for a basic shape object, and the pseudocode is shown in Table I.
- the DeltasetJType and shapejype variables each include an array of [numpoints] values. Each value is a position of a vertex for the shapejype variable and delta value for the DeltasetJType variable.
- DeltaSet_Calc (deltaSetJType *dset, shapeJType *src shapeJType *dest) ⁇ int i; int numpts; dataPointJType delta;
- Numpts src ⁇ numPoints; deltaSet SetNumPts (dset, numpts);
- delta is used to temporarily store the difference in position between the source (src) and destination (dest) for each of the vertices in the shape. Each delta value is then stored in a deltaset array (dset). Once a deltaset array is created, it can be easily applied to any starting shape having an equal number of vertices to form a new target shape.
- deltaSet_Apply (deltaSetJType *dset, shapeJType *dest, float amount)
- the pseudo-code of Table IN shows two utility routines that are used for creating a new, blank deltaset and to set the number of datapoints.
- deltaSetJType rect_dset, circ_dset
- shape 3etNumPoints (&square, 8); shape_SetNumPoints (&rectangle, 8) ; shape SetNumPoints (&circle, 8); 01/37218
- a datapoint is defined as a two-dimensional vector and the square, rectangle, and circle shapes are defined as eight points with abscissa and ordinate values.
- Deltasets are then calculated for the transition from square to rectangle and from square to circle.
- the resulting deltasets (rect_dset and circ_dset) represent differences between abscissa and ordinate values of the respective starting and target images.
- the deltasets can then be appUed to a starting shape (in this example, the starting image, newshape, is set to the square shape of Fig.1 la).
- the rect_dset deltaset is applied to the square shape to form an intermediate shape, and then the circ_dset deltaset is applied to this intermediate shape to form the destination shape that is shown in Fig. lie.
- a deltaset representing a transformation between the square shape of Fig.1 la to the triangle shape of Fig.1 Id is created and appUed to the eUipse shape shown in Fig.1 le.
- the deltasets example, above, can be easily extended to a three-dimensional representation.
- the example can also be expanded to more intricate and complex applications such as in three-dimensional space and facial animation.
- several additional features can be added. For example, certain motions of the face are limited to certain defined areas, such as blinking of the eyes. Accordingly, a deltaset for an entire face would be mostly 0's (indicating no change) except for the eyes and eyelids, thus isolating these areas for change.
- the deltaset datatype can be changed so that only nonzero values are stored. Thus during the execution of the Deltaset_apply routine, only the points that change are acted upon, rather than every point in the graphical representation.
- An embodiment of facial animation is described below with reference to the pseudocode example of Table VI.
- shape Type neutralFace, overallMorphface, blinkFace, emoteFaces [] , speakFaces [], newShapeFace; deltaSet Type overall_dset, blink_dset, emote_dsets [] , speak_dsets [] ;
- the animated face image comprises three-dimensional datapoints.
- "NeutralFace” is a starting image that will be changed based on one or more deltasets.
- the neutralface image is shown in Fig. 12a with eyes looking straight ahead and no expression.
- "OverallMorphFace” is a different face from NeutralFace.
- OverallMorphFace is in the image of a cat shown in Fig. 12b.
- a face showing a completed facial movement is "blinkFace” which shows the same face as NeutralFace but with the eyes closed (see Fig. 12c).
- "EmoteFaces” is an a ⁇ ay of the neutralFace augmented to show one or more emotions. For example, Fig.
- Fig. 12d shows the neutralFace emoting happiness
- Fig. 12e shows neutralFace emoting anger
- "SpeakFaces” is an array of faces showing expressions of different phonemes, a phoneme, or viseme, is a speech syllable used to form spoken words (e.g., the "oo", “ae”, “1", and “m” sounds).
- Fig. 12f shows neutralFace expressing the phoneme "oo”.
- the amount of transformation or morphing can be controlled by multiplication or multipUer values, overallMorphAmount, blinkAmount, emote Amountsf], and speakAmounts[].
- blinkAmount is set to 1.0
- a deltaset for blinking to neutralFace of Fig. 12a wiU achieve the face of Fig. 12c (i.e., 100% of the blink is applied).
- Numbers less than or greater than 1.0 can be selected for these variables.
- Deltasets are then created for transforming the neutralFace image.
- deltaset overaU_dset is created for the changes between neutralFace (Fig. 12a) and OverallMorphFace (Fig. 12b);
- deltaset blink_dset is created for the changes between neutralFace (Fig. 12a) and blinkFace (Fig. 12c);
- deltasets emote_dsets[] are created between neutralFace (Fig. 12a) and each emotion expression image (e.g., the "happy” emoteFace[] of Fig. 12d and the "angry” emoteFace[] of Fig. 12e;
- deltasets speak_dsets[] are created between neutralFace (Fig. 12a) and each phoneme expression image (e.g., the "oo" speakFace[] of Fig. 12f).
- the amounts for each deltaset transformation are calculated (e.g., the values for overallMorphAmount, blinkAmount, emoteAmount[]s, and speakAmounts[]). For the emoteAmounts[] and speakAmounts[] a ⁇ ays, these values are mostly zero.
- the new facial image to be created is stored in newShapeFace and is originally set to the NeutralFace image. Then, the deltasets that were calculated above, are applied to the newShapeFace in amounts set in transformation variables calculated above.
- overallMorphAmount is set to 0.5 (i.e., halfway between neutralFace and OverallMorphFace; blinkAmount is set to 1.0 (i.e., full blink - eyes closed); emoteAmount[] for "happy” is set to 1.0, while all other emoteAmount[] values are set to 0; and speakAmount[] for the phoneme "oo” is set to 1.0 while all other speakAmount[] values are set to 0.
- Fig. 12g The resulting image based on these variables is shown in Fig. 12g.
- the deltasets that have been created can now be applied to another starting image (i.e., an image other than neutralFace shown in Fig. 12a) without recalculation.
- a deltaset can be created between neutralFace and OverallMorphFace which signifies the changes between a male human face (shown in Fig. 12a) and the face of a cat (shown in Fig. 12b).
- Fig. 14a a neutral, male human face is shown without appUcation of this deltaset.
- Fig. 14b shows the effects of the application of this deltaset
- Figs. 14a and 14b The underlying polygonal model for Figs. 14a and 14b are shown in Figs. 10a and 10b, respectively. As seen in Figs. 10a and 10b, vertices of the first image are shown to move to different positions in the destination image. Referring back to Figs. 14a and 14b, one skilled in the art will appreciate that the color of each pixel can also change in accordance with a deltaset storing the difference in color for each pixel in the human and cat images of these figures.
- the deltaset described above can be appUed to a neutral, female human face (see Fig. 14c) to form a new destination image (see Fig. 14d).
- the variables can be input using graphical sUders shown in Figs. 13 a-d.
- a first deltaset represents the difference between a starting image with Ups in a first position and a target image with Ups in a second, higher position.
- a second deltaset represents the difference between a starting image with jaw in a first position and a target image with the jaw in a second, jutted-out position.
- a third deltaset represents the difference between a starting image with relatively smooth skin and a target image with old (i.e., heavily textured skin). Referring to Figs.
- the amount each of these first, second, and third deltasets is applied to the neutral image of Fig. 13a as determined by the placement of one or more sliders 1301-1303.
- the deltaset is not applied at all (i.e., the deltaset multiplied by 0.0 is appUed to the image).
- the deltaset multiplied by 1.0 is applied to the image and if it is placed to the left, the deltaset multiplied by -1.0 is applied to the image.
- sUders 101-03 are in a central position.
- Fig. 10a sUders 101-03 are in a central position.
- sUder 1301 is moved (e.g., with a mouse, not shown) to the right causing the first deltaset (multiplied by 1.0) to be applied to the neutral image of Fig. 13a (thus, the Ups are moved up some distance).
- slider 1302 is moved to the left, and the second deltaset described above (multiplied by -1.0) is applied to the image of Fig. 13b (thus, the jaw is recessed).
- sUder 1303 is moved to the right causing the third deltaset (multiplied by 1.0) to be applied to the image of Fig. 13 c.
- the sliders 1301-03 can have intermediate values between -1.0 and 1.0 or can have values beyond this range.
- a communication system is shown.
- a first component such as server 1510
- a transmission medium 1509 is coupled via a transmission medium 1509 to a second component (such as cUent 1511 coupled to a display 1512).
- the transmission medium 1509 is the so-called Internet system that has a varying, but limited bandwidth.
- the server 1510 and client 1511 are computer systems similar to system 801 of Fig. 8.
- a first image (e.g., a person's face) is transmitted over the transmission medium 1509 from the server 1510 to the client as weU as any desired deltasets (as described above). Some code may also be sent, operating as described herein.
- the image and deltasets can be stored at the client 1511 and the image can be displayed at display 1512.
- the server 1510 to change the image at the cUent 1511 an entire, new image need not be sent. Rather, the multipUcation values for the deltasets (e.g., the values controlled by sUders 1301-03 in Fig. 13) can be sent over the transmission medium 1509 to cause the desired change to the image at display 1512.
- the system of Fig. 15 can be used as a video phone system where the original image that is sent is that of the speaking party at the server 1510 over the transmission medium 1509 (e.g., plain old telephone system (POTS)). Speech by the user at the server 1510 can be converted into phonemes that are then converted into multiplication values that are transmitted over the transmission medium 1509 with the voice signal to facilitate the "mouthing" of words at the client 1511.
- POTS plain old telephone system
- a graphical image of a human can be made to express emotions by applying a deltaset to a neutral, starting image of the human. If the expression of emotions is autonomous, the computer graphical image of the human will seem more life-like. It could be concluded that humans fit into two categories or extremes: one that represents a person who is emotionally unpredictable (i.e., expresses emotions randomly), such as an infant, perhaps; and one that has preset reactions to every stimulation. According to an embodiment of the present invention, an "emotional state space" is created that includes a number of axes, each co ⁇ esponding to one emotion.
- element 937 provides input for changing the neutral image based on the expression of emotions.
- An example of pseudo-code for the expression of emotions is shown in Table VIII. In this pseudo-code, two emotions are selected: one that is to be expressed and one that is cu ⁇ ently fading from expression.
- emoteNum is the number of emotions in the library, float emoteAmounts [ ] ; int emoteNum;
- ⁇ nextAmount select a new random amount of emotion
- objectReactionLevel [i] metric which incorporates object's visibility, speed, speed towards viewer, inherent emotional reactivity (how exciting it is), and distance to center of vision;
- ⁇ mainObject index of largest value in objectReactionLevel
- currAmount nextAmount
- emoteAmounts [nextAmount] currAmount
- emoteAmounts [] is an a ⁇ ay of values for the cu ⁇ ent expression of one of "emoteNum” emotions.
- a value is set (e.g, between -1.0 and 1.0) to indicate the cu ⁇ ent state of the graphical image (e.g., Fig. 12D shows neutralFace emoting "happy" with a value of 1.0).
- the nextEmote variable stores the level of the next emotion to be expressed.
- the lastEmote variable stores the level of the emotion that is cu ⁇ ently being expressed, and is also fading away.
- the number of seconds for this emotion to fade to 0.0 is stored in the variable decaySecs.
- the number of seconds for the next emotion to be expressed after the cu ⁇ ent emotion amount goes to 0.0.
- the objects that are around the graphic image of the person are analyzed to determine which object is of most interest (e.g., by assigning weighted values based on the object's visibility, speed, speed towards the graphical image of the person, the inherent emotional reactivity of the object, and its distance to center of vision for the graphic image of the person).
- Each object has a data structure that includes a predefined emotion, a degree of reactivity and position. For example, a gun object, would elicit a "fear" emotion with a high degree of reactivity depending on how close it is (i.e., distance) to the person.
- nextEmotion is selected and a nextAmount is selected based on the object and the random numbers referenced above determine whether that next Emotion is to be expressed by the human image.
- the human image expresses emotions that are more lifelike in that they are somewhat random, yet can occur in response to specific stimuli.
- an input device 830 is provided for the input of data for the creation of graphic images to be output to display 820.
- the input device 830 can be a variety of components including a video camera, a magnetic tracker monitor, etc.
- selected points are tracked on a person's face.
- These devices output a stream of information that are commensurate with the coordinates of a number of select locations on a person's face as they move (see element 939 in Fig. 9). For example, six locations around the mouth, one on each eyelid, one on each eyebrow, one on each cheek, can all be tracked and output to the computer system of Fig. 8.
- a neutral three-dimensional model of a person is created as described above.
- a test subject e.g., a person
- a set of markers on his/her face (as described above).
- three three-dimensional model faces are created, one for each 3D axis (e.g., the x, y and z axes).
- Each of these models is the same as the neutral model except that the specific marker is moved a known distance (e.g., one inch or other unit) along one of the axes.
- each marker there is a contorted version of the neutral image where the marker is moved one unit only along the x-axis; a second image where the marker is moved along one unit only along the y-axis; and a third image where the marker is moved along one unit only along the z-axis.
- Deltasets are then created between the neutral image and each of the three contorted versions for each marker.
- the input stream of marker positions are received from the input device 830.
- the neutral image is then modified with the appropriate deltaset(s) rather than directly with the input positions. If marker data is only in two dimensions, then only two co ⁇ esponding distorted models are needed (and only two deltasets are created for that marker). Movement of one marker can influence the movement of other points in the neutral model (to mimic real-life or as desired by the user). Also, the movement of a marker in one axis may distort the model in more than one axis (e.g., movement of the marker at the left eyebrow in a vertical direction may have vertical and horizontal effects on the model).
- An example of pseudocode for implementing the input of marker positions is shown in Table IX.
- DeltaSet markerDeltaSets [numMarkers] [numDimensions] ;
- numMarkers is the number of discrete locations being II tracked on the source face. Typically 6-14, but
- markerDisplacements is an array of vectors with one vector
- marker DeltaSets is a 2D array of DeltaSets of size
- a computer-generated character (or characters) in a multidimensional, computer-generated environment can be automatically made to seem aware of objects in that environment.
- This apparent “awareness” of the character includes the ability to "look around” for objects in the environment, to turn its eyes, head and the remainder of the body to "focus” on an object once it enters the character's "field of vision", and to create an apparent emotional response to that object. All of this behavior can be automaticaUy created in a computer-generated character using the method described below.
- the computer-generated character is assigned a list of other objects (or characters) in the environment that the character can be "aware" of.
- the list could include such objects as a flower, a gun, a chair, etc.
- This list can also contain information describing how the character is to react emotionally to the object and what factors are to be used for determining the importance of this object relative to the rest of the Ust. VirtuaUy any variable can be used to determine an importance parameter, for example, proximity, velocity, likeability, etc.
- the character's "field of vision” is specified to be a pyramid or cone-shaped region radiating from the character's eyes in a direction from the pupils. As used herein, this region will be refe ⁇ ed to as the character's "view cone".
- the environmental position of this view cone is calculated from the position of the character's feet (or base) through the use of user-defined rotation limits and a series of transformations applied to a hierarchical structure representing the person's skeleton.
- the character is made to track the object with the objective being to modify the eye and body orientation so that the object is centered within the view cone.
- the eye can initiate the tracking using a nonlinear velocity profile to change its orientation.
- the velocity profile can have "bell" shape where the eye can move toward the object slowly at first, increasing speed to a maximum, then reducing speed to zero (when the object is centered in the cone).
- Additional joints are then gradually included in an additive process using feedback from the eye (difference between eye's cu ⁇ ent position and the ideal position in the center of the view cone). For example, as the view cone moves closer to the object, the head of the character can begin to move toward the object (foUowed by the remainder of the body, if desired).
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Processing Or Creating Images (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US9352398P | 1998-07-21 | 1998-07-21 | |
| US93523P | 1998-07-21 | ||
| PCT/US1999/016553 WO2001037218A1 (en) | 1998-07-21 | 1999-07-21 | Method and apparatus to control responsive action by a computer-generated character |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP1177528A1 true EP1177528A1 (de) | 2002-02-06 |
Family
ID=22239407
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP99937371A Withdrawn EP1177528A1 (de) | 1998-07-21 | 1999-07-21 | Verfahren und vorrichtung zur steuerung von reaktionen einer computererzeugten persönlichkeit |
Country Status (2)
| Country | Link |
|---|---|
| EP (1) | EP1177528A1 (de) |
| WO (1) | WO2001037218A1 (de) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114091639A (zh) * | 2021-11-26 | 2022-02-25 | 北京奇艺世纪科技有限公司 | 一种互动表情生成方法、装置、电子设备及存储介质 |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5659625A (en) * | 1992-06-04 | 1997-08-19 | Marquardt; Stephen R. | Method and apparatus for analyzing facial configurations and components |
| US5611037A (en) * | 1994-03-22 | 1997-03-11 | Casio Computer Co., Ltd. | Method and apparatus for generating image |
-
1999
- 1999-07-21 EP EP99937371A patent/EP1177528A1/de not_active Withdrawn
- 1999-07-21 WO PCT/US1999/016553 patent/WO2001037218A1/en not_active Ceased
Non-Patent Citations (1)
| Title |
|---|
| See references of WO0137218A1 * |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2001037218A1 (en) | 2001-05-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6147692A (en) | Method and apparatus for controlling transformation of two and three-dimensional images | |
| Noh et al. | A survey of facial modeling and animation techniques | |
| US7068277B2 (en) | System and method for animating a digital facial model | |
| EP0883090B1 (de) | Verfahren zum Erstellen von photo-realistischen beweglichen Figuren | |
| US7952577B2 (en) | Automatic 3D modeling system and method | |
| EP2043049B1 (de) | Gesichtsanimation mit Bewegungsaufnahmedaten | |
| JP2007193834A (ja) | コンピュータを用いたアニメーション製作システムおよび方法とユーザーインターフェース | |
| Breton et al. | FaceEngine a 3D facial animation engine for real time applications | |
| CA2649529A1 (en) | Goal-directed cloth simulation | |
| JP2011159329A (ja) | 自動3dモデリングシステム及び方法 | |
| Fu et al. | Real-time multimodal human–avatar interaction | |
| US7477253B2 (en) | Storage medium storing animation image generating program | |
| EP1177528A1 (de) | Verfahren und vorrichtung zur steuerung von reaktionen einer computererzeugten persönlichkeit | |
| Neumann et al. | NPR Lenses: Interactive tools for non-photorealistic line drawings | |
| Park et al. | A feature‐based approach to facial expression cloning | |
| Kalberer et al. | Lip animation based on observed 3D speech dynamics | |
| US6094202A (en) | Method and apparatus for creating lifelike digital representations of computer animated objects | |
| Cowe | Example-based computer-generated facial mimicry | |
| Bibliowicz | An automated rigging system for facial animation | |
| US20250157118A1 (en) | Techniques for motion editing for character animations | |
| de Carvalho Cruz et al. | A review regarding the 3D facial animation pipeline | |
| Di Fiore et al. | Mimicing 3D transformations of emotional stylised animation with minimal 2D input | |
| Magnenat Thalmann et al. | 3-D devices and virtual reality in human animation | |
| Karunaratne et al. | A new efficient expression generation and automatic cloning method for multimedia actors | |
| JP2026000807A (ja) | アニメーション作成方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| 17P | Request for examination filed |
Effective date: 20010221 |
|
| AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
| 18D | Application deemed to be withdrawn |
Effective date: 20040804 |