JPH0452479B2

JPH0452479B2 -

Info

Publication number: JPH0452479B2
Application number: JP58067397A
Authority: JP
Inventors: Keiko Ayukawa
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1983-04-16
Filing date: 1983-04-16
Publication date: 1992-08-21
Also published as: JPS59192294A

Description

【発明の詳細な説明】 (1) 発明の技術分野本発明は会話情報より規則合成器により会話音
声を発生する会話音声合成部に、楽譜情報により
歌声を発生させるための歌声合成部を切替可能と
なるように付加した会話音声歌声発生装置に関す
るものである。[Detailed Description of the Invention] (1) Technical Field of the Invention The present invention is capable of switching between a conversational voice synthesis unit that generates conversational voice using a rule synthesizer based on conversational information and a singing voice synthesis unit that generates singing voice based on musical score information. The present invention relates to a conversational voice and singing voice generator which is added so that the following is achieved.

(2) 従来技術と問題点従来、表音文字列とアクセント型式を含む会話
情報を規則合成器に入力し、各表音文字毎に音素
データを格納した辞書と照合し、会話情報に対応
する音素を表わすスペクトル包絡情報と、アクセ
ント、イントネーシヨンを表わすピツチ情報、す
なわちサンプリングピツチ数で表わした音の周波
数情報、および音の強さを表わす振幅情報を出力
して音声合成器に送り、会話音声を発生する装置
は各種の方式のものが提案され実用されている。
しかし、これと組合せまたは切替えて歌声を発生
するものは見当らない。(2) Conventional technology and problems Conventionally, conversational information including phonetic character strings and accent patterns is input to a rule synthesizer, and the system matches the conversational information by comparing it with a dictionary that stores phoneme data for each phonetic character. Spectral envelope information representing phonemes, pitch information representing accents and intonations, that is, sound frequency information represented by the number of sampling pitches, and amplitude information representing sound intensity are output and sent to a speech synthesizer to generate conversation. Various types of sound generating devices have been proposed and put into practical use.
However, I have not found anything that generates singing voices by combining or switching with this.

この場合、会話音声として歌詞を朗読し、切替
え歌声を発生したり、また通常の会話音声に曲を
付けて歌劇的に発生する等各種の用法が考えられ
る。 In this case, various uses can be considered, such as reciting the lyrics as a conversational voice and generating a switching voice, or adding music to normal conversational voice and generating it in an operatic manner.

(3) 発明の目的本発明の目的は会話音声と歌声の出力切替機能
をもち、両方の音声を出力できるとともに、より
自然な歌声を発生できる会話音声歌声発生装置を
提供することである。(3) Object of the Invention The object of the present invention is to provide a conversational/singing voice generating device which has an output switching function between a conversational voice and a singing voice, is capable of outputting both voices, and is capable of generating a more natural singing voice.

(4) 発明の構成前記目的を達成するため、本発明の会話音声歌
声発生装置は会話情報を規則合成器に入力し、該
情報に対応する音素を表わすスペクトル包絡情報
と、アクセント、イントネーシヨンを表わすピツ
チ情報と、音の強さを表わす振幅情報とを出力し
音声合成器に送り会話音声を発生する会話音声合
成部と、楽譜情報を入力し、音階に対応するピツチ情報
をピツチ情報格納部より読出し各音階につき合成
したピツチ情報と、音の強さに対応する振幅情報
部から読出した振幅情報とを前記音声合成器に送
るとともに、音色と各音階毎の時間長をそれぞれ
切替える切替スイツチを介して前記規則合成器に
送り、スペクトル包絡情報のみを出力させるよう
にした歌声合成部とを具え、前記両切替スイツチの同時開路により会話音声
のみを、同時に閉路により会話音声に曲を付した
歌声を切替え発生することを特徴とするものであ
る。(4) Structure of the Invention In order to achieve the above object, the conversational voice singing voice generating device of the present invention inputs conversational information to a rule synthesizer, and generates spectral envelope information representing phonemes corresponding to the information, accent, and intonation. a conversational speech synthesis section that outputs pitch information representing the pitch and amplitude information representing the strength of the sound and sends it to the speech synthesizer to generate conversational speech; a conversational speech synthesis section that inputs musical score information and stores pitch information corresponding to the scale; a changeover switch that sends pitch information read out from the section and synthesized for each scale, and amplitude information read out from the amplitude information section corresponding to the strength of the sound to the speech synthesizer, and switches the tone color and time length of each scale, respectively. and a singing voice synthesizer configured to output only the spectral envelope information by sending the signal to the rule synthesizer via the above-mentioned rule synthesizer. The feature is that the singing voice is switched and generated.

(5) 発明の実施例第１図は本発明の実施例の構成説明図である。
同図において、記憶部１に歌詞等を示す表音文字
列と会話時のアクセント型より成る会話情報１−
１と、これに付与すべき音色、音階、強さを示す
楽譜情報１−２が格納される。会話音声時はこの
会話情報１−１が規則合成器２に入力し、前述の
該会話情報に対応する音素を表わすスペクトル包
絡情報と、アクセント、イントネーシヨンを表わ
すピツチ情報と、音の強さを表わす振幅情報を出
力し、音声合成器８に送り、合成された会話音声
信号を発生し、スピーカ９から放音される。(5) Embodiment of the invention FIG. 1 is an explanatory diagram of the configuration of an embodiment of the invention.
In the figure, conversation information 1- is stored in a storage unit 1 and consists of phonetic character strings indicating lyrics, etc., and accent types during conversation.
1, and musical score information 1-2 indicating the timbre, scale, and strength to be given to it. In the case of conversational speech, this conversational information 1-1 is input to the rule synthesizer 2, which generates spectral envelope information representing the phoneme corresponding to the aforementioned conversational information, pitch information representing accent and intonation, and sound intensity. It outputs amplitude information representing , and sends it to the voice synthesizer 8 to generate a synthesized conversation voice signal, which is emitted from the speaker 9 .

この会話音声に対して曲を付けるため、記憶部
１内から音色、音階、音の強さから成る楽譜情報
１−２を読出し、音階はピツチ変換部３に入れ、
ピツチ情報格納部４に予め格納されたピツチ値が
読出され、たとえば8KHzサンプリングで合成す
る場合、ピツチ周期は125μsとなり、各音階はこ
の倍数のピツチ値で表わせる。たとえば「フア」
の周波数は349Hzであるから、ピツチ値は23で
１／125μs×23＝348Hzという近似値で表わせる。このようにして楽譜情報１−２からの各音階に対応
して変換されたピツチ値はピツチ合成部６に送ら
れて合成され、各音階の移行波形が円滑化され
る。第２図ａ，ｂはこの状態を示すもので、同図
ａは各音階ド、レ、ミのピツチ合成前、同図ｂは
ピツチ合成後の波形である。 In order to add music to this conversation voice, musical score information 1-2 consisting of timbre, scale, and sound intensity is read out from the storage unit 1, and the scale is input into the pitch conversion unit 3.
When the pitch values previously stored in the pitch information storage section 4 are read out and synthesized by, for example, 8KHz sampling, the pitch period is 125 μs, and each scale can be represented by a pitch value that is a multiple of this. For example, "hua"
Since the frequency of is 349Hz, the pitch value is 23, which can be expressed as an approximate value of 1/125μs x 23 = 348Hz. The pitch values thus converted corresponding to each scale from the musical score information 1-2 are sent to the pitch synthesis section 6 and synthesized, thereby smoothing the transition waveform of each scale. Figures 2a and 2b show this state; Figure 2a shows the waveform before pitch synthesis for each scale C, Re, and E, and Figure 2b shows the waveform after pitch synthesis.

このピツチ合成部６の出力をビブラート部７に
よりビブラートが付加される。第３図ａ，ｂはこ
の状態を示し、同図ａはビブラート前の波形、同
図ｂはビブラート後の波形である。これにより自
然に近い音声が発生できる。このピツチ情報が、
楽譜情報中の音の強さを振幅情報部５に入力して
読出された振幅情報とともに音声合成器８に送ら
れる。 A vibrato section 7 adds vibrato to the output of the pitch synthesis section 6. Figures 3a and 3b show this state, with figure a showing the waveform before vibrato, and figure b showing the waveform after vibrato. This allows the generation of sounds that are close to natural. This pitch information is
The intensity of the sound in the musical score information is input to the amplitude information section 5 and sent to the speech synthesizer 8 together with the read amplitude information.

一方楽譜情報中の音色、すなわち基本波に対す
る倍音の割合の情報と、ピツチ変換部３からの各
音階毎の時間長すなわち、２分音符、４分音符等
により異なる時間長をそれぞれ切替スイツチ１０
_１，１０₂を介して規則合成器２に入力する。 On the other hand, a switch 10 switches between the timbre in the musical score information, that is, the information on the ratio of harmonics to the fundamental wave, and the time length of each scale from the pitch conversion section 3, that is, the time length that differs depending on half note, quarter note, etc.
₁ and 10 ₂ to the rule synthesizer 2.

そして、これらの切替スイツチ１０₁，１０₂の
閉成時には、会話情報１−１により作成されたス
ペクトル包絡情報に対し、楽譜情報からの音色と
時間長を加味し、スペクトル包絡情報のみを出力
して音声合成器８に送り、前述のピツチ情報と振
幅情報を併せて歌声に合成し、スピーカ９から放
音する。 When these switches 10 ₁ and 10 ₂ are closed, the spectral envelope information created by the conversation information 1-1 is added with the timbre and time length from the musical score information, and only the spectral envelope information is output. The voice is then sent to the voice synthesizer 8, where the pitch information and amplitude information described above are combined into a singing voice, and the voice is emitted from the speaker 9.

以上のように、スイツチ１０₁，１０₂を開放す
ると会話情報のみが音声合成器８から発生し、閉
成すると会話情報（歌詞）をもとにして楽譜情報
で曲を付けた歌声が発生する。 As described above, when the switches 10 ₁ and 10 ₂ are opened, only conversation information is generated from the speech synthesizer 8, and when they are closed, a singing voice is generated based on the conversation information (lyrics) and set to music using musical score information. .

本実施例ではより自然な歌声とするため、ピツ
チ合成部６とビブラート部７を設けたが、さらに
他の任意の効果を付加することができる。 In this embodiment, a pitch synthesis section 6 and a vibrato section 7 are provided in order to make the singing voice more natural, but other arbitrary effects may be added.

(6) 発明の効果以上説明したように、本発明によれば、会話情
報を規則合成器に入れて会話音声を発生する会話
音声合成部に、楽譜情報により歌声を発生させる
ための歌声合成部を切替可能となるように付加し
たものである。(6) Effects of the Invention As explained above, according to the present invention, a singing voice synthesis section for generating a singing voice based on musical score information is added to a conversation voice synthesis section that generates a conversation voice by inputting conversation information into a rule synthesizer. It is added so that it can be switched.

これにより、会話情報のみを出力することもで
きるし、この会話情報（歌詞）をもとに歌声とし
て出力することも可能となり、作詩、作曲、演習
等の分野に適用することにより各種の用途が考え
られる。 As a result, it is possible to output only conversation information, and it is also possible to output singing voice based on this conversation information (lyrics), and by applying it to fields such as lyric writing, composition, and exercises, it can be used for various purposes. Conceivable.

[Brief explanation of drawings]

第１図は本発明の実施例の構成説明図、第２図
ａ，ｂ、第３図ａ，ｂは第１図の実施例の要部の
動作を示す特性図であり、図中１は記憶部、１−
１は会話情報、１−２は楽譜情報、２は規則合成
器、３はピツチ変換部、４はピツチ情報格納部、
５は振幅情報部、６はピツチ合成部、７はビブラ
ート部、８は音声合成器、９はスピーカ、１０₁，
１０₂は切替スイツチを示す。 FIG. 1 is a configuration explanatory diagram of an embodiment of the present invention, FIGS. 2 a, b, and 3 a, b are characteristic diagrams showing the operation of the main parts of the embodiment of FIG. Storage part, 1-
1 is conversation information, 1-2 is musical score information, 2 is a rule synthesizer, 3 is a pitch conversion unit, 4 is a pitch information storage unit,
5 is an amplitude information section, 6 is a pitch synthesis section, 7 is a vibrato section, 8 is a speech synthesizer, 9 is a speaker, 10 ₁ ,
10 ₂ indicates a changeover switch.

Claims

[Scope of Claims] 1 Conversation information is input to a rule synthesizer, and spectral envelope information representing phonemes corresponding to the information, pitch information representing accent and intonation, and amplitude information representing sound strength are extracted. a conversational speech synthesizer that outputs and sends it to a speech synthesizer to generate conversational speech, and a conversational speech synthesizer that inputs musical score information, reads out pitch information corresponding to a scale from a pitch information storage section, and synthesizes pitch information and sound intensity for each scale. The amplitude information read from the amplitude information section corresponding to the scale is sent to the speech synthesizer, and the timbre and the time length of each scale are also sent to the rule synthesizer via a changeover switch that respectively switches the timbre and time length of each scale. and a singing voice synthesizing section configured to output a singing voice, and a conversational voice singing voice generating device characterized in that when both the changeover switches are simultaneously opened, only the conversational voice is generated, and when the two changeover switches are simultaneously closed, the conversational voice and the singing voice with music are switched and generated. .