JPH04162000A

JPH04162000A - Voice compounding method by borrowing consonant

Info

Publication number: JPH04162000A
Application number: JP2287902A
Authority: JP
Inventors: Koichi Kobayashi; 剛一小林; Hiroyuki Otsu; 大津　裕幸; Kazuo Chiiro; 千色　一男; Kazumasa Hoshikawa; 干川　一匡
Original assignee: Iwaki Electronics Co Ltd
Current assignee: Iwaki Electronics Co Ltd
Priority date: 1990-10-25
Filing date: 1990-10-25
Publication date: 1992-06-05

Abstract

PURPOSE:To reduce necessary memory capacity by calling and combining voice data of a consonant part and a vowel part corresponding to the consonant part to compound a voice waveform. CONSTITUTION:Voice data of a consonant part, on every column as for syllables other than five vowels 'A'-'O' and 'N', and a vowel part of all syllables are individually registered in a syllable table of Japanese, and the voice data of the consonant part and the vowel part corresponding to the consonant part are called and combined to compound a voice waveform. Respective vowel parts in the syllable table are not commonly used but separately registered respectively, and a resembled consonant part is commonly used (borrowing consonant). This can compress a voice data quantity and also prevent the lowering of voice quality.

Description

【発明の詳細な説明】［産業上の利用分野］本発明は日本語の音声を合成する方法に関する。更に詳
しく述べると、音節表上で同じ行の音節を合成する際、
その行の子音部の１つとその行の選択された母音部とを
組み合わせて合成することにより、音声データ量を圧縮
する借子音による音声合成方法に関するものである。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a method for synthesizing Japanese speech. To explain in more detail, when synthesizing syllables in the same row on the syllable table,
The present invention relates to a voice synthesis method using borrowed consonants, which compresses the amount of voice data by combining and synthesizing one of the consonant parts of the row with a selected vowel part of the row.

この発明は、各種メツセージを出力する音声合成出力装
置なとて有用な技術であり、発声の品位か高く且つメモ
リ容量を低減できる。The present invention is a very useful technology for a speech synthesis output device that outputs various messages, and can improve the quality of speech and reduce memory capacity.

［従来の技術］日本語の音声合成方法の１つとして波形符号化方式かあ
る。これは文章単位、フレーズ（単語）単位、音節単位
（５０音単位）なとて音声を登録し、その組み合わせて
色々な言葉を再生し出力する。これらの音声信号を原音
声信号に近い形で登録する基本方式はＰＣＭ（パルス符
号変調）であり、それを更に効率的に行う方式としてＤ
ＰＣＭ（差分パルス符号変調）やＡＤＰＣＭ　（適応差
分パルス符号変調）がある。[Prior Art] One of the Japanese speech synthesis methods is the waveform encoding method. This registers audio in units of sentences, phrases (words), and syllables (50 syllables), and combines them to reproduce and output various words. The basic method for registering these audio signals in a form close to the original audio signal is PCM (pulse code modulation), and D is a more efficient method.
There are PCM (differential pulse code modulation) and ADPCM (adaptive differential pulse code modulation).

文章単位あるいは単語単位で音声波形を登録する方法は
、録音した音声の抑揚（イントネーション）をそのまま
再生できる利点はあるか、登録文章や登録単語の数に限
界かあるので、決まりきった言葉しか再生できず応用範
囲に欠ける欠点がある。Does the method of registering speech waveforms in sentences or words have the advantage of being able to play back the intonation of the recorded voice as is?However, there is a limit to the number of registered sentences and words, so only regular words can be played back. The disadvantage is that it cannot be used and has a limited range of applications.

それに対して音節、単位（５０音単位）になると、抑揚
を音節ごとに割り付ける煩わしさかあリ、しかも前述の
方法のように綺麗な抑揚の表現は難しいか、再生する文
章の数に制約がないため、その用途は今後益々広かって
いくものと考えられる。On the other hand, when it comes to syllables and units (50 syllable units), it is a hassle to assign intonation to each syllable, and it is difficult to express beautiful intonation as in the method described above, or there is no limit to the number of sentences that can be reproduced. Therefore, its applications are expected to become more widespread in the future.

音節単位で音声を登録すると、直音（清音と濁音）及び
拗音など、登録方法にもよるか登録数は全部て百数十種
類になる。音節は、５母音「アＪ〜「才ｊと「ン」以外
は全て子音と母音とて形成されている。す行の「す」音
と「シボ音の音声波形を第１図及び第２図に示す。図示
されているように、子音部が最初に現れ、母音部かそれ
に続（。母音部は成長部、熟音部、エンベロープからな
る。If sounds are registered in units of syllables, the total number of registered sounds will be over a hundred or so, depending on the registration method, such as direct sounds (clear sounds and voiced sounds) and persistent sounds. The syllables are all made up of consonants and vowels, except for the five vowels ``AJ'', ``Saij'' and ``N''. Figures 1 and 2 show the speech waveforms of the ``su'' and ``shibo'' sounds in the ``su'' line.As shown, the consonant part appears first, and the vowel part or It consists of a part, a pitch part, and an envelope.

次に音節表（５０音表を濁音、拗音についても拡張した
もの）を第１表に示す。Next, Table 1 shows a syllable table (the 50-syllabary table is expanded to include voiced sounds and persistent sounds).

（以下余白）第１表（以下余白）音節表上の夕行ｒり」〜「ト」には、子音が２種類存在
する゛。「夕」の子音でその行を発声すると第１表のよ
うに、「夕」、「ティ」、「トウＪ、ｒテＪ、「トＪの
音になり、「チ」と「ツ」には異なる子音が存在する。(Hereafter in the margin) Table 1 (hereinafter in the margin) There are two types of consonants in the syllables from ``yugori'' to ``t''. When you pronounce that line with the consonant of "Yu", as shown in Table 1, it becomes the sounds of "Yu", "T", "Tou J", "R Te J", "T J", and "Chi" and "Tsu". There are different consonants.

同様にダ行も共通の子音で発声すれば「ダ」、「デイ」
、「ドウ」、「デ」、「ド」となる。音節表で「ヂ」と
「ヅ」はザ行の「ジ」、「ズ」と同じ発声になるだけで
ある。Similarly, if you pronounce the Da line with the same consonants, it becomes "Da" and "Dei".
, "do", "de", "do". In the syllable table, ``ji'' and ``zu'' are pronounced the same as ``ji'' and ``zu'' in the za line.

以上のことから、日本語で使用される音節の子音の数は
、はぼ２７種類に集約される。母音は「ア」、「イ」、
「つ」、「工」、「オ」の５種類だけである。単純に推
測すると、母音５種類、子音２７種類、及び「ン」の合
計３３音の組み合わせにより、全ての日本語が発声でき
ることになる。実際に品位の低い音声合成出力装置には
この方式が採用されている。From the above, the number of consonants used in Japanese syllables can be summarized into 27 types. The vowels are "a", "i",
There are only five types: ``tsu'', ``ko'', and ``o''. Simply guessing, all Japanese words can be uttered by combining 33 sounds in total: 5 types of vowels, 27 types of consonants, and ``n''. This method is actually used in low-quality speech synthesis output devices.

［発明が解決しようとする課題］この音声合成方法では、音声波形を登録するメモリ容量
を低減できるけれども、品位か低く、発生する音声は機
械音の域を出ていない。[Problems to be Solved by the Invention] Although this speech synthesis method can reduce the memory capacity for registering speech waveforms, the quality is low and the generated speech is no more than a mechanical sound.

単音節登録方式の音声合成出力装置で、ある程度の品位
を保ちうる装置になると、第１表に記載されている全て
の音節またはそれに準じるものを登録し、使用すること
になる。しかし、これではメモリ容量が増大し、例えば
サンプリング周波数１０ＭＨｚで８ビットＰＣＭの最小
の装置の場合でも必要なメモリ容量は１２８にバイトに
も及ぶ。If a speech synthesis output device using a monosyllable registration method can maintain a certain degree of quality, all the syllables listed in Table 1 or something similar thereto will be registered and used. However, this increases the memory capacity; for example, even in the case of the smallest device with 8-bit PCM at a sampling frequency of 10 MHz, the required memory capacity reaches 128 bytes.

本発明の目的は、上記のような従来技術の欠点を解消し
、必要メモリ容量を少なくでき且つ音声品位か劣らない
音声合成方法を提供することにある。SUMMARY OF THE INVENTION It is an object of the present invention to provide a speech synthesis method that eliminates the drawbacks of the prior art as described above, reduces the required memory capacity, and maintains the same quality of speech.

［課題を解決するための手段］上記の目的を達成できる本発明は、日本語の音節表上で
「アＪ〜「オＪの５母音及び「ン」以外の音節について
は、各行について１つの子音部と、全ての音節の母音部
の音声データを個別に登録しておき、子音部とそれに対
応した母音部の音声データとを呼び出し組み合わせて音
声波形を合成する方法である。つまり本発明では１つの
行内の任意の音節を発声させるのに、その行を代表する
１つの子音部と個別の母音部とを組み合わせており、子
音部を共通に使用することから「借子音による音声合成
」と称している。[Means for Solving the Problems] The present invention, which can achieve the above object, has a system in which, on the Japanese syllable table, for syllables other than the five vowels of "AJ" to "OJ" and "N", one for each line. In this method, audio data for consonant parts and vowel parts of all syllables are registered separately, and a speech waveform is synthesized by calling and combining the consonant parts and the corresponding vowel part audio data. In other words, in the present invention, in order to pronounce any syllable in one line, one consonant part representing that line is combined with an individual vowel part, and since the consonant parts are commonly used, It is called "speech synthesis".

本発明は、音節登録形式のＰＣＭ、ＡＤＰＣＭＳＤＰＣ
Ｍなと、再生最終段ではＰＣＭ音として再生する装置に
適用される。The present invention is based on syllable registration format PCM, ADPCMSDPC
M is applied to devices that reproduce PCM sound at the final stage of reproduction.

［作用］母音と子音との３３音登録の音声合成方式で音声品位か
劣る理由は、音節表（５０音表を拡張したもの）の各行
の各母音がその各子音に影響されるからである。例えば
「力」の母音（−ア）と「す」の母音（−ア）とは明ら
かに異なっている。それに対して音節表の同一行内の子
音は非常に近似しているものが多い。[Effect] The reason why the voice quality is inferior in the voice synthesis method that registers 33 sounds of vowels and consonants is that each vowel in each row of the syllable table (an expanded version of the 50-syllable table) is affected by its own consonant. . For example, the vowel of ``power'' (-a) and the vowel of ``su'' (-a) are clearly different. On the other hand, consonants within the same line of the syllable table are often very similar.

本発明はこの点に着目したものである。即ち本発明では
音節表での各母音部を共通に使用するのではなく、それ
ぞれ別個に登録し、近似している子音部を共通に使用（
借子音）することにより、音声データ量を圧縮すると共
に、音声品位の低下防止を図っている。The present invention focuses on this point. That is, in the present invention, instead of using each vowel part in the syllable table in common, each vowel part is registered separately, and similar consonant parts are commonly used (
By doing so, the amount of audio data is compressed and the quality of the audio is prevented from deteriorating.

［実施例］以下、本発明の実施例について説明する。日本語の音節
表（５０音表を濁音や拗音などについて拡張した音節表
）上で、「ア」〜「才」の５母音と「ン」の音節につい
ては、そのまま登録する。それ以外の音節については、
各行について１つの子音部と、全ての音の母音部の音声
データを個別に登録する。例えば第１図に示すす行の音
節についてみると、「す」〜「ソ」までの各音節の゛う
ち任意の１つの子音部のみをす行の子音部として登録す
る。またす行の各音節の母音部（−ア）〜（−オ）は別
個に全て登録する。[Examples] Examples of the present invention will be described below. On the Japanese syllable table (a syllable table that expands the 50-syllabary table to include voiced sounds and sulky sounds), the five vowels "a" to "sai" and the syllable "n" are registered as they are. For other syllables,
For each line, audio data for one consonant part and vowel parts of all sounds are individually registered. For example, regarding the syllables in the ``su'' line shown in FIG. 1, only one arbitrary consonant part of each syllable from ``su'' to ``so'' is registered as the consonant part of the ``su'' line. The vowel parts (-a) to (-o) of each syllable in the second row are all registered separately.

そして選択した子音部とそれに対応した母音部のデータ
とを呼び出し組み合わせて音声波形を合成する。再生時
、例えば「スジ」　（寿司）と発声させる場合は、す行
の子音部子す行の母音（−ウ）＝「ス」、す行の子音部
子す行の母、音（−イ）＝「シ」と組み合わせて合成し
発声させる。Then, the selected consonant part and the data of the corresponding vowel part are recalled and combined to synthesize a speech waveform. During playback, for example, if you want to say "suji" (sushi), the consonant of the line "su", the vowel (-u) of the row "su", the consonant of the row "su", the vowel of the row "su", the consonant of the row "su", the sound (-i) )=Synthesize and pronounce in combination with "shi".

実際の装置では、更に抑揚を付加し、品位の高い言葉で
発声させることになる。In an actual device, more intonation is added to make the words speak with higher quality.

［発明の効果］　　　“ 本発明は上記のように、５母音と「ン」以外の各音節に
ついては、音節表の各行について子音部を共通に使用す
るため、その分、音声データ量を圧縮できる。つまり必
要なメモリ容量を低減できる。[Effect of the invention] “As mentioned above, the present invention uses the consonant part in common for each row of the syllable table for each syllable other than the five vowels and “n”, so the amount of audio data can be compressed accordingly. . In other words, the required memory capacity can be reduced.

そして同一行の子音部は非常に近似したものか多いため
、上記のように共通に使用しても音声品位が低下するこ
とは少ない。Since the consonant parts in the same line are often very similar, the voice quality is unlikely to deteriorate even if they are used in common as described above.

音声波形をＰＣＭでデータ化する場合は、そのデータ量
の大きさが、それを記憶する半導体メモリ容量との関係
で問題になる。本発明では音声品位を低下させることな
く、メモリ容量を極力小さくできるから、直接、音声合
成出力装置のコスト削減に役立つことになる。When converting audio waveforms into data using PCM, the amount of data becomes a problem in relation to the capacity of the semiconductor memory that stores it. According to the present invention, the memory capacity can be made as small as possible without deteriorating the voice quality, which directly helps to reduce the cost of the voice synthesis output device.

[Brief explanation of drawings]

第１図は「す」音の音声波形図、第２図は「シ」音の音
声波形図である。特許出願人　　いわき電子株式会社FIG. 1 is a speech waveform diagram of the "s" sound, and FIG. 2 is a speech waveform diagram of the "sh" sound. Patent applicant Iwaki Electronics Co., Ltd.

Claims

[Claims]

1. On the Japanese syllable table, for the five vowels "a" to "o" and syllables other than "n", the audio data of one consonant part for each line and the vowel part of all syllables are individually recorded. A voice synthesis method using borrowed consonants, characterized in that voice data of a consonant part and a vowel part corresponding to the consonant part are registered and combined to synthesize a voice waveform.