CN115577112B

CN115577112B - Event extraction method and system based on type perception gated attention mechanism

Info

Publication number: CN115577112B
Application number: CN202211576463.3A
Authority: CN
Inventors: 朱婷婷; 杨瀚
Original assignee: Chengdu Sobey Digital Technology Co Ltd
Current assignee: Chengdu Sobey Digital Technology Co Ltd
Priority date: 2022-12-09
Filing date: 2022-12-09
Publication date: 2023-04-18
Anticipated expiration: 2042-12-09
Also published as: CN115577112A

Abstract

The invention relates to the technical field of information extraction, and discloses an event extraction method and system based on a type-aware gating attention mechanism. The event extraction method utilizes gating information guidance to make different information flow to trigger words under different event categories. Filter out noise not related to trigger words. The invention solves the problems of low event argument extraction accuracy, poor role classification effect and the like existing in the prior art.

Description

An event extraction method and system based on type-aware gated attention mechanism

技术领域Technical Field

本发明涉及信息抽取技术领域，具体是一种基于类型感知门控注意力机制的事件抽取方法及系统。The present invention relates to the technical field of information extraction, and in particular to an event extraction method and system based on a type-aware gated attention mechanism.

背景技术Background Art

事件抽取是信息抽取领域中既基础又极具挑战的任务。事件抽取通常包括两个任务，即事件检测与事件论元抽取。更具体地，事件检测任务又包括触发词检测和事件分类两个子任务，事件论元抽取又包括论元检测和角色分类两个子任务。近年来，随着深度学习的不断发展，基于深度学习的事件抽取方法取得了一定程度上的提升，但是事件抽取的难点依然还未被完全解决。Event extraction is a basic and challenging task in the field of information extraction. Event extraction usually includes two tasks, namely event detection and event argument extraction. More specifically, the event detection task includes two subtasks: trigger word detection and event classification, and event argument extraction includes two subtasks: argument detection and role classification. In recent years, with the continuous development of deep learning, event extraction methods based on deep learning have achieved a certain degree of improvement, but the difficulty of event extraction has not yet been completely solved.

现阶段，大多数事件抽取方法都集中在解决重叠论元场景，却忽视了重叠触发词场景和触发词歧义问题。换句话说，不止论元可能在不同/同一事件中扮演不同的角色，触发词也可能有多种事件类型。At present, most event extraction methods focus on solving overlapping argument scenarios, but ignore overlapping trigger word scenarios and trigger word ambiguity. In other words, not only arguments may play different roles in different/same events, but trigger words may also have multiple event types.

此外，相比于事件检测，事件论元抽取更加困难。许多方法尝试利用角色信息提升事件论元抽取效果，比如角色出现频率（重要性）、角色相关性如层次概念关系、角色语法关系等。角色出现频率忽略了角色之间的相互关系，其他角色相关性则需要基于人的经验进行总结和归纳，并且在某些数据上并不适用，所以对事件论元抽取效果的提升不大。In addition, event argument extraction is more difficult than event detection. Many methods try to use role information to improve event argument extraction, such as role frequency (importance), role relevance such as hierarchical concept relationship, role grammatical relationship, etc. Role frequency ignores the relationship between roles, and other role relevance needs to be summarized and generalized based on human experience, and is not applicable to some data, so it does not significantly improve the effect of event argument extraction.

发明内容Summary of the invention

为克服现有技术的不足，本发明提供了一种基于类型感知门控注意力机制的事件抽取方法及系统，解决现有技术存在的事件论元抽取准确度低、角色分类效果差等问题。In order to overcome the shortcomings of the prior art, the present invention provides an event extraction method and system based on a type-aware gated attention mechanism to solve the problems of low accuracy in event argument extraction and poor role classification effect existing in the prior art.

本发明解决上述问题所采用的技术方案是：The technical solution adopted by the present invention to solve the above problems is:

一种基于类型感知门控注意力机制的事件抽取方法，利用门控信息指导，使不同事件类别下有不同信息流向触发词，过滤与触发词无关的噪声。An event extraction method based on type-aware gated attention mechanism uses gated information guidance to make different information flows to trigger words under different event categories and filter out noise unrelated to the trigger words.

作为一种优选的技术方案，包括以下步骤：As a preferred technical solution, the following steps are included:

S1，文本向量化：将样本

输入到基于语言模型的文本向量化层当中，获得文本向量化结果

；其中，

表示文本中的第

个字，

表示

对应的向量化结果，

表示文本X的向量化结果，R表示实数，d表示向量的维度，R^d表示d维实数向量；S1, text vectorization: transform the sample

Input into the text vectorization layer based on the language model to obtain the text vectorization result

;in,

Indicates the first

Words,

express

The corresponding vectorized result is,

Represents the vectorization result of the text X, R represents a real number, d represents the dimension of the vector, and R ^d represents a d-dimensional real number vector;

S2，事件检测：将文本向量化结果

输入融合类型感知的门控注意力机制的事件检测模块，以完成触发词检测和事件分类两个子任务；S2, event detection: vectorizing text

The input is an event detection module that integrates type-aware gated attention mechanism to complete the two subtasks of trigger word detection and event classification.

S3，论元抽取：对步骤S2完成触发词检测和事件分类后的结果中每种事件类型下的每一个触发词，利用融合了可学习的角色交互参数的论元抽取模块完成论元抽取和论元角色分类两个子任务。S3, argument extraction: For each trigger word under each event type in the results of trigger word detection and event classification in step S2, the argument extraction module that integrates learnable role interaction parameters is used to complete the two subtasks of argument extraction and argument role classification.

作为一种优选的技术方案，步骤S2中融合类型感知门控注意力机制的事件检测模块包括串联的如下子模块：触发词提取层、门控注意力事件分类层。As a preferred technical solution, the event detection module integrating the type-aware gated attention mechanism in step S2 includes the following sub-modules connected in series: a trigger word extraction layer and a gated attention event classification layer.

作为一种优选的技术方案，触发词提取层的构建过程包括如下步骤：As a preferred technical solution, the construction process of the trigger word extraction layer includes the following steps:

S21，首先按照如下公式计算获得输入文本中每个字为触发词开始/结束字符的概率：S21, first calculate the probability of each character in the input text being the start/end character of the trigger word according to the following formula:

其中，

为可学习的网络参数，sigmoid为激活函数，

为输入文本中第

个字是触发词的开始字符的概率，

为文本中第

个字是触发词的结束字符的概率；in,

is a learnable network parameter, sigmoid is the activation function,

For the first

The probability that the character is the starting character of the trigger word,

For the text

The probability that the character is the end character of the trigger word;

S22，根据预先设定的阈值

、

对S21中的结果进行过滤，从而获得位置集合

、

：S22, according to the preset threshold

,

Filter the results in S21 to obtain a location set

,

:

；

;

；

;

其中，

表示触发词的开始字符位置集合，

表示触发词的结束字符位置集合；in,

Indicates the starting character position set of the trigger word,

Indicates the ending character position set of the trigger word;

S23，结合步骤S22的结果，利用最近匹配原则获得触发词集合

；S23, combining the result of step S22, using the nearest match principle to obtain a trigger word set

;

其中，t为候选触发词，s为候选触发词t的开始字符在文本X中的位置，

为集合

中最靠近

的元素；Where t is the candidate trigger word, s is the position of the starting character of the candidate trigger word t in the text X,

For collection

The closest

Elements of

门控注意力事件分类层的构建过程包括如下步骤：The construction process of the gated attention event classification layer includes the following steps:

S24，在门控信息过滤层中，对每个事件类别

，定义事件类别语义向量

，按如下公式计算相应门控向量：S24, in the gated information filtering layer, for each event category

, define the event category semantic vector

, calculate the corresponding gate vector according to the following formula:

；

;

其中，

为事件类别

下的门控向量，

为门控单元的可学习权重参数，

为门控单元的可学习偏置参数；in,

For event category

The gating vector under

is the learnable weight parameter of the gating unit,

is the learnable bias parameter of the gating unit;

S25，结合S24中的结果，在每个事件类别下，利用元素积函数过滤上下文信息：S25, combined with the results in S24, uses the element-wise product function to filter the context information under each event category:

；

;

其中，

为输入文本中第

个字对应的向量，

为经过门控信息过滤层后输入文本中第

个字在事件类别

下经过信息过滤后的对应的向量；in,

For the first

The vector corresponding to the word,

is the first

Words in event category

The corresponding vector after information filtering;

S26，在注意力信息融合层中，利用注意力计算函数获得在事件类别

下输入文本中第

个字对于触发词

的重要性分数

；S26, in the attention information fusion layer, the attention calculation function is used to obtain the event category

Enter the text below

Trigger Word

Importance score

;

S27，结合S26中计算所获重要性分数，利用如下公式在每个事件类别下获得与每个触发词相关的最终信息聚合结果：S27, combined with the importance scores calculated in S26, uses the following formula to obtain the final information aggregation result related to each trigger word under each event category:

；

;

其中，

为经过注意力信息融合层后事件类别

下与触发词t相关的信息聚合向量；in,

is the event category after the attention information fusion layer

The information aggregation vector related to the trigger word t is as follows;

S28，在事件分类层中，结合步骤S27中所得的与触发词t相关的信息聚合向量，利用如下公式判定触发词t所属于的事件类型：S28, in the event classification layer, combined with the information aggregation vector related to the trigger word t obtained in step S27, the event type to which the trigger word t belongs is determined using the following formula:

；

;

其中，

为事件类别判定单元的可学习权重参数，

为事件类别判定单元的偏置参数，w^T表示w的转置；in,

is the learnable weight parameter of the event category determination unit,

is the bias parameter of the event category determination unit, w ^T represents the transpose of w;

根据预先设定的阈值

，所有满足如下条件的事件类别

均会被判定为触发词t所属事件类别：According to the pre-set threshold

, all event categories that meet the following conditions

They will all be judged as the event category to which the trigger word t belongs:

；

;

最终每个触发词t的事件类别集合为

。Finally, the event category set of each trigger word t is

.

作为一种优选的技术方案，步骤S26中，注意力计算函数公式如下：As a preferred technical solution, in step S26, the attention calculation function formula is as follows:

；

;

其中，

为触发词

的表征向量，通过如下公式计算获得：in,

Trigger word

The characterization vector of is calculated by the following formula:

；

;

其中，

表示触发词t的开始字符的表征向量，

表示触发词t的结束字符的表征向量；in,

Represents the representation vector of the starting character of the trigger word t,

The representation vector representing the end character of the trigger word t;

的定义如下：

is defined as follows:

；

;

其中，

表示tanh激活函数，V^T表示V的转置，

表示权重，[；；]表示向量的拼接。in,

represents the tanh activation function, V ^T represents the transpose of V,

represents weight, and [;;] represents the concatenation of vectors.

作为一种优选的技术方案，步骤S3所述融合了可学习的角色交互参数的论元抽取模块的构建过程包括如下步骤：As a preferred technical solution, the construction process of the argument extraction module integrating learnable role interaction parameters described in step S3 includes the following steps:

S31，利用如下公式计算上下文中融入触发词表征向量

：S31, use the following formula to calculate the trigger word representation vector integrated into the context

:

其中，

是基于输入文本在事件类别

下经过信息过滤后的对应向量

计算所得的均值，

是基于输入文本在事件类别

下经过信息过滤后的对应向量

计算所得的标准方差，

表示输入文本中第

个字在事件类别

下经过信息过滤并融合触发词t信息后的向量，

、

分别表示扩展参数和平移参数，

、

分别表示用于计算

的线性层的权重参数、偏置参数，

、

则分别表示用于计算

的线性层的权重参数、偏置参数；in,

is based on the input text in the event category

The corresponding vector after information filtering is

The calculated mean is,

is based on the input text in the event category

The corresponding vector after information filtering is

The calculated standard deviation is

Indicates the first

Words in event category

The following is the vector after information filtering and fusion of trigger word t information,

,

They represent the expansion parameters and translation parameters respectively.

,

Respectively represent the calculation

The weight parameters and bias parameters of the linear layer,

,

They are used to calculate

The weight parameters and bias parameters of the linear layer;

S32，利用如下公式分别计算输入文本中每个字作为每种角色事件类别

下论元的开始字符/结束字符的概率大小：S32, using the following formula to calculate each word in the input text as each role event category

The probability of the start character/end character of the following argument:

其中，

表示输入文本第

个字是事件类别为

的触发词t的角色

下的论元的开始字符的概率值，

表示输入文本第

个字是事件类别为

的触发词t的角色

下的论元的结束字符的概率值，

、

表示权重参数，

、

表示偏置参数；in,

Indicates the input text

The event category is

The role of the trigger word t

The probability value of the starting character of the argument below,

Indicates the input text

The event category is

The role of the trigger word t

The probability value of the end character of the argument below,

,

represents the weight parameter,

,

represents the bias parameter;

S33，定义可学习的角色交互矩阵

，并设计如下判定函数：S33, define a learnable role interaction matrix

, and design the following judgment function:

；

;

其中，

表示事件类别

下的指示函数，

、

表示第一层线性层的权重参数与偏置参数，

表示第二层线性层的偏置参数；in,

Indicates event category

The indicator function below,

,

Represents the weight parameters and bias parameters of the first linear layer,

Represents the bias parameter of the second linear layer;

将

作为权重，结合该权重修正步骤S32中的计算结果：Will

As a weight, the calculation result in step S32 is corrected in combination with the weight:

；

;

；

;

经过训练，判定函数不仅可以学到事件角色之间的相互关系，同时也学到了角色之间的相互关系；After training, the decision function can not only learn the relationship between event roles, but also the relationship between roles;

其中，

为输入文本第

个字是事件类别为

的触发词t的角色

下的论元的开始字符的最终概率值，

为输入文本第

个字是事件类别为

的触发词t的角色

下的论元的结束字符的最终概率值，

为事件类别

下角色

的权重；in,

For input text

The event category is

The role of the trigger word t

The final probability value of the starting character of the argument under

For input text

The event category is

The role of the trigger word t

The final probability value of the end character of the argument under

For event category

Next role

The weight of

S34，根据预先设定的阈值

、

对S33中的结果进行过滤，从而获得位置集合

、

：S34, according to a preset threshold

,

Filter the results in S33 to obtain a location set

,

:

；

;

；

;

其中，

表示角色

下论元的开始字符位置集合，

表示角色

下论元的结束字符位置集合；in,

Representing roles

The set of character positions of the next argument,

Representing roles

The set of ending character positions of the next argument;

S35，结合步骤S34的结果，利用最近匹配原则获得角色

下的论元集合

；其中，

为论元

的开始字符在文本X中的位置，

为集合

中最靠近

的元素。S35, combining the result of step S34, using the closest matching principle to obtain the role

The argument set

;in,

Argument

The position of the starting character in text X,

For collection

The closest

elements.

作为一种优选的技术方案，该事件抽取方法的损失函数如下：As a preferred technical solution, the loss function of the event extraction method is as follows:

；

;

其中，

表示输入文本

中触发词t的事件类型，

表示输入文本

中事件类型为c的触发词t的角色为

的论元，D表示所有输入样本，

表示样本x中触发词为t且事件类型为c且角色r的论元为

的概率，

表示样本x中触发词为t的概率，

表示样本x中触发词

的事件类型为c的概率，

表示在样本x中事件类别为c且触发词为

时角色r的论元为

的概率，

表示样本x中的所有事件。in,

Represents input text

The event type of the trigger word t in

Represents input text

The role of the trigger word t with event type c is

The argument of , D represents all input samples,

Indicates that the trigger word in sample x is t, the event type is c, and the argument of role r is

The probability of

represents the probability that the trigger word in sample x is t,

Represents the trigger word in sample x

The probability that the event type is c,

Indicates that in sample x, the event category is c and the trigger word is

When the argument of role r is

The probability of

represents all events in sample x.

作为一种优选的技术方案，步骤S1中，语言模型为BERT模型。As a preferred technical solution, in step S1, the language model is a BERT model.

文本向量化模块：用以，将样本

；其中，

表示文本中的第

个字，

表示

对应的向量化结果，

表示文本X的向量化结果，R表示实数，d表示向量的维度，R^d表示d维实数向量；Text vectorization module: used to convert samples

;in,

Indicates the first

Words,

express

The corresponding vectorized result is,

事件检测模块：用以，将文本向量化结果

输入融合类型感知的门控注意力机制的事件检测模块，以完成触发词检测和事件分类两个子任务；Event detection module: used to vectorize text results

论元抽取模块：用以，对事件检测模块完成触发词检测和事件分类后的结果中每种事件类型下的每一个触发词，利用融合了可学习的角色交互参数的论元抽取模块完成论元抽取和论元角色分类两个子任务。Argument extraction module: It is used to complete the two subtasks of argument extraction and argument role classification for each trigger word under each event type in the results of trigger word detection and event classification completed by the event detection module, using the argument extraction module that integrates learnable role interaction parameters.

本发明相比于现有技术，具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

本发明利用门控信息指导，使得不同事件类别下有不同信息流向触发词，有效地过滤与触发词无关的噪声并有效地整合其他相关信息，从而更好地消除含重叠触发词的事件抽取场景下的触发词的歧义问题，应对重叠触发词场景，提升事件分类效果；同时，该方法考虑了被大家忽视的角色共现关系，通过引入可学习的角色交互参数建模该角色共现关系，进一步提升了论元抽取和角色分类任务的效果。The present invention utilizes gated information guidance to enable different information flows to trigger words under different event categories, effectively filters out noise unrelated to the trigger words and effectively integrates other relevant information, thereby better eliminating the ambiguity of trigger words in event extraction scenarios containing overlapping trigger words, coping with overlapping trigger word scenarios, and improving event classification effects; at the same time, the method takes into account the neglected role co-occurrence relationship, and introduces learnable role interaction parameters to model the role co-occurrence relationship, further improving the effects of argument extraction and role classification tasks.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明所述的一种基于类型感知门控注意力机制的事件抽取方法的步骤示意图；FIG1 is a schematic diagram of the steps of an event extraction method based on a type-aware gated attention mechanism according to the present invention;

图2为本发明具体实施方式中融合类型感知门控注意力机制的事件检测模块的模型结构示意图。FIG2 is a schematic diagram of the model structure of an event detection module integrating a type-aware gated attention mechanism in a specific implementation of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合实施例及附图，对本发明作进一步的详细说明，但本发明的实施方式不限于此。The present invention will be further described in detail below in conjunction with embodiments and drawings, but the embodiments of the present invention are not limited thereto.

本发明提出一种基于类型感知门控注意力机制的事件抽取方法，该方法设计了类型感知的门控注意力机制，区别于纯粹的注意力机制，该方法利用门控信息指导，使得不同事件类别下有不同信息流向触发词，有效地过滤与触发词无关的噪声并有效地整合其他相关信息，从而更好地消除含重叠触发词的事件抽取场景下的触发词的歧义问题，应对重叠触发词场景，提升事件分类效果；同时，该方法考虑了被大家忽视的角色共现关系，通过引入可学习的角色交互参数建模该角色共现关系，进一步提升了论元抽取和角色分类任务的效果。The present invention proposes an event extraction method based on a type-aware gated attention mechanism. The method designs a type-aware gated attention mechanism. Different from a pure attention mechanism, the method uses gated information guidance to enable different information flows to trigger words under different event categories, effectively filters noise irrelevant to the trigger words and effectively integrates other relevant information, thereby better eliminating the ambiguity of trigger words in event extraction scenarios containing overlapping trigger words, coping with overlapping trigger word scenarios, and improving event classification effects; at the same time, the method takes into account the role co-occurrence relationship that has been neglected by everyone, and models the role co-occurrence relationship by introducing learnable role interaction parameters, thereby further improving the effects of argument extraction and role classification tasks.

实施例1Example 1

如图1所示，一种基于类型感知门控注意力机制的事件抽取方法，包括步骤：As shown in FIG1 , an event extraction method based on a type-aware gated attention mechanism includes the following steps:

S1，将样本

输入到基于语言模型如BERT的文本向量化层当中，获得文本向量化结果

。其中

表示文本中的第

个字，

为其对应的向量化结果；S1, the sample

Input into the text vectorization layer based on language models such as BERT to obtain the text vectorization result

.in

Indicates the first

Words,

is the corresponding vectorized result;

S2，将文本向量化结果

输入融合类型感知的门控注意力机制的事件检测模块以完成触发词检测和事件分类两个子任务；S2, vectorize the text

S3，对S2结果中属于每种事件类型的每一个触发词，利用融合了可学习的角色交互参数的论元抽取模块完成论元抽取和论元角色分类两个子任务；S3, for each trigger word belonging to each event type in the results of S2, use the argument extraction module that incorporates learnable role interaction parameters to complete the two subtasks of argument extraction and argument role classification;

实施例2Example 2

在实施例1的基础上，如图2所示，步骤S2中融合类型感知门控注意力机制的事件检测模块按串联顺序包括如下子模块：触发词提取层、门控注意力事件分类层。Based on Example 1, as shown in Figure 2, the event detection module integrating the type-aware gated attention mechanism in step S2 includes the following sub-modules in series order: a trigger word extraction layer and a gated attention event classification layer.

触发词提取层的构建过程包括如下步骤：The construction process of the trigger word extraction layer includes the following steps:

其中，

为可学习的网络参数，sigmoid为激活函数。

对应输入文本中第

个字是触发词的开始字符的概率，

则对应文本中第

个字是触发词的结束字符的概率；in,

is a learnable network parameter and sigmoid is the activation function.

Corresponding to the first

The corresponding text

The probability that the character is the end character of the trigger word;

S22，根据预先设定的阈值

、

对S21中的结果进行过滤，从而获得位置集合

、

：S22, according to the preset threshold

,

Filter the results in S21 to obtain a location set

,

:

这里，位置集合

、

分别表示触发词的开始字符位置集合、结束字符位置集合；Here, the location set

,

Respectively represent the starting character position set and the ending character position set of the trigger word;

S23，结合步骤S22的结果，利用最近匹配原则获得触发词集合

，这里

为候选触发词t的开始字符在文本X中的位置，

为集合

中最靠近

的元素。S23, combining the result of step S22, using the nearest match principle to obtain a trigger word set

,here

is the position of the starting character of the candidate trigger word t in the text X,

For collection

The closest

elements.

S24，在门控信息过滤层中，对每个事件类别

，定义事件类别语义向量

, define the event category semantic vector

, calculate the corresponding gate vector according to the following formula:

这里

为事件类别

下的门控向量，

为门控单元的可学习权重参数，

为门控单元的可学习偏置参数；here

For event category

The gating vector under

is the learnable weight parameter of the gating unit,

is the learnable bias parameter of the gating unit;

S25，结合S24中的结果，在每个事件类别下，利用元素积函数（element-wiseproduct）过滤上下文信息：S25, combined with the results in S24, uses the element-wise product function to filter the context information under each event category:

这里，

为输入文本中第

个字对应向量，

为经过门控信息过滤层后输入文本中第

个字在事件类别

下经过信息过滤后的对应向量；here,

For the first

The word corresponds to the vector,

is the first

Words in event category

The corresponding vector after information filtering is as follows;

S26，在注意力信息融合层中，经过S25步骤所述运算后，利用如下设计的注意力计算函数获得在事件类别

下输入文本中第

个字对于触发词

的重要性分数

：S26, in the attention information fusion layer, after the operation described in step S25, the attention calculation function designed as follows is used to obtain the event category

Enter the text below

Trigger Word

Importance score

:

这里，

为触发词

的表征向量，通过如下公式计算获得：here,

Trigger word

The characterization vector of is calculated by the following formula:

此外，

的定义如下：also,

is defined as follows:

；

;

这里，

为经过注意力信息融合层后事件类别

下与触发词

Claims

1. An event extraction method based on type-aware gated attention mechanism, characterized in that it uses gated information guidance to make different information flows to trigger words under different event categories, and filters out noise unrelated to the trigger words;

The following steps are involved:

S1, text vectorization: transform the sample

;in,

Indicates the first

Words,

express

The corresponding vectorized result is,

S2, event detection: vectorizing text

S3, argument extraction: for each trigger word under each event type in the results of trigger word detection and event classification in step S2, the argument extraction module integrated with learnable role interaction parameters is used to complete the two subtasks of argument extraction and argument role classification;

The event detection module integrating the type-aware gated attention mechanism in step S2 includes the following submodules connected in series: a trigger word extraction layer, a gated attention event classification layer;

The construction process of the trigger word extraction layer includes the following steps:

S21, first calculate the probability of each character in the input text being the start/end character of the trigger word according to the following formula:

in,

is a learnable network parameter, sigmoid is the activation function,

For the first

For the text

The probability that the character is the end character of the trigger word;

S22, according to the preset threshold

,

Filter the results in S21 to obtain a location set

,

:

;

;

in,

Indicates the starting character position set of the trigger word,

Indicates the ending character position set of the trigger word;

S23, combining the result of step S22, using the nearest match principle to obtain a trigger word set

;

Where t is the candidate trigger word, s is the position of the starting character of the candidate trigger word t in the text X,

For collection

The closest

Elements of

The construction process of the gated attention event classification layer includes the following steps:

S24, in the gated information filtering layer, for each event category

, define the event category semantic vector

, calculate the corresponding gate vector according to the following formula:

;

in,

For event category

The gating vector under

is the learnable weight parameter of the gating unit,

is the learnable bias parameter of the gating unit;

S25, combined with the results in S24, uses the element-wise product function to filter the context information under each event category:

;

in,

For the first

The vector corresponding to the word,

is the first

Words in event category

The corresponding vector after information filtering;

S26, in the attention information fusion layer, the attention calculation function is used to obtain the event category

Enter the text below

Trigger Word

Importance score

;

S27, combined with the importance scores calculated in S26, uses the following formula to obtain the final information aggregation result related to each trigger word under each event category:

;

in,

is the event category after the attention information fusion layer

Next and trigger words

Related information aggregation vector;

S28, in the event classification layer, combined with the information aggregation vector related to the trigger word t obtained in step S27, the event type to which the trigger word t belongs is determined using the following formula:

;

in,

is the learnable weight parameter of the event category determination unit,

According to the pre-set threshold

, all event categories that meet the following conditions

;

Finally, each trigger word

The event category set is

.

2. According to the event extraction method based on type-aware gated attention mechanism according to claim 1, it is characterized in that in step S26, the attention calculation function formula is as follows:

;

in,

Trigger word

The characterization vector of is calculated by the following formula:

;

in,

The representation vector representing the end character of the trigger word t;

is defined as follows:

;

in,

represents the tanh activation function, V ^T represents the transpose of V,

represents weight, and [;;] represents the concatenation of vectors.

3. The event extraction method based on type-aware gated attention mechanism according to claim 2 is characterized in that the construction process of the argument extraction module integrating learnable role interaction parameters in step S3 comprises the following steps:

S31, use the following formula to calculate the trigger word representation vector integrated into the context

:

in,

is based on the input text in the event category

The corresponding vector after information filtering is

The calculated mean is,

is based on the input text in the event category

The corresponding vector after information filtering is

The calculated standard deviation is

Indicates the first

Words in event category

,

,

Respectively represent the calculation

The weight parameters and bias parameters of the linear layer,

,

They are used to calculate

The weight parameters and bias parameters of the linear layer;

S32, using the following formula to calculate each word in the input text as each role event category

The probability of the start character/end character of the following argument:

in,

Indicates the input text

The event category is

The role of the trigger word t

The probability value of the starting character of the argument below,

Indicates the input text

The event category is

The role of the trigger word t

The probability value of the end character of the argument below,

,

represents the weight parameter,

,

represents the bias parameter;

S33, define a learnable role interaction matrix

, and design the following judgment function:

;

in,

Indicates event category

The indicator function below,

,

Represents the weight parameters and bias parameters of the first linear layer,

Represents the bias parameter of the second linear layer;

Will

;

;

After training, the decision function can not only learn the relationship between event roles, but also the relationship between roles;

in,

For input text

The event category is

The role of the trigger word t

The final probability value of the starting character of the argument under

For input text

The event category is

The role of the trigger word t

The final probability value of the end character of the argument under

For event category

Next role

The weight of

S34, according to a preset threshold

,

Filter the results in S33 to obtain a location set

,

:

;

;

in,

Representing roles

The set of character positions of the next argument,

Representing roles

The set of ending character positions of the next argument;

S35, combining the result of step S34, using the closest matching principle to obtain the role

The argument set

;in,

Argument

The position of the starting character in text X,

For collection

The closest

elements.

4. According to claim 3, an event extraction method based on type-aware gated attention mechanism is characterized in that the loss function of the event extraction method is as follows:

;

in,

Represents input text

The event type of the trigger word t in

Represents input text

The role of the trigger word t with event type c is

The argument of , D represents all input samples,

The probability of

represents the probability that the trigger word in sample x is t,

Represents the trigger word in sample x

The probability that the event type is c,

Indicates that in sample x, the event category is c and the trigger word is

When the argument of role r is

The probability of

represents all events in sample x.

5. An event extraction method based on type-aware gated attention mechanism according to any one of claims 1 to 4, characterized in that in step S1, the language model is a BERT model.

6. An event extraction system based on a type-aware gated attention mechanism, characterized in that it is used to implement an event extraction method based on a type-aware gated attention mechanism as described in any one of claims 1 to 5, comprising the following modules connected in sequence:

Text vectorization module: used to convert samples

;in,

Indicates the first

Words,

express

The corresponding vectorized result is,

Event detection module: used to vectorize text results

Argument extraction module: used to complete the two subtasks of argument extraction and argument role classification for each trigger word under each event type in the results of trigger word detection and event classification completed by the event detection module, using the argument extraction module that integrates learnable role interaction parameters;

The event detection module integrating the type-aware gated attention mechanism includes the following submodules connected in series: trigger word extraction layer, gated attention event classification layer;