EP4367847A1 - Système et procédé de production de dialogues électroniques - Google Patents

Système et procédé de production de dialogues électroniques

Info

Publication number
EP4367847A1
EP4367847A1 EP22838474.9A EP22838474A EP4367847A1 EP 4367847 A1 EP4367847 A1 EP 4367847A1 EP 22838474 A EP22838474 A EP 22838474A EP 4367847 A1 EP4367847 A1 EP 4367847A1
Authority
EP
European Patent Office
Prior art keywords
electronic chat
electronic
messages
conversations
gaussian
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP22838474.9A
Other languages
German (de)
English (en)
Inventor
Jan STADERMANN
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Open Text Holdings Inc
Original Assignee
Open Text Holdings Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/389,187 external-priority patent/US11595337B2/en
Priority claimed from US17/389,190 external-priority patent/US11700224B2/en
Priority claimed from US17/389,194 external-priority patent/US12314658B2/en
Application filed by Open Text Holdings Inc filed Critical Open Text Holdings Inc
Publication of EP4367847A1 publication Critical patent/EP4367847A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/214Monitoring or handling of messages using selective forwarding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1822Conducting the conference, e.g. admission, detection, selection or grouping of participants, correlating users to one or more conference sessions, prioritising transmission

Definitions

  • This disclosure relates generally to processing of electronic chat data. More particularly, this disclosure relates to methods and systems for processing electronic chat data for electronic discovery. Even more particularly, this disclosure relates to adaptively splitting electronic chats.
  • Electronic discovery generally refers to the collection, processing, analysis, classification, review, and production of electronically stored information (ESI) in legal proceedings.
  • E-discovery tools use a document paradigm for ESI. While determining document boundaries for many forms of electronic data, such as files, is relatively straightforward, some forms of ESI, such as electronic chat data, do not lend themselves well to the document paradigm.
  • an organization subject to discovery will provide criteria to the electronic chat service and the electronic chat service provider will return the electronic chat meeting the criteria.
  • the current solution is to treat each electronic chat as a document for purposes of e-discovery, even though an electronic chat may include a great number of messages on a wide variety of topics over a long period of time.
  • the e-discovery tool stores the entire electronic chat as a single document-for example, an XML document-and then indexes that document as a whole for searching.
  • Treating an electronic chat as a document in e-discovery presents challenges for the subsequent processing and analysis of the electronic chat.
  • the parties often agree to a set of keywords to be used to search for relevant documents.
  • a search for documents containing the keywords may locate the document embodying the entire electronic chat, even if only a few messages of the electronic chat contain the keyword.
  • One embodiment of a computer-implemented method comprises a computer processor receiving an electronic chat — for example, an electronic chat meeting a chat query criterion — the electronic chat embodying a set of electronic chat messages.
  • the method can further include the computer processor adaptively splitting the set of electronic chat messages from the electronic chat into a set of conversations, each conversation in the set of conversations comprising a subset of electronic chat messages from the set of electronic chat messages.
  • Each conversation in the set of conversation can be stored, for example, as a separate document.
  • each electronic chat message embodied in the electronic chat has associated metadata.
  • adaptively splitting the set of electronic chat messages into the set of conversations comprises clustering the set of electronic chat messages into clusters based on the associated metadata of the electronic chat messages from the set of electronic chat messages.
  • each electronic chat message embodied in the electronic chat has a timestamp.
  • adaptively splitting the set of electronic chat messages into the set of conversations comprises clustering the set of electronic chat messages into clusters based on the timestamps of the electronic chat messages from the set of electronic chat messages.
  • messages are adaptively split into the set of conversations based on the time gaps between adjacent messages in the electronic chat.
  • One embodiment can comprise the computer processor determining a set of time gaps between adjacent messages from the set of electronic chat messages and determining a set of models that model the set of time gaps.
  • determining the set of models comprises determining a single Gaussian distribution of the set of time gaps and learning, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions.
  • a best model can be determined from the set of models.
  • selecting the model from the set of models comprises determining a Bayesian information criterion for each model in the set of models and selecting the best model from the set of models based on the Bayesian information criteria for the set of models.
  • the electronic chat is not split into multiple conversations if the single Gaussian distribution is selected as the best model.
  • adaptive splitting of the set of electronic chat messages into the set of conversations can be performed based on the Gaussian mixture model if the Gaussian mixture model is selected as the best model.
  • performing the adaptive splitting of the set of electronic chat messages into the set of conversations based on the Gaussian mixture model comprises: selecting a time gap from the set of time gaps and determining a probability of the selected time gap for each Gaussian distribution in the mixture of Gaussian distributions to produce a set of probabilities for the selected time gap. Based on a determination that a highest probability from the set of probabilities for the selected time gap is for the highest mean value Gaussian distribution represented by the Gaussian mixture model, the electronic chat can be split into a new conversation at the selected time gap. In accordance with one embodiment, the electronic chat is not split at the selected time gap if the highest probability from the set of probabilities for the selected time gap is not for the highest mean value Gaussian distribution represented by the Gaussian mixture model.
  • One embodiment includes receiving, by an electronic discovery system executing on a computer processor, an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages having a timestamp, determining a set of time gaps between the chat messages, determining a set of models that model the set of time gaps and selecting an optimum model from the set of models.
  • the electronic chat received is based on a chat query criterion.
  • Determining the set of models can comprise determining a single Gaussian distribution of the set of time gaps and determining, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions. Based on selecting the single Gaussian distribution as the optimum model, it can be determined that the electronic chat comprises a single electronic chat message. Based on selecting the Gaussian mixture model as the optimum model, an adaptive splitting of the set of electronic chat messages into a set of conversations can be performed based on the Gaussian mixture model.
  • determining a Gaussian mixture model representing a mixture of Gaussian distributions comprises learning the Gaussian mixture model by modeling the mixture of Gaussian distributions. Further, according to one embodiment, determining the Gaussian mixture model includes setting a maximum number of Gaussian components and modeling a set of Gaussian distributions from 2 through the maximum number of Gaussian components. Learning the Gaussian mixture model may comprise using an expectation maximization technique to learn the Gaussian distributions of the Gaussian mixture model.
  • selecting an optimum model from the set of models further comprises determining a Bayesian information criterion for each model in the set of models and selecting the optimal model from the set of models based on the Bayesian information criteria for the set of models.
  • One embodiment may include the electronic discovery system determining a highest mean value distribution from the mixture of Gaussian distributions of the Gaussian mixture model.
  • Adaptively splitting of the set of electronic chat messages into the set of conversations based on the Gaussian mixture model may include selecting a time gap from the set of time gaps, determining a probability of the selected time gap for each Gaussian distribution in the mixture of Gaussian distributions to produce a set of probabilities for the selected time gap, and based on a determination that a highest probability from the set of probabilities for the selected time gap is for the highest mean value distribution, splitting the electronic chat based on the selected time gap to produce the set of conversations.
  • Another embodiment may include receiving, by an electronic discovery system executing on a computer processor, an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages having a timestamp, determining a set of time gaps between the chat messages, determining a set of models that model the set of time gaps, and selecting an optimum model from the set of models. Determining the set of models may include determining, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions. Embodiments may further include performing an adaptive splitting of the set of electronic chat messages based on selecting the Gaussian mixture model as the optimum model and text analysis of the electronic chat. The adaptive splitting may include splitting the set of electronic chat message into a set of conversations based on the Gaussian mixture model, performing text analysis on the set of conversations based on identified one or more chat subject matter and splitting the set of conversations based on identified one or more chat subject matter.
  • the chat subject matter is a set of chat subject matters within a parent chat subject matter grouping. Further, according to one embodiment, receiving the electronic chat comprising a set of electronic chat messages is based on a chat query criterion identifying the parent chat subject matter grouping.
  • the chat subject matter is a plurality of chat subject matters.
  • One embodiment may include, applying, by a text mining and classification engine, a text analysis on the electronic chat to derive the plurality of chat subject matters for the electronic chat. Further, one embodiment may include splitting the set of conversations by identifying corresponding chat messages characterized by one of the chat subject matters.
  • Embodiments improve on computer-implemented technologies, such as e-discovery systems, that collect and process chat data.
  • Adaptively splitting a chat into multiple conversations increases the precision of downstream processes, such as search indexing and machine-learning based classification and increases processing efficiency by reducing the downstream processing of content not related to a particular search or classification.
  • Embodiments further provide a mechanism to split messages into conversations that do not require content analysis.
  • Mechanisms described herein allow adaptive splitting to be implemented without requiring the overhead of content analysis as some embodiments do not require analyzing the content of the chat messages. Further, adaptive splitting can be implemented based on unsupervised learning from the chat being analyzed and does not require a large historical training data set of messages for training.
  • a hierarchy of processing may be used in which adaptive splitting according to mechanisms described herein is used to split chats into conversations, and the conversations are used to train content- based predictive models or are classified by content-based predictive models.
  • FIG. 1 is a diagrammatic representation of one embodiment of computing ecosystem comprising an e-discovery computer system connected to an electronic chat system;
  • FIG. 2 is a diagrammatic representation of one embodiment of splitting an electronic chat
  • FIG. 3A illustrates an example set of time gap data
  • FIG. 3B is a chart illustrating a plot of a single Gaussian distribution and a mixture of two Gaussian distributions determined from the data of FIG. 3A;
  • FIG. 3C is a chart illustrating potential split points based on the data of FIG. 3A;
  • FIG. 4 is a flowchart illustrating one embodiment of a method for processing an electronic chat
  • FIG. 5 is a flow chart illustrating one embodiment of adaptive splitting of an electronic chat into conversations
  • FIG. 6 is a flow chart illustrating one embodiment of electronic chat splitting into separate conversations based on time gap analysis and text analysis of an electronic chat
  • FIG. 7 is a diagrammatic representation of one embodiment of a networked computing environment.
  • FIG. 1 is a diagrammatic representation of a computing ecosystem comprising an e- discovery computer system 100 connected by a network 102 to an electronic chat system 104, such as an online collaboration platform or other computer system that provides an electronic chat service.
  • Electronic chat system 104 is a cloud-based or other online collaboration platform that provides an electronic chat service.
  • Electronic chat system 104 comprises a database 106 of electronic chat data 108.
  • electronic chat system 104 is owned and operated independently from the organizations or other entities that utilize the electronic chat service.
  • a particular organization’s EIS may include electronic chat data contained in database 106 of which that organization does not have control or to which the organization does not have direct access. Access to the organization’s electronic chat data may be achieved through an application programming interface (API) or other interface or access mechanism that is pertinent to the electronic chat service.
  • API application programming interface
  • E-discovery computer system 100 includes components that serve to retrieve electronic chat data from electronic chat system 104 and segregate electronic chats received from electronic chat system 104 into logical groupings of related messages (referred to herein as conversations) from the electronic chat for further processing.
  • conversations logical groupings of related messages
  • e-discovery computer system 100 includes electronic chat interface 110, such as an API or other interface, to interface with electronic chat system 104 and electronic chat splitter component 112 to segregate electronic chats into conversations.
  • the conversations produced by electronic chat splitter component 112 can be leveraged by other tools.
  • the conversations in a data store 120 may be indexed by an indexing engine 118 for searching via a search engine 116.
  • E-discovery computer system 100 may further include a variety of e- discovery tools to review, redact, analyze, classify, or otherwise process documents or conversations.
  • a user of e-discovery computer system 100 may submit a query for electronic chat data meeting particular criteria, such as electronic chat data associated with a particular custodian or electronic chat data meeting date criteria.
  • electronic chat interface 110 can be utilized to send an electronic chat query for electronic chats meeting particular criteria to electronic chat system 104 and receive responsive electronic chats in return.
  • electronic chat system 104 can return an electronic chat responsive to the electronic chat search criteria.
  • Various mechanisms may be used to return an electronic chat.
  • electronic chat interface 110 may receive each electronic chat as a corresponding file or data stream.
  • each electronic chat may be received as a corresponding XML file or XML stream.
  • an electronic chat that meets the electronic chat search criteria will include all the messages in the electronic chat, even if the individual messages do not meet the electronic chat search criteria.
  • E-discovery computer system 100 may thus receive an electronic chat 130 — by way of example, but not limitation, an XML file or XML stream — that includes any number of messages by any number of participants, over a potentially large period of time.
  • electronic chat 130 is stored as a single electronic chat document 132 embodying all the messages from the electronic chat, which can then be indexed or otherwise processed as an individual document embodying the entire electronic chat.
  • Embodiments described herein include an electronic chat splitter component 112 that processes received electronic chats to determine n conversations embodied in a particular electronic chat and, if n is greater than one, segregates the conversations for further processing.
  • electronic chat splitter component 112 processes electronic chat 130 to extract conversations 134a-134n.
  • electronic chat splitter component 112 stores the n extracted conversations as separate files or other data structures for further processing.
  • electronic chat splitter component 112 stores conversations 134a-134n as separate conversation documents 136a-136n for further processing.
  • electronic chat splitter component 112 stores each conversation 134a-134n extracted from electronic chat 130 as an individual file — for example, an XML file.
  • Downstream processes may thus process the conversations extracted from an electronic chat.
  • indexing engine 118 which may be a component of or separate from search engine 116, separately indexes the documents 136a-136n as individual documents such that the extracted conversations are individually represented in the index 140.
  • index 140 may for example associate terms with individual conversation documents 136a-136n — which may also be considered electronic chat documents — instead of, or in addition to, associating the terms with electronic chat document 132 as a whole. Consequently, when a user using an e-discovery tool 114 searches for documents including “terml”, search engine 116 will return a reference to conversation document 136a (and any other documents containing the term according to index 140).
  • index 140 is illustrated as an inverse index, index 140 may comprise other types of indexes in addition to, or as alternative to, an inverse index.
  • the documents 136a-136n may be independently classifiable by machine learning classifiers (e.g., machine learning classifier 115) utilized by the e-discovery tools.
  • FIG. 2 a diagrammatic representation of splitting an electronic chat 200
  • Electronic chat 200 may be received, for example, as a file, a data stream or according to another format.
  • an electronic chat is a logical container containing any number of messages (e.g., message 202a-message 202m), by any number of different participants, created over a period of time.
  • Electronic chat 200 may include some electronic chat metadata 204 that is common to all the messages in the electronic chat.
  • Electronic chat metadata 204 may include, for example, an electronic chat identifier that uniquely identifies the electronic chat from other electronic chats maintained by the electronic chat service.
  • Each message may also include message metadata (e.g., message metadata 208a . . .
  • the message metadata of a message may include, for example, a user identifier to identify the participant who sent the message, a message id that uniquely identifies the message from other messages in the electronic chat or other messages stored by the electronic chat service and a timestamp indicating, for example, when the message was created or sent.
  • the message content contains the content of the message created by the participant.
  • An electronic chat splitter component (e.g., electronic chat splitter component 112) applies rules to determine a number n of conversations represented by the messages in electronic chat 200.
  • the electronic chat splitter may be configured with a minimum number of messages per conversation such that a split will not occur if a resulting conversation will have less than the configured number of messages.
  • n is greater than one, the electron chat splitter component segregates the messages based on conversation to create n conversations (e.g., conversation 220a . . . conversation 220n) from the electronic chat 200.
  • n conversations e.g., conversation 220a . . . conversation 220n
  • the electronic chat splitter component determines that the messages of electronic chat 200 represents a single conversation, electronic chat 200 can be stored as a single conversation. If the electronic chat splitter component determines that the messages of electronic chat represent multiple conversations, then the electronic chat splitter component splits the messages into the appropriate number of conversations.
  • each conversation includes conversation metadata (e.g., conversation metadata 222a . . . conversation metadata 222n) and messages from the electronic chat 200 from which the conversations were created.
  • the conversation metadata may include, for example, an indication of the electronic chat 200 from which the conversation was created or other metadata that links the conversations created from a particular chat, the identity of the conversation to uniquely identify it from other conversations (e.g., other conversations created from the same electronic chat or other conversations in the system).
  • the conversation metadata may include all or a portion of the electronic chat metadata.
  • each of conversation metadata 222a . . . conversation metadata 222n may include all or a portion of electronic chat metadata 204.
  • the electronic chat splitter component stores all the messages from that electronic chat as a single conversation. If the messages of electronic chat 200 represent multiple conversations, then each conversation created from electronic chat 200 will contain a respective subset of messages from the electronic chat 200 from which the conversation was created.
  • the electronic chat splitter component stores each conversation (e.g., conversation 220a . . . conversation 220n) created from electronic chat 200 as a separate logical entity. Even more particularly, in some embodiments, each conversation is stored as a separately indexable data structure. In a document-centric e-discovery system, each conversation may be stored as a separate document according to the storage paradigm of the e-discovery system. For example, each conversation may be stored as a separate file in some embodiments (e.g., an XML file or other file).
  • the electronic chat splitter component may use a number of mechanisms, based on any number of dimensions of metadata, to split an electronic chat into conversations. Examples include, but are not limited to, machine learning techniques such as k- means clustering, gaussian mixture models, or other unsupervised hard or soft clustering techniques or other machine learning models.
  • machine learning techniques such as k- means clustering, gaussian mixture models, or other unsupervised hard or soft clustering techniques or other machine learning models.
  • the electronic chat splitting component adaptively splits electronic chats into conversations based on the time gaps represented in the electronic chat.
  • the electronic chat splitter component applies a model that embodies the assumptions that the probability distribution for time gaps within a conversation is Gaussian and the range in values of time gaps between messages varies between conversations.
  • the time gaps between messages can be determined from the timestamps of the messages (e.g., from the message metadata 208a-208n).
  • FIG. 3A a graph illustrating a set of example data for an electronic chat that contains seventy- five messages is provided.
  • datapoint (xi) represents the time delay (y- axis) between when a message; and the prior message ⁇ ) from the electronic chat was received (due to the scale, certain time gaps appear to be zero, when they may in fact be several seconds or minutes).
  • datapoint 300 represents the first message in the chat, which has a delay of zero seconds as there was no prior message in the chat.
  • Datapoint 302 represents the time gap (e.g., 3676 seconds) between message2 (the second message in the electronic chat) and messagei (the first message in the electronic chat)
  • datapoint 304 represents the time gap (e.g., 53 seconds) between messages (the third message in the electronic chat) and messages (the second message in the electronic chat)
  • datapoint 306 represents the time gap (e.g., 101287 seconds) between message4 (the fourth message in the electronic chat) and messages (the third message in the electronic chat) and so on.
  • an electronic chat splitter component may determine a single gaussian distribution and a Gaussian mixture model modelling a mixture of k Gaussian components.
  • FIG. 3B is a chart illustrating a plot of a single Gaussian distribution (line 350) and a plot of a Gaussian mixture model modelling a mixture two Gaussian distributions (line 352) determined based on the time differences of FIG. 3A.
  • the x-axis represents time difference
  • the y-axis represents the loglikelihood.
  • Various criteria can be used to select which model (e.g., a single Gaussian distribution or a Gaussian mixture modelling a mixture of k Gaussian components best models the data).
  • the chat may be stored and be considered to be a single conversation. If the chat is best modelled by a Gaussian mixture model, the electronic chat splitter component determines potential split points in the chat based on the Gaussian mixture model. According to one embodiment, the electronic chat splitter component determines the Gaussian distribution represented by the Gaussian mixture model that has the highest mean value and identifies split points based on the datapoints that have the highest probability for the Gaussian distribution with the highest mean value.
  • FIG. 3C illustrates a set of potential split points, including split point 402, split point
  • split point 402 indicates for example, that the chat should potentially be split before the message corresponding to datapoint 306 such that messagei- message3 are in the one conversation and message4 is the first message in a new conversation.
  • the split points are selected because the corresponding datapoint Xi has the highest probability for the highest mean value Gaussian distribution of a Gaussian mixture model that models the data of FIG. 3A.
  • Additional message splitting rules may also be applied. For example, it may be desired in some embodiments that a conversation have at least a minimum number of messages. According to one embodiment, if a proposed split point would result in a conversation with less than a required number of messages, the proposed split point may be ignored when splitting the chat into conversations.
  • FIG. 4 is a flowchart illustrating one embodiment of a method for processing an electronic chat.
  • the method of FIG. 4 may be implemented through execution of computer readable program code embodied on a non-transitory computer readable medium.
  • the electronic chat splitter component receives an electronic chat — by way of example, but not limitation, an XML file or XML stream — that includes any number of messages by any number of participants, over a potentially large period of time (step 502).
  • the electronic chat splitter component applies rules to determine whether to split the electronic chat into multiple conversations (step 504).
  • the electronic chat splitter component may be configured to only split electronic chats that have greater than a threshold number of messages, are larger than a particular size or meet other criteria.
  • the electronic chat splitting component according to some embodiments splits electronic chats based on the time gaps between the messages in the electronic chat. To this end, the time gaps between adjacent messages can be determined to produce a series of datapoints comprising the time gaps (step 506).
  • the electronic chat splitter component can then determine a statistical model of the time gaps.
  • the electronic chat splitter component determines a Gaussian distribution of the time gaps —that is, it determines the standard deviation (s) or variance ( s 2 ) and mean (m) of the time gaps from the chat (step 508).
  • the electronic chat splitter component also learns one or more Gaussian mixture models from the time gap data determined from the chat (step 510).
  • the Gaussian mixture model may model any number of Gaussian components — that is, Gaussian distributions — and the electronic chat splitter component may learn any number of Gaussian mixture models.
  • a Gaussian mixture model of /cGaussians may be represented by: [0063] where p ⁇ is a weighting factor for the i th Gaussian N represented by the model, represents data, m ⁇ represents the mean of the i th Gaussian and s ⁇ is the standard deviation for the i th Gaussian.
  • the chat splitter component learns a standard deviation (Oi) or variance (Oi 2 ), mean (mi) and weighting factor (p ⁇ ).
  • the chat splitter component learns (so, mo, p 0 ) for the first Gaussian distribution and (si, mi, p ⁇ ) for the second Gaussian distribution.
  • Expectation Maximization (EM) techniques can be used to learn the Gaussian distributions of the Gaussian mixture model.
  • training a Gaussian mixture model using EM often begins with a “guess” of standard deviation (s) or variance (s 2 ), mean (m) for each distribution represented by the model.
  • s standard deviation
  • s 2 variance
  • mean mean
  • the chat splitter component begins with a guess for (oo, mo, p 0 ) and a guess for (si, mi, n t ).
  • the guesses may be hardcoded, determined from the data through various techniques known or developed in the art, provided by configuration, or otherwise determined.
  • the electronic chat splitter component determines the mean and variance for the single Gaussian distribution and “guesses” the means and variances for the Gaussian components of a Gaussian mixture model by moving the mean and scaling the variance from the single Gaussian distribution.
  • the current means, standard deviations, and weighting factors for the Gaussian distributions of the mixture model are used to determine the probabilities that each Gaussian of the Gaussian mixture model is responsible for a datapoint (referred to as responsibilities).
  • a responsibility is calculated for each time gap datapoint for each Gaussian of the mixture model being learned. For example, when learning a mixture model that represents two Gaussian distributions using one hundred datapoints, the expectation step generates two hundred responsibilities, one for each datapoint for each Gaussian.
  • the responsibilities for the datapoints with respect to each Gaussian curve are used to imp rove the guess of each Gaussian distribution’s mean, standard deviation and the weighting factor and thus learn better values for (oo, mo, p 0 ) . . . ( O k -i , m ⁇ i, n k-t ).
  • the values for means, standard deviations and the weighting factors learned in an iteration of the maximization step can then be used as the current values for the means, standard deviations, and weighting factors for the Gaussian distributions of the mixture model in a next iteration of the EM steps.
  • the EM steps can be repeated until a stopping condition is reached, such as a certain number of iterations being performed, the mean, a convergence condition is reached, or another condition is met.
  • the electronic chat splitter component may thus include a Gaussian mixture model comprising a trained ⁇ s, m, p) for each Gaussian distribution represented by the mixture model.
  • step 510 can be repeated to learn multiple Gaussian mixture models.
  • the chat splitter component applies model selection criteria to select a model of the chat (step 512).
  • the models generated at step 508 and step 510 are compared using the Bayesian Information Criterion (BIC).
  • BIC Bayesian Information Criterion
  • a weighting criterion in the BIC computation can be configured to control the sensitivity of splits to prevent or reduce awkward splits (e.g., splits in which only a single message or only some other small number of messages is split into a conversation).
  • the weighting criterion adds a penalty to Gaussians with more mixtures thus reducing the likelihood of splits in general. This is a soft parameter as splits still may happen if the data suggests them.
  • the BIC expression can be stated as: [0069] where, X t is an observation sequence (x t is one particular vector value), N t is the total number of observations in the sequence, M ⁇ is a model with a certain number of free parameters to estimate from the data, given by #(M ⁇ ), which accounts for the complexity of the model, log £ C N ⁇ ) is the log-likelihood of the data given the considered model, A is a design parameter (weighting criterion) that may be optimized to change the effect of the penalty term.
  • the electronic chat splitter component determines whether to split the chat into multiple conversations based on the model selected (step 514). If the single Gaussian distribution determined at step 508 is selected at step 512, then the entire chat is stored as a single entity (e.g., single document) (step 515). If a Gaussian mixture model representing the mixture of k Gaussian distributions is selected, an initial determination of the potential conversations can be made (step 516). For example, potential split points may be determined.
  • the electronic chat splitter splits the chat into conversations based on the time delay data and the selected Gaussian mixture model.
  • the electronic chat splitter component applies text analysis at step 517 based on conversation splits applied in step 516. In other embodiments, the text analysis is not performed. Additional rules may be applied to further determine how the chat is split into conversations (step 518). As one example, rules may be applied to prevent a conversation from having less than a threshold number of messages or to prevent splitting the last message (or some number of messages) into a separate conversation.
  • the electronic chat split can ignore the potential split point. If the potential split point would not violate the rules, the potential split point can be used as an actual split point.
  • Awkward splits ⁇ e.g., a single message or some other small number of messages split from the rest) may be avoided through a variety of mechanisms.
  • the minimum number of messages of any result conversation can be configured.
  • a weighting factor in the BIC computation can be configured to control the sensitivity of splits.
  • the electronic chat splitter component splits the conversation at the determined actual split points and stores the conversations determined from the chat (step 520).
  • the electronic chat splitter stores each conversation as a separate file or other data structures for further processing. Even more particularly, in some embodiments, each conversation is stored as a separately indexable data structure.
  • a common identifier can be stored (e.g., in conversation metadata) to link conversations so that all conversations created from the same root chat can be located.
  • FIG. 4 is provided by way of example and not limitation. Various steps may be repeated, steps performed in different orders, steps omitted, and additional or alternative steps performed.
  • FIG. 5 is a flow chart illustrating one embodiment of splitting a chat (e.g., at step 516).
  • the method of FIG. 5 may be implemented through execution of computer readable program code embodied on a non-transitory computer readable medium.
  • adaptive splitting may be based on a highest mean value distribution from the mixture of Gaussian distributions represented by the Gaussian mixture model.
  • the electronic chat splitter determines the Gaussian distribution from the Gaussian mixture model that has the highest mean (m) value.
  • the distribution from the Gaussian mixture model that has the highest mean value represents the largest time gaps with the chat document, which may be assumed to be breaks between conversations.
  • the electronic chat splitter component can iterate or otherwise process the time gaps determined for the set of messages in the electronic chat.
  • a datapoint representing a time gap between adjacent messages is selected (step 602).
  • the electronic chat splitter component determines the probability that the selected datapoint xi belongs to each Gaussian represented by the selected Gaussian mixture model (step 604). For example, if the Gaussian mixture model selected at step 512 represents the mixture of two Gaussian distributions, the electronic chat splitter component determines the probability that the selected datapoint x, belongs to each of the two Gaussian distributions represented by the Gaussian mixture model, thus producing a set of probabilities for the datapoint x,.
  • the electronic chat splitter component determines if a time gap represents a change in conversation (step 606). According to on embodiment, if x, has the highest probability for the Gaussian distribution with the highest mean value, the electronic chat splitter identifies x, as representing a potential split point (step 608). If the potential split point is used as an actual split point (e.g., based on rules applied at step 518), message! can be determined to be the first message of a new conversation. If the highest probability for x, does not correspond to the Gaussian distribution with the largest mean value, x, is not identified as representing a potential split point.
  • the steps may be repeated for each of the time gap datapoints corresponding to the chat.
  • the electronic chat splitting component will stop adaptive splitting when less than some threshold number of datapoints remains to ensure that a conversation with only a single message (or some other small number of messages) is not created.
  • FIG. 5 is provided by way of example and not limitation. Various steps may be repeated, steps performed in different orders, steps omitted, and additional or alternative steps performed.
  • the electronic chat splitter applies text analysis at step 517 on conversation splits applied in step 516.
  • the text analysis can provide additional insight for splitting the chat into conversations.
  • an electronic chat 630 is portrayed comprising a set of chat messages characterized by a time (e.g., calendar date 632 and time 637), and a chat user 639 for a chat message 635 in the electronic chat 630.
  • a time gap Gaussian analysis can be applied to determine an initial set of chat 630 message splits, resulting in two conversations 640 and 650.
  • the time gap Gaussian-based split results in two potential conversations, which occur on different calendar dates, namely, June 15, 2021 , and June 17, 2021 .
  • the chat messages may be further differentiated by chat subject matter.
  • three subject matters are computed, namely, “legal language”, “trademark”, and “patent”.
  • Such chat subject matter identification in the text analysis step 517 may be performed by a text-mining and classification engine.
  • the electronic chat splitter component further processes the messages corresponding to the two potential conversations 640, 650 to determine whether to sub-split the conversations based on chat subject matter.
  • the electronic chat splitter component further splits conversation 650 into two sub-conversations 651 and 660, wherein conversation 651 ’s chat subject matter equals “trademark” and conversation 660’s chat subject matter equals “patents”.
  • time gap Gaussian analysis and text analysis on a chat can result in a more precise, accurate, and useful split into conversations 640, 651 , and 660.
  • chat splitting may be reversed wherein text analysis step 517 is first performed and then enhanced by the Gaussian-based analysis.
  • some embodiments may perform only the text analysis in step 517.
  • One embodiment includes receiving, by an electronic discovery system executing on a computer processor, an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages in the set of electronic chat messages having a timestamp; determining a set of time gaps between the electronic chat messages from the set of electronic chat messages; determining a set of models that model the set of time gaps, selecting an optimum model from the set of models; based on selecting the single Gaussian distribution as the optimum model, determining that the electronic chat comprises a single electronic chat message or based on selecting the Gaussian mixture model as the optimum model, performing an adaptive splitting of the set of electronic chat messages into a set of conversations based on the Gaussian mixture model.
  • One embodiment includes a method of electronic chat production in an electronic discovery system, comprises: receiving, by an electronic discovery system executing on a computer processor, an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages in the set of electronic chat messages having a timestamp; determining a set of time gaps between the electronic chat messages from the set of electronic chat messages; determining a set of models that model the set of time gaps, wherein determining the set of models comprises: determining a single Gaussian distribution of the set of time gaps; and determining, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions; selecting an optimum model from the set of models; and based on selecting the Gaussian mixture model as the optimum model, performing an adaptive splitting of the set of electronic chat messages into a set of conversations based on the Gaussian mixture model.
  • receiving the electronic chat comprises the set of electronic chat messages is based on a chat query criterion.
  • determining the Gaussian mixture model representing the mixture of Gaussian distributions comprises: learning the Gaussian mixture model by modeling a mixture of Gaussian distributions.
  • learning the Gaussian mixture model further comprises: setting a maximum number of Gaussian components; and modeling a set of Gaussian distributions from 2 through the maximum number of Gaussian components.
  • learning the Gaussian mixture model further comprises: using an expectation maximization technique to learn the Gaussian distributions of the Gaussian mixture model.
  • selecting the optimum model from the set of models further comprises: determining a Bayesian information criterion for each model in the set of models and selecting the optimal model from the set of models based on the Bayesian information criteria for the set of models.
  • the method further comprises: determining a highest mean value distribution from the mixture of Gaussian distributions of the Gaussian mixture model, wherein performing the adaptive splitting comprises: adaptively splitting of the set of electronic chat messages into the set of conversations based on the Gaussian mixture model, comprising: selecting a time gap from the set of time gaps; determining a probability of the selected time gap for each Gaussian distribution in the mixture of Gaussian distributions to produce a set of probabilities for the selected time gap; and based on a determination that a highest probability from the set of probabilities for the selected time gap is for the highest mean value distribution, splitting the electronic chat based on the selected time gap to produce the set of conversations.
  • One embodiment includes a computer program product comprising a non-transitory, computer-readable medium storing a set of computer executable instructions, the set of computer executable instructions including instructions for: receiving, by an electronic discovery system executing on a computer processor, an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages in the set of electronic chat messages having a timestamp; determining a set of time gaps between the electronic chat messages from the set of electronic chat messages; determining a set of models that model the set of time gaps, wherein determining the set of models comprises: determining a single Gaussian distribution of the set of time gaps; and determining, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions; selecting an optimum model from the set of models; based on selecting the single Gaussian distribution as the optimum model, determining that the electronic chat comprises a single electronic chat message; and based on selecting the Gaussian mixture model as the optimum model, performing an adaptive splitting of the set of electronic chat
  • One embodiment includes an electronic discovery system comprising: a processor; a non-transitory, computer-readable medium storing a set of computer executable instructions that are executable by the processor, the set of computer executable instructions including instructions for: receiving an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages in the set of electronic chat messages having a timestamp; determining a set of time gaps between the electronic chat messages from the set of electronic chat messages; determining a set of models that model the set of time gaps, wherein determining the set of models comprises: determining a single Gaussian distribution of the set of time gaps; and determining, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions; selecting an optimum model from the set of models; based on selecting the single Gaussian distribution as the optimum model, determining that the electronic chat comprises a single electronic chat message; and based on selecting the Gaussian mixture model as the optimum model, performing an adaptive splitting of the set of electronic chat messages into
  • One embodiment includes receiving, by an electronic discovery system, an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages in the set of electronic chat messages having a timestamp; determining a set of time gaps between the electronic chat messages from the set of electronic chat messages, based on selecting a Gaussian mixture model as a model of the time gaps, splitting the set of electronic chat message into a set of conversations based on the Gaussian mixture model; performing a text analysis on the set of conversations based on a chat subject matter identified in the set of electronic chat messages; and splitting the set of conversations based on the chat subject matter.
  • a method of electronic chat production in an electronic discovery system comprises: receiving, by the electronic discovery system executing on a computer processor, an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages having a timestamp; determining a set of time gaps between the electronic chat messages in the set of electronic chat messages; determining a set of models that model the set of time gaps, wherein determining the set of models comprises: determining, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions; selecting an optimum model from the set of models; based on selecting the Gaussian mixture model as the optimum model and a text analysis of the electronic chat, performing an adaptive splitting of the set of electronic chat messages, comprising: splitting the set of electronic chat message into a set of conversations based on the Gaussian mixture model; performing the text analysis on the set of conversations based on a chat subject matter identified in the set of electronic chat messages; and splitting the set of conversations based on the chat subject matter.
  • the chat subject matter is a set of chat subject matters within a parent chat subject matter grouping and wherein receiving the electronic chat comprising a set of electronic chat messages is based on a chat query criterion identifying the parent chat subject matter grouping.
  • determining a Gaussian mixture model representing a mixture of Gaussian distributions comprises: learning the Gaussian mixture model by modeling a mixture of Gaussian distributions.
  • learning the Gaussian mixture model further comprises: setting a maximum number of Gaussian components; and modeling a set of Gaussian distributions from 2 through the maximum number of Gaussian components.
  • learning the Gaussian mixture model further comprises: using an expectation maximization technique to learn the Gaussian distributions of the Gaussian mixture model.
  • selecting an optimum model from the set of models further comprises: determining a Bayesian information criterion for each model in the set of models and selecting the optimal model from the set of models based on the Bayesian information criteria for the set of models.
  • the chat subject matter is a plurality of chat subject matters, further comprising, by the electronic discovery system; applying, by a text mining and classification engine, a text analysis on the electronic chat to derive the plurality of chat subject matters for the electronic chat; and splitting the set of conversations by identifying corresponding chat messages characterized by the chat subject matter.
  • a computer program product comprising a non-transitory, computer-readable medium storing thereon a set of computer-executable instructions, the set of computer-executable instructions comprising instructions for: receiving, by an electronic discovery system executing on a computer processor, an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages having a timestamp; determining a set of time gaps between the electronic chat messages in the set of electronic chat messages; determining a set of models that model the set of time gaps, wherein determining the set of models comprises: determining, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions; selecting an optimum model from the set of models; based on selecting the Gaussian mixture model as the optimum model and a text analysis of the electronic chat, performing an adaptive splitting of the set of electronic chat messages, comprising: splitting the set of electronic chat message into a set of conversations based on the Gaussian mixture model; performing the text analysis on the set of conversations
  • an electronic discovery system comprises: a processor; a non-transitory, computer-readable medium storing thereon a set of computer- executable instructions executable by the processor, the set of computer-executable instructions comprising instructions for: receiving an electronic chat comprising a set of electronic chat messages, each of the electronic chat messages having a timestamp; determining a set of time gaps between the electronic chat messages in the set of electronic chat messages; determining a set of models that model the set of time gaps, wherein determining the set of models comprises: determining, using the set of time gaps, a Gaussian mixture model representing a mixture of Gaussian distributions; selecting an optimum model from the set of models; based on selecting the Gaussian mixture model as the optimum model and a text analysis of the electronic chat, performing an adaptive splitting of the set of electronic chat messages, comprising: splitting the set of electronic chat message into a set of conversations based on the Gaussian mixture model; performing the text analysis on the set of conversations based on a chat subject
  • FIG. 7 is a diagrammatic representation of one embodiment of a computing environment 700.
  • computing environment 700 includes a computer system 702 that connects to electronic chat system 704 and electronic chat system 706 via a network 708.
  • a single system is shown for computer system 702, electronic chat system 704 and electronic chat system 706.
  • each of computer system 702, electronic chat system 704 and electronic chat system 706 may comprise a plurality of computers (not shown) interconnected to each other over network 708.
  • Computer system 702 comprises a computer processor 710 and associated memory 714.
  • Computer processor 710 may be an integrated circuit for processing instructions.
  • computer processor 710 may comprise one or more cores or micro-cores of a processor.
  • Memory 714 may include volatile memory, non volatile memory, semi-volatile memory or a combination thereof.
  • Memory 714 may include RAM, ROM, flash memory, a hard disk drive, a solid-state drive, an optical storage medium (e.g., CD-ROM), or other computer-readable memory or combination thereof.
  • Memory 714 may implement a storage hierarchy that includes cache memory, primary memory or secondary memory. In some embodiments, memory 714 may include storage space on a data storage array.
  • Computer system 702 may also include input/output (“I/O”) devices 718, such as a keyboard, monitor, printer, electronic pointing device (e.g., mouse, trackball, stylus, etc.), or the like.
  • Computer system 702 may also include a communication interface 719, such as a network interface card, to interface with network 708, which may be a local LAN, a WAN such as the Internet, mobile network, or other type of network or combination thereof.
  • Network 708 may represent a combination of wired and wireless networks that may be utilized for various types of network communications.
  • Memory 714 may store instructions executable by computer processor 710.
  • memory 714 may include code executable to provide an electronic chat splitter component.
  • memory 714 provides instructions for an e-discovery system.
  • computer system 702 may be one embodiment of an e- discovery computer system 100.
  • Data store 720 which may be part of or separate from memory 714, may comprise one or more database systems, file store systems, or other systems to store various data used by computer system 702.
  • Each of the computers in FIGURE 7 may have more than one CPU, ROM, RAM,
  • HD HD, I/O, or other hardware components. Portions of the methods described herein may be implemented in suitable software code that may reside within memory 714 or other computer-readable memory.
  • Embodiments can be implemented or practiced in a variety of computer system configurations including, without limitation, multi-processor systems, network devices, mini-computers, mainframe computers, data processors, and the like.
  • Embodiments can be employed in distributed computing environments, where tasks or modules are performed by remote processing devices, which are linked through a communications network such as a LAN, WAN, and/or the Internet.
  • program modules or subroutines may be located in both local and remote memory storage devices. These program modules or subroutines may, for example, be stored or distributed on computer-readable media, stored as firmware in chips, as well as distributed electronically over the Internet or over other networks (including wireless networks).
  • Example chips may include Electrically Erasable Programmable Read-Only Memory (EEPROM) chips.
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • Embodiments described herein can be implemented in the form of control logic in software or hardware or a combination of both.
  • the control logic may be stored in an information storage medium, such as a computer-readable medium, as a plurality of instructions adapted to direct an information processing device to perform a set of steps disclosed in the various embodiments.
  • an information storage medium such as a computer-readable medium
  • Steps, operations, methods, routines or portions thereof described herein be implemented using a variety of hardware, such as CPUs, application specific integrated circuits, programmable logic devices, field programmable gate arrays, optical, chemical, biological, quantum or nanoengineered systems, or other mechanisms.
  • Computer-readable program code may be stored, in whole or in part, temporarily or permanently, on a non-transitory computer- readable medium.
  • the computer-readable program code can be operated on by a processor to perform steps, operations, methods, routines or portions thereof described herein.
  • a “computer-readable medium” is a medium capable of storing data in a format readable by a computer and can include any type of data storage medium that can be read by a processor.
  • non-transitory computer- readable media can include, but are not limited to, volatile and non-volatile computer memories, such as RAM, ROM, hard drives, solid state drives, data cartridges, magnetic tapes, floppy diskettes, flash memory drives, optical data storage devices, compact-disc read-only memories.
  • computer-readable instructions or data may reside in a data array, such as a direct attach array or other array.
  • the computer-readable instructions may be executable by a processor to implement embodiments of the technology or portions thereof.
  • a “processor” includes any hardware system, mechanism or component that processes data, signals or other information.
  • a processor can include a system with a general-purpose central processing unit, multiple processing units, dedicated circuitry for achieving functionality, or other systems. Processing need not be limited to a geographic location or have temporal limitations. For example, a processor can perform its functions in “real-time,” “offline,” in a “batch mode,” etc. Portions of processing can be performed at different times and at different locations, by different (or the same) processing systems.
  • Any suitable programming language can be used to implement the routines, methods or programs of embodiments of the invention described herein, including R, Python, C, C++, Java, JavaScript, HTML, or any other programming or scripting code, etc. Communications between computers implementing embodiments can be accomplished using any electronic, optical, radio frequency signals, or other suitable methods and tools of communication in compliance with known network protocols.
  • Any particular routine can execute on a single computer processing device or multiple computer processing devices, a single computer processor or multiple computer processors. Data may be stored in a single storage medium or distributed through multiple storage mediums. In some embodiments, data may be stored in multiple databases, multiple filesystems or a combination thereof.
  • the term “or” as used herein is generally intended to mean “and/or” unless otherwise indicated.
  • a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
  • a term preceded by “a” or “an” includes both singular and plural of such term, unless clearly indicated within the claim otherwise (i.e., that the reference “a” or “an” clearly indicates only the singular or only the plural).
  • the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
  • any examples or illustrations given herein are not to be regarded in any way as restrictions on, limits to, or express definitions of, any term or terms with which they are utilized. Instead, these examples or illustrations are to be regarded as being described with respect to one particular embodiment and as illustrative only. Those of ordinary skill in the art will appreciate that any term or terms with which these examples or illustrations are utilized will encompass other embodiments which may or may not be given therewith or elsewhere in the specification and all such embodiments are intended to be included within the scope of that term or terms. Language designating such nonlimiting examples and illustrations includes, but is not limited to: “for example,” “for instance,” “e.g.,” “in one embodiment.”

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

L'invention concerne des systèmes, des procédés et des progiciels informatiques destinés à la division adaptative de dialogues électroniques. Un système de découverte électronique comporte un processeur informatique et un support non transitoire lisible par ordinateur sur lequel est incorporé un ensemble d'instructions informatiques exécutables par le processeur informatique. L'ensemble d'instructions informatiques comprend des instructions visant à: envoyer une interrogation de dialogue à un service distant de dialogue électronique; recevoir un dialogue électronique en réponse à l'interrogation de dialogue, le dialogue électronique incorporant un ensemble de messages de dialogue électronique; diviser de manière adaptative l'ensemble de messages de dialogue électronique en un ensemble de conversations, chaque conversation dans l'ensemble de conversations comportant un sous-ensemble de messages de dialogue électronique de l'ensemble de messages de dialogue électronique; et stocker chaque conversation de l'ensemble de conversations en tant que document distinct.
EP22838474.9A 2021-07-09 2022-07-08 Système et procédé de production de dialogues électroniques Withdrawn EP4367847A1 (fr)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US202163220391P 2021-07-09 2021-07-09
US17/389,187 US11595337B2 (en) 2021-07-09 2021-07-29 System and method for electronic chat production
US17/389,190 US11700224B2 (en) 2021-07-09 2021-07-29 System and method for electronic chat production
US17/389,194 US12314658B2 (en) 2021-07-09 2021-07-29 System and method for electronic chat production
PCT/US2022/036537 WO2023283433A1 (fr) 2021-07-09 2022-07-08 Système et procédé de production de dialogues électroniques

Publications (1)

Publication Number Publication Date
EP4367847A1 true EP4367847A1 (fr) 2024-05-15

Family

ID=84801012

Family Applications (1)

Application Number Title Priority Date Filing Date
EP22838474.9A Withdrawn EP4367847A1 (fr) 2021-07-09 2022-07-08 Système et procédé de production de dialogues électroniques

Country Status (2)

Country Link
EP (1) EP4367847A1 (fr)
WO (1) WO2023283433A1 (fr)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160019565A1 (en) * 2014-07-15 2016-01-21 International Business Machines Corporation Predicting the business impact of tweet conversations
US11568231B2 (en) * 2017-12-08 2023-01-31 Raytheon Bbn Technologies Corp. Waypoint detection for a contact center analysis system
US10606954B2 (en) * 2018-02-15 2020-03-31 International Business Machines Corporation Topic kernelization for real-time conversation data
US11677705B2 (en) * 2019-04-23 2023-06-13 International Business Machines Corporation Enriched message embedding for conversation deinterleaving
US11095577B2 (en) * 2019-07-01 2021-08-17 Open Text Corporation Conversation-enabled document system and method
WO2021222455A1 (fr) * 2020-04-28 2021-11-04 Open Text Holdings, Inc. Systèmes et procédés d'identification de rôles de conversation

Also Published As

Publication number Publication date
WO2023283433A1 (fr) 2023-01-12

Similar Documents

Publication Publication Date Title
US10459971B2 (en) Method and apparatus of generating image characteristic representation of query, and image search method and apparatus
US12177178B2 (en) System and method for electronic chat production
US20250252255A1 (en) System and Method for Electronic Chat Production
US12341741B2 (en) System and method for electronic chat production
CN103154991A (zh) 信用风险采集
Rosa et al. Twitter topic fuzzy fingerprints
EP4046054A1 (fr) Résumé automatique de transcriptions
Lawless et al. Cluster explanation via polyhedral descriptions
Ma et al. Social media event prediction using DNN with feedback mechanism
Paramesh et al. A Deep Learning based IT Service Desk Ticket Classifier Using CNN.
Almeida et al. Filtering spams using the minimum description length principle
CN111104422A (zh) 一种数据推荐模型的训练方法、装置、设备及存储介质
CN120654082A (zh) 一种基于深度学习的智能舆情分析与预测系统的开发方法
Roelands et al. Classifying businesses by economic activity using web-based text mining
CN120067455A (zh) 一种专利信息推送管理系统及方法
Kotha A Study on the Impact of Preprocessing Steps on Machine Learning Model Fairness
Trivedi et al. A modified content-based evolutionary approach to identify unsolicited emails
Sarwar et al. Revolutionizing Business Intelligence with AI Insights and Strategies
EP4367847A1 (fr) Système et procédé de production de dialogues électroniques
AU2020343118B2 (en) A text classification method
Memari Predicting the Stock Market Using News Sentiment Analysis
Ma Text classification on imbalanced data: Application to Systematic Reviews Automation
Trabelsi et al. A probabilistic approach for events identification from social media RSS feeds
US20250272495A1 (en) Sentiment-based zoom control at a user interface
US20250272503A1 (en) Distilled generative ai-based topic & sentiment modeling

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20240207

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20250201