WO2017194214A1 - Système pour récupérer des informations filtrées en confidentialité de données de transaction - Google Patents

Système pour récupérer des informations filtrées en confidentialité de données de transaction Download PDF

Info

Publication number
WO2017194214A1
WO2017194214A1 PCT/EP2017/053921 EP2017053921W WO2017194214A1 WO 2017194214 A1 WO2017194214 A1 WO 2017194214A1 EP 2017053921 W EP2017053921 W EP 2017053921W WO 2017194214 A1 WO2017194214 A1 WO 2017194214A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
transaction data
privacy
query
raw
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/EP2017/053921
Other languages
English (en)
Inventor
Barak Chizi
Jeroen D'HAEN
Tomas KLINGER
Laura NURSKI
Johan THIJS
Erik VAN GOOLEN
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
KBC Group NV
Original Assignee
KBC Group NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by KBC Group NV filed Critical KBC Group NV
Priority to US16/301,241 priority Critical patent/US10798066B2/en
Priority to EP17705439.2A priority patent/EP3455816A1/fr
Publication of WO2017194214A1 publication Critical patent/WO2017194214A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification

Definitions

  • the invention relates to a system for retrieving privacy-filtered information from transaction data and other data sources.
  • transaction data In today's globally networked society person-specific data is created at a dazzling pace. A part of this data is transaction data. The latter comprises amongst others financial transaction data, which is confidential by nature, and can therefore not be made available publicly in its raw original form. A known solution to guarantee confidentiality is by reporting only aggregate statistical information on a very general level. This however eliminates a lot of the potential the data offers. Producing anonymous data that remains specific enough to be useful is often a very difficult task, and practice today tends to either incorrectly believe confidentiality is maintained when it is not or to produce data that is practically useless.
  • US 7,269,578 discloses a concept for anonymizing data according to an anonymity criterion called k-anonymity. Given person-specific data organized in fields and entries, the data is said to have the k-anonymity property if the information for each person contained in the data cannot be distinguished from at least k-1 individuals whose information also appears in the data set.
  • the concept disclosed in US 7,269,578 is however limited to the anonymization of data, without consideration of the further processing nor the specific nature of the data.
  • the present invention is not limited to anonymizing data and includes several other key aspects concerned with the processing of the data.
  • the concept disclosed in this document is adapted for data of a specific nature, i.e. transaction data and data relating to it.
  • US 8,626,705 discloses a concept for determining aggregated transaction level data for specific group characteristics. The method involves an aggregator server determining a plurality of aggregates from parsed transaction data. The concept disclosed in US 8,626,705 is limited to the assembly of aggregates without tackling the problem of anonymity. Opposed to this, the invention disclosed in this document provides a systematic approach to aggregation that incorporates anonymity as an integral part, both through tokenization (local anonymization) and through a complementary anonymization step that takes into account the entire transaction data aggregate.
  • US 2014/0089041 discloses an apparatus for identifying misclassified customers in a customer database. The apparatus may include a receiver configured to receive information corresponding to a plurality of customers and information corresponding to a plurality of transactions.
  • the apparatus may additionally include a processor configured to calculate a mean transaction value and a standard deviation from the mean transaction value, wherein the mean transaction value is calculated using the plurality of transactions.
  • a processor configured to calculate a mean transaction value and a standard deviation from the mean transaction value, wherein the mean transaction value is calculated using the plurality of transactions.
  • WO 2010/141270 discloses a system and method to summarize transaction data via cluster analysis and factor analysis.
  • a method includes identifying at least one set of clusters based on a cluster analysis of transaction records to group entities, identifying a plurality of factors based on a factor analysis of the transaction records to reduce correlations in spending variables, classifying an entity according to the at least one set of clusters, and computing values of the factors based on the transaction records of the entity.
  • the concept disclosed in WO 2010/141270 is limited to the aggregation of transaction data without addressing the problem of anonymity. Opposed to this, the present invention systematically incorporates anonymity in its modus operandi.
  • the invention aims to provide a method to derive tailored transaction data from raw transaction data, addressing both the need for aggregation and the need for anonymization. Furthermore, the invention aims to provide a computing system, a tailored transaction data product and a computer program product relating to said method. Related, the invention aims to provide a computing system for obtaining a privacy-filtered response to a query of a user. Summary of the invention
  • the present invention relates to a computing system for obtaining a privacy-filtered response to a query of a system user, the computing system comprising
  • the server comprising a server processor, tangible non-volatile server memory, server program code present on said server memory for instructing said server processor;
  • the computer-readable medium comprising a database, said database comprising privacy settings comprising a privacy threshold;
  • said device comprising a device processor, tangible non-volatile device memory, device program code present on said device memory for instructing said device processor;
  • said server is configured for receiving raw transaction data from an external source such as a raw transaction database or a raw transaction feed, said raw transaction data comprising a plurality of raw transactions associated with a plurality of users, wherein said server is configured for receiving said query of said system user via said device; said computing system carrying out a method for obtaining said privacy-filtered response to said query of said system user, said query relating to a company comprising one or more stores, said company relating to a plurality of products/services offered to one or more users via said one or more stores, at least one of said plurality of products/services relating to one or more brands, said query comprising query-related information such as a store name or a brand name, said method comprising the steps of: receiving said query from said system user via said device, said query relating to at least one store and/or at least one brand; querying said raw transaction
  • step (D) returning said response to said system user via said device; characterized in that, said database comprises business data, said business data comprising user information and/or company information; in that said response concerns said privacy-filtered response; and in that said processing in step (C) comprises the steps of (C. l) extending said raw query results with extension data based at least on said business data and preferably based on said query-related information, obtaining enriched transaction data;
  • the system user is the person or entity providing the query to the system
  • users are the persons or entities associated with raw transactions present in the raw transaction data.
  • the system maintains a database containing additional business data, which may in itself be confidential. Therefore, it might be advantageous to use this business data without revealing it entirely to the system user.
  • the privacy filtering is done "as late as possible", and "in one run", corresponding to step (C.2).
  • said one run may comprise one or more consecutive steps. For example, store names, brand names and other data such as age or average income can still be associated to an individual user if desired just before privacy filtering takes place.
  • the minimal number of users to which the more general entry applies is linked directly or indirectly to the privacy threshold; in a preferred embodiment described below this concerns a parameter relating to k-anonymity and/or t-closeness.
  • the privacy-filtered response no longer contains references to individual users, the entries associated with specific fields such as stores and brands will be calculated more accurately than in a system where the privacy filtering is done partly or entirely before the extending of data, for instance by providing transaction data that is already privacy-filtered beforehand.
  • a related advantage of the present system is that the level of detail that is preserved in the privacy-filtered response is in itself adjustable, via said privacy threshold.
  • a low privacy threshold corresponds to a lower level of privacy and a higher level of detail
  • a high privacy threshold corresponds to a higher level of privacy and a lower level of detail.
  • the privacy threshold concerns a set of two or more values
  • at least one of said two or more values corresponds to a lower level of privacy and to a higher level of detail if it is set low, and to a higher level of privacy and a lower level of detail if it is set high.
  • the privacy-filtered response adheres to t-closeness with parameter value t if all aggregates considered in the privacy- filtered response adhere to t-closeness.
  • an aggregate adheres to t-closeness with parameter value t if the distance between the distribution of a sensitive field in the aggregate and the sensitive feature in the whole dataset is not larger than t.
  • the distance metric used for measuring distance can be chosen appropriately for the dataset at hand, for instance the earth mover's distance metric.
  • the whole dataset concerns the entire combination of all raw transaction data and all business data, in an alternative embodiment the whole dataset is only a subset of said entire combination. Taking into account the properties of parameters k and t, a relation can be chosen between the privacy threshold and said parameters.
  • the privacy threshold comprises two distinct values equal to the parameter k and the parameter t, respectively.
  • said query relates to a store-specific selection of raw transactions relevant to at least one store; whereby said business data comprises demographic data and/or financial data and/or profile data and/or habit data relating to users associated with raw transactions belonging to said store-specific selection.
  • said query relates to a brand-specific selection of raw transactions relevant to at least one brand; and in that said business data comprises demographic data and/or financial data and/or profile data and/or habit data relating to users associated with raw transactions belonging to said brand-specific selection.
  • said query relates to a brand-specific selection of raw transactions relevant to at least one brand; and in that said business data comprises demographic data and/or financial data and/or profile data and/or habit data relating to users associated with raw transactions belonging to said brand-specific selection.
  • This provides the system user with insights on the activities and the market share relating to a specific brand or group or brands.
  • a related example involving a market insights module is discussed below. Thereby, reference is made to Figure 20.
  • said filtering in step (C.2) comprises the steps of
  • step (iv) performing an anonymity test for each of said threshold-tested aggregates and obtaining anonymity-tested transaction data, moving to step (v) if negative and jumping to step (vi) in the opposite case;
  • step (v) reducing the level of detail of said anonymity-tested transaction data and obtaining aggregation-ready transaction data, jumping to step (ii);
  • said system is further configured for generating a visualization belonging to said privacy-filtered response, said visualization comprising a comparison with respect to two or more fields comprised in said privacy-filtered response.
  • said system further comprises a web interface comprising a graphical user interface (GUI) for display to said system user via said device.
  • GUI graphical user interface
  • the server comprising a server processor, tangible non-volatile server memory, server program code present on said server memory for instructing said server processor;
  • said device comprising said computer program product, said computer program product comprising at least one computer-readable medium comprising computer-readable program portions, said program portions containing instructions for executing a device method for obtaining said privacy-filtered response to said query of said system user, said query relating to a company comprising one or more stores, said company relating to a plurality of products/services offered to one or more users via said one or more stores, at least one of said plurality of products/services relating to one or more brands, said query comprising query-related information such as a store name or a brand name, said device method comprising the steps of: (01) receiving said query from said system user via said device, said query relating to at least one store and/or at least one brand;
  • step (04) receiving a response on said device from said server; characterized in that, said database comprises business data, said business data comprising user information and/or company information; in that said response concerns said privacy-filtered response; and in that said processing in step (03) comprises the steps of
  • the querying of the data does not take place on the raw transaction data. Instead, the raw transaction data is first extended by means of said business data, obtaining a first data set. The query is only executed on this first data set, obtaining a second data set. This second data set is then fed to the privacy-filtering step. Also in this alternative embodiment, privacy filtering takes place in a single run and "as late as possible".
  • Figure 1 illustrates a first embodiment relating to the present invention.
  • Figure 2 shows a second embodiment relating to aspects of the present invention.
  • Figure 3 shows a third embodiment relating to aspects of the present invention.
  • Figure 4 shows a fourth embodiment relating to aspects of the present invention.
  • Figure 5 shows a fifth embodiment relating to aspects of the present invention.
  • Figure 6 shows a sixth embodiment relating to aspects of the present invention.
  • Figure 7 shows a seventh embodiment relating to aspects of the present invention.
  • Figure 8 shows an eighth embodiment relating to aspects of the present invention.
  • Figure 9 shows a first view of a ninth embodiment relating to aspects of the present invention.
  • Figure 10 shows a second view of a ninth embodiment relating to aspects of the present invention.
  • Figure 11 shows a tenth embodiment relating to aspects of the present invention.
  • Figure 12 shows an eleventh embodiment relating to aspects of the present invention.
  • Figure 13 shows a twelfth embodiment relating to aspects of the present invention.
  • Figure 14 shows a thirteenth embodiment relating to aspects of the present invention.
  • Figure 15 shows a fourteenth embodiment relating to aspects of the present invention.
  • Figure 16 shows a fifteenth embodiment relating to aspects of the present invention.
  • Figure 17 shows sixteenth embodiment relating to aspects of the present invention.
  • Figure 18 shows a seventeenth embodiment relating to aspects of the present invention.
  • Figure 19 shows an eighteenth embodiment relating to aspects of the present invention.
  • Figure 20 shows a nineteenth embodiment relating to aspects of the present invention.
  • a compartment refers to one or more than one compartment.
  • the value to which the modifier "about” refers is itself also specifically disclosed.
  • business insight relates to quantitative and/or qualitative observations that are provided by the system in response to the query of the system user.
  • said privacy-filtered response may comprise one or more business insights.
  • level-of-interest (LOI) entity is interchangeable with “aggregate” and “microaggregate”.
  • field and “attribute” are interchangeable.
  • entity refers to a content of a field, whereby said content may be a numerical value but also another type of value such as a Boolean variable or a character string.
  • the terms “privacy filtering”, “privacy mechanism” and “anonymity concept” are interchangeable.
  • client may refer to either a system user or a user, depending on the context.
  • client may for instance refer to the role of a customer in a store.
  • client may for instance refer to the role of a system provider client using the system that is provided by a system provider.
  • customer may refer to a user, for instance in the role of a customer visiting a store.
  • business data and “other data” are interchangeable in this document, and relate to the concept of a "digital channel”.
  • % by weight refers to the relative weight of the respective component based on the overall weight of the formulation.
  • Data anonymity encompasses several problems.
  • One of the problems is that one cannot judge the anonymity of data in a simple intuitive way. In some instances it may be simple, e.g. when a bank account number is contained within the data. In such a case, any person with inside knowledge (e.g., a staff member of a store) may look for further information on a specific user that is a customer of that store.
  • a zip code identifies individuals since it is may be almost unique or even unique within a given context, e.g. for a single store.
  • the unique feature such as e.g. the zip code
  • the unique feature may not be known beforehand, and depends on the context. It could be based on some detail or combination of details available to a person with inside knowledge, or knowledge about the data source from some other source.
  • tokenizing e.g. a bank account number makes identifying individuals more difficult, but still cannot guarantee the data are anonymous. If a person with inside knowledge (e.g., a staff member of a shop where several transactions took place) has access to the tokenized version of the bank account number together with a plurality of transactions associated with the bank account number, then said person may identify individual users by the combination of transactions (e.g. articles bought). In one embodiment of the present invention, this problem is circumvented by letting more than one token correspond with a single bank account number.
  • Determining an appropriate bin size to realize anonymity is not a simple task. It depends on the frequencies of characteristics found within the data as well as within other sources for reidentification. In addition, the motivation and effort required to reidentify release of data in cases where virtually all possible users can be identified must be considered. For example, if data are released that map each transaction to ten possible users, and the ten users can be identified, then all ten users may be contacted or visited in an effort to locate the actual users. Likewise, if the mapping is 1 in 100, all 100 could be phoned because visits may be impractical, and in the mapping of 1 in 1,000, a direct mail campaign could be employed. The amount of effort the recipient is willing to spend depends on their motivation.
  • this effect is countered amongst others by choosing the threshold parameter in the threshold-test in step (C) sufficiently large.
  • this effect is countered by setting a privacy threshold sufficiently large by setting the parameter k and/or the parameter t to appropriate values.
  • CDR call detail records
  • xDR Call/Transaction/Session
  • the invention allows system users to extract business insights from the transactional and other data sources with queries.
  • An example of such a query is "what is the profile of males between 20-25 in store A”. This leads to potential privacy issues and concerns. If queries are too specific, individuals can be singled out. This occurs when a set of features is combined in such a way that it can only point to a specific individual user. For example, in the extreme case a national identification number of an individual user is provided. As a result, this individual user can be uniquely identified. Identifying features such as social security numbers, names, addresses, etc. are called key attributes or key fields. Key attributes are and should always be deleted and/or filtered from datasets as they pose clear privacy concerns.
  • Besides key attributes data may contain quasi-identifiers and sensitive attributes.
  • Quasi-identifiers are features such as zip code, gender, age, etc.
  • Sensitive attributes are features such as medical data or income. Quasi-identifiers pose a privacy issue when their combination allows the singling out of an individual user.
  • a combination of quasi- identifiers creates groups. If a group contains only a single user, then the sensitive attributes of that group can be uniquely linked to a single user.
  • k-anonymity is applied to prevent the possibility of singling out a single user, preferably in combination with t-closeness.
  • Each group or aggregate of individual users regardless of the definition of the group or aggregate, should relate at least to k individual users.
  • k-anonymity is ensured by means of microaggregation, possibly in combination with t-closeness.
  • Microaggregation consists of two steps: partitioning and aggregation. Partitioning groups the raw microdata into clusters based on specific quasi identifiers.
  • Aggregation computes a value for every other (sensitive) attribute and replaces the original value with its aggregated value.
  • This could be, for example, the mean or median (e.g. median income per store) for numeric data and the mode or proportion for categorical data (e.g. proportion of singles per store).
  • Applied on the query example "what is the profile of males between 20-25 in store A”, one could, for example, get the following business insights: "Males between 20-25 in store A have an average income of 2 400 euro, 12 % are single and they spent 23 euro on average.”
  • the system takes into account that the more specific a query gets, the smaller each outcome group or cluster will be. Therefore, the system is provided with a privacy mechanism.
  • the privacy mechanism may include any or any combination of k-anonymity, t-closeness, k-concealment and possibly other privacy mechanisms.
  • k-anonymity prevents that insights on groups with less than k individual users are presented. Yet, queries on other groups can be compared to draw conclusions on the omitted group. This is especially clear when considering the example illustrated in Figure 17 and discussed below. Accordingly, in a preferred embodiment of the system, microaggregation and/or aggregation is performed with a k-anonymity check across subgroups of the same group, e.g. a group corresponding to a company branch, to prevent probabilistic conclusions.
  • privacy filtering relates to t-closeness, which may or may not be considered in combination with k-anonymity. This is motivated by the fact that a k-anonymity check alone is not sufficient to prevent probabilistic conclusions.
  • the distribution of sensitive information can provide important information as well. Important is to what degree the distribution within a microaggregated group differs from the population. For example, if in a group 50% of the people have a specific disease, it does not seem to provide any sensitive information concerning that group. However, if in the overall population 99.99 % of the people do not have the disease, the 50 % is of course highly informative. As such, each microaggregated distribution should be compared to the distribution of the whole data.
  • t-closeness principle This is called the t-closeness principle.
  • Each group is considered to have t-closeness if the distance between a sensitive feature in the group and the sensitive feature in the whole dataset is not larger than t.
  • the way to calculate this distance depends on the case at hand and the features.
  • the value of t, and to the respect the value of k as well, is a trade-off between privacy on the one hand and utility on the other. In the extreme case that there can be no difference between the group feature distribution and the whole data feature distribution, no utility, i.e., no business insight, is retained.
  • the system comprises one or more modules.
  • the modules can be configured for running in any or any combination of the following operational modes: 1. As an internal app for system users such as branch workers, to be able to provide insights to bank clients.
  • the system comprises a customer profiling module.
  • the users concern customers, and the system provides intelligence on the customers of a specific company.
  • the customer profiling module can run one-off analyses, a batch mode or real-time streaming.
  • Said customer profiling module preferably allows the system user to set up his/her branches/franchises for which he/she wants to receive the information or for which he/she wants a comparison.
  • the module preferably returns metrics on the branch performance and/or customer profiles and/or customer loyalty.
  • the system user can use this information for marketing strategy, performance monitoring or campaign monitoring purposes. For example, the system user might want to see the evolution in sales or the customer base over time in one or more of his/her branches/franchises.
  • the system considers the transactions of all the shop's customers, enriches it with their profiles such as age, purchasing power, area they are from or the products they have and runs the privacy checks. If those are OK, the system may return the insights to the system user. Furthermore, in a preferred embodiment, if the system user wants to see profiles of individual groups of shopping users (e.g. the loyal ones, the wealthy ones or the high-spenders), the system considers the transactions of all the shop's clients, enriches it with their profiles, microaggregates them into groups and runs the privacy checks. In an alternative embodiment, the system user may consider the information on what types of shops his/her customers usually go to or split its customer base according to shopping behavior.
  • POI profiles are created by categorizing the POIs into higher level categories according to a categorization method and calculating shopping profiles on a customer basis. Subsequently, the system may follow its standard path, enriching the transactions in the given store of interest, (micro)aggregating and running the privacy checks. After those checks are passed, the insight may be provided to the system user.
  • An example of a detailed schema of said customer profiling module is given in Figure 19 and is discussed below.
  • the system comprises a market insight module.
  • a market insight module allows the system user to obtain intelligence about the market environment. In a preferred embodiment, this may run in a one-off analysis and/or a batch mode and/or real-time streaming.
  • a key matter is the definition of the market which depends on good categorization of all the POIs. For example, the system user might be interested in comparison of his POI to the competition or in seeing his total share of consumer budget. In this case all the transactions are taken into account and after the enrichment and privacy check, they can be aggregated to the POI level. Using the additional POI characteristics, the POIs are correctly categorized.
  • the insights are not created only for his/her own POI but also for the whole reference group (e.g. all the POIs of the same category) so that the system can make comparisons between these two.
  • the system user may get an insight about his customer base being younger or wealthier than the customer base of the whole industry.
  • the market insight module is configured such that the privacy-filtered response can be further split on interesting profiles, e.g. the system user may be interested in how the characteristics of his most loyal customers compare with the most loyal customers of the competition.
  • An example of a detailed schema of said market insight module is given in Figure 20 and is discussed below.
  • a method for deriving tailored transaction data from raw transaction data capable of aggregating as well as anonymizing the transaction data in a single unified approach.
  • the present invention provides a method for deriving tailored transaction data from raw transaction data, wherein said raw transaction data comprises a plurality of raw transactions associated with a plurality of users and is organized in fields and entries, said method comprising the steps of
  • step (d) performing an anonymity test for each of said threshold-tested aggregates and obtaining anonymity-tested transaction data, moving to step (e) if negative and jumping to step (f) in the opposite case;
  • this data representation is a relational SQL database.
  • this data representation may be a plurality of simple tables with a number of rows and columns, whereby each row is associated with a single instance, e.g. a single user or a single transaction, whereby each column is associated with a single field representing a distinct feature with feature name, e.g.
  • said raw transaction data may be provided either in batch or in the form of a stream or feed.
  • the raw transaction data may concern for instance a database file or a static database that is available as input to be used.
  • the raw transaction data may be delivered one by one as input to be used, e.g. after transfer over a network.
  • said tokenizing in step (a) is aimed at excluding all identifying data, i.e. data that could identify an individual entity.
  • said tokenizing in step (i) is aimed at this.
  • this may concern a bank account number.
  • Tokenizing said identifying data is done by obfuscating the original data, in particular the entries for which the associated field is known to be privacy-sensitive and is therefore preferably anonymized.
  • tokenizing is done by means of a hashing algorithm that converts original raw entries associated with privacy-sensitive fields into tokens which cannot be easily connected to the original entries, unless by means of a token index.
  • the hashing algorithm's functioning is such that two tokens generated by the hashing algorithm are non-identical unless the original entries are identical.
  • two tokens generated by the hashing algorithm are always non-identical.
  • the hashing algorithm is associated with a token index or hashing table which allows two-way conversion from original raw entry to token and vice versa, whereby said hashing table or token index is confidential and is not available publicly.
  • Said tokenization has the effect of promoting the anonymity of users. Indeed, by tokenizing privacy-sensitive entries present in the raw transaction data, a data set is obtained which does no longer contain explicit direct reference to individuals. While full anonymity comprises more than only tokenization, necessitating a separate anonymizing step (d), tokenization does yield a data set that is less privacy-sensitive. Likewise, while full anonymity comprises more than only tokenization, necessitating a separate anonymizing step (iv), tokenization does yield a data set that is less privacy- sensitive.
  • step (b) comprises aggregating the transactions to make an abstraction of individual information.
  • step (ii) comprises aggregating the transactions to make an abstraction of individual information.
  • the raw transaction data is aggregated so that data belongs to one or more aggregates.
  • the aggregates are chosen such that they correspond to a certain point-of-interest level.
  • this may e.g. be a store, a chain of stores, a group or a sector.
  • This step is important because it helps structuring the transaction data, grouping the data according to aggregates that are meaningful to whom the tailored transaction data concern. Further, it is an important pre-processing step with respect to the threshold test performed in step (c). Similarly, it is an important pre- processing step with respect to the threshold test performed in step (iii).
  • Step (c) comprises a threshold test using a predefined threshold specifying a minimum number of users per aggregate.
  • step (iii) comprises a threshold test using a predefined threshold specifying a minimum number of users per aggregate. Specifically, the test verifies for each aggregate whether the number of users in an aggregate exceeds some value N, keeping only aggregates for which this is the case, and discarding aggregates for which this threshold value is not reached.
  • the different aggregates identify different stores, and the transaction data of stores is discarded if the number of unique customers visiting the store is too low. This has a beneficial effect for anonymity, since an excessively low number of users in a given aggregate may allow to infer the identity of individuals from the tailored transaction data, which is to be prevented for reasons of privacy.
  • Step (d) comprises an anonymity test.
  • step (iv) comprises an anonymity test.
  • this concerns the k-anonymity test as is known from literature and cited in this document.
  • this comprises a t-closeness test, possibly combined with k-anonymity.
  • k-anonymity a certain value of k is chosen, where the level of attained anonymity increases for increasing k. Possible values for k are between 2 and 100, endpoints included, although larger values are also possible.
  • k is between 3 and 50; more preferably between 5 and 25; even more preferably between 7 and 20; most preferably equal to 9, 10 or 11.
  • the threshold-tested transaction data is tested for level of anonymity according to the k-anonymity concept, implying that a value of each entry within at least one field of the tested data occurs at least k times, and wherein a value of k is such that entries of the output data source match a specified anonymity requirement.
  • an anonymity test not based on k-anonymity such as k-concealment may be used. If the test is positive, and anonymity is found to be sufficient, an iteration is made to step (f), i.e. the categorization. Likewise, if the test is positive, and anonymity is found to be sufficient, an iteration is made to step (vi), i.e. the categorization.
  • step (e) and/or step (v) the level of detail of the anonymity-tested transaction data is reduced.
  • the reduction of the level of detail may comprise any or any combination of the following : suppressing entry values, replacing entry values, changing the bin size (or, related, the feature granularity) of entry values.
  • aggregation-ready transaction data is obtained and fed back to step (b) and/or step (ii). This allows a new cycle of aggregation, threshold- testing and anonymity testing, to be repeated until the specified anonymity requirement is met.
  • step (a) to (e) realize an advanced method of privacy protection, ensuring that after the aggregation it is no longer possible to identify individual customers by reverse engineering the aggregate statistics.
  • step (i) to (v) realize an advanced method of privacy protection, ensuring that after the aggregation it is no longer possible to identify individual customers by reverse engineering the aggregate statistics.
  • the present invention provides a method for deriving tailored transaction data from raw transaction data, wherein said raw transaction data comprises raw additional data relating to said user and/or the product/service to which said raw transactions relate. This allows to obtain enriched data, leading to enriched tailored transaction data and improved business insights for recipients of the tailored transaction data.
  • the present invention includes enhanced tokenizing in step (a), comprising exclusion of privacy sensitive data.
  • the present invention includes enhanced tokenizing in step (i), comprising exclusion of privacy sensitive data. This is beneficial for the anonymity of users.
  • said raw transactions are tokenized independently of said raw additional data and wherein said raw additional data is tokenized independently of said raw transactions, obtaining tokenized transactions and tokenized additional data, respectively.
  • said aggregation- ready transaction data is obtained after joining said tokenized transactions with said tokenized additional data, optionally by using said token to link said tokenized transactions to said tokenized additional data. This yields enriched data, which leads to enriched tailored transaction data.
  • summary data is for instance acquired at the store level both from the transaction data itself (e.g. mean value spent) or from the joined data sources (e.g. personal data of customers performing transactions there).
  • said raw transaction data comprises real- time data, i.e. data that is received real-time over a network. This yields tailored transaction data that is more up to date, and therefore of higher potential business value to the recipients of said tailored transaction data or said privacy-filtered response.
  • said threshold-tested transaction data obtained in step (c) comprises aggregate-linking data.
  • said threshold-tested transaction data obtained in step (iii) comprises aggregate-linking data.
  • said aggregate-linking data is optionally obtained by using said token to link a first portion of said aggregated transaction data belonging to a first provisional aggregate to a second portion of said aggregated transaction data belonging to a second provisional aggregate. This provides for further enrichment of the data, leading to better business insight for the recipient of the tailored transaction data.
  • said rule used in step (f) allows automated categorization.
  • said rule used in step (vi) allows automated categorization.
  • said automated categorization comprising the use of any or any combination or any value derived of the following : external data comprising a merchant category code (MCC), said raw transaction data, said tokenized transactions, said tokenized additional data, said aggregate-linking data, said aggregation-ready transaction data, said provisional aggregate, said threshold-tested transaction data, said threshold-tested aggregate, said anonymity-tested transaction data.
  • MCC merchant category code
  • said tailored transaction data comprises a characteristic of said users and/or characteristics of the product/service to which a transaction comprised in said aggregated transaction data relates, said characteristic including a segmentation based on a criterion taking into account any or any combination or any value derived of the following : external data comprising a merchant category code (MCC), said raw transaction data, said tokenized transactions, said tokenized additional data, said aggregate-linking data, said aggregation-ready transaction data, said provisional aggregate, said threshold-tested transaction data, said threshold-tested aggregate, said anonymity-tested transaction data, said categorized transaction data, a timestamp of said raw transactions, a sequence of a first and second transaction of a first user and a second user.
  • MCC merchant category code
  • the present invention provides a computing system according to the present invention comprising a processor, tangible, non-transitory memory and instructions on said memory instructing said processor to execute said method, a display to visualize said tailored transaction data and/or a printer to produce a print-out of said tailored transaction data and/or a storage medium to store an electronic data file comprising said tailored transaction data, whereby said computing system is configured to execute said method.
  • the present invention provides a tailored transaction data product produced by a computing system as explained above, said tailored transaction data product comprising any or any combination of the following : said print-out of said tailored transaction data, said electronic data file comprising said tailored transaction data.
  • the present invention provides a computer program product to execute the methods explained above, whereby said computer program product comprises at least one computer-readable medium comprising computer-readable program portions, whereby said program portions contain instructions for execution of said method.
  • the present invention relates to following points 1 to 13.
  • Method for deriving tailored transaction data from raw transaction data wherein said raw transaction data comprises a plurality of raw transactions associated with a plurality of users and is organized in fields and entries, said method comprising the steps of (a) tokenizing said raw transaction data with a token, obtaining aggregation- ready transaction data;
  • step (d) performing an anonymity test for each of said threshold-tested aggregates and obtaining anonymity-tested transaction data, moving to step (e) if negative and jumping to step (f) in the opposite case;
  • step (e) reducing the level of detail of said anonymity-tested transaction data and obtaining aggregation-ready transaction data, jumping to step (b);
  • Method according to point 2-3 wherein said raw transactions are tokenized independently of said raw additional data and wherein said raw additional data is tokenized independently of said raw transactions, obtaining tokenized transactions and tokenized additional data, respectively;
  • Method according to point 4 wherein said aggregation-ready transaction data is obtained after joining said tokenized transactions with said tokenized additional data, optionally by using said token to link said tokenized transactions to said tokenized additional data.
  • Method according to point 1-5 wherein said raw transaction data comprises realtime data, i.e. data that is received real-time over a network.
  • step (f) allows automated categorization, said automated categorization comprising the use of any or any combination or any value derived of the following : external data comprising a merchant category code (MCC), said raw transaction data, said tokenized transactions, said tokenized additional data, said aggregate-linking data, said aggregation-ready transaction data, said provisional aggregate, said threshold- tested transaction data, said threshold-tested aggregate, said anonymity-tested transaction data.
  • MCC merchant category code
  • a computer program product to execute a method according to point 10 whereby said computer program product comprises at least one computer-readable medium comprising computer-readable program portions, whereby said program portions contain instructions for execution of said method.
  • Figure 1 illustrates a first embodiment relating to the present invention, in a case where the transactions concern financial transactions. It displays three main aspects summarizing the key aspects of the methods disclosed in this document.
  • a first aspect "User privacy protection”, relates to privacy filtering according to the present invention, such as said filtering in step (C.2) and/or step (i) to (vii) and/or step (a) to (e) according to the present invention.
  • a second aspect, “Categorization of transactions”, relates to step (vi) and/or step (f) according to the present invention.
  • a third aspect, “Generating business insights”, corresponds to the generation of tailored transaction data, and relates to step (vii) and/or step (g) of a method according to the present invention.
  • Figure 2 shows a second embodiment relating to aspects of the present invention. Specifically, said second embodiment relates to a "User privacy protection" aspect of the methods disclosed in this document.
  • the starting point is a plurality of raw financial transactions (1.1).
  • This data is tokenized as to exclude all data that could identify an individual entity (1.2, e.g. a bank account number is tokenized).
  • alternative data sources (1.3 such as client characteristics and product ownerships) are tokenized as well (1.4, e.g. tokenizing bank account number and client number, but also excluding features such as name and address, e.g. street + house number).
  • the alternative data sources are joined to the transactions data to enrich it (1.5).
  • the tokenization entails the first layer of the user privacy protection.
  • the second layer of privacy protection is aggregating the transactions to make an abstraction of individual information.
  • the enriched transaction data is aggregated to a point-of-interest level (1.6, e.g. a store, a chain of stores, a group or a sector - in the rest of the description, the example of store will be used to refer to a point of interest). Only when transactions of more than N unique clients (a predefined threshold) are observed in this store, aggregate statistics are calculated for the store, otherwise the transactions are discarded.
  • a third and final layer of privacy protection validates on our customer base that after the aggregation it is no longer possible to identify individual customers by reverse engineering the aggregate statistics (e.g.
  • summary data is acquired at the store level originating from the transaction data itself (e.g. mean value spent) and/or from the joined data sources (e.g. personal data of customers performing transactions there).
  • Figure 3 shows a third embodiment relating to aspects of the present invention.
  • three layers occur which correspond to the three layers mentioned for the second embodiment.
  • the main difference with the second embodiment is that in the third embodiment, while the anonymity concept may be any or any combination of the concepts mentioned in this document, the preferred anonymity concept is k-anonymity.
  • the anonymity concept is k-anonymity.
  • no preference is given for the anonymity concept.
  • Figure 4 shows a fourth embodiment relating to aspects of the present invention, relating to a "Categorization of transactions" aspect of the methods disclosed in this document.
  • Figure 4 shows a flow of categorization for the fourth embodiment, comprising a specific type of categorization referred to as tagging.
  • tags concern specific details about entities whereas categories that are not tags serve to assign a broad grouping of entities.
  • the fourth embodiment concerns a specific example with stores as point-of-interests. To be able to aggregate to a higher level than a store, the stores are categorized into meaningful groups (mainly based on business purpose of the store but also on other dimensions such as level of luxury or geo-location).
  • LSM living standards measure
  • marital status is also possible to identify where the customers usually shop before and after the visit of the store.
  • LSM living standards measure
  • This insights can be provided by performing a tailor-made analysis (3.1) or industrialized by creating a dashboard tool (3.2).
  • abstraction can be made on an industry level and trends can thus be extracted (3.3).
  • Figure 6 shows a sixth embodiment relating to aspects of the present invention, relating to a "User privacy protection" aspect of the methods disclosed in this document.
  • raw transactions enter the system in streaming or batch mode.
  • the other data sources enter the system in batch, as they tend to be of a static nature (e.g., demographics).
  • a hashing algorithm (or any related tokenization algorithm) creates a token based on a primary key that can link multiple data sources (such as bank account number).
  • the other data sources include an exclusion step as well, to remove privacy- sensitive features such as first- and last names.
  • Figure 7 shows a seventh embodiment relating to aspects of the present invention, relating to a "User privacy protection" aspect of the methods disclosed in this document. Specifically, Figure 7 illustrates an embodiment of the tokenization operation as carried out on example data.
  • account numbers present in the raw transaction data can be obfuscated by applying a form of tokenization whereby said account numbers are hashed, in this case yielding unique alphanumeric strings that are in a bijective relation with the original raw account numbers.
  • Figure 8 shows an eighth embodiment relating to aspects of the present invention, relating to a "User privacy protection" aspect of the methods disclosed in this document. Specifically, Figure 8 illustrates another embodiment of the tokenization operation as carried out on example data. As illustrated in Figure 8, tokenization may comprise both the hashing of account numbers and the exclusion of certain privacy-sensitive features. In this example, the features "Firstname”, “Lastname” and “Address" are excluded.
  • FIG. 9 shows a first view of a ninth embodiment relating to aspects of the present invention, relating to a "User privacy protection" aspect of the methods disclosed in this document.
  • LOI level-of-interest
  • the specific LOI entity is added to the data (e.g., a store identifier is added to each enriched transaction).
  • a gatekeeper groups transactions on LOI and lets them through for further analyses as soon as the number of transactions exceed a predefined threshold.
  • Summary statistics are calculated on the LOI (e.g., average amount spent, proportion of males).
  • Tokens can be used to track purchase links between different LOI entities (e.g., on average, if a client goes to store A, to which other stores does this client go as well).
  • Figure 10 shows a second view of said ninth embodiment relating to aspects of the present invention, relating to a "User privacy protection" aspect of the methods disclosed in this document.
  • a k-anonymity check is performed. If an aggregated transaction passes the check, it transgresses to the categorization part of the system. If it does not pass the check, two options can be used. Features can be dropped, after which the k-anonymity check is performed again. Or the feature granularity, comprising also the bin size, is altered. This takes place in the original enriched transactions, meaning the transactions need to be re-aggregated.
  • k- anonymity is combined with or replaced by t-closeness to realize a similar aim.
  • Figure 11 shows a tenth embodiment relating to aspects of the present invention, relating to a "User privacy protection" aspect of the methods disclosed in this document.
  • the tenth embodiment relates to the reduction of the level of detail of transaction data.
  • the reduction of the level of detail may comprise any or any combination of the following : suppressing entry values, replacing entry values, changing the bin size (or, related, the feature granularity) of entry values.
  • aggregation-ready transaction data and is subject to anonymity testing one or more times, reducing the level of detail until a specified anonymity requirement is finally met.
  • the bin size of the age is altered, moving from single unit granularity to multiples of twenty. Due to this reduction of the level of detail, in terms of age, there is no identifiable difference anymore between the record associated with token "5BAD9EM" and the record associated with token "AA09MNJ".
  • Figure 12 shows an eleventh embodiment relating to aspects of the present invention, relating to a "User privacy protection" aspect of the methods disclosed in this document.
  • the eleventh embodiment relates to the reduction of the level of detail of transaction data in a way complementary to the tenth embodiment.
  • two features are dropped to reduce the level of detail : "Prop. Males”, short for "Proportion of Males", and "Prop. Single”, short for "Proportion of Singles”.
  • Figure 13 shows a twelfth embodiment relating to aspects of the present invention, relating to a "Categorization” aspect of the methods disclosed in this document.
  • a rule management tool is on top of the aggregated transactions.
  • the categories attached to the transactions in this tool can originate from three sources: (1) A mapping from MCC/Golden Pages/ Activity Codes/... to the set of predefined categories, (2) A generalization step to uses both the aggregated transactions and the mapping in (1) (e.g., "...PIZZA" is category Restaurant), and (3) A step in which experts can interact with the rule management tool to add, alter and maintain rules (originating from (1) and (2)).
  • Rule LI is applied only if there is no rule L2 or L3 and rule L2 is applied only if there is no rule L3.
  • Figure 14 shows a thirteenth embodiment relating to aspects of the present invention, relating to a "Categorization" aspect of the methods disclosed in this document, specifically the definition of a rule.
  • Different data sources can be mapped to a predefined set of categories, which in turn defines a rule.
  • Figure 15 shows a fourteenth embodiment relating to aspects of the present invention, relating to a "Categorization” aspect of the methods disclosed in this document.
  • the aggregated transactions that have an LI rule can be used to create an L2 rule.
  • the transaction is cleaned. This means that the LOI entity (store name in the example) is cleaned and for example al non alpha characters are dropped.
  • parts are extracted from the name. Typically, the store name is split in separate words.
  • Third, the parts are converted into words.
  • a filter is applied that defines a rule. For example, the proportion of the most frequent category should be more than 0.8 and the total spent should be larger than 140k. Furthermore, additional filters can be added to make sure the second most frequent category is below a predefined threshold.
  • Figure 16 shows a fifteenth embodiment relating to aspects of the present invention, relating to a "Categorization" aspect of the methods disclosed in this document, specifically the relation between rules. Also here, there is an authority level between the rules: LI ⁇ L2 ⁇ L3. Rule LI is applied only if there is no rule L2 or L3 and rule L2 is applied only if there is no rule L3. Rule LI and L2 can be adjusted by an expert who controls rule L3. Furthermore, even completely new rules can be created for uncategorized transactions.
  • Figure 17 shows sixteenth embodiment relating to aspects of the present invention.
  • Two stores, A and B and a group of users is considered.
  • the gender is known to the system, as well as information regarding their age group.
  • one first launches a query to receive the mean income of clients in store B.
  • the query may see this income split according to gender.
  • store B is targeted mainly at males, a sufficiently large number of users may belong to this category. This is indicated with a "+" symbol in Figure 17.
  • the group of store B - females is smaller than k. This is indicated with a symbol "*" in Figure 17. Hence, its result is omitted.
  • microaggregation and/or aggregation is performed with a k-anonymity check across subgroups of the same branch to prevent probabilistic conclusions, optionally in combination with t-closeness.
  • Figure 18 shows a seventeenth embodiment relating to aspects of the present invention. It illustrates an embodiment with a particular way in which privacy protection is embedded in the system.
  • the gatekeeper also referred to as privacy filter or privacy mechanism in this document, prevents the "publication" of privacy sensitive data.
  • the combination of k-anonymity and t-closeness makes sure that the privacy of individual users is maintained, while still yielding sufficiently detailed output in the privacy-filtered response, allowing for business insights.
  • Figure 19 shows an eighteenth embodiment relating to aspects of the present invention. It concerns an example of a detailed schema of said customer profiling module. The schema can be described according to following stages 1 to 4.
  • the general settings such as the privacy parameters that can differ according to legislations in different countries.
  • these include e.g. the parameter k for k-anonymity, the parameter t for t-closeness
  • system user input such as how the points of interest are defined in the transactions, which types of customers and output attributes he/she is interested in
  • customer data to enrich the transactions, e.g. the demographic and financial data of the customers making the transactions, their profiles, habits or other transactions
  • the system identifies the points of interest directly given their ids (e.g. given the combination of id and name of the terminal)
  • the transaction data is subsequently enriched by joining with the other data.
  • the privacy mechanism is applied.
  • the privacy mechanism takes into account the settings parameters and the only the attributes and filters that comply with it are allowed to pass through. Some of the attributes/filters might need to be more generalized to be allowed to pass.
  • a filter refers to a desired attribute or field that is specified by the system user, and may or may not appear as such in the raw transaction data or the other data.
  • This type of "filtering” is to be distinguished from the concept of "privacy filtering" as described in this document. Examples of such a filter are a gender, e.g . "Male”, or a day of the week, e.g. "Sunday", as shown in Figure 19.
  • the attributes that passed the privacy mechanism land in the attribute layer. This is a stage where all attributes are stored that give information about e.g. sales in individual stores of the brand of the system user, profiles of customers in these shops in different age groups or at different times of the day.
  • the attributes can be prepared on different levels of detail, e.g. one store, one region or all the stores of interest.
  • the comparison layer can create new metrics from combinations of attributes, e.g. performance of a shop versus all shops in the brand, average age of all men versus all women.
  • it can also forecast sales in the next period or watch trends by comparing sales in different points in history.
  • the visualization layer prepares the insights to be consumed by the system user in forms of charts, tables and written insights.
  • the insights are served via an interactive computer application and/or via a generated report and/or via consultation with a bank expert.
  • Figure 20 shows a nineteenth embodiment relating to aspects of the present invention. It concerns an example of a detailed schema of said market insights module.
  • the schema can be described according to following stages 1 to 4. - Stage 1.
  • the main inputs to the system are
  • the general settings such as the privacy parameters that can differ according to legislations in different countries.
  • these include e.g. the parameter k for k-anonymity, the parameter t for t-closeness
  • system user input such as how the points of interest are defined in the transactions, which types of customers and output attributes he/she is interested in
  • customer data to enrich the transactions, e.g. the demographic and financial data of the customers making the transactions, their profiles, habits or other transactions
  • the system identifies the points of interest directly given their ids (e.g. given the combination of id and name of the terminal) or based on a categorization procedure.
  • the transaction data is subsequently enriched by joining with the other data.
  • the privacy mechanism is applied.
  • the privacy mechanism takes into account the settings, and only the attributes and filters that comply with it are allowed to pass through. Some of the attributes/filters might need to be more generalized to be allowed to pass.
  • a filter refers to a desired attribute or field that is specified by the system user, and may or may not appear as such in the raw transaction data or the other data. This type of "filtering" is to be distinguished from the concept of "privacy filtering". Examples of such a filter are a gender, e.g. "Male” or “Female", or a day of the week, e.g. "Sunday", as shown in Figure 20.
  • the attributes that passed the privacy mechanism land in the attribute layer. This is a stage where all attributes are stored that give information about e.g. sales in individual stores of the brand of the system user, sales of other brands, profiles of customers in own or other shops in different age groups or at different times of the day.
  • the attributes can be prepared on different levels of detail, e.g. one store, one region or all the stores in one category/industry.
  • the comparison layer can create new metrics from combinations of attributes, e.g. the market share of a given brand, average age of all men versus all women. Preferably, it can also forecast sales in the next periods or watch trends by comparing sales in different points in history for different market players.
  • the visualization layer prepares the insights to be consumed by the system user in forms of charts, tables and written insights.
  • the insights can be served via an interactive computer application and/or via a generated report and/or via consultation with a bank expert.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un système informatique destiné à obtenir une réponse filtrée en confidentialité à une interrogation d'un utilisateur, le système informatique comprenant un serveur, le serveur comprenant un processeur de serveur, une mémoire de serveur non volatile tangible, un code de programme de serveur présent sur ladite mémoire de serveur pour délivrer des instructions audit processeur de serveur ; un support lisible par ordinateur, le support lisible par ordinateur comprenant une base de données, ladite base de données comprenant des paramètres de confidentialité comprenant un seuil de confidentialité ; un dispositif, ledit dispositif comprenant un processeur de dispositif, une mémoire de dispositif non volatile tangible, un code de programme de dispositif présent sur ladite mémoire de dispositif pour délivrer des instructions audit processeur de dispositif. Ledit serveur est configuré pour recevoir des données de transaction brutes de la part d'une source externe telle qu'une base de données de transaction brute ou une source de transaction brute.
PCT/EP2017/053921 2016-05-13 2017-02-21 Système pour récupérer des informations filtrées en confidentialité de données de transaction Ceased WO2017194214A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/301,241 US10798066B2 (en) 2016-05-13 2017-02-21 System for retrieving privacy-filtered information from transaction data
EP17705439.2A EP3455816A1 (fr) 2016-05-13 2017-02-21 Système pour récupérer des informations filtrées en confidentialité de données de transaction

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201662335934P 2016-05-13 2016-05-13
US62/335,934 2016-05-13
EP16169721 2016-05-13
EP16169721.4 2016-05-13

Publications (1)

Publication Number Publication Date
WO2017194214A1 true WO2017194214A1 (fr) 2017-11-16

Family

ID=55969052

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2017/053921 Ceased WO2017194214A1 (fr) 2016-05-13 2017-02-21 Système pour récupérer des informations filtrées en confidentialité de données de transaction

Country Status (2)

Country Link
EP (1) EP3455816A1 (fr)
WO (1) WO2017194214A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11087026B2 (en) 2019-02-19 2021-08-10 International Business Machines Corporation Data protection based on earth mover's distance
CN114297711A (zh) * 2021-12-27 2022-04-08 电子科技大学广东电子信息工程研究院 一种基于云端服务器的数据安全保护方法
WO2022112158A1 (fr) * 2020-11-24 2022-06-02 Collibra Nv Systèmes et procédés d'analyse de données
US12174993B1 (en) * 2020-06-30 2024-12-24 Cable Television Laboratories, Inc. Systems and methods for advanced privacy protection of personal information

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7269578B2 (en) 2001-04-10 2007-09-11 Latanya Sweeney Systems and methods for deidentifying entries in a data source
WO2010141270A2 (fr) 2009-06-01 2010-12-09 Visa U.S.A. Systèmes et procédés pour résumer des données de transaction
US8626705B2 (en) 2009-11-05 2014-01-07 Visa International Service Association Transaction aggregator for closed processing
US20140089041A1 (en) 2012-09-27 2014-03-27 Bank Of America Corporation Two sigma intelligence
US20140281572A1 (en) * 2013-03-14 2014-09-18 Mitsubishi Electric Research Laboratories, Inc. Privacy Preserving Statistical Analysis on Distributed Databases
WO2015077542A1 (fr) * 2013-11-22 2015-05-28 The Trustees Of Columbia University In The City Of New York Dispositifs, procédés et systèmes de protection de la confidentialité de bases de données

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7269578B2 (en) 2001-04-10 2007-09-11 Latanya Sweeney Systems and methods for deidentifying entries in a data source
WO2010141270A2 (fr) 2009-06-01 2010-12-09 Visa U.S.A. Systèmes et procédés pour résumer des données de transaction
US8626705B2 (en) 2009-11-05 2014-01-07 Visa International Service Association Transaction aggregator for closed processing
US20140089041A1 (en) 2012-09-27 2014-03-27 Bank Of America Corporation Two sigma intelligence
US20140281572A1 (en) * 2013-03-14 2014-09-18 Mitsubishi Electric Research Laboratories, Inc. Privacy Preserving Statistical Analysis on Distributed Databases
WO2015077542A1 (fr) * 2013-11-22 2015-05-28 The Trustees Of Columbia University In The City Of New York Dispositifs, procédés et systèmes de protection de la confidentialité de bases de données

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
EUROSTAT: "MANUAL ON DISCLOSURE CONTROL METHODS", 1 January 1996 (1996-01-01), XP055288634, Retrieved from the Internet <URL:http://ec.europa.eu/eurostat/ramon/statmanuals/files/manual_on_disclosure_control_methods_1996.pdf> [retrieved on 20160714] *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11087026B2 (en) 2019-02-19 2021-08-10 International Business Machines Corporation Data protection based on earth mover's distance
US12174993B1 (en) * 2020-06-30 2024-12-24 Cable Television Laboratories, Inc. Systems and methods for advanced privacy protection of personal information
WO2022112158A1 (fr) * 2020-11-24 2022-06-02 Collibra Nv Systèmes et procédés d'analyse de données
US12056763B2 (en) 2020-11-24 2024-08-06 Collibra Belgium Bv Systems and methods for data enrichment
CN114297711A (zh) * 2021-12-27 2022-04-08 电子科技大学广东电子信息工程研究院 一种基于云端服务器的数据安全保护方法

Also Published As

Publication number Publication date
EP3455816A1 (fr) 2019-03-20

Similar Documents

Publication Publication Date Title
US10798066B2 (en) System for retrieving privacy-filtered information from transaction data
US8244573B2 (en) Dynamic marketing system and method
CN105531691B (zh) 用于标识数据值之间的隐私相关相关性的方法和装置
US20130317886A1 (en) Customer Experience Management System Using Dynamic Three Dimensional Customer Mapping and Engagement Modeling
DeVaro Performance pay, working hours, and health‐related absenteeism
US20110178845A1 (en) System and method for matching merchants to a population of consumers
US20140317756A1 (en) Anonymization apparatus, anonymization method, and computer program
US20090271246A1 (en) Merchant recommendation system and method
AU2013295603A1 (en) Systems and methods of aggregating consumer information
US12175496B1 (en) Systems and methods of a tracking analytics platform
US11620673B1 (en) Interactive estimates of media delivery and user interactions based on secure merges of de-identified records
AU2010292843A1 (en) Audience segment estimation
US10860621B1 (en) Systems and methods for database management
ES3009866T3 (en) A system and method of reconstructing browser interaction from session data having incomplete tracking data
US12198072B2 (en) Predicting customer lifetime value with unified customer data
WO2017194214A1 (fr) Système pour récupérer des informations filtrées en confidentialité de données de transaction
CA3007260A1 (fr) Mise en grappe intelligente et mise a jour de la grappe
US20130117037A1 (en) Goal Tracking and Segmented Marketing Systems and Methods with Network Analysis and Visualization
Spil et al. Business intelligence in healthcare organizations
US9390195B2 (en) Using a graph database to match entities by evaluating boolean expressions
Chen et al. What Affects Perceived Health Risk Attitude During the Pandemic: Evidence From Migration and Dining Behavior in China
Tarcan et al. Role of the demographic factors in the process of hotel information systems adoption
Li et al. Incorporating both positive and negative association rules into the analysis of outbound tourism in Hong Kong
US12511433B2 (en) Interactive estimates of media delivery and user interactions based on secure merges of de-identified records
Ghosal et al. Envisioning the impact of online shopping: An antecedent to newfangled Social fabric through Digital consumerism

Legal Events

Date Code Title Description
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17705439

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017705439

Country of ref document: EP

Effective date: 20181213