WO2008122181A1 - Procédé et dispositif pour la mise à jour de paramètres, et procédé et dispositif pour afficher un mot-clé associé - Google Patents

Procédé et dispositif pour la mise à jour de paramètres, et procédé et dispositif pour afficher un mot-clé associé Download PDF

Info

Publication number
WO2008122181A1
WO2008122181A1 PCT/CN2007/070573 CN2007070573W WO2008122181A1 WO 2008122181 A1 WO2008122181 A1 WO 2008122181A1 CN 2007070573 W CN2007070573 W CN 2007070573W WO 2008122181 A1 WO2008122181 A1 WO 2008122181A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
feature value
primary
search
searched
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2007/070573
Other languages
English (en)
French (fr)
Inventor
Lei Pan
Yuanhu Yao
Zhen Yang
Tianji Zhang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to US12/594,930 priority Critical patent/US8676811B2/en
Priority to EP07801005A priority patent/EP2136302A4/en
Priority to JP2010502405A priority patent/JP5238800B2/ja
Publication of WO2008122181A1 publication Critical patent/WO2008122181A1/zh
Anticipated expiration legal-status Critical
Priority to US14/160,364 priority patent/US8874588B2/en
Priority to US14/496,256 priority patent/US9135370B2/en
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3322Query formulation using system suggestions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90324Query formulation using system suggestions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles

Definitions

  • the present invention relates to the field of data processing, and in particular, to a method and apparatus for generating update parameters, and a method and apparatus for displaying related keywords.
  • search tool or engine searches in the existing index database and returns the search result.
  • the existing search tool or engine will still be in the search tool.
  • One or more related keywords related to the keyword (ie, the main keyword) input by the user are displayed in the current page or the search result page.
  • the present invention provides a method and apparatus for generating update parameters to make the use of keywords more in line with the user's usage trends.
  • another technical problem solved by the present invention is to provide a method and apparatus for displaying related keywords to ensure that the user can obtain related keywords simply and comprehensively.
  • the embodiment of the present invention discloses a method for generating an update parameter, which includes:
  • Statistics are performed on the search keywords to obtain a primary keyword, a related keyword, a number of times the primary key word and the related keyword are simultaneously searched, and a number of times the primary keyword is separately searched;
  • the second feature value is an update parameter determining a manner in which the related keyword is displayed.
  • the method further includes:
  • the main keyword, the related keyword, and the second feature value are recorded to form a keyword information table.
  • the step of calculating the search keyword, the step of calculating the first feature value, and the step of calculating the second feature value are multi-thread parallel operations.
  • the method further includes: removing the search keyword that meets the filtering rule.
  • the second characteristic value is calculated by the following steps:
  • the content of the keyword information table further includes:
  • the first feature value corresponding to the main keyword is the first feature value corresponding to the main keyword.
  • the search keyword includes a search keyword used by the search user and a release keyword used by the publishing user.
  • An embodiment of the present invention further discloses an apparatus for generating an update parameter, including:
  • the obtaining unit is configured to acquire a search keyword used by the user in the preset time period
  • Statistical unit used to perform statistics on the search keyword, obtain the main keyword, and correlate Key words, the number of times the primary keyword and the related keyword are simultaneously searched, and the number of times the primary keyword is individually searched;
  • a first calculating unit configured to calculate a first feature value according to the number of times the primary keyword is individually searched;
  • a second calculating unit configured to calculate a second feature value according to the first feature value and a number of times the primary keyword and the related keyword are simultaneously searched, where the second feature value is used to determine a manner in which the related keyword is displayed Update parameters.
  • the device further includes:
  • Recording unit configured to record the main keyword, the related keyword and the second feature value to form a keyword information table.
  • the device further includes:
  • the second calculating unit Connected to the statistics unit to remove search keywords that match the filter rules.
  • the second calculating unit comprises:
  • Correlation calculation sub-unit configured to calculate a correlation degree according to the number of times the main keyword and the related keyword are simultaneously searched;
  • Obtaining a calculation subunit configured to acquire a first feature value in the cache, and calculate a second feature value according to the first feature value and the correlation.
  • the device further includes:
  • Adding unit connected to the recording unit, for recording a corresponding first feature value in the keyword information table.
  • the embodiment of the invention further discloses a method for displaying related keywords, including:
  • the primary keyword is obtained by calculating the number of times the related keyword is simultaneously searched, and the first feature value is obtained according to the number of times the primary keyword is searched separately; the primary keyword is simultaneously with the related keyword
  • the number of times of searching and the number of times the main keyword is individually searched are obtained by counting the search keyword statistics;
  • the method further includes: Showing related keywords whose second eigenvalue is less than a certain threshold.
  • the process of acquiring the related keyword whose second feature value is greater than or equal to a certain threshold value is: a process of acquiring a related keyword whose second feature value is greater than or equal to a certain threshold value in the keyword information table,
  • the keyword information table includes a primary keyword, a related keyword, and a second feature value.
  • An embodiment of the present invention further discloses an apparatus for displaying related keywords, including:
  • the interface unit is configured to send a request for extracting the corresponding related keyword according to the main keyword input by the user;
  • a related keyword acquiring unit configured to acquire, according to the request, a related keyword that the second feature value is greater than or equal to a certain threshold value, where the second feature value is based on the first feature value and the primary keyword and the related keyword
  • the number of times of the search is calculated, and the first feature value is obtained according to the number of times the primary keyword is searched separately; the number of times the primary keyword and the related keyword are simultaneously searched and the primary keyword are separately
  • the number of searches is obtained by counting the search keyword statistics;
  • the first display unit is configured to display the related keywords.
  • the device further includes:
  • the second display unit is connected to the related keyword acquisition unit, and is configured to display related keywords whose second feature value is smaller than a certain threshold value.
  • the embodiment of the present invention performs statistics and calculation based on the search keywords used by the user in the preset time period, which is beneficial to guarantee the timeliness of the keywords, and determines the related keyword display manner by using the second feature value.
  • the parameters are such that the relevant keywords that match the current usage trend are preferentially provided to the user, so that the user has a better operating experience.
  • FIG. 1 is a flow chart of an embodiment of a method of generating an update parameter of the present invention
  • FIG. 2 is a structural diagram of an embodiment of an apparatus for generating an update parameter according to the present invention
  • FIG. 3 is a flow chart of an embodiment of a method for displaying related keywords according to the present invention.
  • FIG. 4 is a structural diagram of an embodiment of an apparatus for displaying related keywords according to the present invention.
  • the embodiment of the invention updates the correlation coefficient between the primary keyword and the corresponding related keyword, and controls the output of the corresponding related keyword to make the related keyword more in line with the user's search requirement.
  • FIG. 1 a flowchart of an embodiment of a method for generating an update parameter according to the present invention is shown, which specifically includes the following steps:
  • Step 101 Obtain a search keyword used by the user in the preset time period
  • the preset time period may be preset by a person skilled in the art according to requirements, for example, for a shopping website, in order to make the corresponding product keyword conform to the usage trend of the user, the time period may be preset to one week or one month, and the like.
  • the search keyword may be from a database, a script program, a local program, a user input history, a storage unit of a client, a server, or other device, and the like, which is not limited by the present invention.
  • Step 102 Perform statistics on the search keyword to obtain a primary keyword, a related keyword, a number of times the primary keyword and the related keyword are simultaneously searched, and a number of times the primary keyword is searched separately;
  • the keyword content in the prior art is fixed, the time is formed earlier, the update is slow, and the real-time update or usage of the content in the network is not consistent, which is inconsistent with the user's use requirement. Therefore, by counting the main keywords and related keywords, the user can be guaranteed to obtain the corresponding keywords that match their usage trends.
  • the method of obtaining the primary keyword, the related keyword, the number of times the primary keyword and the related keyword are simultaneously searched, and the number of times the primary keyword is separately searched may be any in the prior art.
  • a method for example, using the search keyword as a primary keyword, and then using a keyword that is searched simultaneously with the primary keyword as a related keyword, and then separately counting the primary keyword and the related keyword simultaneously The number of searches and the number of times the primary keyword was searched separately.
  • the basic process of the Apriori algorithm is as follows: (1) Scan the transaction database to find all items with a support degree not less than the minimum support, that is, frequent itemsets. L1; (2) connect the items in the L1; (3) scan the transaction database: filter the set in L1, find L2 in L1 with a support not less than the minimum support; (4) pair L2 Make a connection; (5) Scan the transaction database: Filter the collection in L2 to find L3 whose support is not less than the minimum support in L2; and so on.
  • the obtained search keywords are as shown in Table 1; the main keywords are obtained and the number of times they are searched separately As shown in Table 2; further statistics on related keywords that are searched simultaneously with the main keyword, and the number of times they are simultaneously searched are shown in Table 3.
  • the corresponding primary keywords, related keywords, the number of times the primary keywords and related keywords are simultaneously searched, and the number of times the primary keywords are individually searched are obtained.
  • the embodiment may further comprise the step of: removing the search keywords that meet the filtering rules.
  • the filtering rule may be preset by a person skilled in the art according to experience or needs, and it is assumed that for the item in which the filtering rule is set to remove the main keyword and the related keyword simultaneously less than 2, the matching condition is obtained.
  • the items are shown in Table 4.
  • the filtering rule may also be set to remove the search keyword as an illegal keyword or a search keyword including an illegal character or an illegal word in the search key word, which is not limited by the present invention.
  • Step 103 Calculate a first feature value according to the number of times the primary keyword is searched separately.
  • the first feature value may be understood as a value indicating a keyword popularity.
  • Main and related keywords are the same as the main keyword alone.
  • the popularity base is an intermediate value of the number of times the primary keyword is searched separately.
  • 10% of the primary keywords are searched separately for 10 times, and 80% of the primary keywords are searched separately.
  • the number of times is 20 times, and the number of times the 10% of the main keywords are searched separately is 50 times, and 20 is the popularity base.
  • the person skilled in the art can preset according to experience or needs, and the present invention does not limit this.
  • the first feature value and the method for calculating the first feature value can also be set by a person skilled in the art according to needs or experience.
  • the above method is only used as an example, and the present invention is not limited thereto.
  • Step 104 Calculate a second feature value according to the first feature value and a number of times the primary keyword and the related keyword are simultaneously searched, and the second feature value is an update parameter that determines a manner in which the related keyword is displayed.
  • the second feature value obtained by calculating the number of times the first feature value and the primary keyword and the related keyword are simultaneously searched may be used as a representation form of the correlation coefficient.
  • Sub-step A1 calculating a correlation degree according to the number of times the primary keyword and the related keyword are simultaneously searched;
  • Sub-step A2 Acquire a first feature value in the cache, and calculate a second feature value according to the first feature value and the correlation.
  • the correlation degree is calculated according to the number of times the main keyword and the related keyword are simultaneously searched, wherein a formula for calculating the correlation degree is:
  • the correlation base is an intermediate value of the number of times the primary keyword and the related keyword are simultaneously searched, and may be preset by a person skilled in the art according to experience or needs, and the present invention does not limit this.
  • the first feature value may be saved in the cache, and when the second feature value is calculated, the corresponding first feature value is directly obtained from the cache, and according to the first A feature value and a correlation value are calculated corresponding to the second feature value.
  • obtaining data from the cache is much faster than obtaining data from a database or other device. Therefore, preferred embodiments of the present invention can provide the present invention with better computing performance.
  • the method saved in the cache may be saved in the form of a hash table, or may be saved in a file format, or may be saved in other manners; in order to facilitate the acquisition of the first feature value, the primary key may also be
  • the word setting is an optimization operation such as ascending ordering. Of course, the present invention does not need to limit the method of the optimization.
  • the first eigenvalue and the correlation are also weighted separately, and the weighted result is used as the second eigenvalue.
  • the first eigenvalue of the keyword e bike is 0.05
  • the weight of the first eigenvalue is 0.4
  • the correlation between the primary keyword bid and the related keyword e bike is 0.2
  • the weight of the correlation is 0.6.
  • the weighting can be preset by a person skilled in the art according to experience or needs, and can also be arbitrarily changed according to the needs of the user, which is not limited by the present invention. In order to ensure the consistency of the calculation results, you can set the sum of multiple weight values to 1, or other values.
  • the second feature value is an update parameter that determines a manner in which the related keyword is displayed.
  • the related keyword with the second feature value greater than or equal to a certain threshold value is preferentially displayed or fixedly displayed, and the second feature value is used.
  • the related keywords that are less than a certain threshold are displayed in a round robin manner or not.
  • the manner in which the related keywords are displayed according to the second characteristic value may be arbitrarily set by a person skilled in the art according to needs or experience, and the present invention does not limit this.
  • the statistical step of the search keyword, the first eigenvalue calculation and the second eigenvalue calculation step are multi-thread parallel operations, thereby effectively improving the calculation performance and the calculation efficiency of the system.
  • the multi-threading is a mechanism: it allows multiple instruction streams to be executed concurrently in a program, each instruction stream is called a thread, and the multiple threads are independent of each other.
  • the execution of multiple threads is concurrent, that is, logically "simultaneously", that is, the multi-threaded operation is a case where there are N execution bodies at the same time and are jointly operated by several different execution clues.
  • the thread of the search keyword is calculated, the thread of the calculated feature value (including the first feature value and the second feature value) is operated in parallel, and the corresponding keyword is processed cyclically, thereby effectively improving the implementation of the present invention.
  • the embodiment of the present invention may further include the steps of: recording the primary keyword, the related keyword, and the second feature value.
  • the record is recorded in a form, recorded in a file, or recorded in any other manner.
  • the keyword information table is formed by recording the main keyword, the related keyword, and the second feature value, and only needs to clear the data in the keyword information table at the next update, according to the present invention.
  • the method described in the embodiment can refill the relevant data.
  • the update may be a periodic update, a real-time update, or an alternate update, for example, an update is initiated every month, or may be arbitrarily updated by those skilled in the art, and the present invention is not limited thereto.
  • the user can obtain a direct search prompt when searching.
  • the present invention may further comprise the step of: recording a corresponding first feature value in the keyword information table. Thereby effectively improving the intelligence of the search tool.
  • the user typically includes a buyer user and a seller user, in which case the search keywords may include searching for search keywords used by the user and publishing the published keywords used by the user.
  • the search keyword includes the search keyword (first search keyword) used by the search user and the release keyword (second search keyword) used by the posting user.
  • first search keyword used by the search user
  • second search keyword used by the posting user.
  • Step A Acquire a first search keyword in a first script program of a preset time period, where the source of the first search keyword is a key used by the user in the range of opening the browser to closing the browser. word. For example, when the user opens the browser at one time, searches the search box multiple times, and inputs a plurality of keywords, then the keyword is the first search keyword obtained in this example. Obtaining, by counting the first search keyword, a first primary keyword, a first related keyword, a number of times the first primary keyword and the first related keyword are simultaneously searched, and the first primary keyword is separately The number of searches is shown in Table 6:
  • Step B Obtain a second search keyword in the second script program of the preset time period, where the source of the second search keyword is a keyword input by the user when the product is released, because usually three or more versions may be issued. Keywords, then the keywords are obtained. Obtaining, by the second search keyword, a second primary keyword, a second related keyword, a number of times the first primary keyword and the first related keyword are simultaneously searched, and the second primary keyword is separately searched The number of times is shown in Table 7: Table 7
  • Step c Calculate the first eigenvalue:
  • Step D Calculate the relevance:
  • the first keyword and the first feature value of the second keyword are the same, in order to improve the calculation efficiency, only one first feature value is used to participate in the calculation, and obviously, two first feature values are taken into the calculation, and Give different weights, or get the same calculation results.
  • Step E Record the main keyword, the related keyword, and the second feature value, and form a keyword information table as shown in Table 8:
  • FIG. 2 a block diagram of an embodiment of an apparatus for generating update parameters of the present invention is shown, including:
  • the obtaining unit 201 is configured to: acquire a search keyword used by the user in the preset time period; and the statistic unit 202 is configured to perform statistics on the search keyword, obtain the main keyword, the related keyword, the primary keyword, and the related The number of times the keyword is simultaneously searched and the number of times the primary keyword is searched separately;
  • a first calculating unit 203 configured to calculate a first feature value according to the number of times the primary keyword is separately searched;
  • the second calculating unit 204 is configured to calculate a second feature value according to the first feature value and a number of times the primary keyword and the related keyword are simultaneously searched, and the second feature value determines the related keyword display manner Update parameters.
  • the device may further include a recording unit: configured to record the primary keyword, the related keyword, and the second feature value to form a keyword information table.
  • a recording unit configured to record the primary keyword, the related keyword, and the second feature value to form a keyword information table.
  • the statistical unit, the first computing unit and the second computing unit are configured to process multi-thread parallel operations.
  • the apparatus may further include a filtering unit: configured to remove the search keyword that meets the filtering rule.
  • the second calculating unit comprises:
  • Correlation calculation sub-unit configured to calculate a correlation degree according to the number of times the main keyword and the related keyword are simultaneously searched;
  • Obtaining a calculation subunit configured to acquire a first feature value in the cache, and calculate a second feature value according to the first feature value and the correlation.
  • the apparatus may further include an adding unit: configured to record a corresponding first feature value in the keyword information table.
  • the search keyword includes a search keyword used by the search user and a release keyword used by the publishing user.
  • the device embodiment for generating the update parameter shown in FIG. 2 can correspond to the method embodiment for generating the update parameter described above, the description is relatively simple. For details, refer to the description of the corresponding part in the foregoing description.
  • a method for adding a related keyword includes the following steps: Step 301: Send a request for extracting a corresponding related keyword according to a primary keyword input by a user;
  • Step 302 Acquire, according to the request, a related keyword that is greater than or equal to a certain threshold value, where the second feature value is the number of times that the primary feature value and the primary keyword are simultaneously searched according to the first feature value.
  • the first feature value is obtained according to the number of times the primary keyword is searched separately; the number of times the primary keyword and the related keyword are simultaneously searched, and the number of times the primary keyword is searched separately Obtained by searching for keyword statistics;
  • Step 303 Display the related keywords.
  • the user can use the input device, such as a keyboard, a tablet, etc., to input the main keyword in the search bar or the toolbar, and then click the OK button, press the Enter key or the TAB key or other trigger mode to trigger.
  • the local program or the script program of the search page issues an extraction request for a related keyword corresponding to the main keyword.
  • the second characteristic value can be understood as a representation of a correlation coefficient.
  • the correlation degree is first calculated according to the number of times the primary keyword and the related keyword are simultaneously searched, for example, The formula for calculating the correlation is:
  • the correlation base is an intermediate value of the number of times the primary keyword and the related keyword are simultaneously searched, and may be preset by a person skilled in the art according to experience or needs, and the present invention does not limit this.
  • the result calculated based on the first feature value and the correlation can be used as the second feature value.
  • the first feature value and the correlation degree may be respectively weighted, and the weighted result is used as the second feature value.
  • the weight can be preset by a person skilled in the art according to experience or needs, and can also be arbitrarily changed according to the needs of the user, and the present invention does not limit this.
  • the sum of the plurality of weight values may be set to 1.
  • the related keyword whose second feature value is greater than or equal to a certain threshold value can be understood as a related keyword that is fixedly displayed each time the related keyword is displayed for the primary keyword, since the related keywords are greater than Or a second eigenvalue equal to a certain threshold value, that is, it has a better correlation with the main keyword, and in this case, the user can have a higher recommendation item every time the related keyword is displayed. , closer to the user's usage habits, more in line with the user's usage trends, so that users get a better operating experience.
  • the related keywords whose second characteristic value is greater than or equal to a certain threshold are: electric bike, mountain bike, e bike, e bicycle, suspension bike, scooter, motorcycle, electric Scooter, gas scooter, vehicle, then, for search tools that fixed search-related keywords, these 10 related keywords will appear in the relevant page for each primary keyword bike search; In terms of tools, the 10 related keywords are fixedly present in each related keyword group table.
  • a certain threshold such as 0.2
  • the related keywords whose second feature value is less than a certain threshold may be displayed according to any rule, or may not be displayed.
  • a search tool such as GOOGLE that displays only a fixed number of related keywords, Show only a fixed number of second eigenvalues greater than or A related keyword that is equal to a certain threshold, and does not display a related keyword whose second eigenvalue is less than a certain threshold; for some search tools that can implement related keyword round robin display or all display, it can be displayed according to any rule All related keywords are not limited by the present invention.
  • FIG. 4 a block diagram of an embodiment of an apparatus for adding a related keyword according to the present invention is shown, including:
  • the interface unit 401 is configured to send a request for extracting a corresponding related keyword according to the main keyword input by the user;
  • the related keyword obtaining unit 402 is configured to acquire, according to the request, a related keyword that is greater than or equal to a certain threshold value, where the second feature value is based on the first feature value and the primary keyword and the related key
  • the word is obtained by calculating the number of times of the search, the first feature value is obtained according to the number of times the main keyword is searched separately; the number of times the main keyword and the related keyword are simultaneously searched, and the main keyword
  • the number of times that a single search is obtained is obtained by counting the search keyword statistics;
  • the first display unit 403 is configured to display the related keywords.
  • the device in this embodiment may further include a second display unit: configured to display related keywords whose second feature value is less than a certain threshold
  • the device embodiment for displaying related keywords in FIG. 4 may correspond to the embodiment of the related keyword method shown in FIG. 3, the description is relatively simple. For details, refer to the description of the corresponding part in the foregoing description.
  • the embodiment of the present invention performs statistics and calculation based on the search keywords used by the user in the preset time period, which is beneficial to ensuring the timeliness of the keywords, and by using the second feature value as the method for determining the related keyword display manner. Parameters, so that related keywords that meet the current usage trend are preferentially provided to the user, so that the user has a better operating experience;
  • the present invention can establish the keyword information table, and only needs to update the data in the table when updating, thereby improving the system processing efficiency;
  • the statistical step of the search keyword and the calculation step of the feature value of the present invention are multi-thread parallel operations, thereby effectively improving the computing performance and computational efficiency of the system;
  • the present invention further improves the computing performance and efficiency of the system by storing the first feature value in the cache; the present invention also records the first feature value and displays the main keyword The first feature value is provided to the user as a reference;
  • the embodiment of the present invention has no special security algorithm for the service provider, and the implementation is simple and the development cost is low.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

生成更新参数的方法和装置、 展示相关关键词方法和装置 本申请要求于 2007 年 4 月 10 日提交中国专利局、 申请号为 200710095848.7、发明名称为"生成更新参数的方法和装置、展示相关关键词 的方法和装置"的中国专利申请的优先权, 其全部内容通过引用结合在本申请 中。
技术领域
本发明涉及数据处理领域, 特别是涉及一种生成更新参数的方法和装 置、 展示相关关键词的方法和装置。
背景技术
随着因特网及其它数据网和系统中使用的文本和多媒体内容的迅速增 加, 用户越来越多地依靠基于关键词的搜索工具去搜索需要的信息。 一般 地, 用户将所需要查找的信息文件的关键词输入搜索工具或引擎, 由搜索 工具或引擎在已有索引数据库中进行搜索并返回搜索结果, 通常, 现有的 搜索工具或引擎还会在当前页面或搜索结果页面中展示与用户所输入的关 键词 (即主关键词)相关的一个或多个相关关键词。
众所周知, 大多数用户在网上寻找信息都是从搜索引擎开始, 通常都 是通过输入关键词来寻找想要的信息, 随着社会节奏的加快, 文化冲突和 融合的不断进行, 导致许多固定关键词已经远远不能涵盖现代社会中各种 用户的需求, 尤其随着信息迅速膨胀, 目前关键词的搜索方式并不能满足 各种用户的需求。 由于关键词内容固定, 形成时间较早, 更新緩慢, 不符 合网络中内容的更新, 例如, 对于主关键词 "衣服", 现有技术获得的相关 关键词通常为 "运动服"、 "羽绒服"等, 而随着季节的变化, 用户实际需要 的相关关键词可能为 "春装"、 "夏装"、 "T恤"等, 显然, 现有技术的关键词 与用户的使用趋势不符。
由此可见, 现有技术中关键词的搜索不能满足用户的需求, 即关键词 的使用并不能符合用户使用的趋势。
发明内容
本发明提供一种生成更新参数方法和装置, 以使关键词的使用更符合 用户的使用趋势的问题。 相应的, 本发明解决的另一个技术问题是提供一种展示相关关键词方 法和装置, 以保证用户可以简单、 全面的获得相关关键词的问题
为了解决上述问题, 本发明实施例公开了一种生成更新参数方法, 包 括:
获取预置时间段内用户使用的搜索关键词;
对所述搜索关键词进行统计, 获得主关键词、 相关关键词、 所述主关 键词与相关关键词同时被搜索的次数以及所述主关键词单独被搜索的次 数;
根据所述主关键词单独被搜索的次数计算第一特征值;
根据所述第一特征值和所述主关键词与相关关键词同时被搜索的次数 计算第二特征值, 所述第二特征值为确定该相关关键词展现方式的更新参 数。
优选的是, 所述方法还包括:
记录所述主关键词、 相关关键词和第二特征值, 形成关键词信息表。 优选的是, 所述搜索关键词的统计步骤、 计算第一特征值的步骤以及 计算第二特征值的步骤为多线程并行运算。
优选的是, 在所述计算第一特征值的步骤之前, 所述方法还包括: 去除符合过滤规则的搜索关键词。
优选的是, 通过以下步骤计算第二特征值:
根据所述主关键词与相关关键词同时被搜索的次数计算相关度; 获取緩存中的第一特征值, 并根据所述第一特征值和相关度计算第二 特征值。
优选的是, 所述关键词信息表的内容还包括:
与主关键词对应的第一特征值。
优选的是, 所述搜索关键词包括搜索用户使用的搜索关键词和发布用 户使用的发布关键词。
本发明实施例还公开了一种生成更新参数的装置, 包括:
获取单元: 用于获取预置时间段内用户使用的搜索关键词;
统计单元: 用于对所述搜索关键词进行统计, 获得主关键词、 相关关 键词、 所述主关键词与相关关键词同时被搜索的次数以及所述主关键词单 独被搜索的次数;
第一计算单元: 用于根据所述主关键词单独被搜索的次数计算第一特 征值;
第二计算单元: 用于根据所述第一特征值和所述主关键词与相关关键 词同时被搜索的次数计算第二特征值, 所述第二特征值为确定该相关关键 词展现方式的更新参数。
优选的是, 所述的装置, 还包括:
记录单元: 用于记录所述主关键词、 相关关键词和第二特征值, 形成 关键词信息表。
优选的是, 所述装置还包括:
过滤单元: 与统计单元相连, 用于去除符合过滤规则的搜索关键词。 优选的是, 所述第二计算单元包括:
相关度计算子单元: 用于根据所述主关键词与相关关键词同时被搜索 的次数计算相关度;
获取计算子单元: 用于获取緩存中的第一特征值, 并根据所述第一特 征值和相关度计算第二特征值。
优选的是, 所述装置还包括:
添加单元: 与记录单元相连, 用于在所述关键词信息表中记录相应的 第一特征值。
本发明实施例还公开了一种展示相关关键词的方法, 包括:
根据用户输入的主关键词发出提取对应的相关关键词的请求; 根据所述请求获取第二特征值大于或等于一定阔值的相关关键词, 所 述第二特征值为根据第一特征值和所述主关键词与相关关键词同时被搜索 的次数计算而得到, 所述第一特征值为根据所述主关键词单独被搜索的次 数计算而得到; 所述主关键词与相关关键词同时被搜索的次数以及主关键 词单独被搜索的次数通过对搜索关键词统计而得到;
展示所述相关关键词。
优选的是, 所述方法还包括: 展示第二特征值小于一定阔值的相关关键词。
优选的是, 所述获取第二特征值大于或等于一定阔值的相关关键词的 过程为: 在关键词信息表中获取第二特征值大于或等于一定阔值的相关关 键词的过程, 所述关键词信息表包括主关键词、 相关关键词和第二特征值。
本发明实施例还公开了一种展示相关关键词的装置, 包括:
接口单元: 用于根据用户输入的主关键词发出提取对应的相关关键词 的请求;
相关关键词获取单元: 用于根据所述请求获取第二特征值大于或等于 一定阔值的相关关键词, 所述第二特征值为根据第一特征值和所述主关键 词与相关关键词同时被搜索的次数计算而得到, 所述第一特征值为根据所 述主关键词单独被搜索的次数计算而得到; 所述主关键词与相关关键词同 时被搜索的次数以及主关键词单独被搜索的次数通过对搜索关键词统计而 得到;
第一展示单元: 用于展示所述相关关键词。
优选的是, 所述装置还包括:
第二展示单元: 与相关关键词获取单元相连, 用于展示第二特征值小 于一定阔值的相关关键词。
由上述方案可知, 本发明实施例基于预置时间段内用户使用的搜索关 键词进行统计、 计算, 有利于保证关键词的时效性, 并且通过以第二特征 值作为确定该相关关键词展现方式的参数, 从而使符合当前使用趋势的相 关关键词优先提供给用户, 使用户具有较好的操作体验。
附图说明
图 1是本发明的一种生成更新参数的方法实施例的流程图;
图 2是本发明的一种生成更新参数的装置实施例的结构图;
图 3是本发明的一种展示相关关键词的方法实施例的流程图; 图 4是本发明的一种展示相关关键词的装置实施例的结构图。
具体实施方式
为使本发明的上述目的、 特征和优点能够更加明显易懂, 下面结合附 图和具体实施方式对本发明作进一步详细的说明。 本发明实施例通过更新主关键词与对应相关关键词的关联系数, 并以 此控制相应相关关键词的输出 , 使相关关键词更符合用户的搜索需求。
参考图 1 , 示出了本发明的一种生成更新参数的方法实施例的流程图, 具体包括以下步骤:
步骤 101、 获取预置时间段内用户使用的搜索关键词;
所述预置时间段可以由本领域技术人员根据需要进行预置, 例如, 对 于购物网站, 为了使相应的商品关键词符合用户的使用趋势, 所述时间段 可以预置为一周或一个月等, 所述搜索关键词可以来自数据库、 脚本程序、 本地程序、 用户输入历史记录、 客户端、 服务器或其它设备的存储单元等, 本发明对此不作限制。
步骤 102、 对所述搜索关键词进行统计, 获得主关键词、 相关关键词、 所述主关键词与相关关键词同时被搜索的次数以及所述主关键词单独被搜 索的次数;
由于现有技术中的关键词内容固定, 形成的时间较早, 更新很慢, 不 符合网络中内容的实时更新或使用的活跃性, 从而与用户使用要求不符。 因此通过统计主关键词和相关关键词可以保证用户获得符合其使用趋势的相 应关键词。 在实际中, 所述获得主关键词、 相关关键词、 所述主关键词与相 关关键词同时被搜索的次数以及所述主关键词单独被搜索的次数的方法可 以为现有技术中的任一种方法, 例如, 将所述搜索关键词作为主关键词, 然后 将与所述主关键词同时被搜索的关键词作为相关关键词 ,然后分别统计所述 主关键词与相关关键词同时被搜索的次数以及所述主关键词单独被搜索的 次数。
下面以一种基于 Apriori算法的统计方法为例来说明, 所述 Apriori算 法的基本过程为: ( 1 ) 对事务数据库进行扫描, 找出支持度不小于最小支 持度的所有项目,即频繁项目集 L1 ; ( 2 )对所述 L1中的项目进行连接;(3 ) 对事务数据库扫描: 对 L1中的集合进行过滤, 找出 L1中支持度不小于最 小支持度的 L2; ( 4 )对 L2进行连接; (5 )对事务数据库扫描: 对 L2中的 集合进行过滤, 找出 L2中支持度不小于最小支持度的 L3; 以此类推。 其 中, 获得的搜索关键词如表 1所示; 获得主关键词及其被单独搜索的次数 如表 2所示; 进一步统计与主关键词同时被搜索的相关关键词, 及其同时 被搜索的次数如表 3所示。
表 1
Figure imgf000008_0001
表 2
主关键词 次数
啤酒 3
花生 2
口香糖 2
餐巾纸 1
牛奶 2
白糖 1
表 3
主关键词和相关关键词 同时被搜索的次数 啤酒, 花生 1
啤酒, 口香糖 2
啤酒, 餐巾纸 1
啤酒, 牛奶 1
啤酒, 白糖 0
花生, 口香糖 1
花生, 餐巾纸 0
花生, 牛奶 0
花生, 白糖 0
口香糖, 餐巾纸 1
口香糖, 牛奶 0 口香糖, 白糖 0
餐巾纸, 牛奶 0
餐巾纸, 白糖 0
牛奶, 白糖 1
按照上述规则统计完毕, 即可获得相应的主关键词、 相关关键词、 所 述主关键词与相关关键词同时被搜索的次数以及所述主关键词单独被搜索 的次数。
当然, 上述方法仅仅用于举例, 对于本领域技术人员而言, 也可以根 据经验或者需要采用其它挖掘关联规则的方法都是可行的, 本发明对此不 需要进行限定。
优选的是, 本实施例还可以包括步骤: 去除符合过滤规则的搜索关键 词。 所述过滤规则可以由本领域技术人员根据经验或者需要进行预置, 假 设对于上例中设置过滤规则为去除所述主关键词与相关关键词同时被搜索 的次数小于 2的项目 , 则得到符合条件的项目如表 4所示。
表 4
Figure imgf000009_0001
所述过滤规则还可以设置为去除搜索关键词为非法关键词或搜索关键 词中包括非法字符或非法字词的搜索关键词等, 本发明对此不作限制。
步骤 103、 根据所述主关键词单独被搜索的次数计算第一特征值; 为了使关键词的使用趋势更加符合用户的搜索需求, 所述第一特征值 可以理解为表示关键词流行度的值, 在这种情况下, 所述第一特征值可以 通过将所述主关键词单独被搜索的次数与预置的流行度基数对比获得, 其 中, 一种计算所述第一特征值的公式为: 第一特征值 =主关键词被单独搜索 的次数 /预置的流行度基数, 如表 5所示为例,
表 5
主、 相关关键词同 主关键词单独 主关键词 相关关键词
时被搜索的次数 被搜索的次数 bike E bike , city bike , bicycle 2 , 1 , 1 2 e bike bike , city bike , bicycle 1, 1, 1 2 city bike bike , e bike 1, 1 1
在上表中, 若预置的流行度基数为 20,对于关键词 bike的第一特征值则 为 2/20=0.1, 而对于关键词 e bike的第一特征值则为 1/20=0.05。 优选的是, 所述流行度基数为所述主关键词被单独搜索的次数的中间值, 例如, 10% 的主关键词被单独搜索的次数是 10次, 80%的主关键词被单独搜索的次数 是 20次, 10%的主关键词被单独搜索的次数是 50次, 则取 20为流行度基数, 本领域技术人员根据经验或需要预置即可, 本发明对此不作限制。
当然, 所述第一特征值以及计算第一特征值的方法还可以由本领域技 术人员根据需要或经验进行设置, 上述方法仅用于举例, 本发明对此不作 限制。
步骤 104、根据所述第一特征值和所述主关键词与相关关键词同时被搜 索的次数计算第二特征值, 所述第二特征值为确定该相关关键词展现方式 的更新参数。
基于上述对本发明实施例的理解, 在实际应用中, 需要获得相应的主 关键词和相关关键词的关联系数。 在本实施例中, 则通过根据所述第一特 征值和所述主关键词与相关关键词同时被搜索的次数进行计算, 得到的第 二特征值即可作为一种关联系数的表现形式。 为了使所述第二特征值可以 充分体现关键词的使用趋势, 在本实施例中优选的是, 通过以下子步骤计 算第二特征值:
子步骤 A1、 根据所述主关键词与相关关键词同时被搜索的次数计算相 关度;
子步骤 A2、 获取緩存中的第一特征值, 并根据所述第一特征值和相关 度计算第二特征值。
也就是说, 首先根据所述主关键词与相关关键词同时被搜索的次数计 算相关度, 其中, 一种计算相关度的公式为:
相关度 =主关键词与相关关键词同时被搜索的次数 /预置相关度基数 根据表 5中的数据, 假设预置的相关度基数为 10, 则主关键词 bike和相 关关键词 e bike的相关度为搜索相关中间表中 bike与 e bike同时出现次数 2/10=0.2。优选的是, 所述相关度基数为所述主关键词与相关关键词同时被 搜索的次数的中间值, 本领域技术人员根据经验或需要预置即可, 本发明 对此不作限制。
为了提高计算的效率, 在本实施例中可以将所述第一特征值保存在緩 存中, 当计算第二特征值时, 则直接从緩存中获取相应的第一特征值, 并 根据所述第一特征值与相关度计算相应的第二特征值。 显然, 从緩存中获 取数据要比从数据库或其它设备中获取数据要快得多, 因此, 本发明的优 选实施例可以使本发明具有更优良的计算性能。 此外, 所述在緩存中保存 的方法可以以哈希表的形式保存, 也可以以文件形式保存, 还可以以其它 方式保存; 为了方便第一特征值的获取, 还可以通过对所述主关键词设置 升序排序等优化操作, 当然, 本发明对所述优化的方法不需要进行限定。
为了获得更准确的符合用户需求的第二特征值, 优选的是, 还可以分 别对所述第一特征值和相关度进行加权, 并将加权后的结果作为第二特征 值。 基于前例, 假设获得关键词 e bike的第一特征值为 0.05 , 该第一特征值 的权重为 0.4, 主关键词 bike和相关关键词 e bike的相关度为 0.2,该相关度的 权重为 0.6, 则得到主关键词 bike和相关关键词 e bike的第二特征值为: 0.05*0.4+0.2*0.6=0.14。
所述权重可以由本领域技术人员根据经验或需要进行预置, 并且也可 以根据用户的需要任意更改, 本发明对此不作限制。 为了保证计算结果的 一致性, 可以设置多个权重值的和为 1, 也可以是其它的数值。
当然, 本领域技术人员采用其它计算第二特征值的方法也是可行的, 本发明对此不需要进行限定。
所述第二特征值为确定该相关关键词展现方式的更新参数, 例如, 将 所述第二特征值大于或等于一定阔值的相关关键词优先展现或固定展现, 将所述第二特征值小于一定阔值的相关关键词轮循展现或不展现, 所述根 据第二特征值展现相关关键词的方式可以由本领域技术人员根据需要或经 验任意设置, 本发明对此不作限制。
优选的是, 所述搜索关键词的统计步骤、第一特征值计算和第二特征值的 计算步骤为多线程并行运算, 从而有效提高了系统的计算性能及计算效率。 其中, 所述多线程是一种机制: 它允许在程序中并发执行多个指令流, 每个指令流都称为一个线程, 多线程彼此间互相独立。 多个线程的执行是 并发的, 即在逻辑上"同时", 也就是说, 多线程运算就是同时存在 N个执行 体, 按几条不同的执行线索共同运算的情况。 显然, 在本发明实施例中, 统计搜索关键词的线程、 计算特征值 (包括第一特征值和第二特征值) 的 线程并行运算, 循环处理相应的关键词, 从而有效提高了本发明实施例的 计算性能和计算效率。
优选的是, 本发明实施例还可以包括步骤: 记录所述主关键词、 相关 关键词和第二特征值。 所述记录包括在表格中记录、 在文件中记录或以其 它任一种方式进行记录。 更为优选的是, 通过记录所述主关键词、 相关关 键词和第二特征值, 形成关键词信息表, 在下一次更新时, 只需要清除所 述关键词信息表中的数据, 按照本发明实施例所述的方法重新填写相关数 据即可。 所述更新可以为定期更新、 实时更新或二者交替更新, 例如, 每 个月发起一次更新, 也可以为由本领域技术人员进行任意更新, 本发明对 此不作限制。
为了提供更直观的展示效果, 使用户在搜索时可以获得直接的搜索提 示, 优选的是, 本发明还可以包括步骤: 在所述关键词信息表中记录相应 的第一特征值。 从而有效提高搜索工具的智能性。
在实际应用中, 一种可能的情况是, 需要对不同类别的用户, 提供不 同的关键词。 例如, 对于购物网站而言, 用户通常包括买方用户和卖方用 户, 在这种情况下, 所述搜索关键词可以包括搜索用户使用的搜索关键词 和发布用户使用的发布关键词。
为使本领域技术人员更好地理解本发明, 以下对所述搜索关键词包括 搜索用户使用的搜索关键词 (第一搜索关键词) 和发布用户使用的发布关 键词 (第二搜索关键词) 的情况举例, 以详细说明本发明, 具体为:
步骤 A、获取预置时间段的第一脚本程序中的第一搜索关键词, 所述第 一搜索关键词的来源是用户在打开浏览器到关闭浏览器的范围内, 搜索时 用到的关键词。例如,用户一次打开浏览器的过程中,在搜索框中搜索多次, 输入了多个关键词, 那么所述关键词就是本例所获取的第一搜索关键词。 通过统计所述第一搜索关键词得到第一主关键词、 第一相关关键词、 所述 第一主关键词与第一相关关键词同时被搜索的次数以及所述第一主关键词 单独被搜索的次数如表 6所示:
Figure imgf000013_0001
Figure imgf000013_0002
步骤 B、获取预置时间段的第二脚本程序中的第二搜索关键词, 所述第 二搜索关键词的来源是用户在发布产品时输入的关键词, 因为通常可以发 布三个或以上的关键词, 则获取所述关键词。 通过所述第二搜索关键词得 到第二主关键词、 第二相关关键词、 所述第一主关键词与第一相关关键词 同时被搜索的次数以及所述第二主关键词单独被搜索的次数如表 7所示: 表 7
Figure imgf000013_0003
步骤 c、 计算第一特征值:
预置流行度基数为 20, 则第一主关键词中 bike的流行度为 2/20=0.1, 第 一相关关键词中 e bike的流行度为 1/20=0.05;第二主关键词及相关关键词的 第一特征值同上。
步骤 D: 计算相关度:
预置相关度基数为 10, 则主关键词 bike和相关关键词 e bike的第一相关 度为 2/10=0.2; 第二相关度为 1/10=0.1 ; 按照经验分别取权重为: 第一特征 值权重 0.2、 第一相关度权重 0.3、 第二相关度权重 0.5, 则主关键词 bike和相 关关键词 e bike的第二特征值为 0.2*0.05+0.3*0.2+0.5*0.1=0.12。 在上述计算中, 因为第一关键词与第二关键词的第一特征值相同, 为 提高计算效率, 仅取一个第一特征值参与计算, 显然, 取两个第一特征值 参与计算, 并分别赋予不同权重, 还是可以获得相同的计算结果。
步骤 E、 记录所述主关键词、 相关关键词和第二特征值, 形成关键词信 息表如表 8所示:
表 8
Figure imgf000014_0001
参照图 2 , 示出了本发明的一种生成更新参数的装置实施例的结构框 图, 包括:
获取单元 201: 用于获取预置时间段内用户使用的搜索关键词; 统计单元 202: 用于对所述搜索关键词进行统计, 获得主关键词、 相关 关键词、 所述主关键词与相关关键词同时被搜索的次数以及所述主关键词 单独被搜索的次数;
第一计算单元 203 : 用于根据所述主关键词单独被搜索的次数计算第一 特征值;
第二计算单元 204: 用于根据所述第一特征值和所述主关键词与相关关 键词同时被搜索的次数计算第二特征值, 所述第二特征值为确定该相关关 键词展现方式的更新参数。
优选的是, 所述的装置还可以包括记录单元: 用于记录所述主关键词、 相关关键词和第二特征值, 形成关键词信息表。
优选的是, 所述统计单元、 第一计算单元和第二计算单元用于处理多 线程并行运算。
优选的是, 所述的装置还可以包括过滤单元: 用于去除符合过滤规则 的搜索关键词。
优选的是, 所述第二计算单元包括:
相关度计算子单元: 用于根据所述主关键词与相关关键词同时被搜索 的次数计算相关度; 获取计算子单元: 用于获取緩存中的第一特征值, 并根据所述第一特 征值和相关度计算第二特征值。
优选的是, 所述的装置还可以包括添加单元: 用于在所述关键词信息 表中记录相应的第一特征值。
优选的是, 所述搜索关键词包括搜索用户使用的搜索关键词和发布用 户使用的发布关键词。
由于图 2所示的生成更新参数的装置实施例可以对应适用于上述的生 成更新参数的方法实施例中, 所以描述较为简略, 未详尽之处可以参见本 说明书前面相应部分的描述。
参考图 3 ,示出了本发明的一种添加相关关键词的方法,包括以下步骤: 步骤 301、 根据用户输入的主关键词发出提取对应的相关关键词的请 求;
步骤 302、根据所述请求获取第二特征值大于或等于一定阔值的相关关 键词, 所述第二特征值为根据第一特征值和所述主关键词与相关关键词同 时被搜索的次数计算而得到, 所述第一特征值为根据所述主关键词单独被 搜索的次数计算而得到; 所述主关键词与相关关键词同时被搜索的次数以 及主关键词单独被搜索的次数通过对搜索关键词统计而得到;
步骤 303、 展示所述相关关键词。
用户在使用搜索工具或搜索引擎时, 利用输入设备, 比如键盘、 手写 板等在搜索栏或工具栏输入主关键词后, 通过点击确定、 按回车键或 TAB 键或其它触发方式即可触发本地程序或搜索页面的脚本程序发出对与所述 主关键词对应的相关关键词的提取请求。
在实际中, 所述第二特征值即可以理解为一种关联系数的表现形式。 为了使所述第二特征值可以充分体现关键词的使用趋势, 在本实施例中优 选的是, 可以首先根据所述主关键词与相关关键词同时被搜索的次数计算 相关度, 例如, 一种计算相关度的公式为:
相关度 =主关键词与相关关键词同时被搜索的次数 /预置相关度基数 优选的是, 所述相关度基数为所述主关键词与相关关键词同时被搜索 的次数的中间值, 本领域技术人员根据经验或需要预置即可, 本发明对此 不作限制。
根据所述第一特征值和相关度进行计算得到的结果即可作为第二特征 值。
作为另一实施例, 还可以分别对所述第一特征值和相关度进行加权, 并将加权后的结果作为第二特征值。 并且所述权重可以由本领域技术人员 根据经验或需要进行预置, 并且也可以根据用户的需要任意更改, 本发明 对此不作限制。 为了保证计算结果的一致性, 可以设置所述多个权重值的 和为 1。
当然, 本领域技术人员采用其它计算第二特征值的方法也是可行的, 本发明对此不需要进行限定。
在实际中, 所述第二特征值大于或等于一定阔值的相关关键词可以理 解为在每次针对该主关键词展示相关关键词时固定展示的相关关键词, 由 于这些相关关键词具有大于或等于一定阔值的第二特征值, 即表明其具有 与主关键词较好的关联性, 在这种情况下, 可以使用户在每次相关关键词 展示时, 都可以较高的推荐项, 更贴近用户的使用习惯, 更符合用户的使 用趋势, 从而使用户获得更好的操作体验。 例如, 对于主关键词 bike, 所 述第二特征值大于或等于一定阔值(如 0.2 )的相关关键词为: electric bike, mountain bike , e bike , e bicycle, suspension bike, scooter, motorcycle , electric scooter, gas scooter, vehicle, 那么, 对于固定搜索相关关键词的 搜索工具而言, 这 10 个相关关键词会固定出现在每次针对主关键词 bike 搜索的相关页面中; 对于轮循展示的搜索工具而言, 这 10个相关关键词会 固定存在于每个相关关键词分组表中, 当用户针对 bike发出对应的相关关 键词提取请求时, 不论所述请求的次数对应哪个相关关键词分组表, 上述 10个相关关键词都会出现在相关关键词分组表中提供给用户展示。
优选的是, 对于第二特征值小于一定阔值的相关关键词可以按照任意 规则进行展示, 也可以不展示, 例如, 对于 GOOGLE等仅展示固定个数相 关关键词的搜索工具而言, 可以选择仅展示固定个数的第二特征值大于或 等于一定阔值的相关关键词, 而不展示第二特征值小于一定阔值的相关关 键词;对于一些可以实现相关关键词轮循展示或全部展示的搜索工具而言, 则可以按照任意规则展示所有的相关关键词 , 本发明对此不作限制。
参考图 4,示出了本发明的一种添加相关关键词的装置实施例的结构框 图, 包括:
接口单元 401: 用于根据用户输入的主关键词发出提取对应的相关关键 词的请求;
相关关键词获取单元 402: 用于根据所述请求获取第二特征值大于或等 于一定阔值的相关关键词, 所述第二特征值为根据第一特征值和所述主关 键词与相关关键词同时被搜索的次数计算而得到, 所述第一特征值为根据 所述主关键词单独被搜索的次数计算而得到; 所述主关键词与相关关键词 同时被搜索的次数以及主关键词单独被搜索的次数通过对搜索关键词统计 而得到;
第一展示单元 403 : 用于展示所述相关关键词。
优选的是, 本实施例所述的装置还可以包括第二展示单元: 用于展示 第二特征值小于一定阔值的相关关键词
由于图 4所述的展示相关关键词的装置实施例可以对应于图 3所示的展 示相关关键词方法实施例中, 所以描述较为简略, 未详尽之处可以参见本 说明书前面相应部分的描述。
由此可见, 本发明实施例基于预置时间段内用户使用的搜索关键词进 行统计、 计算, 有利于保证关键词的时效性, 并且通过以第二特征值作为 确定该相关关键词展现方式的参数, 从而使符合当前使用趋势的相关关键 词优先提供给用户, 使用户具有较好的操作体验;
其次, 本发明通过建立关键词信息表, 在更新时, 只需要相应地更新 表内数据即可, 从而提高了系统处理效率;
再者, 本发明的搜索关键词的统计步骤和特征值的计算步骤为多线程 并行运算, 从而有效提高了系统的计算性能及计算效率;
此外, 本发明通过在緩存中保存第一特征值, 更进一步提高了系统的 计算性能及效率; 本发明还通过记录所述第一特征值, 并在展示主关键词 时将所述第一特征值提供给用户作为参考;
最后, 本发明实施例对于服务提供商来说, 无特殊保密算法, 实现简 单, 开发成本低。
以上对本发明所提供的生成更新参数的方法及装置、 展示相关关键词 的方法及装置进行了详细介绍, 本文中应用了具体个例对本发明的原理及 实施方式进行了阐述, 以上实施例的说明只是用于帮助理解本发明的方法 及其装置; 同时, 对于本领域的普通技术人员, 依据本发明的思想, 在具 体实施方式及应用范围上均会有改变之处, 综上所述, 本说明书内容不应 理解为对本发明的限制。

Claims

权 利 要 求
1、 一种生成更新参数的方法, 其特征在于, 包括:
获取预置时间段内用户使用的搜索关键词;
对所述搜索关键词进行统计, 获得主关键词、 相关关键词、 所述主关 键词与相关关键词同时被搜索的次数以及所述主关键词单独被搜索的次 数;
根据所述主关键词单独被搜索的次数计算第一特征值;
根据所述第一特征值和所述主关键词与相关关键词同时被搜索的次数 计算第二特征值, 所述第二特征值为确定该相关关键词展现方式的更新参 数。
2、 如权利要求 1所述的方法, 其特征在于, 所述方法还包括: 记录所述主关键词、 相关关键词和第二特征值, 形成关键词信息表。
3、 如权利要求 1所述的方法, 其特征在于, 所述搜索关键词的统计步 骤、计算第一特征值的步骤以及计算第二特征值的步骤为多线程并行运算。
4、 如权利要求 1、 2或 3所述的方法, 其特征在于, 在所述计算第一特 征值的步骤之前, 所述方法还包括:
去除符合过滤规则的搜索关键词。
5、 如权利要求 1所述的方法, 其特征在于, 通过以下步骤计算第二特 征值:
根据所述主关键词与相关关键词同时被搜索的次数计算相关度; 获取緩存中的第一特征值, 并根据所述第一特征值和相关度计算第二 特征值。
6、 如权利要求 2所述的方法, 其特征在于, 所述关键词信息表的内容 还包括:
与主关键词对应的第一特征值。
7、 如权利要求 1、 2、 3、 5或 6所述的方法, 其特征在于, 所述搜索关 键词包括搜索用户使用的搜索关键词和发布用户使用的发布关键词。
8、 一种生成更新参数的装置, 其特征在于, 包括:
获取单元: 用于获取预置时间段内用户使用的搜索关键词; 统计单元: 用于对所述搜索关键词进行统计, 获得主关键词、 相关关 键词、 所述主关键词与相关关键词同时被搜索的次数以及所述主关键词单 独被搜索的次数;
第一计算单元: 用于根据所述主关键词单独被搜索的次数计算第一特 征值;
第二计算单元: 用于根据所述第一特征值和所述主关键词与相关关键 词同时被搜索的次数计算第二特征值, 所述第二特征值为确定该相关关键 词展现方式的更新参数。
9、 如权利要求 8所述的装置, 其特征在于, 所述装置还包括: 记录单元: 用于记录所述主关键词、 相关关键词和第二特征值, 形成 关键词信息表。
10、 如权利要求 8、 或 9所述的装置, 其特征在于, 所述装置还包括: 过滤单元: 与统计单元相连, 用于去除符合过滤规则的搜索关键词。
11、 如权利要求 8所述的装置, 其特征在于, 所述第二计算单元包括: 相关度计算子单元: 用于根据所述主关键词与相关关键词同时被搜索 的次数计算相关度;
获取计算子单元: 用于获取緩存中的第一特征值, 并根据所述第一特 征值和相关度计算第二特征值。
12、 如权利要求 9所述的装置, 其特征在于, 所述装置还包括: 添加单元: 与记录单元相连, 用于在所述关键词信息表中记录相应的 第一特征值。
13、 一种展示相关关键词的方法, 其特征在于, 包括:
根据用户输入的主关键词发出提取对应的相关关键词的请求; 根据所述请求获取第二特征值大于或等于一定阔值的相关关键词, 所 述第二特征值为根据第一特征值和所述主关键词与相关关键词同时被搜索 的次数计算而得到, 所述第一特征值为根据所述主关键词单独被搜索的次 数计算而得到; 所述主关键词与相关关键词同时被搜索的次数以及主关键 词单独被搜索的次数通过对搜索关键词统计而得到;
展示所述相关关键词。
14、 如权利要求 13所述的方法, 其特征在于, 所述方法还包括: 展示第二特征值小于一定阔值的相关关键词。
15、 如权利要求 13所述的方法, 其特征在于, 所述获取第二特征值大 于或等于一定阔值的相关关键词的过程为: 在关键词信息表中获取第二特 征值大于或等于一定阔值的相关关键词,所述关键词信息表包括主关键词、 相关关键词和第二特征值。
16、 一种展示相关关键词的装置, 其特征在于, 包括:
接口单元: 用于根据用户输入的主关键词发出提取对应的相关关键词 的请求;
相关关键词获取单元: 用于根据所述请求获取第二特征值大于或等于 一定阔值的相关关键词, 所述第二特征值为根据第一特征值和所述主关键 词与相关关键词同时被搜索的次数计算而得到, 所述第一特征值为根据所 述主关键词单独被搜索的次数计算而得到; 所述主关键词与相关关键词同 时被搜索的次数以及主关键词单独被搜索的次数通过对搜索关键词统计而 得到;
第一展示单元: 用于展示所述相关关键词。
17、 如权利要求 16所述的装置, 其特征在于, 所述装置还包括: 第二展示单元: 与相关关键词获取单元相连, 用于展示第二特征值小 于一定阔值的相关关键词。
PCT/CN2007/070573 2007-04-10 2007-08-28 Procédé et dispositif pour la mise à jour de paramètres, et procédé et dispositif pour afficher un mot-clé associé Ceased WO2008122181A1 (fr)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US12/594,930 US8676811B2 (en) 2007-04-10 2007-08-28 Method and apparatus of generating update parameters and displaying correlated keywords
EP07801005A EP2136302A4 (en) 2007-04-10 2007-08-28 METHOD AND DEVICE FOR UPDATING PARAMETERS, AND METHOD AND DEVICE FOR DISPLAYING A KEYWORD THEREOF
JP2010502405A JP5238800B2 (ja) 2007-04-10 2007-08-28 更新パラメータを生成および相関するキーワードを表示するための方法および装置
US14/160,364 US8874588B2 (en) 2007-04-10 2014-01-21 Method and apparatus of generating update parameters and displaying correlated keywords
US14/496,256 US9135370B2 (en) 2007-04-10 2014-09-25 Method and apparatus of generating update parameters and displaying correlated keywords

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200710095848.7 2007-04-10
CN2007100958487A CN101286150B (zh) 2007-04-10 2007-04-10 生成更新参数的方法和装置、展示相关关键词的方法和装置

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US12/594,930 A-371-Of-International US8676811B2 (en) 2007-04-10 2007-08-28 Method and apparatus of generating update parameters and displaying correlated keywords
US14/160,364 Continuation US8874588B2 (en) 2007-04-10 2014-01-21 Method and apparatus of generating update parameters and displaying correlated keywords

Publications (1)

Publication Number Publication Date
WO2008122181A1 true WO2008122181A1 (fr) 2008-10-16

Family

ID=39830463

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/070573 Ceased WO2008122181A1 (fr) 2007-04-10 2007-08-28 Procédé et dispositif pour la mise à jour de paramètres, et procédé et dispositif pour afficher un mot-clé associé

Country Status (5)

Country Link
US (3) US8676811B2 (zh)
EP (1) EP2136302A4 (zh)
JP (2) JP5238800B2 (zh)
CN (1) CN101286150B (zh)
WO (1) WO2008122181A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191119A (zh) * 2019-12-16 2020-05-22 绍兴市上虞区理工高等研究院 一种基于神经网络的科技成果自学习方法及装置

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286150B (zh) 2007-04-10 2010-09-15 阿里巴巴集团控股有限公司 生成更新参数的方法和装置、展示相关关键词的方法和装置
CN102033886B (zh) * 2009-09-25 2012-07-04 香港纺织及成衣研发中心 一种织物检索方法及系统
US9262482B2 (en) * 2010-04-19 2016-02-16 Facebook, Inc. Generating default search queries on online social networks
CN102654862B (zh) * 2011-03-01 2016-02-17 腾讯科技(深圳)有限公司 信息相关性分析方法和装置
TW201322022A (zh) 2011-11-24 2013-06-01 Alibaba Group Holding Ltd 分散式資料流處理方法及其系統
CN103455487B (zh) * 2012-05-29 2018-07-06 腾讯科技(深圳)有限公司 一种搜索词的提取方法及装置
US20160306887A1 (en) * 2013-12-03 2016-10-20 Beijing Qihoo Technology Company Limited Methods, apparatuses and systems for linked and personalized extended search
JP6114707B2 (ja) * 2014-02-28 2017-04-12 富士フイルム株式会社 商品検索装置、商品検索システム、サーバシステム及び商品検索方法
JP5847867B2 (ja) * 2014-03-14 2016-01-27 ヤフー株式会社 広告主向け参考情報提供装置
US9703859B2 (en) * 2014-08-27 2017-07-11 Facebook, Inc. Keyword search queries on online social networks
US9754037B2 (en) 2014-08-27 2017-09-05 Facebook, Inc. Blending by query classification on online social networks
US9990441B2 (en) * 2014-12-05 2018-06-05 Facebook, Inc. Suggested keywords for searching content on online social networks
CN105302879B (zh) * 2015-10-12 2019-03-08 百度在线网络技术(北京)有限公司 用于确定用户需求的方法与装置
CN105302894A (zh) * 2015-10-21 2016-02-03 中国石油大学(华东) 一种基于并行关联规则的舆情热点跟踪方法与跟踪装置
CN108845992B (zh) * 2015-10-30 2022-08-26 上海智臻智能网络科技股份有限公司 计算机可读存储介质及问答交互方法
CN107291707A (zh) * 2016-03-30 2017-10-24 阿里巴巴集团控股有限公司 确定词组关联度的方法、品牌竞争度的方法及其装置
CN106970952A (zh) * 2017-03-09 2017-07-21 浙江中诚工程管理科技有限公司 报表生成方法及系统
CN107330672B (zh) * 2017-07-03 2021-02-26 北京拉勾科技有限公司 一种基于相似度的信息处理方法、装置及计算设备
EP3679488B1 (en) * 2017-09-08 2024-08-28 Open Text SA ULC System and method for recommendation of terms, including recommendation of search terms in a search system
CN110489652B (zh) * 2019-08-23 2022-06-03 重庆邮电大学 基于用户行为检测的新闻推荐方法、系统及计算机设备
CN111324804B (zh) * 2020-02-21 2023-09-22 抖音视界有限公司 搜索关键词推荐模型生成方法、关键词推荐方法与装置
US11016980B1 (en) * 2020-11-20 2021-05-25 Coupang Corp. Systems and method for generating search terms
CN112835919B (zh) * 2021-02-24 2022-04-26 武汉联影医疗科技有限公司 医学数据库更新方法、装置、计算机设备和存储介质
CN113221004A (zh) * 2021-05-21 2021-08-06 珠海金山网络游戏科技有限公司 一种关键词展示方法及装置
CN117370340A (zh) * 2023-09-26 2024-01-09 重庆赛力斯新能源汽车设计院有限公司 一种数据处理方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1299488A (zh) * 1998-03-16 2001-06-13 Nbci新西兰有限责任合伙公司 改进的搜索引擎
JP2003323433A (ja) * 2002-05-07 2003-11-14 Hitachi Ltd 情報端末およびサーバ
CN1922605A (zh) * 2003-12-26 2007-02-28 松下电器产业株式会社 辞典制作装置以及辞典制作方法

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06314296A (ja) * 1993-03-02 1994-11-08 Fujitsu Ltd 情報検索システム
JPH0756948A (ja) * 1993-08-09 1995-03-03 Fuji Xerox Co Ltd 情報検索装置
JPH09319767A (ja) 1996-05-29 1997-12-12 Oki Electric Ind Co Ltd 類義語辞書登録方法
US6006225A (en) 1998-06-15 1999-12-21 Amazon.Com Refining search queries by the suggestion of correlated terms from prior searches
US6144958A (en) 1998-07-15 2000-11-07 Amazon.Com, Inc. System and method for correcting spelling errors in search queries
US6523028B1 (en) * 1998-12-03 2003-02-18 Lockhead Martin Corporation Method and system for universal querying of distributed databases
JP2002007450A (ja) * 2000-06-16 2002-01-11 Matsushita Electric Works Ltd 検索支援システム
JP3563682B2 (ja) 2000-09-12 2004-09-08 日本電信電話株式会社 次検索候補単語提示方法および装置と次検索候補単語提示プログラムを記録した記録媒体
WO2003075186A1 (en) * 2002-03-01 2003-09-12 Paul Jeffrey Krupin A method and system for creating improved search queries
US7853557B2 (en) * 2002-06-14 2010-12-14 Siebel Systems, Inc. Method and computer for responding to a query according to the language used
US7240049B2 (en) * 2003-11-12 2007-07-03 Yahoo! Inc. Systems and methods for search query processing using trend analysis
US8572233B2 (en) * 2004-07-15 2013-10-29 Hewlett-Packard Development Company, L.P. Method and system for site path evaluation using web session clustering
US7440947B2 (en) 2004-11-12 2008-10-21 Fuji Xerox Co., Ltd. System and method for identifying query-relevant keywords in documents with latent semantic analysis
US7406465B2 (en) 2004-12-14 2008-07-29 Yahoo! Inc. System and methods for ranking the relative value of terms in a multi-term search query using deletion prediction
EP1866738A4 (en) * 2005-03-18 2010-09-15 Search Engine Technologies Llc SEARCH ENGINE THAT APPLIES USER REPORTS TO IMPROVE SEARCH RESULTS
US8135728B2 (en) 2005-03-24 2012-03-13 Microsoft Corporation Web document keyword and phrase extraction
JP4774806B2 (ja) * 2005-05-25 2011-09-14 セイコーエプソン株式会社 ファイル検索装置、印刷装置、ファイル検索方法及びそのプログラム
US7739708B2 (en) 2005-07-29 2010-06-15 Yahoo! Inc. System and method for revenue based advertisement placement
US7558787B2 (en) * 2006-07-05 2009-07-07 Yahoo! Inc. Automatic relevance and variety checking for web and vertical search engines
US7779003B2 (en) 2006-07-17 2010-08-17 Siemens Medical Solutions Usa, Inc. Computerized search system for medication and other items
US7937403B2 (en) * 2006-10-30 2011-05-03 Yahoo! Inc. Time-based analysis of related keyword searching
WO2008083211A1 (en) 2006-12-29 2008-07-10 Thomson Reuters Global Resources Information-retrieval systems, methods, and software with concept-based searching and ranking
US8285745B2 (en) 2007-03-01 2012-10-09 Microsoft Corporation User query mining for advertising matching
CN101286150B (zh) 2007-04-10 2010-09-15 阿里巴巴集团控股有限公司 生成更新参数的方法和装置、展示相关关键词的方法和装置
CN101477542B (zh) * 2009-01-22 2013-02-13 阿里巴巴集团控股有限公司 一种抽样分析方法、系统和设备

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1299488A (zh) * 1998-03-16 2001-06-13 Nbci新西兰有限责任合伙公司 改进的搜索引擎
JP2003323433A (ja) * 2002-05-07 2003-11-14 Hitachi Ltd 情報端末およびサーバ
CN1922605A (zh) * 2003-12-26 2007-02-28 松下电器产业株式会社 辞典制作装置以及辞典制作方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2136302A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191119A (zh) * 2019-12-16 2020-05-22 绍兴市上虞区理工高等研究院 一种基于神经网络的科技成果自学习方法及装置
CN111191119B (zh) * 2019-12-16 2023-12-12 绍兴市上虞区理工高等研究院 一种基于神经网络的科技成果自学习方法及装置

Also Published As

Publication number Publication date
US20150106391A1 (en) 2015-04-16
US9135370B2 (en) 2015-09-15
CN101286150A (zh) 2008-10-15
US20100121860A1 (en) 2010-05-13
US8874588B2 (en) 2014-10-28
JP6066077B2 (ja) 2017-01-25
CN101286150B (zh) 2010-09-15
JP5238800B2 (ja) 2013-07-17
HK1120897A1 (zh) 2009-04-09
EP2136302A4 (en) 2010-11-24
JP2010524099A (ja) 2010-07-15
JP2013152744A (ja) 2013-08-08
EP2136302A1 (en) 2009-12-23
US20140136551A1 (en) 2014-05-15
US8676811B2 (en) 2014-03-18

Similar Documents

Publication Publication Date Title
WO2008122181A1 (fr) Procédé et dispositif pour la mise à jour de paramètres, et procédé et dispositif pour afficher un mot-clé associé
KR100699977B1 (ko) 데이터베이스 검색 시스템에서 관련 검색을 식별하기 위한방법 및 장치
CA2601768C (en) Search engine that applies feedback from users to improve search results
CN101276361B (zh) 一种显示相关关键词的方法及系统
US8583633B2 (en) Using reputation measures to improve search relevance
CN112632359A (zh) 信息推荐方法、装置、电子设备和存储介质
CN102930054A (zh) 数据搜索方法及系统
CN101031915A (zh) 利用基于用户信息和情境自动生成的链接的增强的文档浏览
CN103034680B (zh) 针对终端设备的数据交互方法及装置
US20150215271A1 (en) Generating suggested domain names by locking slds, tokens and tlds
CN102915380A (zh) 用于对数据进行搜索的方法和系统
CN101124081A (zh) 基于信誉的搜索
CN103020128B (zh) 与终端设备交互数据的方法与装置
GB2509766A (en) Website analysis
CN112579854A (zh) 信息处理方法、装置、设备和存储介质
US20150347423A1 (en) Methods for completing a user search
CN101546308A (zh) 一种基于检索过期的网页搜索方法及其系统
CN103617278A (zh) 一种地址栏搜索的控制方法及装置
CN111611491A (zh) 搜索词推荐方法、装置、设备及可读存储介质
JP5072792B2 (ja) 情報量に応じたページを優先的に表示する検索方法、プログラム及びサーバ
CN112016017A (zh) 确定特征数据的方法和装置
WO2018201596A1 (zh) 密码输入方法、装置、计算机可读存储介质和终端设备
HK1120897B (zh) 生成更新参数的方法和装置、展示相关关键词的方法和装置
JP2013196315A (ja) 情報処理装置及び方法
JP4534690B2 (ja) 文書検索装置および方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07801005

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2007801005

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2010502405

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 12594930

Country of ref document: US