WO2011052526A1 - 特有コンテンツ判定プログラム、特有コンテンツ判定装置、特有コンテンツ判定方法、記録媒体、コンテンツ生成装置及び関連コンテンツ挿入装置 - Google Patents
特有コンテンツ判定プログラム、特有コンテンツ判定装置、特有コンテンツ判定方法、記録媒体、コンテンツ生成装置及び関連コンテンツ挿入装置 Download PDFInfo
- Publication number
- WO2011052526A1 WO2011052526A1 PCT/JP2010/068820 JP2010068820W WO2011052526A1 WO 2011052526 A1 WO2011052526 A1 WO 2011052526A1 JP 2010068820 W JP2010068820 W JP 2010068820W WO 2011052526 A1 WO2011052526 A1 WO 2011052526A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- content
- web page
- specific
- constituting
- blog
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/40—Business processes related to social networking or social networking services
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
Definitions
- the present invention relates to a technical field for extracting contents constituting a Web page.
- Non-Patent Document 1 discloses a technology that, when a URL of image data is specified by a user, acquires image data corresponding to the URL from the Web, and automatically creates a banner based on the acquired image data. Has been.
- the contents according to the purpose of the Web site are posted on each Web page constituting the Web site.
- the contents of the Web pages constituting the Web site are basically related to each other, but may have some characteristics.
- the factor which determines the content of the web page is the content of the content (for example, text data, image data, etc.) which comprises the web page. Therefore, content that characterizes the Web page among content constituting the Web page, that is, content specific to the Web page may exist.
- Non-Patent Document 1 extracts content unique to a Web page, but does not automatically extract content, and the user must manually specify the content. It is not possible to easily extract content peculiar to. Therefore, when the user cannot determine which content is unique to the Web page, or when the favorite content is biased depending on the user's preference, the content specific to the Web page is accurately extracted. I can't. In addition, when the number of target Web pages is large, there is a problem that the user's work becomes enormous.
- the present invention has been made in view of the above points, and a unique content determination apparatus and a specific content determination method capable of easily extracting content specific to a Web page from content constituting the Web page.
- An object is to provide a unique content determination program and the like.
- the invention according to claim 1 is directed to a computer for extracting content that constitutes a designated web page, and each content constituting the designated web page. Calculating means for calculating the appearance frequency of the content, and determining means for determining content that is specific to the Web page among the contents constituting the specified Web page based on the calculated appearance frequency, It is made to function as.
- the appearance frequency of each content constituting the designated Web page is calculated.
- the content that appears less frequently than the designated Web page is the content that appears less frequently. Therefore, content specific to the designated Web page can be specified based on the appearance frequency. Therefore, content unique to the Web page can be easily extracted.
- the determination unit selects a content having the lowest appearance frequency among the contents constituting the specified Web page.
- the computer is caused to function so as to determine that the content is unique to a page.
- the content specific to the designated Web page is specified.
- the determination means selects a content whose appearance frequency is a predetermined value or less from among the contents constituting the specified Web page.
- the computer is caused to function so as to determine that the content is unique to the Web page.
- the calculation means causes the appearance frequency of each content on a plurality of Web pages included in a predetermined site.
- the computer is made to function so as to calculate.
- the appearance frequency of each content constituting the specified Web page is calculated on a plurality of Web pages included in a predetermined site, it is commonly used in the site.
- the content can be determined not to be content specific to the Web page, and the accuracy of determining content specific to the Web page can be increased.
- the extracting unit is configured to determine a predetermined type of each Web page included in the predetermined site. Content constituting the Web page is extracted, content information indicating the extracted content is stored in a storage unit in advance, and the calculation unit stores the designated Web page based on the stored content information.
- the computer is caused to function so as to calculate the appearance frequency of each content constituting the content.
- each content constituting the specified Web page is determined. Since the appearance frequency is calculated, the appearance frequency is accurately calculated, and the accuracy of determining content specific to the Web page can be increased.
- the extracting means is a Web page in a unit of a content group composed of one or more contents.
- the calculating means calculates the frequency of appearance of the content group constituting the designated web page, and the determining means comprises the designated web page.
- the computer is caused to function so as to determine a content group that is unique to the Web page among content groups that are present.
- content specific to the Web page is determined in units of content groups. For example, content that is displayed in a certain unit on the Web page or is related to each other is displayed. When a content group is used, it is possible to extract content that is unique to a Web page.
- the extracting means is based on document data that is described in a predetermined markup language and indicates the content constituting the Web page.
- the computer is made to function so as to extract a group.
- the content group is extracted based on the document data indicating the content constituting the Web page, the content group can be accurately extracted.
- the computer is configured such that the extracting unit determines a content group based on a tag that is predetermined in document data indicating the content. Is made to function.
- the content group is extracted based on a predetermined tag. Therefore, when content specific to a Web page and content that is not specific are grouped by a predetermined tag, respectively. Therefore, it is possible to increase the accuracy of determining content unique to the Web page.
- the generation means for generating new content based on the content determined to be the specific content is further functioned as follows.
- new content is generated based on content specific to a Web page. For example, it is possible to generate content indicating the characteristics of details posted on the Web page.
- the generation means matches the display size of the content determined to be the specific content with a preset display size.
- the computer is caused to function so as to generate new content including content whose display size is adjusted.
- an effect is applied to the content determined to be the specific content by the generating means to reproduce the content.
- the computer is caused to function so as to generate new content to be generated.
- related content related to a content determined to be specific content is specified as the specified Web.
- the computer is further caused to function as insertion means for inserting into a page.
- the content related to the content determined to be the specific content is inserted into the specified Web page, so that information related to the characteristics of the Web page can be added to the Web page. it can.
- the determination means includes text data of a blog article as the content constituting the specified Web page.
- the text data is determined to be content specific to the Web page, and the insertion unit determines from the text data of the blog article determined to be specific content by the specific content determination device.
- the computer is caused to function so as to extract a feature word of the Web page and insert related content related to the feature word into the Web page.
- the text data of each blog article contains contents specific to the article
- the text data of each blog article can be extracted by the unique content determination device. Thereby, information related to the content of the blog posted on the Web page can be added to the Web page.
- the invention according to claim 14 is an extraction means for extracting content constituting the designated Web page, a calculation means for calculating the appearance frequency of each content constituting the designated Web page, And determining means for determining content unique to the Web page among the contents constituting the specified Web page based on the calculated appearance frequency.
- the determination unit determines a content having the lowest appearance frequency among the contents constituting the specified Web page. It is determined that the content is unique to the page.
- the determination unit selects a content whose appearance frequency is a predetermined value or less from among the contents constituting the designated Web page. It is determined that the content is unique to the Web page.
- the calculation unit is configured to display an appearance frequency of each content on a plurality of Web pages included in a predetermined site. It is characterized by calculating.
- the extraction unit is configured to determine a predetermined type of each Web page included in the predetermined site. Content constituting the Web page is extracted, content information indicating the extracted content is stored in a storage unit in advance, and the calculation unit stores the designated Web page based on the stored content information. The appearance frequency of each content constituting the content is calculated.
- the extraction unit is a unit of a content group composed of one or more contents, and a Web page.
- the calculating means calculates the frequency of appearance of the content group constituting the designated web page, and the determining means comprises the designated web page. It is characterized in that a content group that is unique to the Web page is determined among the existing content groups.
- the extraction unit is a content based on document data that is described in a predetermined markup language and indicates the content constituting the Web page. It is characterized by extracting groups.
- the extracting unit determines a content group based on a predetermined tag in document data indicating the content. .
- the invention according to claim 22 is an extraction process for extracting the content constituting the designated web page, a calculation process for calculating the appearance frequency of each content constituting the designated web page, And a determination step of determining content that is unique to the Web page among the content that constitutes the specified Web page based on the calculated appearance frequency.
- the invention according to claim 23 is an extraction means for extracting the content constituting the designated Web page, and a calculation means for calculating the appearance frequency of each content constituting the designated Web page. And a unique content determination program that functions as a determination unit that determines content unique to the Web page among the contents constituting the specified Web page based on the calculated appearance frequency. It is recorded so as to be readable by a computer.
- a new content is determined based on the specific content determination device according to any one of the fourteenth to twenty-first aspects and the content determined to be specific content by the specific content determination device.
- generating means for generating content are provided.
- the generation unit matches the display size of the content determined to be the specific content with a preset display size. Adjusting and generating new content including content whose display size is adjusted.
- the generating means applies an effect to the content determined to be unique content and reproduces the content. It is characterized by generating new content.
- the specific content determination device according to any one of the fourteenth to twenty-first aspects and related content related to the content determined to be specific content by the specific content determination device. And insertion means for inserting into the designated Web page.
- the unique content determination device includes text data of a blog article as the content constituting the designated web page.
- the text data is determined to be content specific to the Web page, and the insertion unit determines from the text data of the blog article determined to be specific content by the specific content determination device.
- a feature word of the designated web page is extracted, and related content related to the feature word is inserted into the web page.
- the content whose appearance frequency is smaller is the content that does not appear much other than the designated Web page. Therefore, content specific to the designated Web page can be specified based on the appearance frequency. Therefore, content unique to the Web page can be easily extracted.
- FIG. 1 is a diagram illustrating an example of a schematic configuration of a shopping system S according to the present embodiment.
- the shopping system S includes a content generation server 1, a shopping server 2, a management terminal 3, a plurality of store terminals 4, and a plurality of users as examples of the specific content determination device and the content generation device. And a terminal 5.
- the content generation server 1, the shopping server 2, each store terminal 4, and each user terminal 5 exchange data with each other using, for example, TCP / IP as a communication protocol via the network NW.
- the network NW is constructed by, for example, the Internet, a dedicated communication line (for example, a CATV (CommunityCommunAntenna Television) line), a mobile communication network (including a base station, etc.), a gateway, and the like.
- the content generation server 1 and the management terminal 3 are connected via a network such as a LAN (Local Area Network). Note that the content generation server 1 and the shopping server 2 may be similarly connected via a network such as a LAN.
- the shopping server 2 is a Web server that transmits a Web page constituting the shopping site in response to a request from the store terminal 4 or the user terminal 5. Moreover, the shopping server 2 registers the product sold by a shopping site based on the request from the store terminal 4, and produces
- HTML document of the product detail page an example of document data
- image data that is the material of the product detail page
- the store terminal 4 is a terminal device used for employees of stores that sell products on a shopping site.
- a personal computer or the like is used as the store terminal 4.
- the user terminal 5 is a terminal device used by a user who purchases a product at a shopping site.
- a personal computer, a PDA, a mobile phone or the like is used as the user terminal 5, for example, a personal computer, a PDA, a mobile phone or the like is used.
- the content generation server 1 is a Flash content (standardized by Adobe Systems) showing the characteristics of the designated product detail page (and thus the characteristics of the product) based on a request from the management terminal 3 or the store terminal 4 Software).
- the generated Flash content is, for example, a banner image of a product, a slide show content introducing the product, a moving image content, or the like.
- the Flash content is, for example, posted on a website operated by a store, or used as a material of a web page constituting a shopping site.
- the content generation server 1 includes a material extraction DB 101, and content (image data, described in an HTML document) as a Web material that constitutes a product detail page registered in the product detail page DB 201. Text data etc.) is extracted, and the extraction result is registered in the material extraction DB 101. Then, the content generation server 1 specifies content specific to the product detail page from the content extracted from the designated product detail page, and generates Flash content based on the specified content.
- the management terminal 3 is a terminal device used by the system administrator of the shopping system S.
- a personal computer or the like is used as the management terminal 3.
- FIG. 2 is a block diagram illustrating an example of a schematic configuration of the content generation server 1 according to the present embodiment.
- FIG. 3 is a diagram showing an outline of processing from when a Web page is designated until Flash content is generated.
- FIG. 4 is a diagram illustrating a configuration example of a Web page.
- FIG. 5 is a diagram illustrating an example of a DOM tree generated from an HTML document.
- FIG. 6 is a diagram illustrating an example of the content of information registered in the material extraction DB 101.
- the content generation server 1 includes an operation unit 11, a display unit 12, a communication unit 13, a drive unit 14, a storage unit 15 as an example of a storage unit, and an input / output interface unit 16. And a system control unit 20.
- the system control unit 20 and the input / output interface unit 16 are connected via a system bus 21.
- the operation unit 11 includes, for example, a keyboard and a mouse, and receives an operation instruction from a system administrator or the like, and outputs the instruction content to the system control unit 20 as an instruction signal.
- the display unit 12 includes, for example, a CRT (Cathode Ray Tube) display, a liquid crystal display, and the like, and displays information such as characters and images.
- the communication unit 13 is connected to a network NW or the like and controls a communication state with the shopping server 2, the management terminal 3, the store terminal 4, the user terminal 5, and the like.
- the drive unit 14 reads data from a disk DK such as a flexible disk, a CD (Compact Disc), a DVD (Digital Versatile Disc), and the like, and records data on the disc DK.
- the storage unit 15 is configured by, for example, a hard disk drive or the like, and stores various programs, data, and the like.
- the material extraction DB 101 is constructed in the storage unit 15.
- the input / output interface unit 16 performs interface processing between the operation units 11 to 15 and the system control unit 20.
- the system control unit 20 includes a CPU (Central Processing Unit) 17, a ROM (Read Only Memory) 18, a RAM (Random Access Memory) 19, and the like.
- the system control unit 20 controls each unit of the content generation server 1 by the CPU 17 reading and executing various programs stored in the ROM 18 and the storage unit 15.
- the system control unit 20 functions as an extraction unit, a calculation unit, a determination unit, and a generation unit by executing content generation software (an example of a specific content determination program).
- the content generation software or the like may be acquired from another server device or the like via the network NW, for example, or recorded on a disk DK such as a CD-ROM and read via the drive unit 14. You may do it.
- Content generation software is a program for generating Flash content based on content specific to the product detail page.
- the content generation software includes a manager unit, a material extraction engine, a SWF (ShockWave Flash Object) generation engine, and the like.
- the manager unit controls the execution of the material extraction engine and the SWF engine, and generates GUI content (Graphical User Interface) for users (store employees and system administrators) who use the content generation software. It is software for providing.
- the material extraction engine is software for extracting content as a Web material from an HTML document of a product detail page and determining content specific to the product detail page. Content extraction is performed in units of content blocks (an example of a content group) described later.
- the SWF engine is software for generating Flash content based on one or more given contents (Web material).
- a rich Internet application other than Flash content is generated as new content, for example, a Microsoft Silverlight (trademark) generation engine may be applied instead of the SWF generation engine.
- software that generates a script that realizes a dynamic page using a technology such as Ajax (Asynchronous JavaScript (registered trademark) + XML) may be applied.
- the system control unit 20 acquires and analyzes an HTML document registered in the product detail page DB 201 from the shopping server 2, and extracts content that is a Web material in units of content blocks. Then, as the extraction result, content block correspondence information (an example of content information) is registered in the material extraction DB 101 for each extracted content block (1). This process is performed in advance before the generation of the Flash content, and basically all HTML documents registered in the product detail page DB 201, that is, all product detail pages constituting the shopping site are extracted.
- the URL of the HTML document of the product detail page for which the Flash content is generated is specified by the system administrator or the store employee (2).
- the system control unit 20 acquires an HTML document from the shopping server 2 based on the designated URL, and extracts a content block.
- the control unit 20 refers to the material extraction DB 101 and calculates the appearance frequency of all the extracted content blocks on all product detail pages.
- the appearance frequency calculated may be the number of appearances (frequency) or the ratio of the number of appearances to all content blocks of all product detail pages (relative frequency).
- the system control part 20 determines the content block peculiar to the goods detailed page corresponding to designated URL based on appearance frequency. Specifically, the system control unit 20 determines that the content block with the lowest appearance frequency is a content block specific to the product detail page corresponding to the specified URL (3).
- the system control unit 20 acquires content included in the content block determined to be a specific content block from the product detail page DB 201 via the shopping server 2.
- the system control unit 20 generates Flash content based on the acquired content (4). Then, the system administrator or the store employee downloads the generated Flash content (5). Note that the Flash content may be appropriately modified by the system administrator or store employee before downloading the Flash content.
- each content as a Web material is displayed for each certain group (group) on the product detail page.
- Each group corresponds to a content block.
- Each content is divided into content blocks by a DIV tag and a TABLE tag (an example of a predetermined tag) described in the HTML document. That is, each content is blocked (grouped) by the DIV tag and the TABLE tag.
- the DIV tag and the TABLE tag are referred to as “blocked tags”.
- the content block 301 is, for example, a content block in the header portion of the page, and is composed of a text A and an image a.
- the content block 302 is, for example, a content block of a navigation part for moving to Web pages related to products of various categories, and is composed of, for example, text B, text C, and text D indicating links to other Web pages.
- the content block 303 is, for example, a content block on which information related to a product is displayed, and includes a text E indicating a heading such as a product name, a content block 304, and a content block 305. In this way, the content blocks may be nested, that is, have a hierarchical structure.
- the content included in the content block 303 is only the text E, and the content block 304 and the content block 305 are independent of the content block 303.
- the content block 304 is, for example, a content block indicating details of a product, and includes a text F indicating a detailed description, an image b as an image of the product, and an image c.
- the content block 305 is, for example, a content block indicating general precautions when purchasing a product, and includes a text G and a text H.
- the content block 306 is, for example, a content block indicating copyright display, and is composed of text I.
- content blocks 301, 302, 305 and 306 appear relatively frequently on product detail pages other than the product detail page shown in FIG.
- the frequency of the content block 303 (text E) and the content block 304 Smaller than 306. Therefore, for example, the content block 303 or the content block 304 is determined to be a content block unique to the product detail page.
- FIG. 5 shows the HTML document of the product detail page shown in FIG. 4 in a DOM (Document Object Model) tree, that is, a tree structure.
- DOM Document Object Model
- a DIV node indicating a DIV tag and a TABLE node indicating a TABLE tag are nodes that block each content into content blocks (hereinafter referred to as “blocked nodes”).
- the system control unit 20 searches the DOM tree by depth-first search and determines the content block. Specifically, when the system control unit 20 finds a blocked node, the contents defined in each node of the partial tree having the node as a vertex are grouped into a content block.
- a content block corresponding to a subtree having a higher-level blocked node as a vertex (hereinafter referred to as “higher-level subtree”) It is divided into a content block corresponding to a subtree having a lower blocked node as a vertex (hereinafter referred to as “lower subtree”) and a content block corresponding to a portion of the upper subtree excluding the lower subtree (for example, Content block 304 and content block 303).
- the former content block is hierarchically lower than the latter content block.
- the hierarchy of the content blocks 301, 302, 303, and 306 is 1, and the hierarchy of the content blocks 304 and 305 is 2. In other words, the lower the hierarchy value, the higher the hierarchy.
- the system control unit 20 registers content block correspondence information indicating the extraction result in the material extraction DB 101.
- the content block correspondence information (reference numeral 401) is registered for each content block.
- the content block correspondence information includes a URL setting part (reference numeral 402) and block configuration information (reference numeral 403) of the extraction source HTML document.
- Each extracted content is set in the block configuration information.
- the URL of the image data is set as the src attribute of the IMG node indicating the IMG tag in the DOM tree.
- FIG. 7 is a flowchart illustrating a processing example in the material extraction processing of the system control unit 20 of the content generation server 1 according to the present embodiment.
- the material extraction process is started, for example, periodically or when a request for executing the material extraction process is transmitted from the management terminal 3 based on the operation of the system administrator.
- the system control unit 20 analyzes all HTML documents registered in the product detail page DB 201. Therefore, for example, the system control unit 20 acquires information on a list of HTML documents registered in the product detail page DB 201 from the shopping server 2 in advance, and acquires an HTML document based on the information on the list.
- the HTML document of the product detail page may be sequentially acquired by following links one after another from the HTML document of the top page of the shopping site.
- the system control unit 20 initializes the material extraction DB 101 (step S1). Specifically, when content block correspondence information is registered in the material extraction DB 101, the system control unit 20 deletes all content block correspondence information from the material extraction DB 101.
- the system control unit 20 specifies the URL of the HTML document of the product detail page to be acquired first among all the product detail pages (step S2), and transmits a request in which the specified URL is set to the shopping server 2.
- an HTML document is acquired from the shopping server 2 (step S3).
- the system control unit 20 designates the acquired HTML document and executes a one-page extraction process described later (step S4). In this one-page extraction process, a content block is extracted from the acquired HTML document, and content block correspondence information is registered.
- step S5 determines whether or not the content blocks of all product detail pages have been extracted. At this time, if there is a product detail page from which no content block is extracted (step S5: NO), the system control unit 20 specifies the URL of the HTML document of the next product detail page (step S6). The process proceeds to step S3. When the system control unit 20 repeats the processes of steps S3 to S6 to extract the content blocks of all the product detail pages (step S5: YES), the material extraction process is terminated.
- system control unit 20 does not have to initialize the material extraction DB 101 and re-register the content block correspondence information.
- the system control unit 20 does not initialize the material extraction DB 101, generates content block correspondence information for a newly generated product detail page after executing the material extraction process last time, and additionally registers the content block correspondence information in the material extraction DB 101,
- content block correspondence information may be generated and updated and registered in the material extraction DB 101 for the product detail page updated after the previous material extraction processing is executed.
- FIG. 8 is a flowchart showing a processing example in the one-page extraction process of the system control unit 20 of the content generation server 1 according to the present embodiment.
- the system control unit 20 first generates a DOM tree of the acquired HTML document on the RAM 19 (step S21).
- the system control unit 20 sets 0 to the block number NUM and sets 0 to the hierarchy LV (step S22).
- the block number NUM is the number of content blocks that have been discovered at the present time.
- the hierarchy LV is a hierarchy of content blocks to which the currently searched node belongs in the DOM tree. NUM and LV are both global variables and can be accessed from the one-page extraction process and the tree search process described later.
- the system control unit 20 designates the root node of the DOM tree (step S23) and executes tree search processing (step S24).
- the tree search process can be recursively called. By this tree search process, all content blocks are extracted from the Web page, and content block correspondence information is generated.
- the system control unit 20 registers each content block correspondence information generated by the tree search process in the material extraction DB 101 (step S25). When completing this process, the system control unit 20 ends the one-page correspondence extraction process.
- FIG. 9 is a flowchart showing a processing example in the tree search process of the system control unit 20 of the content generation server 1 according to the present embodiment.
- the system control unit 20 first determines the type of the designated node (step S31). At this time, when the designated node type is a DIV node or a TABLE node (blocked node), that is, when a content block is found (step S31: DIV or TABLE), the process proceeds to step S32.
- step S32 the system control unit 20 adds 1 to the block number NUM and adds 1 to the hierarchy LV.
- the system control unit 20 sets NUM in the block number BN [LV] (step S33).
- the block number BN [LV] is the block number of the content block indicated by the hierarchy LV to which the currently searched node belongs. This block number is assigned in the order of discovery of content blocks.
- BN [LV] is a global variable.
- the system control unit 20 initializes the content block correspondence information corresponding to the content block with the block number BN [LV] (step S34). Specifically, the system control unit 20 sets an area for storing the content block correspondence information on the RAM 19 and sets the URL of the acquired HTML document in the area.
- step S35 determines whether there is a child node that has not been searched among the child nodes of the designated node. At this time, if there is a child node that has not been searched yet (step S35: YES), the system control unit 20 proceeds to step S36.
- step S36 the system control unit 20 designates one of the unsearched child nodes and executes tree search processing (step S37). After completing the tree search process, the system control unit 20 proceeds to step S35.
- step S35 when the system control unit 20 repeats the processes of steps S35 to S37 and finishes the tree search process for all child nodes (step S35: NO), the system control unit 20 proceeds to step S38. Note that the system control unit 20 also proceeds to step S38 when there is no child node of the designated node. In step S38, the system control unit 20 subtracts 1 from the hierarchy LV and ends the tree search process.
- step S31 when the type of the designated node is a text node (step S31: text), the system control unit 20 displays the content (text data) of the designated node with the block number BN [LV]. It is additionally set in the block configuration information in the content block correspondence information corresponding to the content block (step S39). After completing this process, the system control unit 20 ends the tree search process.
- step S31 if the type of the designated node is an IMG node (step S31: IMG), the system control unit 20 acquires the URL of the image data set as the src attribute of the designated node.
- the acquired URL is additionally set in the block configuration information in the content block correspondence information corresponding to the content block of the block number BN [LV] (step S40). After completing this process, the system control unit 20 ends the tree search process.
- step S31 when the type of the designated node is not any of the DIV node, the TABLE node, the text node, and the IMG node (step S31: Other), the system control unit 20 is a child of the designated node. It is determined whether there is a child node that has not been searched for among the nodes (step S41). At this time, if there is a child node that has not been searched yet (step S41: YES), the system control unit 20 designates one of the child nodes that have not been searched (step S42). Then, tree search processing is executed (step S43). After completing the tree search process, the system control unit 20 proceeds to step S41.
- step S41: NO the system control unit 20 finishes the tree search process for all the child nodes of the designated node, or when there is no child node of the designated node (step S41: NO).
- the tree search process is terminated.
- FIG. 10 is a flowchart illustrating a processing example in the content generation processing of the system control unit 20 of the content generation server 1 according to the present embodiment.
- the content generation process is executed when a request for execution of the content generation process is transmitted from the management terminal 3 based on the operation of the system administrator, or the content generation process is executed from the store terminal 4 based on the operation of the store employee. Triggered when a request is sent.
- the system control unit 20 uses the designated URL as the management terminal 3 or the store terminal 4. (Step S51). Next, the system control unit 20 acquires an HTML document from the shopping server 2 by transmitting a request in which the received URL is set to the shopping server 2 (step S52).
- the system control unit 20 designates the acquired HTML document and executes a specific content block determination process described later (step S53).
- a content block is extracted from the acquired HTML document, and a content block specific to the HTML document is determined.
- the system control unit 20 acquires each content constituting the content block determined to be unique (step S54). At this time, when acquiring the text data, the system control unit 20 acquires the text data from the content block correspondence information corresponding to the content block determined to be unique. On the other hand, when acquiring the image data, the system control unit 20 acquires the URL of the image data from the content block correspondence information corresponding to the content block determined to be unique, and shopping the request in which the acquired URL is set. By transmitting to the server 2, the image data registered in the product detail page DB 201 is acquired from the shopping server 2.
- the system control unit 20 designates all acquired contents and executes a later-described Flash content generation process (step S55).
- the system control unit 20 transmits the Flash content generated in the Flash content generation process to the management terminal 3 or the store terminal 4 that is the generation request source (Step S56). After completing this process, the system control unit 20 ends the content generation process.
- FIG. 11 is a flowchart illustrating a processing example in the specific content block determination process of the system control unit 20 of the content generation server 1 according to the present embodiment.
- the system control unit 20 first generates a DOM tree of the acquired HTML document (step S61), sets 0 for the number of blocks NUM and the hierarchy LV, as in the one-page extraction process. Perform (step S62), specify the root node of the DOM tree (step S63), and execute the tree search process (step S64).
- the system control unit 20 sets 1 to the block number i (step S65).
- the system control unit 20 calculates the appearance frequency of the content block with the block number i (step S66).
- the system control unit 20 is registered in the block configuration information of the content block correspondence information i (content block correspondence information corresponding to the content block of the block number i) generated in the tree search process and the material extraction DB 101.
- the block configuration information of each content block correspondence information is compared. At this time, if the contents of the block configuration information match, the system control unit 20 counts the number of appearances as one. At this time, the system control unit 20 may ignore the content order in the block configuration information.
- the system control unit 20 includes a part of the content specified in the block configuration information of the content block correspondence information i for all of the contents specified in the block configuration information of the content block correspondence information registered in the material extraction DB 101. Even if it matches the content, it may be counted as one appearance.
- the system control unit 20 when comparing the text data defined in the block configuration information of the content block correspondence information, the system control unit 20 does not determine whether or not the sentences etc. indicated by the text data match. Instead, the substantial contents expressed by the sentences may be compared. For example, the system control unit 20 may extract words from the text data by performing morphological analysis of each text data and compare the extracted words. Then, the system control unit 20 may determine that the text data match when all the words match, or determine that the text data match when the words match at a predetermined ratio or more. May be. In this way, the system control unit 20 compares the block configuration information of the content block correspondence information i with the block configuration information of all the content block correspondence information registered in the material extraction DB 101, and calculates the appearance frequency. .
- step S67 the system control unit 20 adds 1 to the block number i (step S67), and determines whether the block number i is larger than the value of the block number NUM (step S68). At this time, if the block number i is equal to or less than the value of the block number NUM (step S68: NO), the system control unit 20 proceeds to step S66. And the system control part 20 will transfer to step S69, if the appearance frequency of all the content blocks extracted in the tree search process is calculated (step S68: YES).
- step S69 the system control unit 20 compares the appearance frequencies of all the content blocks from the content block 1 to the block number indicated by the block number NUM, and determines the content block having the lowest appearance frequency as a specific content block. It is determined that there is (step S69). After completing this process, the system control unit 20 ends the specific content block determination process.
- FIG. 12 is a flowchart showing a processing example in the Flash content generation processing of the system control unit 20 of the content generation server 1 according to the present embodiment.
- Flash content that generates a slideshow of each content included in the content block determined to be a specific content block is generated. Will be described below.
- the system control unit 20 first adjusts the display size of each designated content (step S71). For example, the system control unit 20 adjusts the number of vertical and horizontal pixels of the image data and adjusts the font size of the text data so as to match the actual display size at the time of Flash content playback. In addition, when the display size of the content is too large compared to the actual display size when the Flash content is played back, the system control unit 20 divides the content into a plurality of pieces. Further, the system control unit 20 combines a plurality of contents into one when the display size of the content is too small compared to the actual display size at the time of Flash content reproduction.
- the system control unit 20 determines the display order of each content (step S72).
- the display order of each content is basically the same order as the content setting order for the content block correspondence information in the tree search process. That is, the content specified near the top of the document in the HTML document has a faster display order.
- the system control unit 20 determines a transition method for each content (step S73). That is, the system control unit 20 determines an effect (display effect) to be applied when switching the content to be displayed in the slide show display. Examples of effects include fade-in / fade-out, slide, random block, wipe, and no effect.
- the system control unit 20 generates Flash content based on the conditions determined in steps S72 and S73 based on the contents adjusted in step S71 (step S74). After completing this process, the system control unit 20 ends the Flash content generation process.
- the system control unit 20 of the content generation server 1 extracts the content constituting the product detail page corresponding to the specified URL and corresponds to the specified URL.
- the frequency of appearance of each content constituting the product detail page to be calculated is calculated, and among the content constituting the product detail page corresponding to the specified URL, the content with the lowest appearance frequency is the content specific to the product detail page. It is judged that.
- the content with the lowest appearance frequency is the content that does not appear much other than the specified product detail page. Therefore, the content specific to the specified product detail page is determined by determining the content with the lowest appearance frequency. Identified. Therefore, content specific to the product detail page can be easily extracted.
- system control unit 20 of the content generation server 1 generates Flash content based on the content determined to be content specific to the product detail page.
- system control unit 20 of the content generation server 1 calculates the appearance frequency of each content on a plurality of product detail pages included in the shopping site.
- the appearance frequency of each content constituting the designated product detail page is calculated on a plurality of Web pages included in the shopping site, the content commonly used in the shopping site is unique. It can be determined that the content is not content, and the determination accuracy can be increased.
- the system control unit 20 of the content generation server 1 extracts the content constituting the product detail page for all the product detail pages constituting the shopping site, and stores the content block correspondence information indicating the extracted content in advance as a material
- the frequency of appearance of each content constituting the product detail page corresponding to the specified URL is calculated based on each content block correspondence information registered in the extraction DB 101 and registered in the material extraction DB 101.
- the system control unit 20 of the content generation server 1 extracts the content constituting the product detail page in units of content blocks composed of one or more contents, and the product details corresponding to the specified URL.
- the frequency of appearance of each content block constituting the page is calculated, and the content block having the lowest appearance frequency among the content blocks constituting the product detail page corresponding to the specified URL is a content block specific to the product detail page. It is judged that.
- a header part for example, a navigation part, a part showing product details, a part showing general precautions when purchasing goods, a part showing copyright, etc.
- a content block specific to the product detail page can be extracted.
- system control unit 20 of the content generation server 1 extracts the content constituting the product detail page based on the HTML document of the product detail page, and the content block is extracted based on the DIV tag or the TABLE tag in the HTML document. Determine.
- one or more contents explicitly blocked when creating an HTML document can be specified by the DIV tag, and one of the contents that is blocked and displayed in a table format by the TABLE tag. Since the above content can be specified, for example, when the content specific to the product detail page and the non-specific content are blocked by these tags, the content specific to the Web page is determined. The accuracy can be increased.
- the content block correspondence information corresponding to each content block constituting the designated product detail page is compared with all the content block correspondence information registered in the material extraction DB 101.
- Each occurrence frequency was calculated.
- the frequency of appearance in the range targeting all the product detail pages included in the shopping site was calculated.
- it is not necessary to target all product detail pages For example, it is possible to specify a target store and calculate the frequency of appearance in a range targeting all product detail pages corresponding to the specified store. Further, for example, product detail pages for a predetermined number of pages may be targeted.
- content blocks may be extracted for each product detail page necessary for calculating the appearance frequency.
- the Flash content is generated when the URL of the HTML document of the product detail page is specified by the system administrator or the store employee.
- new product details When a page is created or when a product detail page is updated, Flash content for a newly created or updated product detail page may be generated.
- the content block with the lowest appearance frequency is set as the content specific to the Web page.
- the content block with the lowest appearance frequency is Nth (N is a natural number of 2 or more).
- N content blocks up to small content may be used as content blocks specific to the Web page. This can be applied, for example, when the number of content blocks necessary for desired processing is two or more and predetermined.
- the number of contents (not content blocks) necessary for the desired processing is predetermined at 2 or more, and only the content included in the content block with the lowest appearance frequency is insufficient
- the content block with the second lowest appearance frequency is additionally certified as content specific to the Web page, and only the content included in the content block with the first lowest frequency and the content block with the second smallest frequency is insufficient
- Flash content is generated using content specific to the Web page.
- content other than Flash content for example, moving image data, still image data, electronic document, etc.
- the use of content unique to a Web page is not limited to the generation of new content.
- image data specific to a Web page may be determined, and the image data determined to be specific image data may be displayed in a search result or the like as image data representing the Web page.
- FIG. 13 is a diagram showing an example of a schematic configuration of the blog system BS according to the present embodiment. In FIG. 13, elements similar to those in FIG.
- the blog system BS includes a blog server 6 as an example of a specific content determination device and a related content insertion device, a management terminal 3, and a plurality of user terminals 5.
- the blog server 6 and each user terminal 5 can transmit / receive data to / from each other using, for example, TCP / IP as a communication protocol via the network NW.
- the blog server 6 and the management terminal 3 are connected via a network such as a LAN.
- the blog server 6 is a Web server that transmits a Web page constituting the blog service site in response to a request from the user terminal 5.
- the user can manage his / her blog on the blog service site.
- the registered user (blogger) can access the blog service site and update his / her blog (add blog articles (records for each blog)). Therefore, the blog server 6 generates or updates a blog page on which one or a plurality of blog articles are posted as a blog Web page in accordance with the update of the blog.
- the blog server 6 includes a blog page DB 601 and registers the blog page in the blog page DB 601.
- the blog server 6 inserts advertising content (an example of related content) into the blogger's blog page designated by the system administrator.
- the advertisement content includes, for example, text data of advertisement text, banner image data, moving image data, a rich internet application (RIA) generated by Adobe Flash (trademark), Silverlight (trademark), and the like.
- the advertising content inserted into each blog page is content indicating an advertisement related to a product or service related to the blog article posted on the target blog page. Therefore, the blog server 6 includes an advertisement DB 602 in which a plurality of advertisement contents are registered. Then, the blog server 6 extracts a blog article from the blog page, further extracts a feature word from the blog article, and selects advertisement content related to the extracted feature word.
- the user terminal 5 is a terminal device used by a user as a blogger or a user browsing a blog.
- a user terminal 5 for example, a personal computer, a PDA, a mobile phone or the like is used.
- the management terminal 3 is a terminal device used by the system administrator of the blog system BS.
- a personal computer or the like is used as the management terminal 3.
- FIG. 14 is a block diagram showing an example of a schematic configuration of the blog server 6 according to the present embodiment.
- FIG. 15 is a diagram showing an overview of processing from when a blogger is designated until the insertion of advertisement content on a blog page.
- FIG. 16 is a diagram illustrating a configuration example of a Web page.
- FIG. 17 is a diagram illustrating an example of a DOM tree generated from an HTML document.
- FIG. 18 is a diagram illustrating an example of the content of the content block correspondence information stored in the storage unit 65.
- the blog server 6 includes an operation unit 61, a display unit 62, a communication unit 63, a drive unit 64, a storage unit 65 as an example of a storage unit, an input / output interface unit 66, A system control unit 70.
- the system control unit 70 and the input / output interface unit 66 are connected via a system bus 71.
- the operation unit 61 includes, for example, a keyboard, a mouse, and the like, and receives an operation instruction from a system administrator or the like, and outputs the instruction content to the system control unit 70 as an instruction signal.
- the display unit 62 includes, for example, a CRT display, a liquid crystal display, and the like, and displays information such as characters and images.
- the communication unit 63 is connected to the network NW or the like, and controls the communication state with the management terminal 3, the user terminal 5, and the like.
- the drive unit 64 reads data from a disk DK such as a flexible disk, a CD, or a DVD, and records data on the disk DK.
- the storage unit 65 is constituted by, for example, a hard disk drive or the like, and stores various programs and data.
- a blog page DB 601 and an advertisement DB 602 are constructed.
- each blog page constituting the blog service site (HTML document (an example of document data) of the blog page, image data that is the material of the blog page), for example, the URL of the page and the identification of the blogger It is registered in association with the user ID which is information.
- the advertisement DB 602 a plurality of advertisement contents are registered in association with keywords related to products or services to be advertised by the advertisement contents.
- the URL of the content is also registered in association with it.
- the URL of the Web page is also registered in association with the Web page related to the advertisement target product or service.
- the input / output interface unit 66 performs interface processing between the operation unit 61 to the storage unit 65 and the system control unit 70.
- the system control unit 70 includes a CPU 67, a ROM 68, a RAM 69, and the like.
- the system control unit 70 controls each unit of the blog server 6 by the CPU 67 reading and executing various programs stored in the ROM 68 and the storage unit 65.
- the system control unit 70 functions as an extraction unit, a calculation unit, a determination unit, and an insertion unit by executing advertisement content insertion software (an example of a specific content determination program).
- advertisement content insertion software may be acquired from another server device or the like via the network NW, or may be recorded on a disk DK such as a CD-ROM and read via the drive unit 64. You may make it.
- Advertising content insertion software is a program for inserting advertising content into a blog page.
- the advertisement content insertion software includes a manager section, a material extraction engine, a sentence analysis engine, an advertisement selection section, and the like.
- the manager unit controls execution of the material extraction engine, the sentence analysis engine, and the advertisement selection unit.
- the material extraction engine is software for extracting content as a Web material from an HTML document of a blog page and determining content specific to the blog page. Content extraction is performed in units of content blocks (an example of a content group).
- a blog article including content specific to the article corresponds to a content block specific to the blog page.
- the sentence analysis engine is software for extracting feature words of the blog page from the blog article extracted as content unique to the blog page.
- the advertisement selection unit is software for selecting advertisement content related to the blog page using the extracted feature words as keywords.
- the user ID of the target blogger is designated by the system administrator (1).
- the system control unit 70 acquires and analyzes HTML documents of all blog pages corresponding to the designated user ID from the blog page DB 601, and extracts content as Web material in units of content blocks.
- content block correspondence information (an example of content information) is generated for each extracted content block (2).
- the system control unit 70 calculates the appearance frequency of each extracted content block on all blog pages corresponding to the specified user ID.
- the appearance frequency calculated in the present embodiment is, for example, the number of appearances (frequency).
- the system control part 70 determines the content block peculiar in each blog page based on appearance frequency. Specifically, the system control unit 70 determines in each blog page that a content block whose appearance frequency is equal to or less than a predetermined threshold is a content block unique to the blog page (3).
- the system control unit 70 performs analysis such as morphological analysis on the content block determined to be a specific content block, that is, a blog article, and extracts a feature word for each block page (4).
- analysis such as morphological analysis on the content block determined to be a specific content block, that is, a blog article.
- a feature word for each block page (4).
- a word having the highest appearance frequency is used as a feature word.
- the system control unit 70 refers to the advertisement DB 602 and selects an advertisement content related to the extracted feature word (5). Then, the system control unit 70 inserts a rule (such as a description of a tag or data itself) for inserting and displaying the selected advertisement content into the blog page and displaying it in the HTML document of the blog page (6).
- a rule such as a description of a tag or data itself
- each content as a Web material is displayed for each certain group (group) on the blog page.
- Each group corresponds to a content block.
- Each content is divided into content blocks by a DIV tag and a TABLE tag (an example of a predetermined tag) described in the HTML document. That is, each content is blocked (grouped) by the DIV tag and the TABLE tag.
- the content block 701 is, for example, a content block in the header portion of the page, and includes a text A and an image a.
- the content block 702 is, for example, a content block of a navigation part for moving to another web page, and is composed of, for example, text B, text C, and text D indicating links to other web pages.
- the content block 703 is, for example, a content block corresponding to a blog display area, and includes a text E indicating a headline such as a blog, a content block 704, and a content block 705.
- the content blocks may be nested, that is, have a hierarchical structure.
- the content included in the content block 703 is only the text E, and the content block 704 and the content block 705 are independent of the content block 703.
- Each of the content blocks 704 and 705 is one blog article.
- the content block 704 is composed of text F and G indicating the title and body of the blog article.
- the content block 705 includes texts H, I, and J indicating the title and body of the blog article and images b and c registered by the blogger in relation to the blog article.
- the content block 706 is, for example, a content block indicating copyright display, and is composed of text I.
- content blocks 701, 702, 703, and 706 appear relatively frequently on blog pages other than the blog page shown in FIG.
- the content block 704 and the content block 705 are basically used only for the blog page. Therefore, it is determined that the content block 704 or the content block 705 is a content block unique to the blog page.
- a content block corresponding to a blog article including the content specific to the article it is necessary to determine a content block corresponding to a blog article including the content specific to the article as a specific content block.
- a plurality of blog articles including such specific contents are included in one page. For this reason, all content blocks whose appearance frequency is equal to or lower than a predetermined threshold are set as unique content blocks.
- the threshold value is set to once. Then, a blog article including specific content is determined as a specific content block, and a blog article including only content similar to other blog articles is not determined as a specific content block.
- the appearance frequency of content blocks that are common to each blog page such as the header portion, the navigation portion, and the copyright display portion is two times or more, these are not determined to be specific content blocks.
- the threshold value is stored in the storage unit 65 in advance.
- FIG. 17 shows the HTML document of the blog page shown in FIG. 16 as a DOM tree, that is, a tree structure.
- a DOM tree that is, a tree structure.
- illustration of tag nodes that are not necessary for the description of the present embodiment is omitted.
- the system control unit 70 When the content block is extracted as in the case of the first embodiment, the system control unit 70 temporarily stores the content block correspondence information indicating the extraction result in the storage unit 65. As shown in FIG. 18, the content block correspondence information (reference numeral 401) is stored for each content block.
- the content block correspondence information (reference numeral 401) is stored for each content block.
- feature words are extracted from a content block determined to be a blog page-specific content block, that is, a blog article, text data may be extracted, and image data is extracted. It is not necessary.
- FIG. 19 is a flowchart showing a processing example in the advertisement content insertion process of the system control unit 70 of the blog server 6 according to the present embodiment.
- the advertisement content insertion process is started, for example, when a request for execution of the advertisement content insertion process is transmitted from the management terminal 3 based on the operation of the system administrator.
- the system control unit 70 receives the designated user ID from the management terminal 3 as shown in FIG. (Step S101).
- the system control unit 70 sets 0 to the block number NUM (step S102).
- the block number NUM is the number of content blocks that have been discovered at the present time. NUM is a global variable and can be accessed from a one-page extraction process and a tree search process described later.
- the system control unit 70 acquires the HTML document of the first blog page corresponding to the received user ID from the blog page DB 601 (step S103).
- the system control unit 70 designates the acquired HTML document and executes a one-page extraction process described later (step S104). In this one-page extraction process, a content block is extracted from the acquired HTML document, and the content block correspondence information is stored.
- step S105 determines whether or not the content blocks of all the blog pages corresponding to the received user ID have been extracted. At this time, if there is a blog page from which no content block has been extracted (step S105: NO), the system control unit 70 acquires an HTML document of the next blog page from the blog page DB 601 (step S106). The process proceeds to step S104. When the system control unit 70 repeats the processes of steps S104 to S106 to extract the content blocks of all the blog pages (step S105: YES), the system control unit 70 proceeds to step S107.
- step S107 the system control unit 70 specifies the HTML document of the first blog page corresponding to the received user ID.
- the system control unit 70 designates the acquired HTML document and executes a specific content block determination process described later (step S108).
- a content block is extracted from the specified HTML document, and a content block specific to the blog page is determined.
- the system control unit 70 extracts feature words of the blog page from each text data constituting the content block determined to be unique (step S109).
- the system control unit 70 inserts an advertisement page related to the blog page into the blog page based on the extracted feature words (step S110).
- the system control unit 70 uses the extracted feature word as a keyword, refers to the advertisement DB 602, and selects advertisement content corresponding to the keyword.
- the system control unit 70 inserts the rule for the selected advertisement content at a predetermined position on the specified HTML document. For example, when the advertising content includes text data, the system control unit 70 adds the content of the text data to the HTML document.
- the system control unit 70 adds an IMG tag for displaying the image data to the HTML document. Further, for example, the system control unit 70 adds link information to the Web page related to the advertisement target product or service to the HTML document.
- the system control unit 70 When the system control unit 70 inserts the advertisement content rule into the specified HTML document, the system control unit 70 updates the HTML document registered in the blog page DB 601 with the HTML document (step S111).
- the system control unit 70 determines whether or not advertisement content has been inserted into all the blog pages corresponding to the received user ID (step S112). At this time, if there is a blog page in which no advertising content is inserted (step S112: NO), the system control unit 70 identifies the HTML document of the next blog page (step S113), and proceeds to step S108. Transition. Then, when the system control unit 70 repeats the processing of steps S108 to S113 and inserts the advertising content into all the blog pages (step S112: YES), it stores all the content block correspondence information stored in the storage unit 65. Then, it is deleted from the storage unit 65 (step S114). After completing this process, the system control unit 70 ends the advertisement content insertion process.
- FIG. 20 is a flowchart showing a processing example in the one-page extraction process of the system control unit 70 of the blog server 6 according to this embodiment.
- the system control unit 70 first generates a DOM tree of the acquired HTML document on the RAM 69 (step S121).
- the system control unit 70 sets 0 in the hierarchy LV (step S122).
- the hierarchy LV is a hierarchy of content blocks to which the currently searched node belongs in the DOM tree.
- LV is a global variable and can be accessed from the one-page extraction process and tree search process.
- the system control unit 70 designates the root node of the DOM tree (step S123) and executes tree search processing (step S124). Since the processing contents of the tree search processing are the same as those in the first embodiment, detailed description thereof is omitted.
- the system control unit 70 stores each content block correspondence information generated by the tree search process in the storage unit 65 (step S125). When completing this process, the system control unit 70 ends the one-page correspondence extraction process.
- FIG. 21 is a flowchart illustrating a processing example in the specific content block determination process of the system control unit 70 of the content generation server 1 according to the present embodiment.
- the system control unit 70 first generates a DOM tree of the designated HTML document (step S161), sets 0 for the block number NUM and the hierarchy LV, as in the one-page extraction process. (Step S162), the root node of the DOM tree is designated (step S163), and the tree search process is executed (step S164).
- the system control unit 70 sets 1 to the block number i (step S165).
- the system control unit 70 calculates the appearance frequency of the content block with the block number i (step S166).
- the system control unit 70 obtains the block configuration information of the content block correspondence information i generated in the tree search process in step S164 and the block configuration information of each content block correspondence information stored in the storage unit 65.
- the appearance frequency is calculated by comparison.
- the appearance frequency calculation method is the same as in the first embodiment.
- the system control unit 70 determines whether or not the calculated appearance frequency is equal to or less than a threshold stored in the storage unit 65 (step S167). At this time, if the appearance frequency is equal to or lower than the threshold (step S167: YES), the system control unit 70 determines that the content block with the block number i is one of the unique content blocks (step S168). . That is, the system control unit 70 adds the content block with the block number i to the content block specific to the blog page corresponding to the designated HTML document.
- step S167: NO When the appearance frequency is greater than the threshold (step S167: NO), or when the process of step S168 is completed, the system control unit 70 adds 1 to the block number i (step S169), and the block number It is determined whether i is larger than the value of the block number NUM (step S170). At this time, when the block number i is equal to or less than the value of the block number NUM (step S170: NO), the system control unit 70 proceeds to step S166. And the system control part 70 will complete
- the system control unit 70 extracted the content block by the tree search process in step S164, but in the one-page correspondence extraction process (step S104 in FIG. 19) executed from the advertisement content insertion process, the received blogger user Content blocks are extracted for all blog pages corresponding to the ID, and as a result, the content block correspondence information is stored in the storage unit 65, so that it is not necessary to extract the content blocks again. In that case, based on the URL of the designated HTML document, the content block correspondence information of each content block constituting the blog page to which the HTML document corresponds can be acquired from the storage unit 65.
- the advertising content is inserted into the blog page of the designated blogger.
- the advertising content is inserted at the timing when the blog is updated. good.
- FIG. 22 is a flowchart showing a processing example in the blog update process of the system control unit 70 of the blog server 6 according to a modification of the present embodiment.
- the same steps as those in FIG. 19 are denoted by the same step numbers.
- the blogger accesses the blog service site by operating the user terminal 5, and logs in to the blog service site by entering his user ID and password.
- the blog server 6 issues a session ID to the user terminal 5, and manages the session ID and the user ID in association with each other. Since the request from the user terminal 5 to the blog server 6 includes a session ID, the blog server 6 can specify which blogger the request is from.
- the user terminal 5 transmits blog article data (text data such as title and body text, image data, etc.) to the blog server 6, as shown in FIG.
- the system control unit 70 of the blog server 6 receives the blog article data (step S171).
- the system control unit 70 acquires the HTML document of the blog page to be updated from the blog page DB 601 from among the blog pages corresponding to the blogger user ID (step S172).
- the system control unit 70 updates the acquired HTML document based on the received blog article data (step S173). For example, the system control unit 70 adds a TABLE tag or DIV tag for a blog article to the acquired HTML document, and adds the title of the received blog article, text data of the body text, etc. sandwiched between the tags. To do.
- the system control unit 70 updates the HTML document registered in the blog page DB 601 with the HTML document to which the blog article data has been added (step S174).
- the system control unit 70 extracts content blocks from all blog pages corresponding to the blogger user ID (steps S103 to S106).
- the system control unit 70 designates the HTML document updated in step S173, executes the specific content block determination process (step S108), and determines the blog page from each text data constituting the content block determined to be specific. Feature words are extracted (step S109).
- the system control unit 70 deletes the existing advertisement content rule from the specified HTML document (step S775), and inserts the related advertisement content rule using the extracted feature word as a keyword (step S110). . That is, the system control unit 70 changes the advertising content displayed on the blog page.
- the system control unit 70 updates the HTML document registered in the blog page DB 601 with the HTML document in which the advertisement content specification is inserted (step S111), and deletes all content block correspondence information from the storage unit 65. (Step S114).
- the processing when a new blog page has to be generated along with the blog update may be basically the same as the processing described above. However, since the advertisement content has not yet been inserted into the newly generated blog page, the advertisement content rule is not deleted in step S175.
- a content block (blog article) having an appearance frequency of once is extracted as content specific to the blog page, and feature words are extracted from the text data of the extracted blog article.
- the threshold is set to once
- a content block (blog article) having an appearance frequency of once is extracted as content specific to the blog page, and feature words are extracted from the text data of the extracted blog article.
- the number of words extracted from the text data decreases. If a sufficient number of words cannot be extracted, it may not be possible to determine at all which word is a feature word, or may not be able to accurately determine. Therefore, by increasing the threshold value and loosening the condition for determining content specific to the blog page, the number of blog articles that are the target of feature word extraction is increased. As a result, feature words can be extracted.
- the system control unit 70 of the blog server 6 initially sets a threshold value to once, and determines a content block specific to the blog page, thereby extracting a blog article that appears once. To extract feature words. At this time, if it is determined that the feature word cannot be extracted, the system control unit 70 changes the threshold to two times, and extracts the blog article and the feature word. If the system control unit 70 still determines that the feature word cannot be extracted, the system control unit 70 changes the threshold to three times to extract the blog article and the feature word. The system control unit 70 continues such processing until a feature word can be extracted. That is, the threshold value is raised when the processing based on the extraction result of the specific content block cannot be normally performed.
- the processing is interrupted when the threshold value is increased to some extent. For example, if the threshold value rises to the value of the number of pages of the blog page corresponding to the specified blogger, the content block used in common in each blog page is extracted, so the threshold value is the number of pages of the blog page. Processing may be interrupted when the value is reached.
- the system administrator may predetermine that a content block that appears only once per a predetermined number of pages of a blog page is a content block specific to the blog page.
- the number of appearances as the threshold value may be changed in proportion to the number of blog pages corresponding to the designated blogger.
- a blog article registered by a blogger can register a comment from another user, and the comment can be viewed together with the blog article.
- the text data of this comment is also one of the contents constituting the blog page.
- the system control unit 70 of the blog server 6 adds a description of a blocking tag to the HTML document of the blog page and adds the text data to the comment. Is a content block independent of the text data of blog articles and other comments. Then, the system control unit 70 extracts the text data of the comment as a content block, and if the extracted text data of the comment has a specific content, the advertising content related to the comment is inserted into the blog page To do.
- the contents of the plurality of comments are not so much as contents that frequently appear, for example, opinions of majority and minority. It may be divided into contents that do not appear frequently. At this time, the opinions of the majority can be considered as general opinions and not very characteristic content. On the other hand, the opinions of minorities are unique opinions and can be considered as content specific to blog pages. In such a case, I want to extract comments that show minority opinions as content specific to the blog page.
- the number of majority opinions and the number of minority opinions are relative and vary with the total number of comments.
- the frequency is used as the appearance frequency and the threshold is set to, for example, once, contents that do not appear frequently (minority opinions) may not be appropriately extracted. Therefore, the relative frequency is used as the appearance frequency, and the threshold is set to a predetermined ratio.
- the threshold value at this time can be set arbitrarily. For example, when the content of the extracted content block is divided into N patterns (N is an integer of 2 or more), in order to distinguish minority opinions, a threshold value is set within a range of less than 1 / N. You may do it. As described above, the system control unit 70 may change the threshold according to the situation at that time.
- Twitter trademark
- electronic bulletin boards As a system capable of registering comments and the like for articles such as blogs, for example, other users can register a tweet that follows a tweet registered by a certain user. Twitter (trademark) and electronic bulletin boards.
- the system control unit 70 of the blog server 6 extracts the contents constituting the blog pages that are sequentially designated by the designation of the HTML document, and is designated.
- the appearance frequency of each content constituting the blog page is calculated, and among the contents constituting the specified blog page, the content whose appearance frequency is equal to or less than a predetermined threshold is determined to be content specific to the blog page. .
- the content with a smaller appearance frequency is a content that does not appear much other than the designated blog page, all content satisfying the condition is determined by determining whether the appearance frequency is equal to or less than a threshold. Identified as content specific to the specified blog page. Therefore, content specific to the blog page can be easily extracted.
- system control unit 70 of the blog server 6 inserts advertisement content related to content specific to the designated blog page into the blog page.
- the system control unit 70 of the blog server 6 includes text data of a blog article as the content constituting the designated blog page
- the text data is converted into the text data specific to the blog page. It is determined that the content is the content, the feature word of the blog page is extracted from the text data of the blog article, and the advertisement content associated in advance is inserted into the blog page using the feature word as a keyword.
- an advertisement related to the content of the blog posted on the blog page can be added to the blog page.
- system control unit 70 of the blog server 6 calculates the appearance frequency of each content on a plurality of blog pages included in the blog service site.
- the appearance frequency of each content constituting the specified blog page is calculated on a plurality of Web pages (for example, a plurality of blog pages corresponding to the specified blogger user ID) included in the blog service site. Therefore, it is possible to determine that the content used in common in the blog service site is not unique content, and the determination accuracy can be improved.
- system control unit 70 of the blog server 6 extracts the contents constituting the blog page in units of content blocks composed of one or more contents, and configures each designated blog page.
- the content block appearance frequency is calculated, and among the content blocks constituting the designated blog page, a content block whose appearance frequency is equal to or less than a threshold is determined to be a content block specific to the blog page.
- a content block specific to a blog page can be extracted.
- system control unit 70 of the blog server 6 extracts the content constituting the blog page based on the HTML document of the blog page, and determines the content block based on the DIV tag or the TABLE tag in the HTML document.
- one or more contents explicitly blocked when creating an HTML document can be specified by the DIV tag, and one of the contents that is blocked and displayed in a table format by the TABLE tag. Since the above content can be specified, for example, when the content specific to the blog page and the non-specific content are blocked by these tags, the accuracy of determining the content specific to the web page Can be raised.
- the content block correspondence information corresponding to each content block constituting the designated blog page is used as the content block constituting all the blog pages corresponding to the designated blogger user ID.
- Each appearance frequency was calculated by comparing with corresponding information.
- the frequency of appearance in the range targeting all the blog pages corresponding to the specified blogger was calculated.
- the target range is not limited to this. For example, blog pages corresponding to a predetermined number of pages may be targeted, or all blog pages constituting the blog service site may be targeted.
- advertising content which shows the advertisement regarding goods or a service was inserted in the said web page as content relevant to the content peculiar to a web page
- advertising content It is not limited to.
- image data still image or moving image
- content such as a blog article determined to be unique content
- a database for image data is constructed, and image data and keywords are associated and registered in the database.
- the keyword associated with the image data is a word indicating an image represented by the image data or a word related to the image.
- feature words are extracted from the content determined to be unique content, and related image data is selected from the database using the extracted feature words as keywords.
- the URL of the selected image data is inserted as a background attribute in the BODY tag of the target HTML document, or an IMG tag for displaying the selected image data is inserted at a predetermined position of the target HTML document.
- content specific to a Web page is not limited to only inserting related content into a Web page.
- new content may be generated based on content unique to the Web page.
- text data and image data are extracted as content constituting a Web page, but the content to be extracted is not limited to these. For example, it may be content displayed on a Web page or content that is played back when a Web page is displayed (for example, moving image data, audio data, electronic document, etc.). Further, only a predetermined type of content may be extracted.
- interposed into the TABLE tag were extracted by grouping as a content block,
- a tag which groups content it is not limited to these.
- content specific to a Web page is extracted in units of content blocks.
- each content may be extracted as it is.
- the specific content determination device of the present invention is applied to the server device.
- the specific content determination device is specific to the terminal device.
- a content determination device may be applied.
- the document data of the present invention is applied to an HTML document.
- data for example, XHTML (Extensible HyperText Markup) that is described in a markup language and indicates content constituting a Web page.
- Document data may be applied to (Language) documents, etc.).
- the content that constitutes the product detail page on the shopping site and the content that constitutes the blog page on the blog service site are extracted, but the target site and page types are limited to these. It is not something that can be done.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Tourism & Hospitality (AREA)
- Human Resources & Organizations (AREA)
- Databases & Information Systems (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
以下、図面を参照して本発明の実施形態について詳細に説明する。なお、以下に説明する実施の形態は、ネットワーク上の電子的なやりとりにより商品の売買が行われるショッピングシステムにおいて、ショッピングサイトのWebページから抽出されたWebページに特有のコンテンツに基づいて新たなコンテンツを生成するサーバ装置に対して本発明を適用した場合の実施形態である。
先ず、本実施形態に係るショッピングシステムSの構成及び概要機能について、図1を用いて説明する。
次に、コンテンツ生成サーバ1の構成及び機能について、図2を用いて説明する。
次に、ショッピングシステムSの動作について、図7乃至図12を用いて説明する。
図7は、本実施形態に係るコンテンツ生成サーバ1のシステム制御部20の素材抽出処理における処理例を示すフローチャートである。
図10は、本実施形態に係るコンテンツ生成サーバ1のシステム制御部20のコンテンツ生成処理における処理例を示すフローチャートである。
以下、図面を参照して本発明の実施形態について詳細に説明する。なお、以下に説明する実施の形態は、ブログサービスを提供するブログシステムにおいて、ブログページ送信するサーバ装置に対して本発明を適用した場合の実施形態である。
先ず、本実施形態に係るブログシステムBSの構成及び概要機能について、図13を用いて説明する。
次に、ブログサーバ6の構成及び機能について、図14を用いて説明する。
次に、ブログシステムBSの動作について、図19乃至図21を用いて説明する。
次に、本実施形態の変形例について、図22を用いて説明する。
これまでの説明においては、ブログページに特有のコンテンツの判定に用いられる閾値として1回を設定していたが、2回以上の値を閾値として設定しても良い。
これまでの説明においては、ブログページに特有のコンテンツの判定に用いられる出現頻度として、出現回数(度数)を用いていたが、指定されたブロガーに対応するブログページの全コンテンツブロックに対する出現回数の割合(相対度数)を用いても良い。
2 ショッピングサーバ
3 管理端末
4 店舗端末
5 ユーザ端末
11 操作部
12 表示部
13 通信部
14 ドライブ部
15 記憶部
16 入出力インタフェース部
17 CPU
18 ROM
19 RAM
20 システム制御部
21 システムバス
101 素材抽出DB
201 商品詳細ページDB
NW ネットワーク
S ショッピングシステム
6 ブログサーバ6
61 操作部
62 表示部
63 通信部
64 ドライブ部
65 記憶部
66 入出力インタフェース部
67 CPU
68 ROM
69 RAM
60 システム制御部
61 システムバス
601 ブログページDB
602 広告DB
BS ブログシステム
Claims (28)
- コンピュータを、
指定されたWebページを構成しているコンテンツを抽出する抽出手段、
前記指定されたWebページを構成している各コンテンツの出現頻度を計算する計算手段、及び、
前記計算された出現頻度に基づいて、前記指定されたWebページを構成しているコンテンツのうち、当該Webページに特有であるコンテンツを判定する判定手段、
として機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項1に記載の特有コンテンツ判定プログラムにおいて、
前記判定手段が、前記指定されたWebページを構成しているコンテンツのうち、出現頻度が最も小さいコンテンツを当該Webページに特有のコンテンツであると判定するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項1に記載の特有コンテンツ判定プログラムにおいて、
前記判定手段が、前記指定されたWebページを構成しているコンテンツのうち、出現頻度が所定値以下のコンテンツを当該Webページに特有のコンテンツであると判定するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項1乃至3の何れか1項に記載の特有コンテンツ判定プログラムにおいて、
前記計算手段が、所定のサイトに含まれる複数のWebページ上における各コンテンツの出現頻度を計算するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項1乃至4の何れか1項に記載の特有コンテンツ判定プログラムにおいて、
前記抽出手段が、前記所定のサイトに含まれる予め定められた種類の各WebページについてWebページを構成しているコンテンツを抽出し、抽出したコンテンツを示すコンテンツ情報を予め記憶手段に記憶しておき、
前記計算手段が、前記記憶されたコンテンツ情報に基づいて、前記指定されたWebページを構成している各コンテンツの出現頻度を計算するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項1乃至5の何れか1項に記載の特有コンテンツ判定プログラムにおいて、
前記抽出手段が、1つ以上のコンテンツで構成されたコンテンツグループの単位で、Webページを構成しているコンテンツを抽出し、
前記計算手段が、前記指定されたWebページを構成しているコンテンツグループの出現頻度を計算し、
前記判定手段が、前記指定されたWebページを構成しているコンテンツグループのうち、当該Webページに特有であるコンテンツグループを判定するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項6に記載の特有コンテンツ判定プログラムにおいて、
前記抽出手段が、所定のマークアップ言語で記述され、Webページを構成するコンテンツを示すドキュメントデータに基づいて、コンテンツグループを抽出するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項7に記載の特有コンテンツ判定プログラムにおいて、
前記抽出手段が、前記コンテンツを示すドキュメントデータにおいて予め定められたタグに基づいてコンテンツグループを定めるように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項1乃至8の何れか1項に記載の特有コンテンツ判定プログラムにおいて、
特有のコンテンツであると判定されたコンテンツに基づいて、新たなコンテンツを生成する生成手段として前記コンピュータを更に機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項9に記載の特有コンテンツ判定プログラムにおいて、
前記生成手段が、特有のコンテンツであると判定されたコンテンツの表示サイズを、予め設定された表示サイズに合うように調整し、表示サイズが調整されたコンテンツを含む新たなコンテンツを生成するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項9又は請求項10に記載の特有コンテンツ判定プログラムにおいて、
前記生成手段が、特有のコンテンツであると判定されたコンテンツにエフェクトが施されて当該コンテンツが再生される新たなコンテンツを生成するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項1乃至8の何れか1項に記載の特有コンテンツ判定プログラムにおいて、
特有のコンテンツであると判定されたコンテンツに関連する関連コンテンツを、前記指定されたWebページに挿入する挿入手段として前記コンピュータを更に機能させることを特徴とする特有コンテンツ判定プログラム。 - 請求項12に記載の特有コンテンツ判定プログラムにおいて、
前記判定手段が、前記指定されたWebページを構成しているコンテンツとして、ブログの記事のテキストデータが含まれている場合に、当該テキストデータを当該Webページに特有のコンテンツであると判定し、
前記挿入手段が、前記特有コンテンツ判定装置により特有のコンテンツであると判定されたブログの記事のテキストデータから前記指定されたWebページの特徴語を抽出し、当該特徴語に関連する関連コンテンツを、当該Webページに挿入するように、前記コンピュータを機能させることを特徴とする特有コンテンツ判定プログラム。 - 指定されたWebページを構成しているコンテンツを抽出する抽出手段と、
前記指定されたWebページを構成している各コンテンツの出現頻度を計算する計算手段と、
前記計算された出現頻度に基づいて、前記指定されたWebページを構成しているコンテンツのうち、当該Webページに特有であるコンテンツを判定する判定手段と、
を備えることを特徴とする特有コンテンツ判定装置。 - 請求項14に記載の特有コンテンツ判定装置において、
前記判定手段は、前記指定されたWebページを構成しているコンテンツのうち、出現頻度が最も小さいコンテンツを当該Webページに特有のコンテンツであると判定することを特徴とする特有コンテンツ判定装置。 - 請求項14に記載の特有コンテンツ判定装置において、
前記判定手段は、前記指定されたWebページを構成しているコンテンツのうち、出現頻度が所定値以下のコンテンツを当該Webページに特有のコンテンツであると判定することを特徴とする特有コンテンツ判定装置。 - 請求項14乃至16の何れか1項に記載の特有コンテンツ判定装置において、
前記計算手段は、所定のサイトに含まれる複数のWebページ上における各コンテンツの出現頻度を計算することを特徴とする特有コンテンツ判定装置。 - 請求項14乃至17の何れか1項に記載の特有コンテンツ判定装置において、
前記抽出手段は、前記所定のサイトに含まれる予め定められた種類の各WebページについてWebページを構成しているコンテンツを抽出し、抽出したコンテンツを示すコンテンツ情報を予め記憶手段に記憶しておき、
前記計算手段は、前記記憶されたコンテンツ情報に基づいて、前記指定されたWebページを構成している各コンテンツの出現頻度を計算することを特徴とする特有コンテンツ判定装置。 - 請求項14乃至18の何れか1項に記載の特有コンテンツ判定装置において、
前記抽出手段は、1つ以上のコンテンツで構成されたコンテンツグループの単位で、Webページを構成しているコンテンツを抽出し、
前記計算手段は、前記指定されたWebページを構成しているコンテンツグループの出現頻度を計算し、
前記判定手段は、前記指定されたWebページを構成しているコンテンツグループのうち、当該Webページに特有であるコンテンツグループを判定することを特徴とする特有コンテンツ判定装置。 - 請求項19に記載の特有コンテンツ判定装置において、
前記抽出手段は、所定のマークアップ言語で記述され、Webページを構成するコンテンツを示すドキュメントデータに基づいて、コンテンツグループを抽出することを特徴とする特有コンテンツ判定装置。 - 請求項20に記載の特有コンテンツ判定装置において、
前記抽出手段は、前記コンテンツを示すドキュメントデータにおいて予め定められたタグに基づいてコンテンツグループを定めることを特徴とする特有コンテンツ判定装置。 - 指定されたWebページを構成しているコンテンツを抽出する抽出行程と、
前記指定されたWebページを構成している各コンテンツの出現頻度を計算する計算行程と、
前記計算された出現頻度に基づいて、前記指定されたWebページを構成しているコンテンツのうち、当該Webページに特有であるコンテンツを判定する判定行程と、
を有することを特徴とする特有コンテンツ判定方法。 - コンピュータを、
指定されたWebページを構成しているコンテンツを抽出する抽出手段、
前記指定されたWebページを構成している各コンテンツの出現頻度を計算する計算手段、及び、
前記計算された出現頻度に基づいて、前記指定されたWebページを構成しているコンテンツのうち、当該Webページに特有であるコンテンツを判定する判定手段、
として機能させる特有コンテンツ判定プログラムがコンピュータ読み取り可能に記録されていることを特徴とする記録媒体。 - 請求項14乃至21の何れか1項に記載の特有コンテンツ判定装置と、
前記特有コンテンツ判定装置により特有のコンテンツであると判定されたコンテンツに基づいて、新たなコンテンツを生成する生成手段と、
を備えることを特徴とするコンテンツ生成装置。 - 請求項24に記載のコンテンツ生成装置において、
前記生成手段は、特有のコンテンツであると判定されたコンテンツの表示サイズを、予め設定された表示サイズに合うように調整し、表示サイズが調整されたコンテンツを含む新たなコンテンツを生成することを特徴とするコンテンツ生成装置。 - 請求項24又は請求項25に記載のコンテンツ生成装置において、
前記生成手段は、特有のコンテンツであると判定されたコンテンツにエフェクトが施されて当該コンテンツが再生される新たなコンテンツを生成することを特徴とするコンテンツ生成装置。 - 請求項14乃至21の何れか1項に記載の特有コンテンツ判定装置と、
前記特有コンテンツ判定装置により特有のコンテンツであると判定されたコンテンツに関連する関連コンテンツを、前記指定されたWebページに挿入する挿入手段と、
を備えることを特徴とする関連コンテンツ挿入装置。 - 請求項27に記載の関連コンテンツ挿入装置において、
前記特有コンテンツ判定装置は、前記指定されたWebページを構成しているコンテンツとして、ブログの記事のテキストデータが含まれている場合に、当該テキストデータを当該Webページに特有のコンテンツであると判定し、
前記挿入手段は、前記特有コンテンツ判定装置により特有のコンテンツであると判定されたブログの記事のテキストデータから前記指定されたWebページの特徴語を抽出し、当該特徴語に関連する関連コンテンツを、当該Webページに挿入することを特徴とする関連コンテンツ挿入装置。
Priority Applications (7)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/504,831 US20120216107A1 (en) | 2009-10-30 | 2010-10-25 | Characteristic content determination program, characteristic content determination device, characteristic content determination method, recording medium, content generation device, and related content insertion device |
| KR1020127014075A KR101640051B1 (ko) | 2009-10-30 | 2010-10-25 | 특유 콘텐츠 판정 장치, 특유 콘텐츠 판정 방법, 기록 매체, 콘텐츠 생성 장치 및 관련 콘텐츠 삽입 장치 |
| KR1020147026766A KR20140127360A (ko) | 2009-10-30 | 2010-10-25 | 특유 콘텐츠 판정 장치, 특유 콘텐츠 판정 방법, 기록 매체, 콘텐츠 생성 장치 및 관련 콘텐츠 삽입 장치 |
| BR112012010120A BR112012010120A2 (pt) | 2009-10-30 | 2010-10-25 | dispositivo e método de determinação de conteúdo característico |
| EP10826658.6A EP2482247A4 (en) | 2009-10-30 | 2010-10-25 | PROGRAM FOR DETERMINING CHARACTERISTIC CONTENT, DEVICE FOR DETERMINING CHARACTERISTIC CONTENT, METHOD FOR DETERMINING CHARACTERISTIC CONTENT, RECORDING MEDIUM, CONTENT MANAGEMENT DEVICE AND CORRESPONDING CONTENTINSTALLATION APPARATUS |
| CN201080048923.4A CN102598038B (zh) | 2009-10-30 | 2010-10-25 | 特有内容数据判定装置、特有内容数据判定方法、内容数据生成装置以及关联内容数据插入装置 |
| US14/696,992 US10614134B2 (en) | 2009-10-30 | 2015-04-27 | Characteristic content determination device, characteristic content determination method, and recording medium |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2009-250646 | 2009-10-30 | ||
| JP2009-250594 | 2009-10-30 | ||
| JP2009250646A JP5462591B2 (ja) | 2009-10-30 | 2009-10-30 | 特有コンテンツ判定装置、特有コンテンツ判定方法、特有コンテンツ判定プログラム及び関連コンテンツ挿入装置 |
| JP2009250594A JP5462590B2 (ja) | 2009-10-30 | 2009-10-30 | 特有コンテンツ判定装置、特有コンテンツ判定方法、特有コンテンツ判定プログラム及びコンテンツ生成装置 |
Related Child Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US13/504,831 A-371-Of-International US20120216107A1 (en) | 2009-10-30 | 2010-10-25 | Characteristic content determination program, characteristic content determination device, characteristic content determination method, recording medium, content generation device, and related content insertion device |
| US14/696,992 Continuation-In-Part US10614134B2 (en) | 2009-10-30 | 2015-04-27 | Characteristic content determination device, characteristic content determination method, and recording medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2011052526A1 true WO2011052526A1 (ja) | 2011-05-05 |
Family
ID=43921948
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2010/068820 Ceased WO2011052526A1 (ja) | 2009-10-30 | 2010-10-25 | 特有コンテンツ判定プログラム、特有コンテンツ判定装置、特有コンテンツ判定方法、記録媒体、コンテンツ生成装置及び関連コンテンツ挿入装置 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20120216107A1 (ja) |
| EP (1) | EP2482247A4 (ja) |
| KR (2) | KR101640051B1 (ja) |
| CN (1) | CN102598038B (ja) |
| BR (1) | BR112012010120A2 (ja) |
| WO (1) | WO2011052526A1 (ja) |
Families Citing this family (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2008092079A2 (en) | 2007-01-25 | 2008-07-31 | Clipmarks Llc | System, method and apparatus for selecting content from web sources and posting content to web logs |
| JP5938170B2 (ja) * | 2011-06-08 | 2016-06-22 | キヤノン株式会社 | 画像処理装置、その制御方法、及びプログラム |
| US9430583B1 (en) | 2011-06-10 | 2016-08-30 | Salesforce.Com, Inc. | Extracting a portion of a document, such as a web page |
| US9223769B2 (en) | 2011-09-21 | 2015-12-29 | Roman Tsibulevskiy | Data processing systems, devices, and methods for content analysis |
| KR101990450B1 (ko) * | 2012-03-08 | 2019-06-18 | 삼성전자주식회사 | 웹 페이지 상에서 본문 추출을 위한 방법 및 장치 |
| US9753926B2 (en) * | 2012-04-30 | 2017-09-05 | Salesforce.Com, Inc. | Extracting a portion of a document, such as a web page |
| US9548042B2 (en) * | 2012-06-28 | 2017-01-17 | Adobe Systems Incorporated | Responsive document breakpoints systems and methods |
| US10354294B2 (en) * | 2013-08-28 | 2019-07-16 | Google Llc | Methods and systems for providing third-party content on a web page |
| WO2015100518A1 (en) | 2013-12-31 | 2015-07-09 | Google Inc. | Systems and methods for converting static image online content to dynamic online content |
| US20150254219A1 (en) * | 2014-03-05 | 2015-09-10 | Adincon Networks LTD | Method and system for injecting content into existing computerized data |
| US10628875B2 (en) * | 2016-06-28 | 2020-04-21 | Facebook, Inc. | Product page classification |
| US11373198B2 (en) * | 2016-12-02 | 2022-06-28 | Honda Motor Co., Ltd. | Evaluation device, evaluation method, and evaluation program |
| US10984166B2 (en) * | 2017-09-29 | 2021-04-20 | Oracle International Corporation | System and method for extracting website characteristics |
| CN110059272B (zh) * | 2018-11-02 | 2023-08-15 | 创新先进技术有限公司 | 一种页面特征识别方法和装置 |
| JP6625259B1 (ja) * | 2019-07-11 | 2019-12-25 | 株式会社ぐるなび | 情報処理装置、情報処理方法及びプログラム |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2003308461A (ja) * | 2002-04-12 | 2003-10-31 | Toyo Kitchen & Living Co Ltd | 組合せ体プラン電子説明システム、及びシステムキッチンプラン電子説明システム |
| JP2006146506A (ja) * | 2004-11-18 | 2006-06-08 | Image:Kk | Webサイト更新システム、Webサイト更新方法およびWebサイト更新プログラム |
| JP2006259965A (ja) * | 2005-03-16 | 2006-09-28 | Sony Corp | 情報処理装置および方法、並びにプログラム |
| JP2007080061A (ja) * | 2005-09-15 | 2007-03-29 | Univ Of Tsukuba | Webページの検索方法及びWebページのクラスタリング方法 |
| JP2008130032A (ja) * | 2006-11-24 | 2008-06-05 | Sharp Corp | コンテンツ抽出装置、方法、プログラム、及び記録媒体 |
| WO2008108515A1 (en) * | 2007-03-05 | 2008-09-12 | Nr Systems, Inc. | System for advertising using meta-blog web page and profit creating method with it |
| JP2009199513A (ja) * | 2008-02-25 | 2009-09-03 | Nec Corp | 違法情報検出装置、違法情報検出方法、及び違法情報検出プログラム |
| JP2009205499A (ja) * | 2008-02-28 | 2009-09-10 | Nec Corp | ウェブページ特定装置、ウェブページ特定方法およびウェブページ特定用プログラム |
Family Cites Families (60)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP4095739B2 (ja) * | 1999-04-16 | 2008-06-04 | インターナショナル・ビジネス・マシーンズ・コーポレーション | ウェブサイト閲覧方法、ウェブサイト閲覧システム、コンピュータ、および記憶媒体 |
| US6665665B1 (en) * | 1999-07-30 | 2003-12-16 | Verizon Laboratories Inc. | Compressed document surrogates |
| US6718363B1 (en) * | 1999-07-30 | 2004-04-06 | Verizon Laboratories, Inc. | Page aggregation for web sites |
| US20020010622A1 (en) * | 2000-07-18 | 2002-01-24 | Fumino Okamoto | System and method capable of appropriately managing customer information and computer-readable recording medium having customer information management program recorded therein |
| FR2826761B1 (fr) * | 2001-06-27 | 2003-10-17 | Canon Kk | Procede d'analyse d'un document represente dans un langage de balisage |
| US7203899B2 (en) * | 2002-04-12 | 2007-04-10 | Xerox Corporation | Systems and methods for assessing user success rates of accessing information in a collection of contents |
| US20050091106A1 (en) * | 2003-10-27 | 2005-04-28 | Reller William M. | Selecting ads for a web page based on keywords located on the web page |
| US20040193698A1 (en) * | 2003-03-24 | 2004-09-30 | Sadasivuni Lakshminarayana | Method for finding convergence of ranking of web page |
| MXPA06004513A (es) * | 2003-10-21 | 2006-09-04 | Intellectual Property Bank | Dispositivo de analisis de caracteristicas de documento para documento que ha de examinarse. |
| US20050149880A1 (en) * | 2003-11-06 | 2005-07-07 | Richard Postrel | Method and system for user control of secondary content displayed on a computing device |
| US7725487B2 (en) * | 2003-12-01 | 2010-05-25 | National Institute Of Information And Communications Technology | Content synchronization system and method of similar web pages |
| US7260568B2 (en) * | 2004-04-15 | 2007-08-21 | Microsoft Corporation | Verifying relevance between keywords and web site contents |
| US7392474B2 (en) * | 2004-04-30 | 2008-06-24 | Microsoft Corporation | Method and system for classifying display pages using summaries |
| CN1702651A (zh) * | 2004-05-24 | 2005-11-30 | 富士通株式会社 | 特定类型信息文件的识别方法和装置 |
| US20060015401A1 (en) * | 2004-07-15 | 2006-01-19 | Chu Barry H | Efficiently spaced and used advertising in network-served multimedia documents |
| US20070011155A1 (en) * | 2004-09-29 | 2007-01-11 | Sarkar Pte. Ltd. | System for communication and collaboration |
| JP2006099423A (ja) * | 2004-09-29 | 2006-04-13 | Hitachi Software Eng Co Ltd | テキストマイニングサーバ及びプログラム |
| US7725502B1 (en) * | 2005-06-15 | 2010-05-25 | Google Inc. | Time-multiplexing documents based on preferences or relatedness |
| US20070027772A1 (en) * | 2005-07-28 | 2007-02-01 | Bridge Well Incorporated | Method and system for web page advertising, and method of running a web page advertising agency |
| US8229914B2 (en) * | 2005-09-14 | 2012-07-24 | Jumptap, Inc. | Mobile content spidering and compatibility determination |
| US7962463B2 (en) * | 2005-10-31 | 2011-06-14 | Lycos, Inc. | Automated generation, performance monitoring, and evolution of keywords in a paid listing campaign |
| KR100755677B1 (ko) * | 2005-11-02 | 2007-09-05 | 삼성전자주식회사 | 주제 영역 검출을 이용한 대화체 음성 인식 장치 및 방법 |
| US7630964B2 (en) * | 2005-11-14 | 2009-12-08 | Microsoft Corporation | Determining relevance of documents to a query based on identifier distance |
| US7603619B2 (en) * | 2005-11-29 | 2009-10-13 | Google Inc. | Formatting a user network site based on user preferences and format performance data |
| US8239754B1 (en) * | 2006-04-07 | 2012-08-07 | Adobe Systems Incorporated | System and method for annotating data through a document metaphor |
| US7624103B2 (en) * | 2006-07-21 | 2009-11-24 | Aol Llc | Culturally relevant search results |
| JP4913154B2 (ja) * | 2006-11-22 | 2012-04-11 | 春男 林 | 文書解析装置および方法 |
| US7877384B2 (en) * | 2007-03-01 | 2011-01-25 | Microsoft Corporation | Scoring relevance of a document based on image text |
| US8244750B2 (en) * | 2007-03-23 | 2012-08-14 | Microsoft Corporation | Related search queries for a webpage and their applications |
| WO2008142800A1 (ja) * | 2007-05-24 | 2008-11-27 | Fujitsu Limited | 情報検索プログラム、該プログラムを記録した記録媒体、情報検索装置、および情報検索方法 |
| US8526405B2 (en) * | 2007-06-13 | 2013-09-03 | Apple Inc. | Routing network requests based on requesting device characteristics |
| CN101855612A (zh) * | 2007-06-21 | 2010-10-06 | 概要软件有限责任公司 | 用于对博客进行简编的系统和方法 |
| US9323827B2 (en) * | 2007-07-20 | 2016-04-26 | Google Inc. | Identifying key terms related to similar passages |
| US8463779B2 (en) * | 2007-10-30 | 2013-06-11 | Yahoo! Inc. | Representative keyword selection |
| US7769749B2 (en) * | 2007-11-13 | 2010-08-03 | Yahoo! Inc. | Web page categorization using graph-based term selection |
| US8145526B2 (en) * | 2007-11-20 | 2012-03-27 | Daniel Redlich | Revenue sharing system that incentivizes content providers and registered users and includes payment processing |
| US7984145B2 (en) * | 2008-01-24 | 2011-07-19 | Pm Investigations, Inc. | Notification of suspicious electronic activity |
| US8886660B2 (en) * | 2008-02-07 | 2014-11-11 | Siemens Enterprise Communications Gmbh & Co. Kg | Method and apparatus for tracking a change in a collection of web documents |
| US7970760B2 (en) * | 2008-03-11 | 2011-06-28 | Yahoo! Inc. | System and method for automatic detection of needy queries |
| US9690786B2 (en) * | 2008-03-17 | 2017-06-27 | Tivo Solutions Inc. | Systems and methods for dynamically creating hyperlinks associated with relevant multimedia content |
| CN101246498B (zh) * | 2008-03-27 | 2010-07-14 | 腾讯科技(深圳)有限公司 | 一种新闻网页的搜索方法 |
| US20140006922A1 (en) * | 2008-04-11 | 2014-01-02 | Alex Smith | Comparison output of electronic documents |
| US20090313127A1 (en) * | 2008-06-11 | 2009-12-17 | Yahoo! Inc. | System and method for using contextual sections of web page content for serving advertisements in online advertising |
| US20090313579A1 (en) * | 2008-06-13 | 2009-12-17 | International Business Machines Corporation | Systems and methods involving favicons |
| JP5226401B2 (ja) * | 2008-06-25 | 2013-07-03 | インターナショナル・ビジネス・マシーンズ・コーポレーション | 文書データの検索を支援する装置及び方法 |
| US20100058440A1 (en) * | 2008-08-27 | 2010-03-04 | Yahoo! Inc. | Interaction with desktop and online corpus |
| JP4650552B2 (ja) * | 2008-10-14 | 2011-03-16 | ソニー株式会社 | 電子機器、コンテンツ推薦方法及びプログラム |
| CN101382962B (zh) * | 2008-10-29 | 2011-03-02 | 西北工业大学 | 一种考虑概念抽象度的浅层分析自动文档综述方法 |
| TWI390177B (zh) * | 2008-11-24 | 2013-03-21 | Inst Information Industry | 景點推薦裝置和方法以及儲存媒體 |
| CN101477563B (zh) * | 2009-01-21 | 2010-11-10 | 北京百问百答网络技术有限公司 | 一种短文本聚类的方法、系统及其数据处理装置 |
| US20100192055A1 (en) * | 2009-01-27 | 2010-07-29 | Kutano Corporation | Apparatus, method and article to interact with source files in networked environment |
| US8719308B2 (en) * | 2009-02-16 | 2014-05-06 | Business Objects, S.A. | Method and system to process unstructured data |
| US8676798B1 (en) * | 2009-09-30 | 2014-03-18 | BloomReach Inc. | Query generation for searchable content |
| US20110099133A1 (en) * | 2009-10-28 | 2011-04-28 | Industrial Technology Research Institute | Systems and methods for capturing and managing collective social intelligence information |
| US7716205B1 (en) * | 2009-10-29 | 2010-05-11 | Wowd, Inc. | System for user driven ranking of web pages |
| US8577887B2 (en) * | 2009-12-16 | 2013-11-05 | Hewlett-Packard Development Company, L.P. | Content grouping systems and methods |
| CA2817136C (en) * | 2010-11-10 | 2018-06-26 | Rakuten, Inc. | Related-word registration and information processing device, method, recording medium and system |
| JP2013037624A (ja) * | 2011-08-10 | 2013-02-21 | Sony Computer Entertainment Inc | 情報処理システム、情報処理方法、プログラム及び情報記憶媒体 |
| US8990202B2 (en) * | 2011-11-03 | 2015-03-24 | Corefiling S.A.R.L. | Identifying and suggesting classifications for financial data according to a taxonomy |
| US20130246436A1 (en) * | 2012-03-19 | 2013-09-19 | Russell E. Levine | System and method for document indexing and drawing annotation |
-
2010
- 2010-10-25 KR KR1020127014075A patent/KR101640051B1/ko active Active
- 2010-10-25 KR KR1020147026766A patent/KR20140127360A/ko not_active Withdrawn
- 2010-10-25 WO PCT/JP2010/068820 patent/WO2011052526A1/ja not_active Ceased
- 2010-10-25 EP EP10826658.6A patent/EP2482247A4/en not_active Ceased
- 2010-10-25 CN CN201080048923.4A patent/CN102598038B/zh active Active
- 2010-10-25 BR BR112012010120A patent/BR112012010120A2/pt not_active Application Discontinuation
- 2010-10-25 US US13/504,831 patent/US20120216107A1/en not_active Abandoned
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2003308461A (ja) * | 2002-04-12 | 2003-10-31 | Toyo Kitchen & Living Co Ltd | 組合せ体プラン電子説明システム、及びシステムキッチンプラン電子説明システム |
| JP2006146506A (ja) * | 2004-11-18 | 2006-06-08 | Image:Kk | Webサイト更新システム、Webサイト更新方法およびWebサイト更新プログラム |
| JP2006259965A (ja) * | 2005-03-16 | 2006-09-28 | Sony Corp | 情報処理装置および方法、並びにプログラム |
| JP2007080061A (ja) * | 2005-09-15 | 2007-03-29 | Univ Of Tsukuba | Webページの検索方法及びWebページのクラスタリング方法 |
| JP2008130032A (ja) * | 2006-11-24 | 2008-06-05 | Sharp Corp | コンテンツ抽出装置、方法、プログラム、及び記録媒体 |
| WO2008108515A1 (en) * | 2007-03-05 | 2008-09-12 | Nr Systems, Inc. | System for advertising using meta-blog web page and profit creating method with it |
| JP2009199513A (ja) * | 2008-02-25 | 2009-09-03 | Nec Corp | 違法情報検出装置、違法情報検出方法、及び違法情報検出プログラム |
| JP2009205499A (ja) * | 2008-02-28 | 2009-09-10 | Nec Corp | ウェブページ特定装置、ウェブページ特定方法およびウェブページ特定用プログラム |
Non-Patent Citations (2)
| Title |
|---|
| AUTOMATIC BANNER CREATION, 21 October 2009 (2009-10-21), Retrieved from the Internet <URL:http//hyperbannermaker.com/>> |
| See also references of EP2482247A4 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN102598038A (zh) | 2012-07-18 |
| EP2482247A1 (en) | 2012-08-01 |
| BR112012010120A2 (pt) | 2016-06-07 |
| KR101640051B1 (ko) | 2016-07-15 |
| KR20140127360A (ko) | 2014-11-03 |
| CN102598038B (zh) | 2015-02-18 |
| EP2482247A4 (en) | 2014-11-19 |
| US20120216107A1 (en) | 2012-08-23 |
| KR20120088792A (ko) | 2012-08-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102598038B (zh) | 特有内容数据判定装置、特有内容数据判定方法、内容数据生成装置以及关联内容数据插入装置 | |
| US11675969B2 (en) | Dynamic native content insertion | |
| US11182823B2 (en) | Automated creative extension selection for content performance optimization | |
| CN102859518B (zh) | 信息处理装置、信息处理方法 | |
| AU2014399168B2 (en) | Automated click type selection for content performance optimization | |
| US20090049062A1 (en) | Method for Organizing Structurally Similar Web Pages from a Web Site | |
| US20180157763A1 (en) | System and method for generating an electronic page | |
| US11625448B2 (en) | System for superimposed communication by object oriented resource manipulation on a data network | |
| CN104598556A (zh) | 搜索方法及装置 | |
| CN104077388A (zh) | 基于搜索引擎的摘要信息提取方法、装置以及搜索引擎 | |
| CN105874449A (zh) | 用于提取和生成用于显示内容的图像的系统和方法 | |
| WO2013077029A1 (ja) | 検索装置、検索方法、検索プログラム及び記録媒体 | |
| CN103336794A (zh) | 用于在目标页面中提供对应呈现信息的方法与设备 | |
| CN103164423A (zh) | 一种用于确定渲染网页的浏览器内核类型的方法与设备 | |
| JP5462591B2 (ja) | 特有コンテンツ判定装置、特有コンテンツ判定方法、特有コンテンツ判定プログラム及び関連コンテンツ挿入装置 | |
| KR101091991B1 (ko) | 광고 제공 장치 및 방법 | |
| JP2022126427A (ja) | 情報処理装置、情報処理方法、情報処理プログラム | |
| US10614134B2 (en) | Characteristic content determination device, characteristic content determination method, and recording medium | |
| JP2020135392A (ja) | 情報処理装置、情報処理方法及び情報処理プログラム | |
| CN114218515A (zh) | 一种基于内容分割的Web数字对象提取方法及系统 | |
| JP5462590B2 (ja) | 特有コンテンツ判定装置、特有コンテンツ判定方法、特有コンテンツ判定プログラム及びコンテンツ生成装置 | |
| JP6505200B2 (ja) | コンテンツの性能の最適化のための自動化されたクリックタイプの選択 | |
| KR101372580B1 (ko) | 브라우저 ui를 제공하기 위한 방법, 단말 장치, 서버 및 컴퓨터 판독 가능한 기록 매체 | |
| KR20120107891A (ko) | 맞춤형 광고 제공 방법 및 장치 | |
| JP2010152441A (ja) | 情報検索装置、情報検索方法、情報検索処理プログラム及び情報検索システム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| WWE | Wipo information: entry into national phase |
Ref document number: 201080048923.4 Country of ref document: CN |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10826658 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2010826658 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 13504831 Country of ref document: US |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 1201001931 Country of ref document: TH |
|
| ENP | Entry into the national phase |
Ref document number: 20127014075 Country of ref document: KR Kind code of ref document: A |
|
| REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112012010120 Country of ref document: BR |
|
| ENP | Entry into the national phase |
Ref document number: 112012010120 Country of ref document: BR Kind code of ref document: A2 Effective date: 20120427 |