WO2013100287A1 - Procédé et dispositif de traitement de données, procédé de recueil de données et procédé de fourniture d'informations - Google Patents

Procédé et dispositif de traitement de données, procédé de recueil de données et procédé de fourniture d'informations Download PDF

Info

Publication number
WO2013100287A1
WO2013100287A1 PCT/KR2012/004997 KR2012004997W WO2013100287A1 WO 2013100287 A1 WO2013100287 A1 WO 2013100287A1 KR 2012004997 W KR2012004997 W KR 2012004997W WO 2013100287 A1 WO2013100287 A1 WO 2013100287A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
data
location
interest
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/KR2012/004997
Other languages
English (en)
Korean (ko)
Inventor
송하윤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020120060839A external-priority patent/KR101365993B1/ko
Application filed by Individual filed Critical Individual
Priority to US14/369,585 priority Critical patent/US9846736B2/en
Publication of WO2013100287A1 publication Critical patent/WO2013100287A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/02Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using radio waves
    • G01S5/0284Relative positioning

Definitions

  • the present invention provides a technique for clustering a large number of time-stamped big data and big data for extracting and expressing a pattern represented by the big data by providing a transition probability between the clustered regions.
  • a pattern information technology is a technique for clustering a large number of time-stamped big data and big data for extracting and expressing a pattern represented by the big data by providing a transition probability between the clustered regions.
  • big data is a set of data that has a level of data that is difficult to handle with ordinary software tools and computer systems for collecting, managing, storing, retrieving, sharing, analyzing, and visualizing data that falls within a certain amount of time. This applies mainly to (data sets).
  • the size of the big data may be in the terabyte, exabyte, or zettabyte range.
  • Big data can exist in a variety of fields: web logs, RFID, sensor networks, social networks, social data, internet text and documents, internet search indexing, astronomy, meteorology, genomics, biology Examples include biogeochemistry, biology, military surveillance, medical records, photographic records, video records, and e-commerce.
  • Position data of a moving object may also be an example of big data.
  • human mobility models can produce accurate research results or high value-added products. Therefore, there has been a demand for realistic human movement pattern information in various fields of research and industry.
  • the spread pattern of infectious diseases or the virus spread pattern on the Internet is likely to be affected by the human migration pattern, and thus related studies have been conducted.
  • youth protection we have been investigating where young people frequently visit to determine the impact of places they frequently visit, and these studies may be based on mobile data collected by portable GPS devices carried by young people.
  • the research on human mobility can be divided into two groups, personal model and group model.
  • the personal model is influenced by personal parameters such as gender, age, and occupation, and also by the psychological parameters of a person. For example, the habit or tendency of the person who chooses the road was studied and the results indicated that a straight clean road was preferred.
  • social trajectory-based studies have been conducted to identify where people often go based on social networks.
  • the group movement model can be clearly found in military groups. In this group, the leader can greatly influence the movement pattern of the group.
  • the present invention is to provide a technique for filtering and expressing a plurality of big data according to a user's purpose, clustering into a plurality of clusters, and providing the probability of transition between each clustered cluster to extract and express the patterns inherent in the big data do.
  • the present invention is not limited by the above-mentioned objects.
  • a data processing method for solving the above object is provided.
  • the method includes receiving current data of interest; Determining a first area to which the current data is mapped among a plurality of predetermined information areas; Include.
  • the plurality of information areas are generated by processing a plurality of data collected in advance by using a probability-based clustering algorithm.
  • the 'data processing method' may be referred to as 'data region mapping method'.
  • a data processing method comprises the steps of receiving a plurality of data relating to the subject of interest; And generating a data-pattern for a plurality of information areas by processing the plurality of data with a probability-based clustering algorithm.
  • This 'data processing method' may also be referred to as a 'data-pattern generating method'.
  • the data processing method may include receiving one or more new data regarding the object of interest; And refining information about the plurality of information areas using the one or more pieces of data.
  • the plurality of data may include an attribute regarding time.
  • the information region may be a union of cluster-regions corresponding to the attribute of interest and clustered data regions not belonging to the cluster-region.
  • the generating may include filtering error data among the data based on an attribute of the data; Dividing the plurality of data into a plurality of initial information regions; Refining the initial information area into a plurality of second information areas using probability-based clustering; Extracting statistics of data belonging to each area of the plurality of second information areas; And expressing a relationship between the cluster-regions as a probabilistic function using the statistics, and expressing the cluster-regions as a mathematical state.
  • the mathematical state refers to each cluster-region
  • the probabilistic function may express a relation-probability between the cluster-regions.
  • a probability function determined according to the property of the data may be used as an equation.
  • a data processing apparatus includes a communication unit; And a processing unit.
  • the communication unit is configured to receive current data of interest
  • the processing unit is configured to determine a first area to which the current data is mapped among a plurality of predetermined information areas.
  • the plurality of information areas are generated by processing a plurality of data collected in advance by using a probability-based clustering algorithm.
  • the communication unit is configured to receive a plurality of data regarding the object of interest, and the processing unit processes the plurality of data by a probability-based clustering algorithm, thereby providing data on a plurality of information areas. To generate a pattern.
  • the plurality of information areas may be a union of cluster-regions corresponding to the attributes of interest of the plurality of data, and a collection data region not belonging to the cluster-region.
  • the processing unit is configured to filter error data among the plurality of data based on the attributes of the plurality of data, and to divide the plurality of data into a plurality of initial information areas.
  • Probability-based clustering is used to refine the plurality of second information regions, and extracts statistics of data belonging to each region of the plurality of second information regions, and uses the statistics to generate the cluster-regions. Can be represented as a probabilistic function and express the cluster-regions in a mathematical state.
  • the object of interest is a user
  • the data is the location data of the user, the location data including time, latitude, and longitude information (or the geolocation coordinate information data specifying a location on the earth in the same manner as the latitude and longitude).
  • the information area may be a location area.
  • a method for collecting information about an object of interest is provided. This method is characterized by requesting a server for information about an object of interest whose relational probability that the attribute of interest of the object of interest is to be included in the category of the attribute of interest of a particular information region satisfies a predetermined rule. step; And receiving information about the object of interest from the server.
  • whether or not the relation probability to be transferred satisfies the predetermined rule may include: receiving, by the server, information about current data of interest, first information to which the current data is mapped among the plurality of information areas; Determining a region, and a relation probability of being included in a category of the interest attribute of the specific information region in a state in which the interest attribute of the data of interest is included in the category of the interest attribute of the first information region. A determination is made by performing a process comprising determining whether the rule is satisfied.
  • the plurality of information areas may be generated by processing a plurality of data about the interest collected in advance using a probability-based algorithm.
  • a method for collecting information about an object of interest comprises the steps of: requesting the server for information about a relation probability that an attribute of interest of data about an object of interest belongs to a category of an attribute of interest of a specific information region belonging to a plurality of predetermined information regions; And receiving information about the relationship probability from the server.
  • the relation probability may include determining, by the server, a first information area to which current data of interest is mapped among the plurality of information areas, and the interest attribute of the data of interest includes the first information area. Extracting information about a relation probability to be included in a category of the interest attribute of the specific information area in a state of being included in the category of the attribute of interest.
  • the plurality of information areas may be generated by processing a plurality of data about the object of interest collected in advance using a probability-based clustering algorithm.
  • a method of providing information about an object of interest includes receiving current data of interest; Determining a first information area to which the current data is mapped among a plurality of predetermined information areas; And information about a relation probability that the interest attribute of the data of interest is included in the category of the interest attribute of the second information region belonging to the plurality of location areas while the interest attribute of the data of the first information region is included in the category of the interest attribute of the first information region.
  • the plurality of information areas are generated by processing a plurality of data about the object of interest collected in advance using a probability-based clustering algorithm.
  • the information about the relationship probability may be information about whether the relationship probability exceeds a predetermined threshold.
  • the interest is a user
  • the data is location data of the user
  • location data including time, latitude, and longitude information
  • the information area is a location area
  • the interest attribute is the latitude and longitude.
  • a method for determining a user location comprises: receiving a current location of a user; And determining a first location area including the current position of the plurality of predetermined location clusters, wherein the plurality of location areas comprises probability-based clustering of a plurality of previously collected location information. Created by processing with an algorithm.
  • a method of generating a location information model comprising: receiving a plurality of location information about a user; And generating information about the plurality of location areas by processing the plurality of location information with a probability-based clustering algorithm.
  • Position information processing server the communication unit; And a processing unit, wherein the communication unit is configured to receive a current location of the user, and the processing unit is configured to determine a first location area to which the current location is mapped among a plurality of predetermined location areas.
  • the plurality of location areas are generated by processing a plurality of previously collected location information using a probability-based clustering algorithm.
  • Position information processing server the communication unit; And a processing unit, wherein the communication unit is configured to receive a plurality of location information about a user, and the processing unit is configured to process the plurality of location information by using a probability-based clustering algorithm to obtain information about the plurality of location areas. It is supposed to generate.
  • a method of collecting user information comprising: requesting a server for information about a user whose probability of moving to a specific location area including a predetermined specific location meets a predetermined rule; And receiving information about the user from the server, wherein whether the probability of moving above satisfies a predetermined rule is determined by the server receiving information on the current location of the user. Determining a first location area including the current position of the plurality of location areas, and determining whether a probability of moving from the first location area to the specific location area satisfies a predetermined rule. Determine by performing a process comprising the steps.
  • a method of collecting user information comprising: requesting a server for information about a probability that a user moves to a specific location area belonging to a plurality of predetermined location areas; And receiving information about the above probability from the server, wherein the above probability includes the first location area including the current location of the user among the plurality of location areas. And calculating information about the probability of the user moving from the first location area to the specific location area.
  • a method of providing user movement information comprising: receiving a current location of a user; Determining a first location area including an upper current location among a plurality of predetermined location clusters; And providing information regarding a moving probability of the user moving from the first location area to a second location area belonging to the plurality of location areas, wherein the plurality of location areas are previously collected.
  • a plurality of location information is generated by processing a probability-based clustering algorithm.
  • a technique of clustering a plurality of big data pattern information into a plurality of clusters and providing a transfer probability between the clustered clusters in the form of a probability function can be provided.
  • the present invention is not limited by the above effects.
  • 2 is original data for clustering experiments obtained from collected data sets for student's daily movements.
  • FIG. 4 is a visualization of the result of FIG. 3 and shows a cluster located on an actual map.
  • FIG. 5 illustrates a result of generating a Markov Chain-type mobility model based on a result of identifying major locations for student daily life using GPS trajectory data and EM clustering algorithm.
  • 6A, 6B, and 6C illustrate a method of providing location information according to an embodiment of the present invention.
  • FIG. 7 is for explaining a user information processing method according to another embodiment of the present invention.
  • 8A and 8B illustrate the steps described with reference to FIG. 7 as specific examples.
  • FIG 9 is for explaining a user information processing method according to another embodiment of the present invention.
  • FIG. 10 shows an example of an internal structure of the server shown in FIG. 6A.
  • 11 is an example showing on the map the big data (location data of the first experimenter in Seoul) collected according to an embodiment of the present invention.
  • FIG. 13 illustrates an example of a result of clustering the big data shown in FIG. 11 on a map.
  • FIG. 14 is another example in which all of the results of clustering the big data shown in FIG. 11 are shown on a map.
  • FIG. 15 shows an example of a result of Seoul clustering of big data shown in FIG. 12 on a map.
  • FIG. 16 shows an example of a result of clustering the big data shown in FIG. 12 in the center of Jeju city in Jeju Island on a map.
  • FIG. 17 shows an example of a result of Jeju-do clustering the big data shown in FIG. 12 on a map.
  • FIG. 18 is an example of a probability density function used to cluster big data obtained from the first experimenter of FIG. 11.
  • FIG. 19 is an example of a probability density function used to cluster big data obtained from the second experimenter of FIG. 12.
  • FIG. 20 illustrates a pattern expressed by CTMC extracted from big data obtained from the first experimenter of FIG. 11.
  • 21A and 21B illustrate a pattern expressed by CTMC extracted from big data obtained from the second experimenter of FIG. 12.
  • FIG. 22 shows detailed information about 13 clusters generated from big data obtained from the second experimenter of FIG. 12.
  • big data The idea of the present invention is presented on the basis of a collective abstract concept called big data.
  • the present specification uses data relating to the position and / or movement of a person below as an example.
  • the location data described below may be regarded as an example of big data.
  • the mobility of the student's daily life was grasped based on the complex human mobility model structure.
  • Student's daily movement patterns were collected by the student himself carrying a commercially available portable GPS device.
  • GPS data collected by the GPS device is stored for more than 20 days.
  • the location data includes tuples of ⁇ time, latitude, longitude> as well as some non-essential information.
  • the data may contain inherent errors in position. Due to the inherent errors of the GPS device and the effects of the locational environment, the collected data may indicate that the GPS device has moved even if the GPS device is stopped for a while. Errors due to locational circumstances can be very serious. In particular, when the device is in a building, a large amount of positional error as well as continuous error can be collected because of the weak GPS signal blocked or distorted by the building or obstacle.
  • the clustering of data corresponding to a significant number and time is called clustering.
  • an embodiment of the present invention may use a clustering technique.
  • the clustered data may be represented by location information of a building name, a street name, a city name, a university, and the like.
  • the collected GPS data is clustered.
  • clustering techniques based on EM algorithms can be used for personal mobility configurations, and appropriate clustering methods based on recent research such as Levy Walk can be used.
  • an EM clustering algorithm and an EM clustering calibration method used in an embodiment of the present invention will be described.
  • the results of the experiment based on the collected GPS data are described.
  • the EM algorithm was first introduced in 1958 by Hartley et al. And in 1977 by Dempster et al.
  • the EM algorithm generates the original model, and the iterative improvement of the data set leads to the maximum likelihood known as the optimal model.
  • the probability that an object belongs to a mixed model is repeatedly calibrated to become an optimal model and the suitability of the model can be determined by the log likelihood function.
  • the EM algorithm is a probability based clustering algorithm.
  • the parameter ⁇ (t) is calculated and the next step is the parameter ⁇ (t + 1) .
  • These steps are also divided into the expectation (E) and maximization (M) phases.
  • the algorithm defines the expected value Q of the given ⁇ (t) likelihood function as shown in [Equation 2].
  • the algorithm calculates ⁇ (t + 1) by maximizing Q as shown in [Equation 3].
  • an appropriate EM clustering method is first defined.
  • Human movement data collected from GPS devices are used, and accordingly, a probabilistic model for EM clustering is determined.
  • a normal distribution known as a Gaussian distribution may be used, which may not be suitable for one embodiment of the present invention.
  • very inaccurate clustering results can be obtained because the human mobility model exhibits a heavy-tailed distribution called the Levy Walk.
  • the pattern of movement of people is usually concentrated in areas (resident areas, ie clusters) within one or two kilometers for a certain time (period of residence). And the transition between living areas shows a power law distribution (transition period of human movement).
  • a power distribution similar to the exponential distribution may be used.
  • This distribution may be referred to as a modified exponential distribution in the present invention, where the parameter is the distance of human movement from the center of the living area. Equation 4 below shows a probability distribution used in one embodiment of the present invention.
  • lambda is a controllable parameter representing the average radius of the cluster, which can be fixed to a constant value and can be calculated by an appropriate algorithm.
  • X also represents the distance between the person's current location and the center of the cluster.
  • the movement speed of a person can be considered. During the residence of a person's movement, if it has a speed of 10 km / h or less, it may be recognized as a staying state. We can calculate the speed of a person moving according to a specific time with location data such as GPS, and set a speed threshold of 10 km / h by the maximum walking speed of a person. In addition, one or more GPS data having a speed of 10 Km / h or more may be regarded as a transition period, where a person may be considered to be in a mobile state.
  • the diameter of a cluster may be defined as a maximum of several kilometers in consideration of the normal walking distance of a person and the maximum size of a building complex. If the cluster diameter is set large, a rough movement pattern of a person can be identified. If the cluster diameter is set small, a more accurate human movement pattern can be obtained. However, depending on the cluster diameter, the amount of computation can explode. As a result, in the experiment according to an embodiment of the present invention, the cluster diameter was selected as 2 km or 3 km based on the initial stage experiment. In consideration of these basic parameters, clustering for human mobility may be performed in the following steps.
  • Equation 4 calculates the probability that each location information point belongs to a cluster.
  • the position data about the daily life of the student obtained from the location system was used.
  • the positioning systems two global positioning systems in actual service state, the global positioning system and the 3G base station positioning system, were used.
  • the general location technique using 3G base station (3GBS) location is suspected as useful because it tends to have errors (frequent change of base station) in location due to the weakness of the Radio Signal Strength (RSS) system.
  • RSS Radio Signal Strength
  • current GPS systems have obvious problems. That is, the GPS signal is reduced or distorted inside the building.
  • 1 shows the results of a basic experiment for extracting mobility data of an example of big data.
  • a combination of GPS, 3GBS, building interior, and exterior area is shown.
  • # Of data means the total number of data obtained for each experimental situation.
  • 2187 location data were acquired for GPS and external area situation.
  • a change in position between two consecutive localization data was detected, and the data was regarded as an error because the localization device was fixed at the experimental position.
  • the distance between two consecutive data is the error distance.
  • the error distance is in meters.
  • the Average column shows the mean value of the error distances and also shows the standard deviation of the error distances. And Maximum shows the maximum error distance.
  • the last column Error Ratio shows the ratio of errors in each subtest.
  • the results according to GPS-internal, one of the subtests, are interesting. If the GPS device cannot acquire the GPS signal, the internal results using the Garmin GPS device may be meaningless since the device's position is automatically estimated using past speed data. In other words, if a device misses a GPS signal in a building, it only estimates its current location. Such a feature of the "user friendly estimation" of the device manufacturer's policy causes large errors in the location data, and therefore the experimental results are not realistic.
  • Clustering experiments were performed as shown in FIG. 2 from the collected data set for the student's daily movements. I used a handheld GPS device and visualized the results with MapSource and Google Earth. The data was collected for a month and visually verified by a volunteer for data collection using a Google map. The area of data collection is the metropolitan area of Seoul, Korea, including the student's home in Incheon, the student's university in Seoul, and the movement of Bucheon and Gimpo. In FIG. 2, GPS data collection in minutes is shown using Google Earth. As expected, numerous location points are concentrated in two areas, home and company. This GPS data set was clustered using an EM-based clustering technique and as a result several key clusters were recognized. The detailed results and examinations of this experiment are described below.
  • FIG. 3 shows the numerical results of clustered location data sets. Six clusters were recognized and verified by localization data collectors. 3 shows the center position of each cluster, the standard deviation of the position of the cluster member in each cluster, the maximum cluster radius at the initialization stage, the average radius of each cluster, the standard deviation of the radius, the average velocity of the cluster members in each cluster, Cluster information including the time of stay in the cluster, the ratio of the time of stay in each cluster, and the number of location data in each cluster. As expected, the largest clusters are found in homes in Incheon and the corporate areas around Seoul and Mapo, while the other smaller clusters are located in downtown Bucheon, restaurants in Gimpo, rare visits to central Seoul, south of Seoul known as Gangnam. For your dental visit.
  • FIG. 4 is a visualization of the result of FIG. 3 and shows a cluster located on an actual map. For example, in human intuition, cluster # 1 represents Hongik University. This intuitive result is more human-friendly if numerical location data can be automatically mapped to institution names.
  • One embodiment of the present invention includes clustering some or all of the data into a plurality of cluster-regions, expressing the cluster-regions in a mathematical state, and expressing a relationship between the cluster-regions as a probabilistic function.
  • I can do a model.
  • the Markov Chain model described above can be regarded as a form of this model.
  • FIG. 5 shows the position data of an object (a person or a group) and six states associated with a cluster obtained by using the clustering algorithm described above, and transition probabilities between the states.
  • state 305 represents a state where the object is located in cluster # 5 named Gimpo
  • state 302 represents a state where the object is located in cluster # 2 named Incheon.
  • the sum of the two probabilities that state 305 can change only to state 302 or state 305 is 1.000.
  • state 302 can only be changed to state 305, state 303, state 301, and state 302 and the probability of transition to each state is 0.0113, 0.0057, 0.1089, 0.8741.
  • the other states 301, 306, 304 can be similarly described.
  • the cluster # has a probability of 0.4132 moving from cluster # 6 to cluster # 1, that is, the probability of transitioning from state 306 to state 301 is 0.4132.
  • Store # 1 present at 1, has the advantage of stimulating the buyer's willingness to send marketing information to the observers in cluster # 6.
  • cluster # 6 because the probability that an object located in cluster # 6 immediately moves to cluster # 2 is zero, that is, the probability that it immediately transitions from state 306 to state 302 is zero, the store that exists in cluster # 2 # 2 has a relatively low profit to immediately deliver its marketing information to the observers in cluster # 6.
  • the location of the observation target is divided into a plurality of groups (clusters) and the transition probability between each group is known, various services can be developed based on the information about the current location of the observation target. As described above, some of the location information collected from the observation target may not belong to any of the plurality of clusters described above, and the information of the portion may imply that the observation target is in motion.
  • the preprocessing step and the postprocessing step may be added before and after the clustering step described above.
  • the collected location data may have an error.
  • OCSVM class support vector machine
  • positional data with errors can be filtered out.
  • One class support vector machine (OCSVM) can be used for such data filtering.
  • OCSVM class support vector machine
  • the clustered results can be presented in the form of a ⁇ latitude, longitude> tuple, which is difficult for humans to understand, and is mapped to data such as area name and building name to give user friendliness.
  • the post-treatment step shown can be followed.
  • 6A, 6B, and 6C illustrate a method of generating a location information model of a user and a method of determining a location of a user according to an embodiment of the present invention.
  • the user device 2 and / or the server 1 possessed by the user can be used to generate the user's location information model and determine the user's location.
  • the location information model may be referred to as a 'location pattern'.
  • This location information model generation method can be executed in the server 1.
  • This location information model may include elements that embody the concept of the Markov chain shown in FIG. 5, for example.
  • the method for generating a location information model may include a step S11 of receiving a plurality of user location data collected from a user who is stationary or moving with a user device.
  • the location data may be regarded as an example of big data.
  • the plurality of user location data may be generated by the user device 2 possessed by the user by detecting their own location information.
  • One user may possess a plurality of user devices 2 simultaneously or alternately, wherein the device identifiers identifying each user device 2 may be different, but each user device 2 possesses them. It may contain one user identifier that identifies one user.
  • the user device 2 directly provides the server 1 with a plurality of user location data
  • the user device 2 stores the plurality of location data in another third device.
  • This third device may then provide the server 1 with the information stored therein.
  • the location of the user can be treated the same as the location of the user device.
  • the method of generating a location information model comprises the steps of generating the information on the plurality of location areas by clustering the plurality of user location data into a plurality of location areas by processing the probability-based clustering algorithm.
  • the location area may be regarded as an example of an information area formed by clustering a set of big data.
  • the probability-based clustering algorithm may be, for example, the EM algorithm described through Equation 1 to Equation 4, but is not limited thereto.
  • the information about the plurality of location areas may include information about clustered location areas as shown in FIG. 4, and also relates to a probability of moving from one location area to another location area among these location areas. May contain information.
  • the information about the plurality of location areas may include information about clusters # 1 to # 6 (301 to 306), for example, as shown in FIG. 5, and may be different from any one of these clusters. It may include information about the probability of moving to one cluster.
  • Each location region or each cluster above may contain information about the range of geolocations that it represents.
  • the information about the plurality of location areas described above may be, for example, a concept such as the Markov chain shown in FIG. 5 or a concept including or similar to the Markov chain.
  • the generating step S12 may include more subdivided steps. This will be described with reference to FIG. 6B.
  • the method may include a step (S610) of dividing the plurality of user location data into a plurality of initial clusters.
  • the initial cluster may be created according to a predetermined deterministic rule or a random rule.
  • the above initial clusters are refined into a plurality of second clusters having boundary regions different from those of the initial clusters using, for example, EM algorithm-based clustering described through Equations 1 to 4. refinement) may be included (S620).
  • EM algorithm-based clustering described through Equations 1 to 4. refinement
  • a formula representing a Levy walk may be used as a probability density function used for EM algorithm-based clustering. Levy walk is a well-known concept presented by Marta C. Gonzalez, A. Hidalgo, and Albert-Laszlo Barabasi in Nature's 2008 paper, "Understanding individual human mobility patterns.”
  • the method may include extracting location data statistics from each group of the second cluster (S630).
  • 'location data' may be regarded as an example of 'big data'.
  • the position data statistic includes a center position of each group, a standard deviation of positions (Std. Dev. Of Positions), a mean radius, a stay time, and a user position. It may include information such as the number of data (# of GPS data).
  • the method may include the step S640 of building a Markov chain using the location data statistics.
  • a state in a Markov chain may mean each location region, and a transition probability may indicate a movement probability (relative probability) between the location regions.
  • One location information model may be generated.
  • One location information model may represent a movement pattern of one user.
  • the method may further include a step S13 of receiving new one or more user location data other than the plurality of user location data described above. Thereafter, the method may further include refining information about the plurality of location areas by using the one or more user location data (S14). That is, the information about the plurality of location areas generated by steps S11 and S12, that is, the location information model, can be updated by receiving and using new user location data.
  • steps S13 and S14 are executed in addition to steps S11 and S11, steps S13 and S14 may be performed before performing step S110 shown in FIG. 6A.
  • All user location data described above may include information on latitude / longitude and acquisition time at which the user at the time of obtaining the information is located.
  • step S12 to S14 for the method of generating the location information model are performed in the server 1, but can be directly performed in the user device 2 having excellent computing power.
  • step S11 of acquiring user location data by the server 1 may be omitted.
  • This user positioning method can be executed in the server 1.
  • the user location determining method may include receiving a current location of the user (S110).
  • the current location of the user may be generated by the user device 2 possessed by the user by detecting his or her location information.
  • a user may carry two or more user devices at the same time or carry them alternately. Therefore, the user device that has sent the user location data in step S11 and the user device that has sent the user location data in step S110 may be the same or different.
  • the user location determination method may include the step (S120) of determining a first location area that includes the current location of the user of the plurality of predetermined location areas.
  • the predetermined plurality of location areas may be generated by the above-described step S12. That is, the plurality of user location data collected in step S11 may be generated by processing the probability-based clustering algorithm.
  • the user's location data is not simply determined by providing one longitude and latitude value according to the geolocation information, but using information about the user's locations in the past and using a probability-based clustering algorithm. Provides mapping to a plurality of location areas created by
  • each step of this embodiment is shown to be executed in the server 1, but in the modified embodiment, it may be executed in the user device 2.
  • the user device 2 may need to be able to execute step S12. Otherwise, the user device 2 may need to receive and store information about the plurality of location areas described in the first embodiment from the server 1 that has performed the steps S11 and S12.
  • This user positioning method can be executed in the server 1.
  • This embodiment assumes that the current location of the user is determined according to step S120 described in the second embodiment. In the following Embodiment 3, it is assumed that the current location of the user determined in step S120 exists in the first location area for convenience of description.
  • the user location information providing method may include extracting movement probability information to be moved from the first location area where the user is currently located to the second location area (S130).
  • the first location area and the second location area may be, for example, cluster # 1 301 and cluster # 2 302 of FIG. 5, wherein the above probability information is 0.0986 or a value processed therefrom.
  • the processed value may have a value of 1 when the probability of movement is greater than 0.5, and a value of 0 otherwise.
  • the processed value is not limited thereto and may have various methods.
  • each of the clusters shown in FIG. 5 corresponds to some of the plurality of location areas obtained from the user's location data. That is, there may be location areas not represented by each cluster.
  • Each of the clusters shown in FIG. 5 represents a space in which the user mainly resides, and location data generated when moving between the spaces may not be included in each cluster. Therefore, it can be easily understood that the above-described first position region is not necessarily limited to any one of the clusters shown in FIG. 5.
  • This user location information providing method can then provide the above moving probability information to the subscriber device 3 (S140).
  • the probability of movement may provide information (user proximity information) that the user will reach a nearby location (S140).
  • the provision may be via a wired or wireless network.
  • the subscriber device may be different from or the same as the user device.
  • the subscriber device may be, for example, a device used by an operator who regards the above user as a customer or a potential customer, and may be a mobile device or a fixed device.
  • the server 1 may provide the subscriber device 3 with movement probability information for the user to move from the first location area to the second location area. Can meet business needs
  • the above company has great interest in the probability of the user moving to the second location area. It is likely that there is no. This is because the place of business of the above operator does not exist in the second location area.
  • the server 1 when there are a plurality of operators and each operator uses a different subscriber device 3, it is necessary for the server 1 to provide all of the subscriber devices 3 with all probability of movement between user's location areas. It does not have to be, and may or may not provide the probability of movement information according to the needs of each operator. For example, a subscriber device used by an operator having a business place in the nth location area may provide only a probability that the user moves to the nth location area.
  • the server 1 may transmit information (provider proximity information) that the user device 2 will approach the provider after a predetermined time, to the user device 2.
  • the server 1 provides a step S160 of providing a means by which the provider of the subscriber device 3 can promote and communicate with the user of the user device 2 together with or separately from the steps S140 and S150. can do.
  • the subscriber device 3 is not necessarily limited thereto.
  • the method for providing location information according to the present embodiment is illustrated as being performed in the server 1, but in the modified embodiment, the providing method may be performed in the user device 2.
  • the user device 2 needs to have the ability to perform the steps S120, S130 and S140 instead of the server 1.
  • FIG. 7 is for explaining a user information processing method according to another embodiment of the present invention. This method can be executed on the user device 3.
  • the user information processing method includes requesting the server for identification information about a user whose probability of moving to a specific location area including a specific location designated by the user who uses the subscriber device 3 meets a predetermined rule (S210). can do.
  • the specific location designated by the user of the subscriber device 3 may be, for example, a location where a company using the subscriber device 3 actually sells goods or services.
  • the specific location area may be any one of a plurality of predetermined location areas clustered as shown in FIG. 4. In other words, the plurality of location areas may be, for example, clusters # 1 to # 6 (301 to 306) as shown in FIG.
  • the probability of moving to the specific location area may refer to, for example, the probability of the user moving to cluster # 1 301 in FIG. 5.
  • the specific location is designated by the user using the subscriber device 3, but the server 1 can determine where the specific location area including the specific location is located and what.
  • this user information processing method comprises the steps of receiving from the server 1 information (eg, a user ID or a user device ID) about a user whose probability of moving to the specific location area above meets a predetermined rule ( S220) may be included.
  • the predetermined rule may be, for example, a rule for determining whether the probability of movement is greater or smaller than a specific value, but there are various ways.
  • the selected user may mean a user adjacent to a business place of a provider of the subscriber device 3.
  • the server 1 may perform one or more of the following steps shown in FIG.
  • the server 1 receives a plurality of first user location data collected from a first user who is stationary or moving with a user device (S11) and a plurality of second user locations collected from a second user.
  • the data may be provided.
  • the server 1 clusters the plurality of first user location data into the plurality of first location areas by using the probability-based clustering algorithm described in Embodiment 1 to generate information about the plurality of first location areas.
  • the plurality of second user location data may be clustered into a plurality of second location areas to generate information on the plurality of second location areas.
  • the information about the plurality of first location areas and the information about the plurality of second location areas may be the same concept as the Markov chain shown in FIG.
  • step S33 the above-described steps S13 and S14 may be further performed.
  • the server 1 may perform the step S111 of receiving the current location of the first user and the step S112 of receiving the current location of the second user.
  • the user device sending the current location of the first user in step S111 and the user device sending the plurality of first user location data in step S11 may be the same or different.
  • the user device which sent the current location of the second user in step S112 and the user device which sent a plurality of second user location data in step S12 may be the same or different. This is described in Example 2.
  • the server 1 determines the location area 1 including the current location of the first user among the plurality of first location areas, and includes the location area including the current location of the second user among the plurality of second location areas.
  • Determining step 2 may include (S121).
  • the predetermined plurality of location areas may be generated by the above-described step (S33).
  • the server 1 may perform a step (S131) of extracting a first probability for the first user and a second probability for the second user.
  • the first probability is a probability that the first user moves to the specific location area [1] including the specific location specified by the user of the subscriber device 3 (from the location area 1 currently staying).
  • the specific location area [1] is a location area included in the plurality of first location areas.
  • the second probability is a probability that the second user moves to the specific location area [2] that includes the specific location specified by the user of the subscriber device 3 (from the location area 2 currently staying).
  • the specific location area [2] is a location area included in the plurality of second location areas.
  • the specific location area [1] and the specific location area [2] above correspond to a specific location area including the specific location specified by the subscriber device 3.
  • the server 1 may determine whether the first probability and the second probability satisfy a predetermined rule, and select the information about the satisfied user (S141).
  • the predetermined rule may be, for example, a rule for selecting only information of a user whose movement probability is 0.5 or more, but is not limited thereto.
  • the server 1 may transmit information (provider proximity information) that the user device 2 will approach the provider after a predetermined time, to the user devices 21 and 22. Further, in addition to or separately from the steps S220 and S230, the server 1 provides a step S240 of providing a means for the provider of the subscriber device 3 to promote and communicate to the users of the user devices 21 and 22. Can provide.
  • the steps executed in the server 1 may be executed in the user device 2 in some cases.
  • the steps executed by the server 1 may be executed in the user devices 21 and 22.
  • 8A and 8B illustrate the steps S11, S21, S33, S111, S112, S121, S131, S141, S210, and S220 described with reference to FIG. 7.
  • FIG. 8A illustrates information about a plurality of first location areas obtained by using a probability-based clustering algorithm with a plurality of first user location data obtained from a first user.
  • three cluster-regions in which the user mainly resides are given, and there may be one or more location regions different from this cluster-region.
  • 8B illustrates information about a plurality of second location areas obtained by using a probability-based clustering algorithm with a plurality of second user location data obtained from a second user.
  • three cluster-regions in which the user mainly resides are given, and there may be one or more location regions different from this cluster-region.
  • the specific position designated by the subscriber device 3 is one Seocho.
  • the current location of the first user is location area # 11 (Yeouido) and the current location of the second user is location area # 21 (Jongno).
  • the specific location area including 1 Seocho-dong which is a specific location designated by the subscriber device 3, becomes the location area # 12 (1 Seocho-dong and 2 Seocho-dong) for the first user, and for the second user.
  • the region # 22 (1 Seocho-dong, 3 Seocho-dong).
  • the first arbitrary location area described above becomes the location area # 12
  • the second optional location area becomes the location area # 22.
  • the probability that the first user moves from the location area # 11 to the location area # 12 is 0.6
  • the probability that the second user moves from the location area # 21 to the location area # 22 is 0.4.
  • the above-described predetermined rule is a rule for selecting only information of a user whose movement probability is 0.5 or more, in the above case, only the information of the first user will be selected and transmitted to the subscriber device 3.
  • the user using the subscriber device 3 is, for example, a business that provides goods or services, and the store operated by the business is located in 1 Seocho-dong, the business operator is likely to visit his or her own store by geographical location.
  • the marketing information may be individually marketed to the first user.
  • the specific location is designated by the user using the subscriber device 3, but the server 1 can determine where the specific location area including the specific location is located and what.
  • the specific location area may be different for the first user and the second user. That is, in the case of the first user, the specific location area including the above specific location is the location area # 11, but in the case of the second user, the location area # 21.
  • the specific location area may be difficult to designate or determine by the user of the subscriber device 3 and / or the subscriber device 3.
  • FIG. 9. 9 is for explaining a user information processing method according to another embodiment of the present invention. This method may be performed at the subscriber device 3.
  • the fifth embodiment is a modified example of the fourth embodiment, and concepts not specifically described herein may be the same as those described in the fourth embodiment.
  • the user information processing method may include a step (S310) of requesting, by the subscriber device 3, the server 1 to the server 1 for information about a probability that the user moves to a specific location area including the specific location.
  • the user device 3 or the user using the subscriber device 3 may know in advance information about the user.
  • the specific location may be specified by the user using the subscriber device 3, or may be a fixed location or a location that varies with time.
  • the specific location may be a location where an operator using the subscriber device 3 actually sells goods or services.
  • the server 1 can determine which area the specific location area including the specific location means.
  • the subscriber device 3 can receive information from the server 1 about the probability of moving to the above specific location area.
  • the information about this probability may be the probability itself or may be processed information obtained by processing the value of the probability itself.
  • the predetermined rule may be a rule for determining whether the probability of moving is greater or less than a specific value.
  • the server 1 may perform the following steps.
  • the server 1 may perform a step S11 of receiving a plurality of user location data collected from a user who is stationary or moving with a user device.
  • the server 1 clusters the plurality of user location data into a plurality of location areas by using the probability-based clustering algorithm described in Embodiment 1 to generate information about the plurality of location areas (S12). Can be done.
  • the information about the plurality of location areas may be the same concept as the Markov chain shown in FIG. 5 or a concept including the Markov chain. If information on the plurality of location areas is provided in advance, the performance of steps S11 and S12 may be omitted.
  • the server 1 may perform step S110 of receiving the current location of the user.
  • the user device which sent the current location of the user in step S110 and the user device which sent the plurality of user location data in step S11 may be the same or different. This is described in Example 2.
  • the server 1 may include determining a first location area to which a current location of the user is mapped among the plurality of location areas (S120).
  • the predetermined plurality of location areas may be generated by the above-described step S12.
  • the server 1 may perform a step S130 of extracting probability for the user.
  • the above probability is a probability that the user moves from the first location area currently staying to the specific location area including the specific location designated by the subscriber device 3.
  • the specific location area is a location area included in the plurality of location areas.
  • the server 1 transmits the information about the extracted probability to the subscriber device 3.
  • the user proximity information indicating whether the user of the user device 2 is close to the provider of the subscriber device 3 may be transmitted based on the extracted probability.
  • the server 1 may transmit information (provider proximity information) that the user device 2 will approach the provider after a predetermined time, to the user device 2.
  • the server 1 provides a step S340 of providing a means for the provider of the subscriber device 3 to promote and communicate with the user of the user device 2 together with or separately from the steps S320 and S330. can do.
  • the steps executed in the server 1 may be executed in the user device 2 in some cases.
  • the steps executed by the server 1 may be executed in the user device 2.
  • FIG. 10 shows a structure of a server that can be used for one embodiment of the present invention.
  • the server 1 may include a communicator 100 and a processor 200 that exchanges data and / or commands with the communicator 100.
  • the processor 200 may include a calculator 210 and a storage 220.
  • the processor 200 may calculate statistics for each group shown as an example in FIG. 3, generate a Markov chain as shown in FIG. 5, and the steps S12, S120, and S130 of FIG. 6A, and FIG. 7. Steps S33, S121, S131, and S141 and steps S12, S120, and S130 of FIG. 9 may be executed.
  • the communication unit 100 may execute steps S11, S110, and S140 of FIG. 6A, steps S11, S21, S112, S121, and S220 of FIG. 7, and steps S11, S110, and S320 of FIG. 9.
  • the concept of a Markov chain used in an embodiment of the present invention is divided into a continuous time Markov chain and a discrete time Markov chain. Any of these may be used, but according to an embodiment, it is more preferable to use a continuous time Markov chain. Can be good.
  • 'Location data' described in the above embodiments may be regarded as an example of 'big data'.
  • a part or all of a set of big data is clustered into a plurality of cluster-regions, the cluster-regions are represented in a mathematical state, and the relationship between the cluster-regions is expressed as a probabilistic function.
  • a model may be included that includes steps. In this case, when generating a plurality of cluster-regions from some data of one set of big data, the remaining data may be mapped to an information region different from the plurality of cluster-regions.
  • the Markov Chain model shown in FIG. 5 can be regarded as a form of this model.
  • the location information model described in Embodiment 1 may be regarded as an example of the 'data information model' generated from big data according to an embodiment of the present invention to be described below.
  • the 'data information model' may be referred to as a 'data pattern'.
  • the “location area” described in Embodiment 1 may be regarded as an example of an “information area” formed by grouping a set of big data.
  • the method according to an aspect of the present invention for detecting a movement pattern from the movement data of the person described above may be extended and applied to general 'big data'. This is because the above-mentioned person's 'moving data' and 'big data' include one or more attributes, since one or more attributes include time information in common.
  • 1) 'user location data' shown in superscripts in FIGS. 6A, 6B, 6C, 7, 8A, 8B, and 9 may extend the concept to 'data of interest'.
  • 2) 'location area' indicated by a superscript may extend the concept to an 'information area'.
  • a method of generating a data-pattern of big data is provided.
  • the method includes receiving a plurality of data (big data) related to the object of interest (S610), and processing the plurality of data with a probability-based clustering algorithm, thereby providing a plurality of 'information areas'. and generating a data-pattern relating to 'area)' (S620).
  • the interest may be a person and the data may be location data.
  • the interest may be weather change and the data may be weather data.
  • the weather data may include information such as time, wind speed, and wind direction.
  • the location data and the weather data presented above as examples of the data can be found in common in that both include information about time.
  • the above information area may be regarded as a union of 'cluster-regions' corresponding to 'attributes of interest' and 'cluster-regions' not belonging to the cluster-region.
  • the weather data includes information on barometric pressure, rainfall, latitude and longitude.
  • the above-described attribute of interest may be defined as "latitude-longitude under rain and low pressure", and data having "latitude-longitude under rain and low pressure" is mapped to the cluster-area, and other data is collected in the filter set. Can be mapped to an area.
  • a total of N collected data can be clustered to generate M information areas. At this time, any one of the N data may be mapped to any one of the M information area.
  • the generating step (S620) may include the step (S621) of filtering the error data of the data based on the attributes of the data.
  • the location data includes time, latitude, and longitude information, and the speed and acceleration of the person of interest may be calculated using the information.
  • the acceleration of the person can be seen as the 'property of the data' above, it can be determined that there is an error in the data if the person has too large acceleration. In this manner, error data may be filtered out of the data based on the attributes of the data.
  • the step of dividing the plurality of data into a plurality of initial information areas (S622), the initial information area is purified into a plurality of second information areas using probability-based clustering. (S623), extracting statistics of data belonging to each area of the plurality of second information areas (S624), and expressing a relationship between the cluster-areas as a probabilistic function using the statistics.
  • the method may further include a step S625 of expressing the cluster-regions in a mathematical state.
  • the mathematical state may mean each cluster-region, and the probabilistic function may express 'relation-probability' between the cluster-regions.
  • a Markov chain as shown in FIG. 5 may be generated by step S625 above.
  • the relation probability may be defined as a probability of transitioning from the first state to the second state of the Markov chain.
  • An example of the above probability-based clustering algorithm may be the clustering technique by the above-described expectation maximization.
  • a probability function predetermined according to the property of the data may be used as an equation.
  • the modified exponential distribution of Equation 4 is used as a probability function.
  • step (S620) steps S621 to S625)
  • step (S630) of receiving one or more new data regarding the object of interest, and using the one or more pieces of data Refinement (S640) of the information about the information area may be further performed.
  • new data may be mapped to any one of the plurality of information areas that are already generated.
  • the method according to the sixth embodiment can be performed in a device such as a terminal or a server having a processing unit and a communication unit.
  • the communication unit may serve to acquire new data from the outside of the apparatus, and the processing unit may perform the various steps described above by using the data obtained from the communication unit.
  • the relational probability of another state associated with a particular state may be a fixed value, but may also be a function of time.
  • the relationship between a person's height and weight may change over a long period of time, but for a short time it is a fixed value.
  • the location of a particular person will change over time in most cases. In other words, the transition of the state from the state of home to the state of work at the time of work is concentrated at a certain time. In this case, the relationship probability between the home and the work must be expressed as a function of time.
  • the relation probability that the atmospheric pressure of Cheongju will affect the atmospheric pressure of Daejeon within a few hours is a function of time
  • the relation probability that the dark clouds of Daejeon will affect the rainfall probability of Cheongju is also a function of time. to be.
  • the relational probability between specific states should be regarded as a function of time to be a more accurate methodology for describing relational probability.
  • the method includes a step S710 of requesting information from a server about an object of interest.
  • the request is transmitted to the server only when the relation probability of the data of interest satisfying the predetermined rule is included in the category of the interest attribute of the specific information region among the plurality of information regions.
  • any one information area is a set including one or more data, the attribute of interest of the data contained in the any one information area may not have a single value but a value having a predetermined category. .
  • the interest attribute of the weather data may be defined as 'atmospheric pressure'.
  • the plurality of information areas include the information area A and the information area B.
  • the information area A may be an area including data having a value of air pressure of interest 760 mmHg or less
  • the information area B may be an area containing data having a value of air pressure of 760 mmHg or more.
  • step S710 can be embodied as follows.
  • the request is sent to the server only when the relation probability p satisfies a predetermined rule (ex: p> 0.5).
  • the request is sent to the server only if the relation probability p, which is to be included in 89 ° ⁇ latitude ⁇ 90 °, satisfies a predetermined rule (ex: p> 0.5).
  • whether the relation probability p satisfies the predetermined rule is determined by the server receiving information on the current data of interest (S711), and mapping the current data among the plurality of information regions. Determining a first information area to be included (S712), and including the interest attribute of the data of interest in the category of the interest attribute of the specific information region in a state of being included in the category of the interest attribute of the first information region; It may be determined by performing a process including a step (S713) of determining whether a relationship probability to be satisfied satisfies the predetermined rule.
  • step S712 it may be determined that the first information area to which the current data is mapped among the plurality of information areas is the information area B (ex: barometric pressure> 760 mmHg).
  • step S713 the probability of interest (ex: barometric pressure) of the weather data of interest may belong to a category of interest (ex: barometric pressure ⁇ 760 mmHg) of a specific information area (ex: information area A). It can be determined whether this predetermined rule (ex: p> 0.5) is satisfied.
  • step S713 the attribute of interest (ex: latitude) of the location data of interest is placed in the category of the attribute of interest (ex: 89 ° ⁇ latitude ⁇ 90 °) of the specific information area (ex: information area C). It may be determined whether the relation probability p to belong satisfies a predetermined rule (ex: p> 0.5).
  • step S720 of receiving the information about the object of interest from the server may be performed.
  • the method may include requesting the server for information about a relation probability that an interest attribute of data about an object of interest belongs to a category of an interest attribute of a specific information region belonging to a plurality of predetermined information regions (S810) and the relationship.
  • the relation probability may include determining, by the server, a first information area to which current data of interest is mapped among the plurality of information areas (S811), and the interest attribute of the data of interest is determined by the server. 1 may be calculated by performing a processor including extracting information about a relation probability to be included in a category of the interest attribute of the specific information region in a state included in the category of the interest attribute of the information region (S812).
  • step S810 a plurality of information areas (ex: information area C, information area) in which a property of interest (ex: latitude) of data about the interest is predetermined is predetermined.
  • Region D (ex: 84 ° ⁇ latitude ⁇ 89 °).
  • the attribute of interest (ex: latitude) of the data of interest is placed in the category of the attribute of interest (ex: 89 ° ⁇ latitude ⁇ 90 °) of the specific information area (ex: information area C).
  • the method may include receiving current data of interest (S910), determining a first information area to which the current data is mapped among a plurality of predetermined information areas (S920), and interest of the data of interest. Providing information about a relation probability to be included in a category of the interest attribute of the second information area belonging to the plurality of location areas while the attribute is included in the category of the interest attribute of the first information area (S930). It may include.
  • step S920 the first information area to which current data is mapped among the plurality of predetermined information areas (ex: information area C and information area D) is the information area D (ex: 84 ° ⁇ latitude ⁇ 89 °). Can be determined.
  • the prediction method collects a plurality of user location data about a user of the user device 2 (S11), and generates the information on the plurality of location areas by processing the plurality of user location data with a probability-based clustering algorithm. Steps S12 and S33 are included. Next, receiving the current location data of the user of the user device 2 (S110, S111), and determining the first location area of the plurality of location areas to which the current location data is mapped (S120, S121) Include. Next, the method may include generating information related to the probability of the user of the user device 2 moving from the first location area to the second location area of the plurality of location areas (S130, S131, and S141). .
  • the prediction method may be performed in a device such as a server or a user device including a communication unit and a processing unit, and a computer readable medium having a program for executing the method may be provided. Providing an apparatus and a medium related to this prediction method can be easily accomplished.
  • 11 is an example showing on the map the big data (location data of the first experimenter in Seoul) collected according to an embodiment of the present invention.
  • 12 illustrates another example of big data collected in accordance with an embodiment of the present invention (location data of Seoul and Jeju Island of a second experimenter) on a map.
  • FIG. 13 illustrates an example of a result of clustering the big data shown in FIG. 11 on a map.
  • cluster-region # 5 is formed near Gimpo
  • cluster-region # 3 is formed near Bucheon
  • cluster-region # 3 is formed therebetween.
  • 2 (Cluster 2) is formed.
  • FIG. 14 is another example in which all of the results of clustering the big data shown in FIG. 11 are shown on a map. It can be seen that cluster-areas # 1 to # 6 are formed.
  • FIG. 15 shows an example of a result of Seoul clustering of big data shown in FIG. 12 on a map. It can be seen that the cluster regions # 1 to # 6, and # 13 are formed.
  • FIG. 16 shows an example of a result of clustering the big data shown in FIG. 12 in the center of Jeju city in Jeju Island on a map. It can be seen that cluster-regions # 7 to # 9 are formed.
  • FIG. 17 shows an example of a result of Jeju-do clustering the big data shown in FIG. 12 on a map. It can be seen that cluster-regions # 7 to # 12 are formed.
  • FIG. 18 is an example of a probability density function used to cluster big data obtained from the first experimenter of FIG. 11.
  • the x axis of the graph shown in FIG. 20 represents the distance from the center of the cluster, and the y axis represents the probability density.
  • the solid line, long dotted line, and short dotted line in this graph represent the probability density functions in the cluster-regions # 2, # 5, and # 3, respectively, shown in FIG.
  • the probability density function in each cluster-domain is the strain index distribution, and the parameter is calculated by probability-based clustering.
  • FIG. 19 is an example of a probability density function used to cluster big data obtained from the second experimenter of FIG. 12.
  • the solid line, long dotted line, and short dotted line in this graph represent the probability density functions in the cluster-regions # 7, # 8, and # 9, respectively, shown in FIG.
  • FIG. 20 illustrates a pattern expressed by CTMC extracted from big data obtained from the first experimenter of FIG. 11.
  • the six cluster-regions shown in FIG. 14 are shown together with the probability of transition between each cluster-region.
  • CTMC stands for Continuous Time Markov Chain, which means one of the techniques that can provide the description of specific states and the probability of transition between states in continuous time.
  • FIG. 21A and 21B illustrate a pattern expressed by CTMC extracted from big data obtained from the second experimenter of FIG. 12.
  • Figure 21a shows the results in Seoul
  • Figure 21b shows the results in Jeju Island.
  • the relationship probability (transition probability) between the cluster-region # 13 of FIG. 21A and the cluster-region # 9 of FIG. 21B is also displayed.
  • FIG. 22 shows detailed information about 13 clusters generated from big data obtained from the second experimenter of FIG. 12.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/KR2012/004997 2011-12-29 2012-06-25 Procédé et dispositif de traitement de données, procédé de recueil de données et procédé de fourniture d'informations Ceased WO2013100287A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/369,585 US9846736B2 (en) 2011-12-29 2012-06-25 Data processing method, data processing device, data collecting method and information providing method

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2011-0146367 2011-12-29
KR20110146367 2011-12-29
KR10-2012-0060839 2012-06-07
KR1020120060839A KR101365993B1 (ko) 2011-12-29 2012-06-07 데이터처리방법, 데이터처리장치, 데이터수집방법, 및 정보제공방법

Publications (1)

Publication Number Publication Date
WO2013100287A1 true WO2013100287A1 (fr) 2013-07-04

Family

ID=48697725

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2012/004997 Ceased WO2013100287A1 (fr) 2011-12-29 2012-06-25 Procédé et dispositif de traitement de données, procédé de recueil de données et procédé de fourniture d'informations

Country Status (1)

Country Link
WO (1) WO2013100287A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105188035A (zh) * 2015-08-11 2015-12-23 重庆邮电大学 基于转移概率热点映射的室内wlan增广流形对齐定位方法
WO2016032172A1 (fr) * 2014-08-29 2016-03-03 삼성전자 주식회사 Système de détermination de l'emplacement d'une zone d'intérêt et d'une entrée
US10415978B2 (en) 2015-11-20 2019-09-17 Samsung Electronics Co., Ltd. Landmark location determination

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100127595A (ko) * 2009-05-26 2010-12-06 국방과학연구소 이동체의 위치 예측 시스템 및 위치 예측 방법
JP2011118776A (ja) * 2009-12-04 2011-06-16 Sony Corp データ処理装置、データ処理方法、およびプログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100127595A (ko) * 2009-05-26 2010-12-06 국방과학연구소 이동체의 위치 예측 시스템 및 위치 예측 방법
JP2011118776A (ja) * 2009-12-04 2011-06-16 Sony Corp データ処理装置、データ処理方法、およびプログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HYUN UK, KIM ET AL.: "A Use of Expectation Maximization Clustering for Characterizing the Human Mobility Pattern", JOURNAL OF CONFERENCE, vol. 38, no. 2, November 2011 (2011-11-01), pages 261 - 264 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016032172A1 (fr) * 2014-08-29 2016-03-03 삼성전자 주식회사 Système de détermination de l'emplacement d'une zone d'intérêt et d'une entrée
US9541404B2 (en) 2014-08-29 2017-01-10 Samsung Electronics Co., Ltd. System for determining the location of entrances and areas of interest
CN105188035A (zh) * 2015-08-11 2015-12-23 重庆邮电大学 基于转移概率热点映射的室内wlan增广流形对齐定位方法
CN105188035B (zh) * 2015-08-11 2018-06-15 重庆邮电大学 基于转移概率热点映射的室内wlan增广流形对齐定位方法
US10415978B2 (en) 2015-11-20 2019-09-17 Samsung Electronics Co., Ltd. Landmark location determination

Similar Documents

Publication Publication Date Title
WO2012011690A2 (fr) Système et procédé pour un service basé sur l'emplacement permettant de naviguer à l'intérieur
US8699370B2 (en) Method and apparatus for analysis of user traffic within a predefined area
WO2011021899A2 (fr) Procédé et dispositif pour générer, gérer et partager un chemin mobile
KR101365993B1 (ko) 데이터처리방법, 데이터처리장치, 데이터수집방법, 및 정보제공방법
WO2010024584A4 (fr) Système de reconnaissance d'objets, système internet sans fil comportant ce système et procédé de service de communication sans fil à base d'objets utilisant ce système
WO2015119371A1 (fr) Dispositif et procédé pour la fourniture d'informations de poi au moyen d'un regroupement de poi
WO2017183920A1 (fr) Dispositif de commande destiné à un véhicule
JP6035995B2 (ja) 気象情報生成装置、プログラム及び通信システム
WO2020046034A1 (fr) Procédé et appareil d'utilisation de données mobiles de sim logicielle
WO2012050268A1 (fr) Système de place de marché d'applications mobiles basé sur un emplacement
WO2016117965A1 (fr) Système de gestion d'emploi du temps et procédé de gestion d'emploi du temps utilisant un calendrier
WO2017200234A1 (fr) Procédé et appareil pour prédire le nombre de visiteurs futurs à stocker, sur la base d'informations de modèle de population flottante
US20150012555A1 (en) POI Information Providing System, POI Information Providing Device, POI Information Output Device, POI Information Providing Method, and Program Therefor
EP3963910A1 (fr) Procédé de géorepérage variable et dispositif électronique associé
WO2021137402A1 (fr) Dispositif électronique détectant un emplacement et procédé associé
WO2018212607A1 (fr) Procédé et appareil de fourniture d'informations basées sur la proximité
WO2018124500A1 (fr) Procédé et dispositif électronique pour fournir un résultat de reconnaissance d'objet
WO2013100287A1 (fr) Procédé et dispositif de traitement de données, procédé de recueil de données et procédé de fourniture d'informations
JP2018061126A (ja) 移動端末装置、センサデータ送信方法、及びプログラム
EP2761509A1 (fr) Appareil et procédé de génération et d'extraction d'un contenu à emplacement étiqueté dans un dispositif informatique
WO2014158007A1 (fr) Procédé et dispositif de détermination de position d'emplacement
WO2019074315A1 (fr) Procédé et système de fourniture d'informations d'emplacement d'utilisateur
WO2018062626A1 (fr) Système de gestion de ressources pour acquisition de données de mesure entre domaines hétérogènes de l'internet des objets
WO2018080261A1 (fr) Dispositif électronique et procédé de détermination d'entrée de région d'intérêt de dispositif électronique
WO2025254272A1 (fr) Procédé et système de fourniture d'un service de réponse par ia conversationnelle

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12861111

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 14369585

Country of ref document: US

122 Ep: pct application non-entry in european phase

Ref document number: 12861111

Country of ref document: EP

Kind code of ref document: A1