An Approach to Construct Semantic Networks with ...

3 downloads 117 Views 826KB Size Report
proposal of a method for associating semantic networks with confidence scores ... domain as a semantic network, and then design a template of a probabilistic ..... on a large amount of social media data, environmental sensor data, and web ...
An Approach to Construct Semantic Networks with Confidence Scores Based on Data Analysis – Case Study in Osaka Wholesale Market – Takahiro Kawamura and Akihiko Ohsuga Graduate School of Information Systems, University of Electro-Communications, Japan

Abstract—In recent years, several large-scale knowledge bases (KBs) have been constructed, such as YAGO, DBpedia, and Google Knowledge Graph. Although automatic extraction techniques that extract facts and rules from the Web is necessary for constructing such large-scale KBs, incorporation of noisy, unreliable knowledge cannot be unavoidable. Thus, Google Knowledge Vault assigns extracted knowledge with confidence scores based on consistency with the existing KBs. In this paper, we propose a new approach for associating confidence scores with knowledge based on a large amount of raw data for domains, where there is no existing KB. We first construct knowledge in a specific domain as a semantic network, and then design a probabilistic network, that corresponds to the semantic network. To associate the confidence scores with the semantic network, we train the probabilistic network with a large amount of open data, provided by the Osaka central wholesale market in Japan. We also confirm the validity of the confidence scores with the accuracy of reasoning on the probabilistic network. A semantic network associated with confidence scores, that is, a weighted labeled graph is advantageous not only for reducing the noisy, unreliable knowledge with low confidence, but also for making retrieval results ranking on the KB. In the future, probabilistic reasoning on semantic networks may also be possible. Keywords-Knowledge Representation; Semantic Network; Probabilistic Network; RDF;

I. I NTRODUCTION Recently, there have been several activities for constructing large-scale knowledge bases (KBs), including YAGO [1], NELL [2], DBpedia [3], Microsoft Satori1 , Google Knowledge Graph2 , and Facebook3 . These KBs mainly store millions of facts predicate(subject, object) in a relational property form or subject −−−−−→ object in a graph form of Resource Description Framework (RDF), and rules about the world, such as information about people, places and things. However, previous approaches for building databases based on contributions from human volunteers like Wikipedia have been observed that the growth has limitations [4]. Thus, there are also a lot of activities for 1 blogs.bing.com/search/2013/03/21/understand-your-world-with-bing 2 googleblog.blogspot.jp/2012/05/introducing-knowledge-graph-thingsnot.html 3 www.adweek.com/socialtimes/facebook-builds-knowledge-graph-withinfo-modules-on-community-pages

automatic KB construction, that extract facts and rules from the Web to augment the knowledge collected from human volunteers and structured data sources. In the automatic KB construction, however, incorporation of noisy, unreliable knowledge cannot be unavoidable. Thus, there are some approaches for associating a confidence score with each fact and rule, to represent the probability that KB believes the correctness. Knowledge Vault [5], which is Google’s project to build a Web-scale probabilistic KB compares extracted knowledge with existing KBs, such as Freebase [6] and Knowledge Graph. Extractor in Knowledge Vault assigns a confidence score to an extracted triple, by computing the probability of the triple being true, based on the consistency between the triple and prior knowledge derived from the existing KBs. This approach is powerful and reasonable, but cannot be applicable to domains, which have no existing KB so far. Therefore, we propose a new approach for associating confidence scores with knowledge based on a large amount of raw data, such as sensor data and transaction data in this paper. In other words, we compare knowledge extracted or constructed in semantic networks with big data, and then calculate the probability that the actual data support the facts and rules in the networks. In addition to reasoning and search on the knowledge with high confidence, it is useful to reduce the noisy, unreliable knowledge with low confidence. RDF provides high-flexibility of design, and then schemas occasionally get complicated and the data size tends to be increased. By limiting the knowledge with high confidence scores, we can reduce the redundant triples and suppress the size of KBs. Thus, the contribution of this paper is a proposal of a method for associating semantic networks with confidence scores based on data analysis. We then confirm the validity of the confidence with the accuracy of reasoning on the probabilistic networks. Specifically, assuming that a probabilistic network of entities in a domain matches a semantic network in the same domain, we first construct knowledge in a specific domain as a semantic network, and then design a template of a probabilistic network, that corresponds to the semantic network. To associate the confidence scores with the semantic network, we train the probabilistic network with a large

amount of data. We use open data concerning the handling amounts and prices of fresh foods by prefecture for 10 years, provided by the Osaka central wholesale market in Japan. We also use meteorological data, such as temperature, solar radiation, and the amount of rainfall by prefecture, provided by Japan Meteorological Agency (JMA). Then, we associate the probabilities of the trained network with corresponding properties in the semantic network, on the assumption that the probability P (A|B) approximately corresponds the confidence degree of the property between A and B. We also try to estimate the future amount and price of a vegetable in the wholesale market to confirm the validity of the confidence scores with the estimation accuracy, although improvement of the accuracy is out of the scope of this paper. The rest of this paper is organized as follows. Section 2 describes the construction of a semantic network for the wholesale market. In Section 3, we design a template of a probabilistic network based on the semantic network, and then train the probabilistic network to obtain the confidence scores. Also, we estimate the future handling amount and price of tomatoes in a prefecture to confirm the validity of the confidence scores. Section 4 discusses related work, and finally, Section 5 refers to future work and concludes this paper. II. C ONSTRUCTION OF S EMANTIC N ETWORK In this section, we first introduce data of a wholesale market and JMA, and then describe the construction of a semantic network. A. Data sources 1) Osaka central wholesale market data: Transaction data of the Osaka central wholesale market are publicly available on the market website4 . The data include the following information on a daily basis according to the market calendar5 : {product type, product item, product name, origin, producer} ×{highest price, intermediate price, lowest price} × quantity × day − month − year in a CSV format. The product types include vegetables, fruits, fish, and so on. For example, a product item in vegetables is asparagus, and the product names in asparagus are green asparagus and white asparagus. There are over 500 product items, and the day-month-year starts from 0101-2005. An example data for a product on a day without a producer name can be seen in a table6 (in Japanese). 4 www.shijou.city.osaka.jp/sikyomap/sikyo

The data size is about 100 MB for a year. Figure 1 illustrates changes of the handling amount of tomatoes at three prefectures, which are part of the data for the following experiment. 2) Meteorological data: The meteorological data include the average temperature, the highest temperature, the lowest temperature, the amount of rainfall, humidity, wind speed, sunlight hours, solar radiation, etc. on a daily basis at 850 locations from 1980 to 2015 in Japan, provided by JMA7 . This is also open data. B. Semantic network construction As the domain knowledge for constructing a semantic network, we referred to “Growing conditions and price prospects of vegetables”8 , which are monthly reports of the Ministry of Agriculture, Forestry and Fisheries in Japan. We manually extracted several classes which represent the domain, mainly based on their appearance frequencies in the reports, and first aligned the classes with is-a (subClassOf) and part-of properties. We then placed other properties as the estimated relations between two classes, and finally added instances under the classes. As a policy of constructing the semantic network, we connected the classes with the properties as much as possible, since the properties with low confidence can be identified and removed in the following process. An overview of the semantic network is shown in Fig. 2. In addition to the handling amount (HandlingAmount) and the price (UnitPrice), ProductionRegion and WeatherData are represented as classes in the figure. The ProductionRegion has a hasPrice relation (property) with UnitPrice. Also, the ProductionRegion has a hasAmount relation with HandlingAmount, and hasTemp, hasHumid, hasWind relations, etc. with Temp, Humid, Wind classes, etc. in WeatherData. This figure also represents the effect (hasGrowth) of the past climate conditions on the handling amount, and the effect (buyingBehavior) of the present climate conditions on the price of the day. There is also the correlation (hasDependency) between the handling amount and the price at a prefecture, and hasBalance relations among prefectures due to the taste and the brand image of each prefecture. Furthermore, the handling amount and price are compared with the handling amount and price for the same month of the last year, the previous month, and the previous day. Finally, HandlingAmount, UnitPrice and each class in WeatherData has time-series values with rdf:Seq. III. C ONSTRUCTION OF P ROBABILISTIC N ETWORK Next, we designed a template of a probabilistic network, which corresponds to the above semantic network.

5 www.city.osaka.lg.jp/shijo/page/0000006696.html 6 www.shijou.city.osaka.jp/data/webroot/nippou/hinmoku/201501/ 20150126-11-1-31900000.csv

7 agrienv.dc.affrc.go.jp 8 www.maff.go.jp/j/press/index.html

Figure 1.

Changes of handling amount of tomatoes at 3 prefectures for 3 years.

Figure 2.

Semantic network in wholesale market.

A. Introduction of Markov Logic Network Probabilistic network approaches, such as Hidden Markov Model (HMM), Bayesian network, and Conditional Random Fields (CRF) have been actively studied to model complicated and uncertain phenomenons. However, the models occasionally include unnatural hard rules, such as societal policies and economic constraints. Thus, a Markov Logic Network (MLN) [7] is considered to be suitable for such modeling, since MLN can describe the hard rules as firstorder logic in a probabilistic network. So, we introduced

MLN to estimate the causal relationship between multiple time-series data and events to be observed, such as daily climate conditions and a harvesting period of a vegetable in [8]. In contrast, MLN could be said that it is composed of first-order logic formulas with weights from a logic perspective. Logical expressions are described by four symbols: constants, variables, functions, and predicates. Constants represent individuals or instances, and variables represent any of the individuals or instances in a domain. Functions represent mappings from tuples of objects to objects, and

Figure 3.

Example logic formulas and a corresponding network. Figure 4.

predicates represent relations among objects or attributes of objects. A first-order logic is a set of strong constraints on the set of possible worlds. The solution in the real world, however, is often in the set of impossible worlds. Thus, MLN associates weights, which reflect how strong the constraints are with each formula, and then enables calculation of the conditional probability, mainly by a sampling technique over the grounded network. Fig. 3 shows an example of a Markov network. In more detail, the Markov network is also known as the Markov Random Field (MRF), which is a model for the joint distribution of a set of variables, and is composed of a set of pairs Fi and wi , where Fi is a formula in first-order logic and wi is a real number. The probability distribution over possible world x specified by the grounded MLN is given by: ) ( ∑ 1 P (X = x) = exp wi ni (x) Z i

(1)

, where ni (x) is the number of true groundings of Fi in x, and Z is a normalization term. The exponent is the sum of the weights of the satisfied ground clauses, and thus P (X = x) can be optimized by maximizing this sum. It is generally laborious to construct a Markov network. However, constructing a Markov network from a Markov ‘Logic’ Network is straightforward, since the first-order predicate logic represents an initial template of a Markov network. Given a set of formulas, their weights can be learned by either maximizing the joint likelihood of all predicates or maximizing the conditional likelihood of the query predicates [9], [10]. Inference in Markov networks is computing probabilities using Markov chain Monte Carlo (MCMC) and Belief Propagation (BP) [11], [12], and Maximum a posteriori (MAP) inference [13]. MAP inference

Template of probabilistic network in wholesale market.

can be carried out efficiently using a weighted satisfiability solver such as MaxWalkSAT [14]. See more details in [7]. B. Construction of a template of probabilistic networks In MLN, the nodes are the (grounded) predicates and the edges mean the probabilities of implications. Therefore, we represented properties on the semantic network as facts (= predicates, nodes on MLN) or rules (= implications, edges on MLN), depending on whether there is a probability for a property. For example, each location surely has its own handling amount, and thus hasPrice is a fact without a probability. However, the correlation between a handling amount and a price is not a fact, but a presumption with a probability. Thus, hasDependency is represented by a rule. Figure 4 illustrates a probabilistic network template designed based on the semantic network. As described above, MLN can have hard rules, but we did not manually add the hard rules in the network template, since we evaluate the probabilities on the network with the estimation accuracy. The following Listing 1 represents logical expressions corresponding to Fig. 4. // Predicate Declarations // Handling amount and price hasAmount(loc, year, month, day, amt) hasPrice(loc, year, month, day, prc) // Weather data (per day) hasAvgTemp(loc, year, month, day, atmp) hasHighTemp(loc, year, month, day, htmp) hasLowTemp(loc, year, month, day, ltmp) hasHumid(loc, year, month, day, hmd) hasWind(loc, year, month, day, wd) hasSunTime(loc, year, month, day, stime) hasSunAmt(loc, year, month, day, samt) // Weather data (integration per year) hasSumOfTemp(loc, year, month, day, satmp) hasSumOfSunTime(loc, year, month, day, sstime) hasSumOfSunAmt(loc, year, month, day, ssamt)

// Rules (the antecedent must be a conjunction) // hasGrowth (pm: previous month, pd: previous day) hasSumOfTemp(l, y, pm, d, sta), hasSumOfTemp(l, y, m, d, stb), pm = m-1, m = 7, stb-sta > 800 => hasAmount(l, y, m, d, a) ... hasSumOfSunTime(l, y, pm, d, ssta), hasSumOfSunTime(l, y, m, d, sstb), pm = m-1, m = 7, sstb-ssta > 150 => hasAmount(l, y, m, d, a) ... hasSumOfSunAmt(l, y, 6, pm, ssaa), hasSumOfSunAmt(l, y, 7, m, ssab), pm = m-1, m = 7, ssab-ssaa > 6000 => hasAmount(l, y, m, d, a) ... // buyingBehaviour hasAvgTemp(al, y, m, d, at), hasHighTemp(al, y, m, d, ht), hasLowTemp(al, y, m, d, lt), hasHumid(al, y, m, d, h), hasWind(al, y, m, d, w), m = 7 => hasPrice(l, y, m, d, p) // previousYear, previousMonth, previousDay (py: previous year) hasPrice(l, py, m, d, pp), py = y-1, m = 7 => hasPrice(l, y, m, d, p) hasAmount(l, py, m, d, pa), py = y-1, m = 7 => hasAmount(l, y, m, d, a) ... // hasDependency hasAmount(l, y, m, d, a), m = 7 => hasPrice(l, y, m, d, p) // hasBalance (al: another location) hasPrice(al, y, m, d, ap), m = 7 => hasPrice(l, y, m, d, p) hasAmount(al, y, m, d, aa), m = 7 => hasAmount(l, y, m, d, a) // Facts in Kagawa, Tokushima, Ehime prefectures // HandlingAmount hasAmount(Kagawa, 2012, 4, 4, 251) hasAmount(Tokushima, 2012, 4, 5, 176) hasAmount(Ehime, 2012, 4, 6, 176) ... // Price hasPrice(Kagawa, 2014, 4, 23, 610) hasPrice(Tokushima, 2014, 4, 24, 610) hasPrice(Ehime, 2014, 4, 25, 600) ... // WeatherData hasAvgTemp(Kagawa, 2012, 4, 1, 12) hasHighTemp(Kagawa, 2012, 4, 1, 17) hasLowTemp(Kagawa, 2012, 4, 1, 8) hasHumid(Kagawa, 2012, 4, 1, 59) hasWind(Kagawa, 2012, 4, 1, 1) hasSunTime(Kagawa, 2012, 4, 1, 5) hasSunAmt(Kagawa, 2012, 4, 1, 108) ...

Listing 1.

Part of logic descriptions

Comments that precede rules and facts represent properties in the semantic network. For example, the hasGrowth rule in the list means that the handling amount of July at a prefecture is affected by the integration of temperature hasSumOfTemp, sunlight hours hasSumOfSunTime, and solar radiation hasSumOfSunAmt for the previous month. The

hasBalance rule means that the price at Kagawa prefecture is influenced by the price of another prefecture because of the taste and the brand image of each prefecture. As the facts, we prepared the handling amount, price, and the weather data at Kagawa, Ehime, and Tokushima prefecture. IV. C ONSTRUCTION OF S EMANTIC N ETWORK WITH C ONFIDENCE S CORES In this section, we try to associate the semantic network with confidence scores, which are calculated on the probabilistic network, and then confirm the validity of the confidence scores. A. Association of confidence scores We trained the above probabilistic network with the handling amounts and prices of tomatoes ranging from 2010 to 2014 at three prefectures, and the weather data for the same period at the same locations. As an implementation of MLN, we used Tuffy ver.0.49 . The learning method is discriminative learning, and the inference method provides MAP and marginal probabilities. Tuffy has advantages, such as a fast grounding method using databases (DBs), and multi-thread computation by partitioning the network. We run Tuffy on a server with 64 VCPU and 128 GB memory and PostgreSQL DB. Kagawa, Ehime, and Tokushima are closely related prefectures on the island in Japan. We manually confirmed that the data on these prefectures is virtually complete and without abnormal values. In order to limit the scope to tomatoes grown outdoors, the data in the summer, that is, those ranging from April to July are adopted for training. After the training, we associated the probabilities with corresponding properties in the semantic network in a procedure opposite to Section III-B. Figure 5 shows the semantic network with properties attached with the probabilities. The probability in MLN means the conditional probability of A given B, usually written as P (A|B) based on Markov property, where event A and B correspond to two nodes of a semantic link. Then, the Markov property means that future states depend only upon the present state, not on the sequence of events that preceded it. In the figure, the probabilities are the averages of absolute values, and we believe that the probability P (A|B) approximately corresponds the confidence degree of the property between A and B, supported by the actual data, although a fine value is considered to be not so meaningful. For example, the probability that a vegetable has a price (UnitPrice) under a weather condition (WeatherData) is considered to correspond to the confidence degree of the buyingBehavior property, which represents the correlation between the weather condition and the price. As a result, we successfully constructed a semantic network with confidence scores from a large amount of raw 9 i.stanford.edu/hazy/hazy/tuffy

Figure 5.

Semantic network with probabilities in wholesale market.

data. In this figure, properties related to the price, that is, buyingBehavior and hasBalance, and comparison with the same month of the last year, that is, previousYear have the highest confidence. Also, the hasBalance property for the handling amount hardly be supported by the actual data. The proposed method identifies knowledge that was constructed by human volunteers, but not supported by the actual data, such as causal relationships or correlations of entities in the experiment. In addition to removing such unreliable knowledge with low confidence, semantic networks associated with the confidence scores, that is, weighted labeled graphs are advantageous for ranking recommendation results in the case of ambiguous search, and for optimization of triple stores and acceleration of SPARQL search. In the future, probabilistic reasoning on semantic networks may also be possible. However, the experiment is a case study for the given perspective, such as the handling amount and price, in a dataset of the Osaka central wholesale market. We need to conduct more experiments in several different domains and datasets. B. Validity of confidence scores Finally, we confirmed the validity of the confidence scores with the estimation accuracy of the future handling amount and price. Specifically, we estimated the handling amount and price of July 2014 from the training data ranging from April to July in 2012 and 2013, by calculating marginal probabilities and taking a value with the highest probability for a day. In the same way, we estimated the handling amount and price of July 2013 from the training data ranging from April to July in 2011 and 2012, and also estimated the handling amount and price of July 2012 from the training

data ranging from April to July in 2010 and 2011, as 3-fold cross validation. Accuracy of the estimation is calculated by comparison with the actual data of July 2012, 2013, and 2014. As a result, the difference on average of the days in July was 97.8 JPY (17.9% of the actual price) in the price, and 369 kg (28.5% of the actual amount) in the handling amount after excluding abnormal values. Although there is room for further improvement, we confirmed that the accuracy was satisfiable. A person in charge of general affairs of the Osaka central wholesale market has stated that there are cases in which the rumor about a poor crop of a specific vegetable in a cool summer or a warm winter causes a price surge of the vegetable. But this estimation will be able to verify the truth of such rumors. V. R ELATED W ORK First, in terms of analysis of time-series data such as wholesale market data, a survey of [15] shows that there are two approaches to clustering and/or classification based on sequential data patterns: a statistical approach, for example, to create self-regression models, and an AI approach, for example, to measure distances among patterns using unsupervised learning. However, the purpose of this paper is not clustering and/or classification of time-series data, but estimation of the causal relationship between time-series data and the results. Also, in terms of adoption of MLN, application of logic programming such as event calculus and answer set programming is problematic for ensuring conformity, because of uncertainty in the natural world. Furthermore, a graphical model, such as a Hidden Markov Model, a Bayesian network or a Neural network, has problems in that known rules

cannot be encoded explicitly and the causal relationship is hidden in a black box (no interpretability and explicability). Therefore, we applied MLN to time-series data analysis in order to introduce probability into logic programming and rules into the graphical model. Then, in terms of application of MLN to the time-series data analysis, approaches to time-series data using MLN include a method for event annotations in movie data [16], where low-level propositions are created by image analysis, and the situation of the image is estimated by MLN. In [17], event calculus is applied to MLN also for event annotations, and clause reduction is proposed for saving calculation costs. In [18], facts with temporal information are extracted from unstructured text, and are sorted in chronological order using MLN. None of these approaches aim to estimate the causal relationship between time-series data and the results. In terms of agricultural data analysis, the National Agriculture and Food Research Organization (NARO) in Japan predicts growth of outdoor-grown vegetables that are affected by weather conditions, and then tries to estimate their shipping time and shipping amount [19]. NARO created plant growth models based on temperature, solar radiation, etc., and calculates the weight increase of leaves and roots by fields, and then sums the results in a region using macros of MS Excel. A system similar to that of NARO has been monetized as a big data business by NTT Data [20], an IT company in Japan. In addition, approaches similar to the above have been applied for the control of invasive species or to encourage species in conservation areas [21]. However, none of these approaches use a machine learning technique or a probabilistic approach. Finally, in terms of combination of semantic networks and probabilities, there are many studies on probabilistic data integration, namely, record linkage in DB research and entity linking in Linked Open Data (LOD). However, to the best of our knowledge, there is no research on addition of probabilities to properties in semantic networks. VI. C ONCLUSION AND F UTURE W ORK In this paper, we proposed a method for associating semantic networks with confidence scores based on a large amount of raw data. Based on transaction data of the Osaka central wholesale market, we constructed a semantic network, and then associated confidence scores with the semantic network, which are calculated on the probabilistic network. Finally, we confirmed the validity of the confidence scores with the estimation accuracy of the future handling amount and price. In future work, we intend to make this technique useful for a Web-scale probabilistic KB, query-answering, and inference system, which have also been proposed by Google [5]. If we add probabilities to several datasets of LOD based on a large amount of social media data, environmental sensor

data, and web information, these probabilistic LOD sets will be beneficial for non-commercial domains. R EFERENCES [1] F. Suchanek, G. Kasneci, and G. Weikum: “YAGO - A Core of Semantic Knowledge,” Proceedings of 16th international conference on World Wide Web (WWW 07), pp. 697-706, 2007. [2] A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. H. Jr., and T. Mitchell: “Toward an architecture for never-ending language learning,” Proceedings of 24th AAAI Conference on Artificial Intelligence (AAAI 2010), 2010. [3] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives: “DBpedia: A nucleus for a web of open data,” Proceedings of 6th International Semantic Web Conference (ISWC 2007), pp. 722-735, 2007. [4] B. Suh, G. Convertino, E. H. Chi, and P. Pirolli: “The singularity is not near: slowing growth of wikipedia,” Proceedings of 5th International Symposium on Wikis and Open Collaboration, (WikiSym 09), pp. 8:1-8:10, 2009. [5] X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang: “Knowledge Vault: A Web-scale Approach to Probabilistic Knowledge Fusion,” Proceedings of 20th ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 601610, 2014. [6] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor: “Freebase: a collaboratively created graph database for structuring human knowledge,” Proceedings of ACM SIGMOD, pp. 1247-1250, 2008. [7] M. Richardson, P. Domingos: “Markov Logic Networks. Machine Learning,” Vol. 62, pp. 107-136, 2006. [8] T. Kawamura, A. Ohsuga: “Prediction of Plant Growth using Markov Logic Network,” Proceedings of International Symposium on Green Computing and Information Technology (Green IT-14), 2014. [9] P. Singla and P. Domingos: “Discriminative training of markov logic networks,” Proceedings of AAAI, Vol. 5, pp. 868-873, 2005. [10] H. Poon and P. Domingos: “Sound and efficient inference with probabilistic and deterministic dependencies,” Proceedings of in AAAI, Vol. 6, pp. 458-463., 2006. [11] W. R. Gilks, S. Richardson, and D. J. Spiegelhalter: “Introducing markov chain monte carlo,” Markov chain Monte Carlo in practice, pp. 1-19, 1996. [12] D. Roth: “On the hardness of approximate reasoning,” Artificial Intelligence, Vol. 82, No. 1, pp. 273-302, 1996. [13] S. Della Pietra, V. Della Pietra, and J. Lafferty: “Inducing features of random fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 4, pp. 380393, 1997.

[14] D. Du, J. Gu, P. M. Pardalos et al.: “Satisfiability problem: theory and applications,” Proceedings of DIMACS Workshop, Vol. 35, 1996. [15] E. Keogh, S. Kasetty: “On the Need for Time Series Data Mining Benchmarks, A Survey and Empirical Demonstration,” Journal of Data Mining and Knowledge Discovery,Vol. 7 No. 4, pp. 349-371, 2003. [16] V. I. Morariu and L. S. Davis: “Multi-agent event recognition in structured scenarios,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 32893296, 2011. [17] A. Skarlatidis, G. Paliouras, G. A. Vouros, and A. Artikis: “Probabilistic Event Calculus based on Markov Logic Networks,” Proceedings of RuleML America, pp.155-170, 2011. [18] X. Ling and D. S. Weld: “Temporal Information Extraction,” Proceedings of 25th National Conference on Artificial Intelligence (AAAI), pp. 1385-1390, 2010. [19] K. Sugawara: “Estimation of weekly shipping amount based on growth prediction of outdoor-grown vegetables,” www. maff.go.jp/j/shokusan/sanki/pdf/6noke.pdf, 2012. [20] JSOL and NTT Data: “Crop estimation system for agricultural producers,” www.jsol.co.jp/release/2013/131211.html, 2013. [21] J. M. Bullock, R. F. Pywell, and S. J. Coulson-Phillips: “Managing Plant Population Spread: Prediction and Analysis using a Simple Model,” Ecological Applications, Vol. 18, No. 4, pp. 945-953, 2008.

Suggest Documents