Risk Management with Hard-Soft Data Fusion in ... - IEEE Xplore

Risk Management with Hard-Soft Data Fusion in Maritime Domain Awareness Rafael Falcon, Rami Abielmona, Sean Billings, Alex Plachkov and Hussein Abbass

Abstract—Enhanced situational awareness is integral to risk management and response evaluation. Dynamic systems that incorporate both hard and soft data sources allow for comprehensive situational frameworks which can supplement physical models with conceptual notions of risk. The processing of widely available semi-structured textual data sources can produce soft information that is readily consumable by such a framework. In this paper, we augment the situational awareness capabilities of a recently proposed risk management framework (RMF) with the incorporation of soft data. We illustrate the beneficial role of the hard-soft data fusion in the characterization and evaluation of potential vessels in distress within Maritime Domain Awareness (MDA) scenarios. Risk features pertaining to maritime vessels are defined a priori and then quantified in real time using both hard (e.g., Automatic Identification System, Douglas Sea Scale) as well as soft (e.g., historical records of worldwide maritime incidents) data sources. A risk-aware metric to quantify the effectiveness of the hard-soft fusion process is also proposed. Though illustrated with MDA scenarios, the proposed hard-soft fusion methodology within the RMF can be readily applied to other domains.

I. I NTRODUCTION Maritime Domain Awareness (MDA) can be understood as the situational knowledge of physical and environmental conditions that exist within or influence a maritime region. The intended scope of this awareness includes all behaviors that could, directly or indirectly, affect the security of the region, its economic activity or the local environment [1]. Accurate situational awareness requires the integration of multiple types of information that are extracted from different data sources. High-Level Information Fusion (HLIF), defined as Level 2 Fusion and above in the Joint Director of Laboratories (JDL)/Data Fusion Information Group (DFIG) model [2] [3], has proven to be a useful tool for the characterization of vital MDA processes and behaviours such as anomaly detection, trajectory prediction, intent assessment, and threat assessment. HLIF techniques are much better positioned to cope with the overwhelming volume of the data from a variety of sources that flows to the maritime operations center at increasingly high velocity with sometimes questionable veracity. Maritime operators often rely on hard data sources (i.e., structured, quantitative, more objective, usually sensed data) R. Falcon and R. Abielmona are with the Research & Engineering Division, Larus Technologies Corporation, Ottawa ON, K1P 5V5 Canada

{rafael.falcon,rami.abielmona}@larus.com S. Billings and A. Plachkov are with the School of Electrical Engineering & Computer Science, University of Ottawa, Ottawa ON, K1N 6N5 Canada

{sbill042, aplac099}@uottawa.ca H. A. Abbass is with the University of New South Wales at the Australian Defence Force Academy, Canberra, Australia ACT 2600

[email protected]

c 2014 IEEE 978-1-4799-5431-5/14/$31.00

generated by vessel traffic in order to identify suspicious events at sea. However, a wealth of relevant information can be extracted from soft data sources (i.e., unstructured/semistructured, more subjective, qualitative data, such as textual reports on vessel sightings or marine incidents). As demonstrated in [4], Natural Language Processing (NLP) methods can draw meaningful information that is representative of human intuition, which is often not captured by hard data sources. These pieces of soft information can then supplement the existing hard information in order to provide a more comprehensive situational awareness. A risk-aware view of the system, its units and the surrounding environment will help identify the existing vulnerabilities in a proactive rather than reactive fashion, which will contribute to the design of effective countermeasures to mitigate the perceived threats and assess their impact on the system. With this goal in mind, the authors in [5] put forth a generic, multimodular risk management framework (RMF) architecture for real-time processing of distributed systems. The RMF is capable of: (1) extracting a parallel risk stream from the regular data flow periodically reported by the system units; (2) dynamically visualizing the risk landscape of each system unit; (3) assessing the local and global risk levels for all units and the system as a whole and (4) producing a limited set of promising candidate responses that can be effectuated on the system/environment. In this paper, we augment the HLIF capabilities for situational awareness (Level 2 JDL/DFIG) of the RMF in [5] with the incorporation of soft data. We illustrate the beneficial role of the hard-soft data fusion (HSDF) in the characterization and evaluation of potential vessels in distress within MDA scenarios [6]. Risk features pertaining to maritime vessels are defined a priori and then quantified in real time using both hard (e.g., Automatic Identification System, Douglas Sea Scale) as well as soft (e.g., historical reports of worldwide maritime incidents) data sources. A geographic and descriptive representation of the incidents contained within these reports is used to characterize the regional hostility and risk of attack for vessels actively monitored by the RMF, hence enhancing the overall maritime picture presented to the operator. To the best of our knowledge, this is the first time that an RMF featuring an HSDF component is presented and then applied to the maritime world. Additionally, the effectiveness of the fusion process is quantified via a newly proposed metric. Finally, the proposed methodology is easily applicable to other domains provided the necessary customizations are in place.

A. The Role of Computational Intelligence (CI) CI techniques permeate the entire RMF design (fuzzy logic and granular computing are used for risk feature extraction; clustering techniques for risk visualization; fuzzy inference systems for risk assessment and evolutionary optimization for response selection). The new fusion effectiveness metric proposed in Section V also employs fuzzy reasoning. B. Paper Outline The rest of the paper is structured as follows. Section II briefly reviews relevant works. Section III unveils the proposed HSDF scheme for risk management and Section IV illustrates its application within MDA. A risk-aware Measure of Effectiveness (MoE) is put forth in Section V. Section VI is the empirical evaluation and Section VII concludes the paper. II. R ELATED W ORK This Section briefly reviews several relevant studies concerning maritime risk analysis, HLIF for MDA as well as existing HSDF maritime systems. A. Maritime Risk Analysis The ISO 3100 [7] standard for risk management defines risk as the effect of uncertainty on objectives. Two primary objectives within the maritime domain are safety and security, while uncertainty classically reflects a lack of the right information to support the commanders intent. Consequently, an important dimension of risk management in the maritime domain is to provide the set of processes and tools that support and enhance the commanders situational awareness picture (SAP). Improving the SAP can be categorized into a number of sub-problems, from improving the collection and transmission of information through the fusion of data, to the provision of intent, projection and consequence cues to support the commanders intent. Many tools have been proposed in the literature to provide a system-level risk picture. Hidden Markov Models [8][9] (HMMs) are common techniques used in this domain that rely on latent states to approximate the dynamics of the system-level risk picture. Technically, HMMs provide an effective method for tactical risk but can be unstable in a dynamic environment when evaluating system-level risk. The inter-dependency from one model to another, arising from the interdependency of assets, makes the system stability vulnerable when confronting a change in the environment. Practically, the cost to sustain the integrity of large systems composed of many interconnected HMMs is large. The same issue arises when relying on other probabilistic models such as Bayesian networks [10]. Complex systems research generated another line of risk assessment methodologies for the military domain including computational red teaming [11][12] and adversarial modelling [13]. Practical risk-aware decision support systems have been developed by industry such as the ATHENA Integrated Defense System by Raytheon [14], which is designed to search for suspicious behaviors in the search-and-rescue division.

Two US-based projects are also worth mentioning here. The first is the one entitled Maritime Automated Super Track Enhanced Reporting (MASTER), which is an integrative reporting project supported by the Joint Capability Technology Demonstration (JCTD) [15] and the Comprehensive Maritime Awareness (CMA) [16] programs. The second project is the Predictive Analysis for Naval Deployment Activities (PANDA) [17], which is a case-based reasoning system that models context using ontologies and business rules. B. HLIF for Maritime Domain Awareness Recent efforts in performing HLIF for MDA have made use of clustering techniques to process the data sources. Clustering is a pattern classification technique used to group similar points; this method is widely employed due to its unsupervised learning nature. In 2007, Laxhammar [18] tackled anomalous vessel detection via two clustering techniques operating on Automatic Identification Systems (AIS) data, namely the Mixture of Gaussians (MoG) model and the Fuzzy Adaptive Resonance Theory (FuzzyART) self-organizing neural network. The data generated by a vessel was transformed into a data point belonging to two feature spaces: F1 = (Velx , Vely ) and F2 = (Velx , Vely , Lat, Lon). Both MoG and FuzzyART were trained only on data containing normal vessel activities, and later tested on anomalous data. The MoG is more tolerant to noisy (anomalous) training data than FuzzyART. Due to their unsupervised nature (which implies no need for any domain knowledge), the two approaches are flexible enough that they could be employed in different domains dealing with twodimensional motion. The limitation in the proposed solutions is that they can detect only elementary types of anomalies (e.g., sea lane crossed by a vessel). The author discovered that the inclusion of the coordinates in the second feature space did not improve the anomaly detection capabilities. More recently, Blasch et al. [19] explored the potential of fusing information scattered over multiple data sets containing hard (e.g., imagery) and soft (e.g., textual labels) data. The authors note that the challenges at hand are mainly due to no standardized ontological representation (i.e. no shared domain vocabulary) used by the different data collection agencies to describe the real-world objects contained within their data sets. They claim that ontological alignment allows for the successful aggregation of the information contained in the dispersed data sets and proceed by providing a maritime example in which the overall situational awareness was augmented by combining vessel traffic pattern information into one fused data set. Finally, Shao et al. [20] successfully correlated vessel contacts from AIS, Synthetic Aperture Radar (SAR), and Ground Moving Target Indicator (GMTI) data via Fuzzy K-Nearest Neighbor classification; furthermore, the authors were able to associate the information from the three different sources by performing Fuzzy C-Means clustering. Lastly, they improved vessel trajectory prediction by enhancing the performance of the Kalman Filter in nonlinear cases with the help of an Echo State neural network.

III. RMF WITH HSDF This Section sheds light on the extension of the RMF in [5] and [6] with HSDF for JDL/DFIG Level 2 (situational assessment) and Level 3 (impact assessment). Fig. 1 displays an augmented version of the risk feature extraction module that accounts for both hard and soft data sources. Sensor modalities like AIS, radars, video cameras, and weather gaugers, as well as other unsensed yet quantitative and structured information such as oil price indicators or dynamic country demographics are all examples of hard data sources DS1 , . . . , DSk that can be ingested by the RMF. Tweets, blogs, webpages, or other forms of textual reports illustrate soft data sources DSk+1 , . . . , DSn . To be assimilated by the RMF, the original content of these soft data sources will have to be converted to a Common Textual Risk Representation (CTRR). This will involve typical NLP tasks such as tokenization, named entity recognition, part-of-speech tagging, model representation and term annotation that may span both lexical and syntactic analyses. The most important step in the CTRR transformation is the elicitation of a risk lexicon (possibly encompassing multiple textual sources), i.e. a dictionary of keywords that symbolizes risk factors in the domain of interest. The soft risk feature extractors will then use the lexicon generated by the CTRR module to identify risk spans in the soft data, instantiate them and use them as building blocks for the creation of soft risk features, whose outputs must be quantitative in order to seamlessly blend them with those of the hard risk features. The lexicon-building process is often guided by domain experts. Notice that in Fig. 1 the CTRR module feeds two soft risk feature extractors: one in the object (system unit) space and one in the response space. That is, the textual risk mined from the soft data sources could be used to enhance the initial characterization, provided by the hard data sources, of an object (Level 2 JDL/DFIG, situational assessment) or a candidate response (Level 3 JDL/DFIG, impact assessment) needed to mitigate a perceived threat. In both spaces, the hard risk feature extractors could choose to ingest the information granules produced by their soft counterparts. The rest of the RMF modules retain their original functionality as described in [5] and [6]. In the next section, we will illustrate how these HSDF concepts (in the object space) are applied to MDA scenarios. IV. C ASE S TUDY: M ARITIME D OMAIN AWARENESS This Section illustrates the application of the risk-aware HSDF framework to MDA.

•

•

•

•

current location, intended course, speed, etc. We use the vessel’s identifier, type and location in our risk-aware fusion framework2 . SAR [Hard] - This feed contains 188 contacts from Canada’s RADARSAT-23 satellite for the same area (northeast Canada/USA) and period of interest (POI). It is an active tracking modality that can be used to supplement AIS feeds, but it can also be used independently to locate vessels. Given the scarce number of contacts available, a SAR-contact-to-AIS-track association is conducted as shown in [20]. Sea State Reports [Hard] - Describes the motion of the sea waves produced by the joint effect of wind and swell. This information is used in conjunction with the Douglas Sea Scale4 in order to compute the Sea State risk feature. The sea data for the East Coast of Canada was obtained from Fisheries and Oceans Canada5 . To the best of our knowledge, there is no available sea state dataset for the Horn of Africa region, so the sea scale values were randomly generated for that scenario. NGA WWTTS Maritime Incident Reports [Soft] - The Worldwide Threats to Shipping Reports (WWTTS)6 is a publicly available, semi-structured textual source maintained by the National Geospatial-Intelligence Agency (NGA). Relevant maritime crime and piracy incidents around the globe are compiled weekly and freely disseminated. The textual description of each incident includes, among other things: the kind of threat, the vessel in question and the location where it occurred. Structured data is extracted from this database through NLP and then used to compute the regional hostility metric and augment the degree of distress of a vessel [6]. Our POI goes from 2011/01 to 2013/01, totalling 107 reports describing 2,200 incidents (732 were duplicates and hence removed). GeoNames geographical database [Soft] - This source7 is used to extract textual repositories of city/location names and their respective geographical coordinates to complement NGA WWTTS incident reports where a specific latitude and longitude is not provided.

B. MDA Scenario An MDA scenario S consists of a predefined maritime area of interest (AOI) and a group of vessels transiting through those waters which are tracked via either passive (like AIS) or active (e.g., SAR) sensing modalities. As a result, periodical data about each vessel X ∈ S (e.g., position, type, heading or speed) arrives at the maritime control center, where the maritime operators continually monitor their statuses. This is

A. Data Sources The following data sources were used in this study: 1 • AIS [Hard] - A 5-day ExactEarth feed of 227,299 AIS contacts along the northeastern coast of Canada and USA. AIS reports include crucial information pertaining to the vessel like its unique identifier (MMSI), type, 1 http://www.exactearth.com

2 AIS contacts around the Horn of Africa were synthetically generated as we did not have access to the AIS feed of that region 3 http://www.asc-csa.gc.ca/eng/satellites/radarsat2 4 https://en.wikipedia.org/wiki/Douglas Sea Scale 5 http://www.meds-sdmm.dfo-mpo.gc.ca/isdm-gdsi/waves-vagues/ search-recherche/index-eng.asp 6 http://msi.nga.mil/NGAPortal/MSI.portal? nfpb=true& pageLabel=msi portal page 64 7 http://www.geonames.org/

Fig. 1. The RMF’s architectural blueprint showcasing hard/soft risk extractors in both the object and response spaces. Gray boxes indicate external RMF elements. Green boxes indicate Level 2 RMF capabilities and yellow boxes indicate Level 3 RMF capabilities. TABLE I R EPORTED MARITIME INCIDENTS AND THEIR SEVERITY Incident Category Bomb Threat Terrorism Hostage Scenario Crew Damage Theft Invasion Near Invasion Threatened Approach Crew Error Unknown

Incident Keywords bombed terrorist, terrorism hijacked, abducted, kidnapped, hostage, kidnapping fired, tied up with rope robbed, attacked, robbers, criminals, robbery, theft, stole Equipment Boarded, clashed with, boarded the vessel, knives, invaded, trespasser attempted boarding, forced, crime, threat, surrender chased, threatened, threat, suspect, escape, blocking, risk suspicious approach, suspiciously approached, approached crashed, negligence other risks

the external environment module in Fig. 1. As mentioned in Section I, a risk stream is extracted from this original data flow through the RMF’s risk feature extraction module. Like in [6], maritime vessels are characterized in terms of four risk features: (i) collision factor indicates the likelihood of the vessel colliding with another object in the sea; (ii) degree of distress subsumes different distress factors such as the danger to human lives aboard the vessel, the environmental impact of a catastrophe caused by the vessel and the risk of running out of fuel and hence being stranded; (iii) sea state models the distress posed by the prevalent sea conditions (calm, moderate, rough, etc.) and is drawn from the Douglas Sea Scale based on the vessel’s reported location and (iv) regional hostility captures the danger of the region where the vessel is transiting.

Severity Value 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.05

C. Hard-Soft Risk Feature Extraction In the MDA scenario, a vessel X periodically reports its position (latitude, longitude) and its vessel type as part of the AIS message. The position field is used to calculate the n closest maritime incidents to X’s current location while the vessel type is employed to determine the similarity of X to the other vessel types reported in the maritime incidents repository. These two fields contribute to the calculation of the Regional Hostility Metric for a vessel. Furthermore, the vessel type is used to define the Risk of Attack of any vessel, which augments the Degree of Distress risk feature. 1) Regional Hostility Metric: This risk feature, denoted by µ(X), quantifies the perils associated with the geographical region through which vessel X navigates based on historical data in the form of maritime incident reports. Equation (1) formalizes its modeling.

TABLE II S IMILARITY MATRIX FOR VESSEL CATEGORIES

µ(X) = α M IP (X) + β M IS(X) + γ V P I(X)

(1)

where α weighs the significance of X’s proximity to known maritime incidents, β weighs the significance of the severity of the incidents reported around the vicinity of X and γ denotes the significance of the pertinence of those reports to X. A suggested weighting is α = 0.4, β = 0.3 and γ = 0.3; however, these weights could vary according to the maritime operator’s preferences. The mean incident proximity M IP (X) quantifies the risk induced by the proximity of the vessel X to reported maritime incidents. It is defined as an average over the risk values associated with the proximity of the n closest maritime incidents: M IP (X) =

n X θ(xi ) i=1

n

n X ψ(yi ) i=1

n

(3)

where yi = vi .incidentType is the type of incident in the i-th closest report and: ψ(yi ) is a mapping, termed the incident severity risk, from an incident type to a numerical value between 0 and 1, as shown in Table I. The victim pertinence index V P I(X) quantifies how pertinent the n closest maritime incidents are for this vessel. It is defined as the maximum similarity between the vessel’s type and those involved in the n closest maritime incidents. V P I(X) = max{δ(X.vesselType, vi .vesselType)} i

(4)

where δ(X.vesselType, vi .vesselType) is computed as follows   1 δ(X.vesselT ype, vi.vesselT ype) = 0.5   0

Cargo Transport 1 0.5 0 0 0.5

Tanker/Industrial 0.5 1 0 0 0

Warship 0 0 1 0.5 0

Small Military Vessel 0 0 0.5 1 0.5

Small Transport/Utility 0.5 0 0 0.5 1

TABLE III V ESSEL TYPES AND THEIR CORRESPONDING CATEGORY Cargo Transport bulk carrier car carrier cargo ship carrier ship container ship dry cargo vessel general cargo ship livestock carrier lng carrier refrigerated cargo ship merchant ship

Tanker/Industrial chemical tanker heavy lift vessel lpg tanker mother vessel oil tanker product tanker tanker

Warship warship

Small Military Vessel coast guard boat naval patrol vessel

Small Transport/Utility fishing trawler japanese harpoonists militant anti-whaling group skiff speed boat tug boat dredger fishing vessel vessel

(2)

where xi = dist(X.location, vi .location) is the geospatial distance (in nautical miles, nm) from the vessel X to the i-th closest maritime incident and θ(xi ) is termed the incident proximity risk and defined as a fuzzy set over the domain of distance values. Depending on the vessel’s speed, the trapezoidal membership function that models this fuzzy set will adopt the following parametric configurations: • A = 0; B = 0; C = 5; D = 10 for slow vessels; • A = 0; B = 0; C = 21.6; D = 43.2 for medium-speed vessels; and • A = 0; B = 0; C = 40; D = 80 for fast vessels. The mean incident severity M IS(X) quantifies the risk induced by the severity of the reported maritime incidents. It is defined as an average over the risk values associated with the severity of the n closest maritime incidents M IS(X) =

Cargo Transport Tanker/Industrial Warship Small Military Vessel Small Transport/Utility

if X.vesselType is the same as vi .vesselType if X.vesselType is similar to vi .vesselType if X.vesselType is unrelated to vi .vesselType

Table II defines the similarity matrix for vessel categories while Table III maps a vessel type to its category.

2) Degree of Distress: As mentioned in Section IV-B, the Degree of Distress risk feature quantifies the combined effect of multiple distress factors acting upon the vessel, such as those affecting people aboard, the environment or the vessel itself in case it runs out of fuel. In this paper, we are expanding the formulation in [6] with another distress factor, Risk of Attack, that will be derived from textual information mined directly from the semi-structured NGA WWTTS maritime incident reports, as displayed in Equation (5): µDD (X) = 0.3µRP (X) + 0.2µRE (X) + 0.2µRF (X) + 0.3µRA (X) (5) where µRP (X), µRE (X) and µRF (X) are defined as in [6] and µRA (X) is the probability that vessel X will be attacked due to the category it belongs to. This value is based on the maritime incident reports compiled in the NGA WWTTS repository. More formally, µRA (X) is the square root of the conditional probability that X.category would be the subject of a maritime incident I. µRA (X) =

p P (X.category|I)

(6)

where P (X.category|I) is the fraction of the total number of reports (over a user-specified time period) where the vessel’s category appears involved in the incident categories 1–8 listed in Table I. Taking the square root increases the probability so as to more uniformly distribute the Risk of Attack over [0; 1]. V. A R ISK -AWARE F USION E FFECTIVENESS M ETRIC Despite the sound interest and rapid research developments in the HLIF arena, the literature on MoE for HLIF is still in its infancy. Blasch et al [21][22] were among the first ones to propose an indicator that quantifies the degree of effectiveness in any HLIF process. Their metric is based on three components: (1) the information gain, measuring the value added by the content contributed by new sources; (2) the information quality, which reflects several facets of the data at hand, such as its reliability and credibility and (3)

the robustness, gauging the consistency over the testing and application domains. The authors claim that the product of these three components is a domain-agnostic, valid strategy for estimating HLIF effectiveness. We formulate the information fusion effectiveness (IFE), depicted in Equation (7), as a more tailored version of the above general MOE by injecting risk into its components. R IF E = IBR × RAL × IC × OIT × |{z} {z } | {z } | Information Gain

Information Quality

(7)

Robustness

In our MDA scenario S, the Information Gain (IG) is a measure of the number of vessels that make use of the available data sources (Information Benefit Ratio, IBR) and the overall risk alertness they provide (Risk Awareness Level, RAL). More formally: • •

•

1 |{X ∈ S : X.AIS 6= ∅}| for hard sources IBRH = |S| only and 1 IBRS = |S| |{X ∈ S : Nn (X.Loc) 6= ∅}| for soft sources onlyand  if X.AIS = ∅ ∀X ∈ S IBRS IBRHS = IBRH if Nn (X.Loc) = ∅ ∀X ∈ S   SRH IBRH + SRS IBRS otherwise

where |S| is the number of vessels in S, X.AIS denotes an active AIS transceiver in vessel X, Nn (X.Loc) is the set of the n closest maritime incident reports to X’s current location (see Section IV) and SRH , SRS are the relevance values of the hard and soft data sources, respectively. In this study, we used SRH = 0.6 and SRS = 0.4 The RAL is defined as the extent to which the vessels in the scenario S are aware of the existing risks. This variable is computed as the average overall risk (i.e., across all risk features) of the vessels in S at any time instant. 1 X RAL = OR(X) |S|

(8)

X∈S

The Information Quality (IQ) is defined as the product of the Information Confidence (IC) and the Overall Information Timeliness (OIT ). IC is in turn calculated as the mean of the product of the Reliability of Information (ROI) and Reliability of Source (ROS) for all participating data sources: ICH = ROSH ROIH

(9)

ICHS = (ROSH ROIH + ROSS ROIS )/2

(10)

The reliability values were defined according with the US Military Standard 640 (MIL-STD-640). Both ROSH and ROSS have been set to 1.0 given that the two sources (ExactEarth for AIS and NGA for WWTTS reports) are quite reputable private and governmental entities, respectively. ROIH was fixed to 0.75 (fairly reliable) since AIS can be intentionally spoofed whereas ROIS was set to 0.85 (very reliable) because the maritime incident reports are peer reviewed by knowledgeable operators prior to their online disclosure.

The OIT is a function of the Information Timeliness (IT ). ITH can be expressed as the average, across all vessels, of the average delay (in seconds) between consecutive AIS reports for each vessel. On the other hand, ITS is the average, across all vessels, of the average age (in days) of each vessel’s n closest maritime incident reports. The next step is to model the hard information timeliness (HIT ) and soft information timeliness (SIT ) fuzzy variables, whose respective crisp inputs are ITH and ITS . These two variables have the following linguistic terms: HITRecent (trapezoidal, A = 0, B = 0, C = 180, D = 540); HITOld (trapezoidal, A = 360, B = 540, C = ∞, D = ∞); SITRecent (trapezoidal, A = 0, B = 0, C = 500, D = 732) and SITOld (trapezoidal, A = 500, B = 732, C = ∞, D = ∞). OITHS is then determined as the center of gravity of the fuzzified output of the Mamdani fuzzy inference system (FIS) below: • • •

IF HIT is HITRecent AND SIT is SITRecent THEN OITHS is OITRecent IF HIT is HITRecent AND SIT is SOld THEN OITHS is OITAcceptable IF HIT is HITOld THEN OITHS is OITOld

The set of membership functions associated with the OITHS fuzzy variable are as follows: • • •

OITOld : triangular (A = 0, B = 0, C = 0.5) OITAcceptable : triangular (A = 0, B = 0.5, C = 1) OITRecent : tiangular (A = 0.5, B = 1, C = 1)

In the case where only hard sources are available, OITHS is computed by fuzzifying ITH via a triangular membership function with parameters A = 0, B = 180 and C = 540. Finally, the robustness R of the system is set to 1 since the system can cope with real-time variations [21] [22]. VI. E XPERIMENTAL E VALUATION In this Section, we empirically validate the proposed riskaware HSDF framework described in Sections III and IV. To evaluate and contrast the effectiveness of the RMF with hardsoft HLIF, we leaned on two scenarios: the east Canada coast (ECC) outlined in [6] and a new one modeled around the Horn of Africa (HOA). A. Maritime Incident Reports Fig. 2 reveals the distribution of worldwide maritime incidents per vessel category in the POI under discussion. Transport/utility and cargo vessels are nearly equally the subject of maritime lawlessness given their strategic importance. As expected, incidents related to military vessels are scant due to their defensive capabilities. The distribution of worldwide maritime incidents per incident category is reflected in Fig. 3. Vessel boarding and tresspassing (like in piracy cases) are responsible for 40% of the criminal activity reported in the POI. The combined effect of theft and crew damage accounts for almost the same degree of unlawfulness. These facts underscore the importance of automated HLIF solutions for a more effective MDA.

Cargo

of reported maritime incidents in the Canadian coastline, the IF E value is slightly superior, around 8.74%. This is caused by the minor increase in the IC indicator due to the ingestion of a fairly reliable soft data source. Notice that OITH = OITHS given that no compiled incident reports apply to the ECC dataset. In this case, varying the number of closest reports n did not impact the RAL indicator at all.

Transport or Utility

C. HOA Scenario Analysis

34%

Industrial

29% 1%

36%

Military

Fig. 2. Maritime incident distribution per vessel category

Crew Damage

The coast off the Horn of Africa is notorious for piracy, as confirmed by the data extracted from the NGA WWTTS reports. As seen below in Fig. 5, the density of maritime incidents in this region is staggering.

20%

Hostage 9% 40% 6% 3% Approach 5% Threatened 17% Near Invasion Theft

Invasion

Fig. 3. Maritime incident distribution per incident category

B. ECC Scenario Analysis The ECC scenario consists of 14 maritime assets. An extension of this scenario was tested successfully in [6]. The Canadian maritime shoreline is clearly characterized by low regional hostility as displayed in Fig. 4.

Fig. 5. Horn of Africa maritime incidents

We created a simulated scenario of this volatile trade route in order to test the HSDF capabilities of the RMF. The scenario portrayed in Fig. 6 consists of 43 maritime units.

Fig. 4. Overall risk for all vessels in the ECC scenario (2 coast guard vessels, 2 medical vessels, 4 tugs, 1 oil tanker, 4 speed boats and 1 cruise)

TABLE IV ECC HLIF E FFECTIVENESS Scenario ECCH ECCHS (n = 10) ECCHS (n = 9) ECCHS (n = 8) ECCHS (n = 7) ECCHS (n = 6) ECCHS (n = 5)

IBR 1 1 1 1 1 1 1

RAL 0.163 0.164 0.164 0.164 0.164 0.164 0.164

IC 0.600 0.640 0.640 0.640 0.640 0.640 0.640

OIT 0.833 0.833 0.833 0.833 0.833 0.833 0.833

IFE 0.0815 0.0874 0.0874 0.0874 0.0874 0.0874 0.0874

The overall effectiveness of the hard fusion scenario (ECCH ), as calculated by the metric in Section V, was 8.15%. For the hard-soft scenario (ECCHS ) and despite the lack

Fig. 6. Horn of Africa maritime scenario (4 medical vessels, 14 tugs, 11 oil tankers, 12 speed boats and 2 cruises)

In this scenario, the risk evaluation process clearly benefited from the incorporation of soft data. In Fig. 7, we find that the regional hostility metric and risk of attack (as part of degree of distress) soft risk features have a visible effect on the overall risk of vessels. For the HOAHS scenario, we find that the proposed HSDF system improves on the HOAH scenario by around 16.89% − 19.17%, as shown in Table V. The IBR value of 0.983 in the

TABLE V HOA HLIF E FFECTIVENESS Scenario HOAH HOAHS (n = 10) HOAHS (n = 9) HOAHS (n = 8) HOAHS (n = 7) HOAHS (n = 6) HOAHS (n = 5)

IBR 1 0.983 0.983 0.983 0.983 0.983 0.983

RAL 0.163 0.524 0.526 0.514 0.511 0.485 0.482

IC 0.600 0.640 0.640 0.640 0.640 0.640 0.640

OIT 0.833 0.826 0.826 0.826 0.826 0.826 0.826

IFE 0.0816 0.2723 0.2733 0.2671 0.2655 0.2520 0.2505

Fig. 7. HOA risk assessment. Left: hard scenario. Right: hard-soft scenario

hard-soft case indicates that a few vessels were still far away from the nearest reported incidents. Furthermore, the RAL of the HOAHS scenario experiences a drastic improvement over its hard-only counterpart. This significant increase in risk awareness is both accurate and useful for navigation in such conflict-laden region. The downside of bringing soft data into the fusion framework is expressed in the OIT indicator, owing to the age of the available textual incidents (≈ 1.3 years preceding the actual scenario date). Overall, the HOA scenario provides useful information that stimulates discussion on the cost/benefit of HSDF systems. In our case study, we realized that the amalgamation of hard and soft data can produce significant leaps in terms of IG. Another lesson learned is that HSDF systems may not exhibit the same IQ as hard fusion systems, as reliability/timeliness may decline when ingesting soft sources. VII. C ONCLUSIONS In this work, we have augmented the RMF proposed in [5] and [6] with an HSDF module as part of the risk feature extraction and modeling. We tested our scheme with two MDA scenarios where the number of maritime incident reports varies substantially. In either case, our HSDF system has proven effective in accurately translating hard and soft information into a quantitative structure for risk analysis that correlates with and confirms human intuition. Further still, our IF E measure has proven to accurately translate important qualitative factors into relevant quantitative terms. This measure could be applied to any data fusion system that deals with risk. Even risk-agnostic systems could profit from the building blocks behind the metric formalization provided that the necessary customizations to the domain of interest are made.

Future work is concerned with HSDF in the response space (Level 3 RMF, see Fig. 1) and the study of the dynamic nature of the reliability of the data sources and their information. R EFERENCES [1] R. Abielmona, “Tackling big data in maritime domain awareness,” Vanguard, pp. 42–43, August-September 2013. [2] E. Blasch, I. Kadar, J. Salerno, M. M. Kokar, G. M. Powell, D. D. Corkill, and E. H. Ruspini, “Issues and challenges in situation assessment (level 2 fusion),” Journal of Advances in Information Fusion, vol. 1, pp. 122–139, December 2006. [3] E. Blasch and S. Plano, “DFIG level 5 (user refinement) issues supporting situational assessment reasoning,” in International Conference on Information Fusion, July 2005. [4] A. H. Razavi, D. Inkpen, R. Falcon, and R. Abielmona, “Textual risk mining for maritime situational awareness,” in 2014 IEEE International Inter-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA), pp. 167–173, March 2014. [5] R. Falcon, R. Abielmona, and A. Nayak, “An evolving risk management framework for wireless sensor networks,” in Proceedings of the 2011 IEEE Int’l Conference on Computational Intelligence for Measurement Systems and Applications (CIMSA), pp. 1–6, September 2011. [6] R. Falcon and R. Abielmona, “A response-aware risk management framework for search-and-rescue operations,” in 2012 IEEE Congress on Evolutionary Computation (CEC), pp. 1540–1547, June 2012. [7] ISO, “Risk management: Principles and guidelines,” International Organization for Standardization, no. 31000, 2009. [8] X. Tan, Y. Zhang, X. Cui, and H. Xi, “Using hidden markov models to evaluate the real-time risks of network,” in Proceedings of IEEE Intl Symposium on Knowledge Acquisition and Modeling Workshop, pp. 490–493, December 2008. [9] K. Haslum and A. Arnes, “Real-time risk assessment using continuoustime hidden markov models,” in Proceedings of Intl Conference on Computational Intelligence and Security, pp. 1536–1540, November 2006. [10] J. Merrick, J. van Dorp, and V. Dinesh, “Assessing uncertainty in simulation based maritime risk assessment,” Risk Analysis, vol. 25, pp. 731–743, July 2005. [11] H. Abbass, A. Bender, S. Gaidow, and P. Whitbread, “Computational red teaming: Past, present and future,” IEEE Computational Intelligence Magazine, vol. 6, pp. 30–42, February 2011. [12] A. Yang, H. Abbass, and R. Sarker, “Characterizing warfare in red teaming,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 36, no. 2, pp. 268–285, 2006. [13] M. Jakob, O. Vanek, S. Urban, P. Benda, and M. Pechoucek, “Adversarial modeling and reasoning in the maritime domain - first year report,” Agent Technology Center, Department of Cybernetics, FEE Czech Technical University in Prague, 2009. [14] N. Friedman, The Naval Institute Guide to World Naval Weapon Systems. Naval Institute Press, 2006. [15] S. Dewey and J. Maynard, “Time, technology and exploitation can future military command-and-control domination be linked to effective knowledge?,” Joint Military Operations Department Naval War College, April 2008. [16] I. Lim and F. Jau, “Comprehensive maritime domain awareness: an idea whose time has come?,” Defence, Terrorism and Security, Globalisation and International Trade, October 2007. [17] K. Moore, “Predictive analysis for naval deployment activities,” PANDA BAA, no. 05-44, 2005. [18] R. Laxhammar, “Artificial intelligence for situation assessment,” Master’s thesis, Royal Institute of Technology, Sweden, 2007. [19] E. Blasch, E. Dorion, P. Valin, E. Bosse, and J. Roy, “Ontology alignment in geographical hard-soft information fusion systems,” in 13th Conference on Information Fusion (FUSION), pp. 26–29, July 2010. [20] H. Shao, R. Abielmona, R. Falcon, and N. Japkowicz, “Vessel track correlation and association using fuzzy logic and echo state networks,” in IEEE Congress on Evolutionary Computation (CEC), to appear, 2014. [21] E. Blasch, P. Valin, and E. Bosse, “Measures of effectiveness for highlevel fusion,” in 13th Conference on Information Fusion (FUSION), pp. 1–8, July 2010. [22] E. Blasch, R. Breton, and P. Valin, “Information fusion measures of effectiveness (MOE) for decision support,” Proc. of SPIE, vol. 8050, pp. 805011–1–805011–12, May 2011.

Risk Management with Hard-Soft Data Fusion in ... - IEEE Xplore

Risk Management with Hard-Soft Data Fusion in ... - IEEE Xplore

Suggest Documents

Distributed Resource Management in Data Center with ... - IEEE Xplore

Sensor Data Fusion in UWB-Supported Inertial ... - IEEE Xplore

Ontology-based Operational Risk Management - IEEE Xplore

Prescriptive Information Fusion - IEEE Xplore

Prescriptive Information Fusion - IEEE Xplore

Secure Management of Biomedical Data With ... - IEEE Xplore

Risk Assessment - IEEE Xplore

An Introduction To Multisensor Data Fusion - IEEE Xplore

A Highly Accurate and Reliable Data Fusion Framework ... - IEEE Xplore

Hyperspectral and LiDAR Data Fusion: Outcome of the ... - IEEE Xplore

Data fusion of multiple polarimetric SAR images using ... - IEEE Xplore

Multimodal Medical Volumetric Data Fusion Using 3-D ... - IEEE Xplore

Data Fusion of Different Spatial Resolution Remote ... - IEEE Xplore

Physical Layer Data Fusion Via Distributed Co-Phasing ... - IEEE Xplore

Category Mining by Heterogeneous Data Fusion Using ... - IEEE Xplore

Displacement Estimation Using Multimetric Data Fusion - IEEE Xplore

Beginning with Big Data Simplified - IEEE Xplore

Acting Responsibly with Geospatial Data - IEEE Xplore

Collaborative Data Collection with Opportunistic ... - IEEE Xplore

Dynamic Spectrum Management With Spherical ... - IEEE Xplore

Energy-Efficient Topology Management With ... - IEEE Xplore

the data - IEEE Xplore

Study on Data Management of Fundamental Model in ... - IEEE Xplore

In-Memory Big Data Management and Processing: A ... - IEEE Xplore