Maritime Context-Aware Communication Supporting

2 downloads 0 Views 2MB Size Report
Feb 8, 2019 - as Naive Bayes, Random Forest, XGBoost, Lightgbm, Logistic Regression by using bag .... Logistic Regression, Lightgbm and SVM classifier.
Maritime Context-Aware Communication Supporting System for e-Navigation Jin Hyundoo, Seunghee Choi, Jinki Seor, Unkyu Jang, Sungchul Choi TEAMAB at Gachon University, Republic of Korea

Abstract In this paper, a maritime VHF communication support framework has been developed for the purpose of facilitating e-Navigation digital data exchange in a coherent and predictable manner. This system is based on three day VHF communication data script exchanged between ship to ship, ship and shore in the VTS areas respectively in Yeosu, Masan, and Ulsan ports in the Republic of Korea. Repetitive VTS communicative situations were analysed and categorised according to the peculiar characteristics of the events (e.g. anchoring, berthing, and pilotage) in an automated manner by evaluating and predicting the flow of VHF exchanges based on a wide range of machine learning techniques, their ensemble models and deep-learning methods such as word2vec, doc2vec, bag of word (BOW), and term frequency–inverse document frequency (tf-idf). This VHF communication support framework is expected to be highly effective and practical when combined with onboard e-navigation communication systems by automatically analyzing given VHF situations and then assisting with mariners’ enhanced situational awareness in a way to avoid possible human errors caused by miscommunication. Keywords: Maritime Communication, Artificial intelligent applied Classification, e-Navigation, VTS Communication

1

2 3 4 5 6 7 8 9 10 11 12 13

1. Introduction Recently, automation in the maritime area has been extensively studied according to the expansion of maritime data and the increase of ease of data collection. GPS navigation technology, which is the first step for automated navigation, is emerging as one of the core technologies of the maritime AI field [1]. International Maritime Organization (IMO) has to call this type of navigation technologies as e-Navigation 1 . Since 2006, International Association of Marine Aids to Navigation and Lighthouse Authorities (IALA) has also established an e-Navigation committee for the discussion of relevant key issues and its future development in a continual manner. Furthermore, EU has launched EfficienSea2 project with the total funding of 12.5 million Euros 2 . The Korean government has also carried out a large-scale national project, SMART e-Navigation, in order to advance eNavigation technologies and design user-friendly future services by investing 115 million dollars since November, 2013 3 . When the development of e-navigation system is con1

http://www.imo.org/en/OurWork/Safety/Navigation/Pages/eNavigation.aspx https://efficiensea2.org 3 http://www.smart-navigation.org 2

Preprint submitted to Journal Name

February 8, 2019

49

sidered, the characteristics of maritime communication, compared to those of the traffic communication on land, should be fully reflected from the initial stage of system design, development, and implementation with the following facts in mind. First, the majority of communication at sea is mostly conducted between two parties, or ‘ship to ship’ and ‘ship to shore (Vessel Traffic Service center)’ through VHF radio. Considering the fact that the routes at sea are not clearly visible and/or fully guided by traffic marks and signals as seen on the road, each ship, so called VHF station, must make a verbal communicate to reach a mutual agreement on safe navigation in a wide range of traffic situations. Second, a patternized exchange of information accounts for a large amount of VHF communication. Stereotypical dialogue patterns are shown in most of the routine communication such as anchoring, departure, and arrival in the coastal water as demonstrated in this study, even though it be largely dependent on how an individual’s situation unfolds. Third, linguistic variation in terms of pronunciation, intonation and accent is considerably large during the exchange of information due to crew members’ and VTS operators’ different linguistic origins. Therefore, a considerable amount of communication challenges exists in delivering and interpreting VHF voice messages, even if it is conducted in English, which is the common maritime language. Finally, speakers’ and listeners’ different personal characteristics and situational awareness can distort the intended message become misinterpreted. In other words, individual responses to a certain communicative situation cannot be identical, although international common guidelines and procedures for emergency situations already exist. The failure of communication can be directly linked to serious disasters, specifically in the case of merchant and cruise ship accidents. In order to save lives and protect environment at sea by ensuring navigational safety, therefore, the development of e-Navigation system should be directed in a way to reduce all interlocutors’ communicative burden but to increase their capabilities of accommodating information in an instant and accurate manner. To approach these issues in a more practical manner, an event-based maritime communication support system for e-Navigation has been developed. For this purpose, more than a thousand authentic VHF communication cases exchanged between vessels and VTS centers in Masan, Yeosu and Ulsan Ports in Korea were collected and transcribed. And then, repetitive communicative situations were identified by analyzing interlocutors’ conversational dialogues by classification models. By doing so, a standardized syntax for each event was established, which enables a prototype system classifies the events based on their intention in a certain navigational situation, and finally recommends rule-based standardized conversations to support effective and accurate VHF text communication at sea by the provision of the most appropriate responses throughout the communication.

50

2. Related Works

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

51 52 53 54 55 56 57

2.1. e-Navigation in Maritime Industry e-Navigation is a concept defined by IMO in 2006 to provide a navigational support by the means of electronic and digitalized communication system to enhance mariners’ decision making by preventing possible human errors, which is one of the most important factors causing marine accidents [2]. Even with the fact that mariners should be familiarized with the use and operation of electronic equipment onboard in accordance with the IMO Convention on Safety of Life ea (SOLAS), there have been strong arguments that 2

58 59 60 61 62 63 64

65 66 67 68 69 70 71 72 73 74

the adoption of emerging navigational technologies on board put a great deal of burden onto mariners by increasing their workload [3]. This causes a higher risk of inappropriate decision-making, specifically when they are in a physically and mentally demanding navigational situation. In this sense, one of the ultimate goals of e-Navigation is to enhance the safety of navigation through the assistance of information gathering, integration and analysis leading to mariners’ and VTS operators’ decision-making based on the improved situational awareness [4]. 2.2. Text Classification A number of attempts have been made to create automated recommendation systems for assisting communications (REFERENCE, REFERENCE, REFERENCE) in different industrial sectors. The massive collection of the online communication data has made this possible and even diversified a range of text-driven research methods. Text classification is one of the important research methods utilizing texts. For this purpose, various algorithms using machine learning such as Naive Bayes [5] and SVM [6] have been introduced [7]. Since then, the introduction of various representation learning techniques, which enable to solve data sparsity problems, made it possible to apply deep learning models such as RNN [8], CNN [9], and RCNN [10], using a large amount of data [11], [12], [13], [14].

87

2.3. Communication Support System This study labeled each sequence of VHF communication dialogues, and then classified them into the most appropriate target navigational situations by applying text classification model. And then, the system automatically recommends the list of information to be exchanged in the following order. In this step, out of the recommended phrases, the selected information by users can be directly sent to a target receiver in an internationally standardized language format, or IMO Standard Marine Communication Phrases (SMCP). A similar study on rule-based chatbot system was proposed in the maritime industry [15], [16] with the purpose of training cadets who major in navigation to become future deck officers. In this reason, this is more oriented into training VHF communication between ship and VTS centers, rather than deck officers’ practical convenience by easing communication barriers that they encounter in a wide range of navigational situations onboard.

88

3. Proposed System

75 76 77 78 79 80 81 82 83 84 85 86

89 90 91 92 93 94 95

3.1. Data collection The VHF communication recommendation system developed in this study identified 11 different categories of ship-to-ship/ship-to-shore dialogues in real time. The most appropriate standardized phrase is recommended according to the scenario of the navigational situations given, when the prediction probability exceeds a predetermined threshold value. The data for this study has been provided by Korea Institute of Maritime and Fisheries Technology in Busan, Republic of Korea.

3

96 97 98 99 100 101 102 103 104 105 106

3.2. Model The model applied for this research is comprised both of RNN-based deep neural network and machine learning models. Deep neural network-based models use word and sentence embeddings for instant sentence classifications. This machine learning model trains itself based on all of the occurrence words in a series of communications, and then adds words to the instance bag for each communication step for their prediction using the pre-existing and well-accumulated instance bag. An example of this is when communication that begins with ”Ulsan VTS. This is Bright Star.”, the model utilised every word from the beginning and to the end of the sentence in the prediction process. When the next phrase such as “Yes, Bright Star. Go ahead” was spoken, this following sentence was again updated in the database and utilized for further prediction process for classification.

Figure 1: An example of ”Instance bag” 107

108 109 110 111 112 113 114

3.2.1. Deep neural network model The first method was a 2-layer Long Short-Term Memory(LSTM) model using word embedding. In this case, the word embedding in a sentence becomes the input of the first layer. The output of the first layer was used as the sentence representation of each case and then as an input of the second layer. After the application of the second layer, dense layer was then applied, and finally its classification was carried out by using softmax function as described in the picture below.

Figure 2: 2-layer LSTM Model Structure

115 116 117 118 119 120 121 122

In the second method, two word embedding manipulations were used. One of the manipulations was to utilize weighted word embedding by TF-IDF. It has been reported that this method demonstrates a better performance in terms of classification than merely using a sum of word-embedding of each sentence[12]. The other manipulation was Word2vec mean, which the average of the word embeddings of the sentence [17] was used. These two manipulated word embeddings were respectively used as an input to the 1-layer LSTM model. In addition, a dense layer was identically applied as shown in the first method mentioned above. 4

Figure 3: Weighted word embedding by TF-IDF

123 124 125 126

127 128 129 130 131 132 133

As the third method, the doc2vec model [18] is trained by sentence unit. The doc2vec model recognizes each word embedding and finds a representation vector of each sentence at the same time. The representation vector of each sentence is used as an input of the 1-layer LSTM model, and dense layer is added, as done in the previous stage. 3.2.2. Machine Learning Based Models This study applied a wide range of machine learning models for classification such as Naive Bayes, Random Forest, XGBoost, Lightgbm, Logistic Regression by using bag of words and TF-IDF matrix. The categories of VHF communication were predicted by adding a token to the token list in each sequence step, and reflecting the information on time series of the communication during prediction process. It has been reported that traditional machine learning techniques applied in this study demonstrates a considerably high level of performance in classifying short sentences [19].

Figure 4: An example of token list 134 135 136 137 138 139

140 141 142 143 144 145 146

147

Each occurrence of communication in time order is defined as tm . And sm refers to the added token list of tm . As the time step m is further proceeded in a sequential order, therefore, a new set of data words are newly added in the word list sm as the input of the mode. This enable the model to learn the information on the time step of the communication. 3.3. Experiments The total dataset is comprised of 17,827 sentences and 3,300 cases of communications (or 3,300 sequences). All Korean texts are decomposed into tokens using KoNLPy’s Okt tokenizer 4 . Vessel to Vessel means the communication exchanged between a vessel and another vessel during the voyage, and VTS to Vessel refers to the communication exchanged between a vessel and VTS center including navigational warnings or information. The major communicative situations in this research data can be categorized as follows: 1. Anchoring - when the anchor is let go for anchoring 4

http://konlpy.org/

5

148 149 150 151 152 153 154 155

2. 3. 4. 5. 6. 7. 8.

Heaving up anchor - when anchors are lifted from the bottom of the sea for sailing Pilot - when pilot embarks and disembarks a vessel Shifting - when a vessel moves to change anchorage or another berth within the port Arrival - when a ship enters into the port Departure – when a ship leaves the port Berthing – when a mooring operation is conducted alongside Passing – when a ship reports to a VTS centre in such cases as passing a reporting line, and entering the port limit Category N. of Sequences Pilot 176 Vessel to Vessel 264 Heave up anchor 95 VTS to Vessel 641 Shifting 295 Arrival 249 Berthing 254 Departure 473 Passing 491 Anchoring 249 Crossing 113

Mean Length 5.2 6 5.1 5.7 5.2 5.4 5.2 5.6 4.7 5.8 5.1

Mean N. of tokens 41.2 52.8 42.7 55.2 42.2 45.9 34.2 47.4 38.5 50.9 41.2

Table 1: Dataset LLLLLLLLLLLLL

156 157 158 159

As for the deep learning models, the first model used LSTM layer. The sizes of a hidden state of the LSTM are 100 and 64, respectively for the first and second LSTM layer. Two dense layers are followed by a second layer. The size of the hidden states of the two layers are respectively 32 and 11. Sigmoid activation function were further applied. Model Gaussian NB Multinomial NB XGBoost Random Forest Logistic Regression Lightgbm SVM Classifier

2-Layer LSTM

Hyperparameters Laplace smoothing : 0.005 Laplace smoothing : 1 Max depth : 4 Learning rate : 0.1 Max depth : 15 Criterion : Gini index L2 Norm weight : 0.8 N. of estimators : 500 Learning rate : 0.1 Kernel : Linear Total epochs : 256 Learning rate : 0.001 Learning rate decay : 4* 0.99 per 20 epochs Dropout : 0.5 for 2 dense layers

word2vec * TF-IDF word2vec mean doc2vec

Table 2: Hyperparameters of each model

160 161 162 163 164 165

In terms of second and third models, the size of hidden states are 64, and two dense layers with using Sigmoid activation function were also applied. In all three models, the following hyperparameter were applied in common: 256 of batch size, 300 of total epochs, learning rate of 0.001, learning rate decay of 0.99 per 20 steps each, 0.5 of dropout for dense layers. In terms of word embedding, lastly, skip-gram method was used by applying embedding vector size of 300 and window size of 10, for the targeted 464 tokens that 6

174

were repetitively occurred with the frequency of more than 10 times. For machine learning model, BOW and TF-IDF that are conventional text preprocessing techniques were applied to 464 tokens mentioned above. All tokens in total, utilized for the VHF communication sequences were used as input features of at the last time step for a training purpose. For this, the following models are used: Naive, Bayes, XGboost, Random Forest, Logistic Regression, Lightgbm and SVM classifier. The major hyper parameters of each model are in Table2. The performance of every model was measured by 5-fold cross-validation method, and weighted average f1 score was used for the metric.

175

4. Result

176

4.1. Deep Learning-Based Models

166 167 168 169 170 171 172 173

Category 2-LSTM layers word2vec mean word2vec * TF-IDF Pilot (176) N N N Vessel to N N 0.05 Vessel (264) Heave up N N N anchor (95) VTS to 0.3 0.28 0.26 Vessel (641) Shifting N 0.02 0.09 (295) Arrival N N N (249) Berthing N N N (254) Departure N 0.22 0.2 (473) Passing N 0.17 0.2 (491) Anchoring N N N (249) Crossing N N N (113)

doc2vec N

Fasttext N

N

N

N

N

0.19

0.33

0.08

N

N

N

N

N

0.21

N

0.16

N

N

N

N

N

Table 3: Weighted-F1 score of Deep Learning-Based Models

177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193

Both deep learning and machine learning require large amounts of data to properly train the model, but usually machine learning techniques deem to be more suitable when the size of data is relatively small 5 . In this research, it has also been demonstrated that the categories that do not exceed 300 turned out to be unpredictable (N). Even in the cases that are predictable, f1 scores were very low (e.g. Vessel to Vessel). As an additional process, the top 10 most similar words in terms of cosine similarity were also checked on an arbitrary basis to confirm whether the low prediction rate issues were originated from word2vec representation vector in Table4. The results conclude that the word embedding vector well represents a semantic information of the given words. In order to check whether the word vector utilised in this research has sufficient information, and ensure practicality in predicting suitable categories, in addition, Korean pre-trained model in Fasttext [20] was used. This model contains 879,129 vocabulary words and 300 of word vectors. For our research, 2,606 words out of 879,129 in total were used. According to results based on the mean vector of each sentence, our embedding vector was similar to that of Fasttext. Therefore, it was concluded that the data utilised in this research is not sufficient enough to adequately train the lstm model. 5

https://towardsdatascience.com/machine-learning-vs-deep-learning-62137a1c9842

7

weather(날씨) give-way (피항) avoid (피) weather (기상) information (정보) wind (바람) wave (파도) safe navigation (안전항해) outside (외부) engine (엔진) maitain (유지)

port(좌현) starboard (우현) left (좌) stern (선미) side (쪽) port-to (좌현 대) incoming vessel (입항선) altering course (변침) alter course (변침해) a bow of ship (선수) port-to port (좌현 대 좌현)

speed(속력) increase (올리) lower (낮추) overtake (추월) first (먼저) straight (쭉) starboard (우현) maintain (그대로) difficult (힘들) fast (빨리) down (내리)

Table 4: Macro-F1 score of Deep Learning-Based Models

194 195 196 197 198 199 200 201 202

4.2. Machine Learning Based Methods The weighted f1-score was checked, after above-mentioned seven models were trained through bag of words and TF-IDF. In order to improve the classification performance, furthermore, stacking ensemble was applied on the top 5 models. Using the top 5 models as base models, the stacking ensemble calculated the probability of each sequence that should be classified into a suitable category. And then, the results of this probability became an input of meta classifier. As for meta classifiers, Lightgbm, Multinomial Naive Bayes, and XGBoost were used. The hyperparameters of each model are the same as those used in the previous analysis. Category Pilot (176) Vessel to Vessel (264) Heaving up Anchor (95) Vessel to VTS (641) Shifting (295) Arrival (249) Berthing (254) Departure (473) Passing (491) Anchoring (249) Crossing (113) Weighted Category Pilot (176) Vessel to Vessel (264) Heaving up Anchor (95) Vessel to VTS (641) Shifting (295) Arrival (249) Berthing (254) Departure (473) Passing (491) Anchoring (249) Crossing (113) Weighted

BOW BNB

BOW MNB

BOW XGBoost

BOW Random Forest

BOW Logistic Regression

BOW Lightgbm

BOW Linear SVM

0.65

0.67

0.69

0.66

0.68

0.64

0.66

0.71

0.72

0.79

0.66

0.77

0.76

0.73

0.42

0.49

0.65

0.43

0.63

0.57

0.59

0.7

0.69

0.7

0.67

0.67

0.69

0.65

0.71

0.69

0.76

0.72

0.74

0.71

0.69

0.73

0.74

0.76

0.76

0.75

0.75

0.74

0.65

0.79

0.88

0.88

0.87

0.86

0.85

0.88

0.81

0.8

0.78

0.81

0.77

0.78

0.76

0.74

0.76

0.78

0.74

0.74

0.73

0.69

0.71

0.75

0.68

0.72

0.7

0.69

0.69

0.67

0.69

0.66

0.64

0.64

0.66

0.72 0.72 0.75 0.72 TF-IDF BNB TF-IDF MNB TF-IDF XGBoost TF-IDF Random Forest

0.74 TF-IDF Logistic Regression

0.73 0.71 TF-IDF Lightgbm TF-IDF Linear SVM

0.65

0.55

0.7

0.68

0.68

0.64

0.73

0.71

0.51

0.77

0.71

0.69

0.73

0.74

0.42

0.04

0.58

0.41

0.57

0.52

0.64

0.7

0.66

0.69

0.69

0.68

0.69

0.69

0.71

0.65

0.72

0.71

0.74

0.68

0.74

0.69

0.66

0.76

0.75

0.76

0.75

0.76

0.78

0.88

0.87

0.87

0.87

0.88

0.88

0.82

0.71

0.78

0.77

0.8

0.79

0.8

0.74

0.73

0.74

0.75

0.76

0.74

0.76

0.69

0.73

0.72

0.68

0.71

0.71

0.75

0.67

0.35

0.62

0.64

0.66

0.59

0.7

0.72

0.65

0.73

0.72

0.74

0.72

0.75

Table 5: Weighted-F1 score of Machine Learning-Based Models

203 204

The results of the analysis demonstrated that the BOW XGBoost, TF-IDF SVM models and their F1 scores showed the highest levels of performance. Among them, Stacking 8

205

Multinomial Naive Bayes model showed the most excellent and stable performance. Category BOW XGBoost TF-IDF SVM Stacking Lightgbm Stacking MNB Stacking XGBoost Pilot (176) 0.69 0.73 0.7 0.72 0.68 Vessel to 0.79 0.74 0.77 0.81 0.77 Vessel (264) Heave up 0.65 0.64 0.65 0.68 0.67 anchor (95) VTS to 0.7 0.69 0.71 0.7 0.7 Vessel (641) Shifting 0.76 0.74 0.72 0.75 0.73 (295) Arrival 0.76 0.76 0.75 0.78 0.74 (249) Berthing 0.88 0.88 0.9 0.9 0.9 (254) Departure 0.8 0.8 0.82 0.81 0.81 (473) Passing 0.76 0.76 0.75 0.76 0.76 (491) Anchoring 0.75 0.75 0.77 0.76 0.79 (249) Crossing 0.69 0.7 0.71 0.68 0.7 (113) Weighted 0.75 0.75 0.76 0.77 0.76

Table 6: Weighted-F1 score of Ensemble Models

206 207 208 209 210 211 212 213

4.3. With Time Information In order to reflect the time information, the following prediction methods were adopted: the model was first trained with all tokens in each sentence as described in 3.2.2; and then added new tokens in the sentence per time step onto the token list. The five models(BOW XGBoost, TF-IDF SVM, Stacking Lightgbm, Stacking Multinomial NB, and Stacking XGBoost) described in Table 6. In all five models, it has been clearly observed that weighted F-1 scores tend to be low in the first and second time steps, whereas those from third time step dramatically increased as clearly indicated in Figure 5. Category 2-LSTM layers word2vec mean word2vec * TF-IDF doc2vec Fasttext 1 0.28 0.313 0.306 0.353 0.31 2 0.359 0.354 0.354 0.393 0.374 3 0.835 0.795 0.82 0.845 0.808 4 0.886 0.823 0.86 0.893 0.851 5 0.916 0.835 0.884 0.923 0.876 6 0.901 0.792 0.869 0.929 0.859 7 0.886 0.797 0.891 0.95 0.869 8 0.888 0.784 0.887 0.949 0.872 9 0.877 0.762 0.901 0.934 0.853 10 0.904 0.749 0.903 0.965 0.873 11 0.893 0.668 0.908 0.964 0.86 12 0.879 0.744 0.887 0.96 0.87 13 0.876 0.75 0.912 0.982 0.884 14 0.884 0.756 0.914 0.921 0.863 Mean 0.885 0.771 0.887 0.935 0.861 after 3 Table 7: Weighted-F1 score of Deep Learning-Based Models

214 215 216 217 218 219

The top three labels reaching a certain level of threshold were checked by calculating classification probability per each time step. If a target label is included in the top three labels, it is regarded as a ‘correct’ case. Under the condition that threshold is 0, moreover, Stacking Multinomial Naive Bayes scored the highest in terms of performance. It means that it showed the highest accuracy when threshold is not considered. The results of accuracy are as follows: time step 2 is 0.67; time step 3 is 0.954; time step 4 is 0.975; 9

Figure 5: Weighted F-1 scores per time step

220 221

time step 5 is 0.996; beyond time step 5, most of the sequences are matched with correct categories.

Figure 6: Weighted F-1 scores per time step for each threshold

222

223 224 225 226 227 228 229 230 231 232 233

5. Conclusion In this research, a maritime VHF communication support framework, which has never been attempted so far, was developed. The previous maritime communication has been hugely influenced by individual speakers’ linguistic characteristics such as their past experiences and language peculiarities (e.g. intonation, accents and pronunciation), even though internationally standardised guidelines and manuals exist. In this sense, the VHF communication support framework developed in this research will be highly effective and practical when converged with real communication systems onboard ships, considering the facts that this will automatically analyze the target VHF communicative situations where deck officers are in and then suggest the most suitable phraseology to them regardless of their communication proficiencies and capabilities. However, the following factors should be further considered in order to enhance the system quality and apply into an 10

234 235 236 237 238 239 240 241 242 243 244 245 246

247

248 249

250 251 252

253 254

255 256 257

258 259

260 261

262 263

264 265

266 267

268 269

270 271 272

actual navigational environment in a more practical manner. First, more advanced functions such as speech recognition needs to be added to maximize its practicality, given the fact that most of the communications at sea are dependent on voice through VHF; not in a text format. At present, there has been a specific restriction in the development of speech recognition algorithm in maritime communication due to a low level of VHF transmission quality, but it is expected to dramatically improve when Korean SMART e-Navigation is introduced, which enables high-quality voice exchange via LTE network up to approximately 100 km from shore. Second, more diversified language options such as English, Japanese, and others should be added and mutually interchangeable according to speakers/listeners’ mother tongues and/or preferred language options. It is highly expected that this system assists with mariners to deal with their inadequate communicative behaviours and possible human errors in VHF exchanges in a way to ensuring efficiency and safety of shipping. [1] J. E. Hagen, Implementing e-Navigation, Artech House, 2017. [2] I. Acejo, H. Sampson, N. Turgo, N. Ellis, L. Tang, The causes of maritime accidents in the period 2002-2016 (2018). [3] M. Lochner, A. Duenser, M. Lutzhoft, B. Brooks, D. Rozado, Analysis of maritime team workload and communication dynamics in standard and emergency scenarios, Journal of Shipping and Trade 3 (2018) 2. [4] D. Patraiko, P. Wake, A. Weintrit, 5e-navigation and the human element, in: Marine Navigation and Safety of Sea Transportation, CRC Press, 2009, pp. 55–60. [5] S.-B. Kim, K.-S. Han, H.-C. Rim, S. H. Myaeng, Some effective techniques for naive bayes text classification, IEEE transactions on knowledge and data engineering 18 (2006) 1457–1466. [6] H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, C. Watkins, Text classification using string kernels, Journal of Machine Learning Research 2 (2002) 419–444. [7] F. Sebastiani, Machine learning in automated text categorization, ACM computing surveys (CSUR) 34 (2002) 1–47. [8] J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv preprint arXiv:1412.3555 (2014). [9] Y. Kim, Convolutional neural networks for sentence classification, arXiv preprint arXiv:1408.5882 (2014). [10] S. Lai, L. Xu, K. Liu, J. Zhao, Recurrent convolutional neural networks for text classification., in: AAAI, volume 333, pp. 2267–2273. [11] Y. Bengio, R. Ducharme, P. Vincent, C. Jauvin, A neural probabilistic language model, Journal of machine learning research 3 (2003) 1137–1155. [12] A. Mnih, G. Hinton, Three new graphical models for statistical language modelling, in: Proceedings of the 24th international conference on Machine learning, ACM, pp. 641–648. 11

273 274

275 276 277

278 279 280

281 282

283 284 285

286 287

288 289 290

291 292

[13] T. Mikolov, Statistical language models based on neural networks, Presentation at Google, Mountain View, 2nd April (2012). [14] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, P. Kuksa, Natural language processing (almost) from scratch, Journal of Machine Learning Research 12 (2011) 2493–2537. [15] N. Takagi, P. John, A. Noble, B. Brooks, Vts-bot: using chatbots in smcp-based maritime communication, in: Japan Institute of Navigation Conference 2016, pp. 1–4. [16] P. John, J. Appell, J. Wellmann, Practising verbal maritime communication with computer dialogue systems using automatic speech recognition (my practice session). [17] L. White, R. Togneri, W. Liu, M. Bennamoun, How well sentence embeddings capture meaning, in: Proceedings of the 20th Australasian Document Computing Symposium, ACM, p. 9. [18] Q. Le, T. Mikolov, Distributed representations of sentences and documents, in: International Conference on Machine Learning, pp. 1188–1196. [19] S. George, S. Joseph, Text classification by augmenting bag of words (bow) representation with co-occurrence feature, IOSR Journal of Computer Engineering 16 (2014) 34–38. [20] A. Joulin, E. Grave, P. Bojanowski, T. Mikolov, Bag of tricks for efficient text classification, arXiv preprint arXiv:1607.01759 (2016).

12

Suggest Documents