Towards Automated Share Investment System Dymitr Ruta1 British Telecom, Reserach & Venturing, Adastral Park, Orion Building, 1st floor pp12, Martlesham Heath, Ipswich IP5 3RE, UK,
[email protected]
ABSTRACT Predictability of financial time series (FTS) is a well-known dilemma. A typical approach to this problem is to apply a regression model, built on the historical data and then further extend it into the future. If however the goal for FTS prediction would be to support or even make investment decisions, prediction generated by regression-based models are inappropriate as on top of being uncertain and unnecessarily complex, they require lots of investor attention and further analysis to make an investment decision. Rather than precise time series prediction, a busy investor may prefer a simple decision on the current day transaction: buy, wait, sell, that would maximise his return from the investment. Based on such assumptions a classification model is proposed that learns the transaction patterns from optimally labelled historical data and accordingly gives the profit-driven decision for the current-day transaction. The model is embedded into an automated client-server platform which automatically handles data collection and maintains client models on the database. The prototype of the system was tested over 20 years of NYSE:CSC share price history showing substantial improvement of the long-term profit compared to a passive long-term investment. KEY WORDS Financial Time Series, Classification, Investment, Decision Support
1 Introduction Prediction of the financial time series represents a very challenging signal processing problem. Many scientist consider FTS as very noisy, non-stationary and non-linear signal but believe that it is at least to a certain degree predictable [3], [6]. Other analysis suggest that a financial market is selfguarded against predictability as whenever it shows some signs of apparent
2
Dymitr Ruta
predictability, investors immediately attempt to exploit the trading opportunities, thereby affecting the series and turning it unpredictable [2]. Stable forecasting of FTS seems therefore unlikely to persist for longer periods of time and will self-destruct when discovered by a large number of investors. The only prediction model that could be successful and sustainable seems to be the one that exploits the supportive evidence either hidden to other investors or the evidence that is available but highly dispersed among many sources and therefore considered irrelevant, too difficult or too costly to be incorporated in the prediction model. Despite this seemingly obvious rationale a number of techniques is being developed in an attempt to predict what seems unpredictable: tomorrow’s share price based on historical data. Starting from simple linear Autoregressive Moving Average models (ARMA) [3] through conditional heteroscedastic models like ARCH or GARCH [3] up to the complex non-linear models [3], [4], the idea is similar: establish the regression-based description of the future samples based on the historical data series. More recently a number of machine learning techniques started to be applied to a financial forecasting and on a number of occasions showed considerable improvement compared to a traditional regression models [5], [6], [7]. Neural networks are shown to be particularly good at capturing complex non-linear characteristics of FTS [5], [6]. Support vector machines represent another powerful regression technique that immediately found applications in financial forecasting [8], [7]. While there is already extensive knowledge available in pattern recognition domain, it has been rarely used for financial time series prediction. The major problem lies in the fact that classification model learns to categorise patterns into crisp classes, rather than numerical values of the series. A temporal classification models would have to provide a specific definition of classes or obtain it from the series by discretisation. Although some work has already been done in this field [10], [9], [11], [14] there is still a lack of pattern recognition based models that would offer immediate investment applications surpassing in functionality and performance the traditional regression based models. While the reason for using prediction models for share market investment is obvious, given the future price prediction, an investor still needs time to analyse this information before making an investment decision. Moreover, it is well known in the share market technical analysis, that humans tend to be inconsistent in sticking to fixed rules due to emotion factors that affect their investment decisions [13]. Addressing these weaknesses, rather than predicting share price prInvestor uses a classification model that learns from expandable historical evidence how to categorise the future series into investment actions: buy, wait or sell. The proposed prInvestor prototype is a step forward towards a fully automated pattern recognition based investment system. It is designed as a fully automated client-server application where server handles all analytical requests and maintains multiple client profiles while client supported with relevant presentation layer only sends the requests and collects responses.
Towards Automated Share Investment System
3
prInvestor is tested on the 20-years of daily share price series and the results are analysed to give recommendations towards prospective fully automated platform development. The remainder of the paper is organised as follows. Next section provides a detailed analysis of the proposed temporal classification with prInvestor, specifying the investment cycle, an algorithm for optimal labelling of training data, feature extraction process and the classification model used. Section 3 presents the results of experiments evaluating the performance of the prInvestor system. The concluding remarks and some suggestions for model refinement are shown in the closing Section 4.
2 Temporal classification with prInvestor Classification represents a supervised learning technique that tries to correctly label the patterns based on multidimensional set of features [1]. The model is fully built in the training process carried out on the labelled dataset with a maximally discriminative set of features. Based on the knowledge gained from the training process, a classifier assigns the label to a new previously unseen pattern. Adapting the pattern recognition methodology, prInvestor would have to generate the action label: buy, sell or wait to the current day feature values based on the knowledge learnt from historical data. For that to be possible the training data have to be optimally labelled such that the investments corresponding to the sequence of labels generate the maximum return possible. Furthermore, to maximise the discrimination among classes, prInvestor should exploit the scalability of pattern recognition models and use as many relevant features as possible, far beyond just the historical share price series. All the properties mentioned above along with some mechanisms controlling model flexibility and adaptability are addressed in the presented prInvestor system. 2.1 Investment cycle In the simplified investment cycle, considered in this paper, the investor is using all assets during each transaction which means he buys shares using all the available cash and sells all shares at once. Assuming this simplification, there are four different states an investor can fall into during the investment cycle. He enters the cycle at the state ”wait with money” (WM), where the investor is assumed to possess financial assets and is holding on in the preparation for a ”good buy”. From there he can either wait at the same state WM or purchase the shares at the actual price of the day thereby entering the ”buy” state (B). From state B investor can progress to two different states. Either he enters ”wait with shares” state (WS) preparing for the good selling moment, or he sells immediately (the day after the purchase) all the shares transferring them back to money at the ”Sell” (S) state. If the investor chooses to wait
4
Dymitr Ruta
with shares (WS), he can stay at this state or may progress only to the state S. The sell state has to be followed by either the starting ”wait with money” state WM or directly buy state (B) that both launch the new investment cycle. The complete investment cycle is summarised by the conceptual diagram and accompanied directed cyclic graph both shown in Figure 1. Immediate consequence of the investment cycle is that the labelling sequence is highly constrained by the allowed transaction paths i.e. obeying the following sequentiality rules: • • • •
WM may follow only with WM or B B may follow only with WS or S WS may follow only with WS or S S may follow only with WM or B
The above rules imply that for the current day sample the system has to pick only one out of two labels depending on the label from the previous sample. That way the 4-class problem is locally simplified to the 2-class classification problem. In a real-time scenario it means that to classify a sample, the model has to be trained on the dynamically filtered training data from only two valid classes at each sample. If the computational complexity is of concern, retraining at each sample can be replaced by 4 fixed models that can be built on the training data subsets corresponding to to 4 combinations of valid pairs of classes as stated in the above sequentiality rules.
Fig. 1. Visualisation of the investment cycle applied in prInvestor. The wait state has been separated into 2 states as they have different preceding and following states.
Towards Automated Share Investment System
5
2.2 Training set labelling Classification, representing supervised learning model, requires labelled training data for model building and classifies incoming data using available class labels. In our share investment model, the training data initially represent unlabelled daily time series that has to be labelled using four available states WM, B, S, WS, subject to sequentiality rules. A sequence of labels generated that way determines the transaction history and allows for a calculation of the key performance measure - the profit. In a realistic scenario each transaction is subjected to a commission charge, typically set as a fixed fraction of the transaction value. Let xt for t ∈ 0, 1, .., N be the original share price series and c stand for the transaction commission rate. Assuming that b and s = b + p, where p ∈ 1, .., N − b, denote the buying and selling indices, the relative capital after a single buy-sell investment cycle is defined by: xs 1 − c Cbs = (1) xb 1 + c Note that the same equation would hold if the relative capital was calculated in numbers of shares resulting from a sell-buy transaction. Assuming that there are T cycles in the series let b(j) and s(j) for j = 1, .., T denote indices of buying and selling at j th cycle such that xb(j) and xs(j) stand for buy and sell prices in j th cycle. Then the relative capital after k cycles (0 < k ≤ T ) can be easily calculated by: s(k)
Cb(1) =
k Y
s(j)
Cb(j)
(2)
j=1
The overall performance measure related to the whole series would be then the closing relative capital, which means the relative capital after T transactions: s(T )
CT = Cb(1)
(3)
Given CT the absolute value of the closing profit can be calculated by: P = C0 (CT − 1)
(4)
where C0 represents the absolute value of the starting capital. Finally, to be consistent with the investment terminology one can devise the return on investment performance measure, which is an annual average profit related to initial investment capital: t
i AN N h P AN N s(T ) s(T )−b(1) −1 = Cb(1) R= C0 where tAN N stands for the average number of samples in 12 months.
(5)
6
Dymitr Ruta
The objective of the model is to deliver an optimal sequence of labels, which means the labels that through the corresponding investment cycles generate the highest possible profit. Such optimal labelling is only possible if at the actual sample to be labelled, the knowledge of the future samples (prices) is available. Although reasoning from the future events is forbidden in the realistic scenario, there is no harm of applying it to the training data series. The classification model would then have to try to learn the optimal labelling from historical data and use this knowledge for classification of new previously unseen samples. An original optimal labelling algorithm is here proposed. The algorithm is scanning the sequence of prices and subsequently finds the best buy and sell indices labelling the corresponding samples with B and S labels respectively. Then all the samples in between of B and S labels are labelled with WS label and all the samples in between of S and B labels are labelled with WM label as required by the sequentiality rules. The following rules are used to determine whether the scanned sample is identified as optimal buy or sell label: Sample xb is classified with the label B (optimal buy) if for all samples xt (b < t < s) between sample xb and the nearest future sample xs b, s ∈ 1, .., N ∩ b < s, at which the shares would be sold with a profit (Cbs > 1), the capital Cts < Cbs . • Sample xs is classified with the label S (optimal sell) if for all samples xt (s < t < b) between sample xs and the nearest future sample xb b, s ∈ 1, .., N ∩s < b, at which the shares would be bought increasing the original number of shares (Cbs > 1), the capital Cbt < Cbs .
•
The Matlab implementation of the above rules together with complete code for optimal labelling algorithm can be found in Appendix A. It has to be made clear at this point that the optimal labelling algorithm is used for only one but key purpose: to label a training dataset. Labelled training dataset defines the whole learning process in which classifier would have to establish the relationship between the action labels and the historical data.
3 Feature extraction The data in its original form represents only a share price time series. Optimal labelling process adds labels on top of prices. Extensive research dedicated to time series prediction [3] proves that building the model solely on the basis of historical data of the FTS is very uncertain as it exhibits considerable proportion of a random crawl. At the same time an attractive property of classification systems is that in most of the cases they are scalable, which means they can process large number of features in a non-conflicting complementary learning process [1]. Making use of these attributes, prInvestor takes the original share price series, the average transaction volume series as well as the label series as the basis for the feature generation process.
Towards Automated Share Investment System
7
Details of the family of features used in prInvestor model are listed in Table 1. Apart from various mixtures of typical moving average and various orders defferencing features, there are new features (plf , ppf ) that exploit the labels of past samples in their definition. The past label series is used as feature in a raw form (plf ) as well as in generation of the prospective transaction profit feature (ppf ). PPF determines the share price difference between current day and the latest buy or sell transaction made by investor. The use of labels as features might draw some controversies as it imposes that current model outputs depend on its previous outputs. This is however truly a reflection of the fact that investment actions strongly depend on the previous actions as for example if a good buy moment was missed the following good sell point could no longer be good or even could fall into WM or B class. Incorporation of the dependency on previous system outputs (labels) injects also a needed element of the flexibility to the model such that after a wrong decision, it could quickly recover rather than make further losses. Another consequence of using past labels as features is the high non-linearity and indeterminism of the model and hence its limited predictability. Various configurations of features from the families of features listed in Figure 1 are evaluated in the experimental section and the optimal subset filtered out as a result of a feature selection procedure details of which are presented in Section 6. It is important to note that the features proposed in the prototype of prInvestor model are just a proposition of simple, cheap and available features which by no means form the optimal set of features. In fact as the series is time related, countless number of features starting from the company’s P/E ratio or economy strength indicators up to type of weather outside or the investment mood could be incorporated. The problem of generation and selection of the most efficient features related to the prInvestor model is by far open and will be considered in more detail in the later version of prInvestor. Name prc vol mvai (x) atdi (x) difi (x) plfi ppf
Description Average daily share price Daily transactions volume Moving average - mean from i last samples of the series x Average difference between the current value of x and mvai (x) Series x differenced at ith order The past label of the sample taken i steps before the current sample. Difference between the current price and the price at B or S labels Table 1. A list of features used in the prInvestor model.
8
Dymitr Ruta
3.1 Feature selection Even in its initial prototype stage, preliminary experiments indicated very high sensitivity of performance to the selection of feature subsets. Moreover future additions of different classification models to the system typically work well with different subsets of features, hence the system requires incorporation of a simple evaluative classifier selection method that optimises the selection to the classifier model it selects features for. One of the powerful algorithms fulfilling these requirements is a probability based incremental learning (PBIL) method [12], [15]. It shares some similarities with evolutionary algorithms, by operating on a population of enumerated solutions - chromosomes, which are vectors of 0-1 incidences, where 1 on the k th position means that k th feature is selected while 0, that is not. The chromosomes are sampled from a special probability vector, which is updated at each step according to the fittest chromosomes (best performing set of features). The update process of the probability vector is performed according to a standard supervised learning method. Given the probability vector p = [p1 , ..., pM ]T , and population of chromosomes P = [v1 , ...vC ], where vj = [ωj1 , ..., ωjM ]T , each probability bit is updated as in the following expression: pnew i
=
pold i
+ ∆pi
∆pi = η
µ PC
j=1
C
ωji
− pi
¶
(6)
where j = 1, ..., C, i = 1, ..., M refers to the C fittest chromosomes found and η controls the magnitude of the update. A number of best chromosomes taken to update the probability vector together with the magnitude factor η control a balance between the speed of reaching convergence and the ability to explore the whole search space. According to the standard algorithm, the only information that remains after each step is the probability vector, from which the chromosomes are generated. Convergence achieved when: ∆pi → 0, implies that pi → ωji for all the chromosomes in the population, which means that probability vector becomes the the final solution of the search process. The complete PBIL algorithm can be described in the following steps: 1. Create a probability vector of the same length as the required chromosome and initialise it with values of 0.5 at each bit. 2. Create a population of chromosomes according to the probability vector. 3. Evaluate fitness of sample by calculating MVE for the combinations defined by the chromosomes. 4. Update the probability vector using (6). 5. If all elements in probability vector are 0 or 1 then finish, else go to step 2.
Towards Automated Share Investment System
9
4 Classification model Given the set of features, labelled training set and the performance measure, the model needs only a relevant classifier that could learn to label the samples based on optimal labels and the corresponding historical features available in the training set. Before the decision on the classifier is made, it is reasonable to consider the complexity and adaptability issues related to prInvestor working in a real-time mode. Depending on the choice of training data there are three different modes prInvestor can operate on. In the most complex complete mode the model is always trained on all available data to date. At each new day the previous day would have to be added to the training set and the model retrained on typically immense dataset covering all available historical evidence. In the fixed mode the model is trained only once and then used in such fixed form day by day without retraining that could incorporate the new data. Finally in the window mode each day the model is retrained on the same number of past samples. Undoubtedly the model is fastest in fixed mode, which could be a good short-term solution particularly if complexity is of concern. Complete mode offers the most comprehensive training, however at huge computational costs and poor adaptability capabilities. The most suitable seems to be the window mode in which the model is fully adaptable and its complexity can be controlled by the window width. Given relatively large datasets and the necessity of retraining, it seems reasonable to select a rather simple easily scalable classifier that would not be severely affected by the increase in sizes of both feature and data sets. Initially prInvestor prototype is featured with the simple quadratic discriminant analysis (QDA), decision tree (DT) and the k-nearest neighbours classifiers details of which can be found in [1]. These classifiers seem to accommodates the simplicity properties mentioned above while being still capable of capturing some non-linearities. As classification is a relatively consistent process, this set of classifier models can be appended by more complex models at latter stage. It is important to note that given a day lag of the series, there is plenty of time for retraining even using large datasets. The model is therefore open for more complex classifiers or even the mixture of classifiers that could potentially improve its performance. The three simple classifiers used in this work, due to its simplicity are particularly useful in this early prototyping stage where the experimentation comprehensiveness is the top priority rather than maximum possible performance. Moreover, simple classifiers are also preferred for shorter-lag series where the retraining might be necessary every hour or minute.
5 Client-server platform implementation From the technical point of view prInvestor’s capability can be efficiently exploited in a relevant client-server architecture. The server traditionally handles
10
Dymitr Ruta
all analytical tasks and the communication with a database where the data are read from and processed results written to. Multiple remote clients can be served simultaneously by means of widespread HTTP protocol inline with the internet communication channel. The schematic architecture of the prInvestor system is shown in Figure 2.
Fig. 2. Client-server architecture of the prInvestor system.
The Data Mining Server (DMS) performs all computational and database interaction realised by the prInvestor system. It maintains client accounts which can have various privilege levels with respect to the access to the shares data and data processing requests. A user logged on to administrator account can manage all the client accounts which includes, adding and removing accounts, resetting their usernames and passwords and setting various privilege levels. DMS has a direct, secure and fast connection with the relational database containing daily shares data updated overnight. If necessary on client request the user can perform various preprocessing procedures like data cleaning and aggregation supported by the relevant GUI. Given complete source data the user has to decide upon the set of features he wants to use or prInvestor picks a default set selected from the list of available features offered by the system. The list of selected features is then fixed to the user untill he decides to change it. prInvestor that are offered by the prInvestor system and this set is then fixed to the user ever since. The user has also the facility to redesign or filter the data to his individual needs by means of relevant SQL queries which are send to and run by the server and then saved either locally or on the database depending on privilege profile of the user. The clients can request analytical task at any time. The server manages them via priority queuing system taking into account estimated task duration, resource usage, user priority level etc. For each user the server creates a table in a database where the history of his transactions is kept. If no transactions is made by the user in particular day, the system automatically updates wait
Towards Automated Share Investment System
11
action states in the user tables. During the periods of less active server usage, the server automatically retrains the models for the users according to their priority and stores the predicted action for the next day to be proposed to users if they log in on that day. Before the model training the server pulls the shares data from the database and joins them with optimal labels returned as a result of the optimal labelling algorithm. To minimise analytical effort, optimal label is appended each day after the database is updated with new data. After retraining the models are stored within particular user schemas and are ready to handle client requests. In case of extremely high server usage during the week, retraining of models can be postponed to the weekend, which would mean that prInvestor model would operate in the fixed mode on the short-term and moving window mode on the long-term. In a response to client action prediction request the server performs the following actions: • • • • •
Retrieves the shares data from the database or receives them in the request from the user Appends the data by the features using user action labels Retrieves the user classification model Applies the model to the joined data point(s) Sends the action label(s) to the client
6 Experiments Extensive experimentation work has been carried out to evaluate prInvestor. Specifically prInvestor was assessed in terms of the relative closing capital compared to the relative closing capital of the passive investment strategy of buying the shares at the beginning and selling at the end of the experimental series. Rather than a large number of various datasets only one dataset has been used but covering almost 20 years of daily average price and volume information. The dataset represents Computer Sciences Corporation average daily share price and volume series since 1964 to 1984, available at the corporation website (www.csc.com). Initially the dataset has been optimally labelled for many different commission rates, just to investigate what is the level of maximum possible oracle-type return from the share market investment. The experimental results shown in Figure 3(b) reveal surprisingly large double-figure return for small commission charges (< 1%) falling sharply to around 0.5 (50%) for high commission charges (> 10%). Relating this information to the plot of the original series shown in Figure 3(a) it is clear that to generate such a huge return the algorithm has to exploit all possible price rises that exceed the profitability threshold determined by the commission rate. It also indicates that the most of the profit is generated on small but frequent price variations (±2%), which
12
Dymitr Ruta
in real-life could be considered as noise and may not be possible to predict from the historical data. The maximum possible annual return or the profile presented in Figure 3(b) can be also considered as a measure describing potential speculative investment attractiveness of the corresponding company. High values of such measure would give a founded hope that even a small fraction of the optimal transactions if learned from the historical evidence would still generate a considerable profit. Shape analysis of the return profile from Figure 3(b) could provide even more detailed information on this issue. Due to many varieties of prInvestor setup and a lack of presentation space in this paper the experiments have been carried out in the moving-window mode as a balanced option featuring reasonable flexibility and adaptability mechanisms. The choice of optimal width of the moving window was aided by a specifically designed experiment. All the features were tested individually by a simulation of the real time investment over 20 years of data for training windows fixed at 6 months, 1, 2 and 4 years. The resulting profit performances are compared with passive investment strategy in Table 2. The expectation of poor performance of the classification system with just a single feature has been confirmed for most of the cases. Nevertheless for a number of features the closing capital was comparable with the passive investor closing capital and for differenced moving average applied to share price dif (mva(p20)) prInvestor doubled the closing capital of the passive investor. Then, for each of the window width fixed at: 6 months, 1, 2 and 3 years, prInvestor has been run 100 times using QDC classifier and a random subset of features. The discriminant function of the QDC classifier is visualised for two sample features in Figure 4. The plot shows data points projected on the superposition of the class discriminant functions and the resulting class boundaries on the bottom plane. Details of the visualisation method used in Figure 4 can be found in [16]. The performances obtained in a form of returns related to the passive investment returns indicated that the width of 2 years (around 500 days) is optimal for the particular pair of dataset-classifier. Having decided on the 2 years moving window mode prInvestor was further tuned by selection of the most relevant features. The PBIL algorithm described in section 3.1 was applied to search for the optimal subset of features listed in Table 2. PBIL has been run with 50-element population of randomly initialised chromosomes and after around 40 generations the algorithm converged resulting in a optimal subset of 21 features which are indicated by the + mark preceding feature names in Table 2. The model was then trained on the first 500 days (2 years window) that have been optimally labelled. In the next step the investment simulation was launched, in which at each next day the model generated the transaction label based on the training on the preceding 500 optimally labelled samples. The resulted sequence of transaction labels represents complete output of the model. Figure 5(a) illustrates the transactions generated by the prInvestor model while Table 3 shows the performance results. Important point is that most of investment cycles generated by prInvestor were profitable. Moreover
Towards Automated Share Investment System
13
the occasional loss cycles occur mostly during bessa and are relatively short in duration. The model is quite eager to invest during hossa, which seemed to be at least partially picked from the historical data. Numerical evaluation of the prInvestor depicted in Figure 5(b), shows more than 5 times higher closing capital than for the case of passive investor who buys at the beginning and sells at the end of 20 years period. Such remarkable results correspond on average to almost 20% annual return from investment and give a good prospect for the development of the complete investment platform with a number of carefully developed features and the data incoming automatically to the system for daily processing resulting in a final investment decision.
7 Conclusions prInvestor is a proposition of the intelligent share investment system. Based on extensive historical evidence it uses pattern recognition mechanisms to generate a transaction decision for the current day: buy, sell or wait. The advantage of this model over the existing techniques is that it is capable of incorporating in a complementary, non-conflicting manner various types of evidence beyond just the historical share price data. The proposed system benefits further from optimal labelling algorithm developed to assign the transaction labels to the training series such that the maximum possible return on investment is achieved. The model features basic flexibility and adaptability mechanisms such that it can quickly recover from bad decisions and adapt to the novel trend behaviour that may suddenly start to appear. The robustness of prInvestor is demonstrated on just a few simple feature types generated upon the share price and volume series. With such a simple setup the model hugely outperformed the passive investment strategy of buying at the beginning and selling at the end of the series, and brings on average 20% annual return from investment. Despite tremendous average results the model is not always consistent and occasionally generates losses. Full understanding of this phenomenon requires deeper analysis of the role of each individual feature in the decision process. In addition there is plenty of unknowns relating to the choice of features, classifiers and the real-time classification mode. Moreover, although a prototype of the client-server implementation of prInvestor has already been proposed in this work, the system is still far from commercial use which would require portfolio management of multiple investments at the same time. The system needs also flexibility to accommodate more action types beyond just simple buy, sell and wait. Prospective real-time application of prInvestor may require drastic decrease of time granularity which in turn could impose rethinking and possibly redesign of the retraining process. All these problems and doubts will be the subject of further investigation towards fully automated, robust and commercially feasible investment platform, which should be flexible enough to work with multiple classification models. For that to happen the system would have to meet very restrictive
14
Dymitr Ruta
Feat. +prc +vol -mva(p1) -mva(v1) +mva(p5) -mva(v5) -mva(p20) +mva(v20) +mva(p50) +mva(v50) +atd(p1) +atd(v1) +atd(p5) +atd(v5) +atd(p20) +atd(v20) -atd(p50) -atd(v50) +dif(p1) -dif(v1) +dif[mva(p20)] +dif[mva(v20)] +dif(p2) +dif(v2) +dif[mva2(p20)] +dif[mva2(v20)] +plf(1) -plf(2) -plf(3) +ppf
ret.5 0.09 0.00 0.07 0.07 0.00 -0.01 0.06 0.00 0.02 0.08 -0.14 0.12 -0.07 -0.05 -0.01 -0.12 0.07 -0.02 -0.14 0.12 0.04 -0.01 0.10 0.07 -0.02 0.04 -0.09 0.02 0.03 -0.15
rcp.5 0.36 0.07 0.24 0.25 0.07 0.06 0.21 0.07 0.11 0.30 0.00 0.70 0.02 0.03 0.06 0.01 0.27 0.05 0.00 0.70 0.14 0.06 0.45 0.29 0.05 0.14 0.01 0.11 0.12 0.00
rc.5 5.16 0.95 3.43 3.48 1.01 0.82 2.94 1.01 1.55 4.28 0.06 9.89 0.26 0.39 0.79 0.09 3.83 0.74 0.06 9.89 2.04 0.85 6.40 4.05 0.72 2.03 0.16 1.60 1.76 0.04
ret1 0.07 -0.01 0.07 0.05 0.05 0.03 -0.01 0.07 0.04 0.01 -0.05 -0.11 -0.11 -0.01 0.07 -0.02 0.02 0.00 -0.05 -0.11 0.08 -0.07 0.00 0.05 0.02 0.06 -0.05 0.00 -0.05 -0.10
rcp1 0.37 0.08 0.34 0.25 0.25 0.17 0.09 0.32 0.19 0.11 0.03 0.01 0.01 0.08 0.33 0.06 0.15 0.10 0.03 0.01 0.39 0.02 0.09 0.25 0.13 0.29 0.04 0.09 0.04 0.01
rc1 3.91 0.81 3.54 2.65 2.68 1.82 0.90 3.38 2.02 1.21 0.35 0.12 0.11 0.89 3.51 0.63 1.55 1.05 0.35 0.12 4.07 0.24 0.99 2.63 1.33 3.08 0.39 1.00 0.37 0.13
ret2 0.07 -0.02 0.06 -0.02 0.06 -0.03 0.05 0.01 -0.02 0.00 -0.06 0.04 -0.05 -0.07 -0.01 -0.01 -0.13 0.01 -0.06 0.04 0.08 -0.06 -0.12 -0.06 -0.02 -0.01 0.00 0.00 0.00 -0.13
rcp2 0.77 0.16 0.68 0.15 0.59 0.12 0.57 0.26 0.15 0.23 0.08 0.43 0.10 0.06 0.20 0.20 0.02 0.26 0.08 0.43 0.85 0.07 0.02 0.07 0.16 0.17 0.22 0.22 0.22 0.02
rc2 3.49 0.71 3.10 0.67 2.67 0.56 2.59 1.17 0.69 1.05 0.36 1.97 0.44 0.27 0.90 0.90 0.08 1.19 0.36 1.97 3.86 0.33 0.10 0.31 0.75 0.79 1.00 1.00 1.00 0.08
ret4 0.00 -0.13 0.00 -0.15 0.00 -0.12 0.00 -0.08 0.00 0.00 -0.14 -0.13 -0.11 -0.13 -0.13 -0.05 -0.04 -0.02 -0.14 -0.13 0.04 -0.12 -0.15 -0.13 -0.04 -0.10 0.00 0.00 0.00 -0.15
rcp4 1.11 0.11 1.11 0.08 1.11 0.14 1.11 0.31 1.11 1.05 0.10 0.13 0.16 0.12 0.13 0.50 0.59 0.75 0.10 0.13 2.05 0.13 0.08 0.12 0.54 0.22 1.11 1.11 1.11 0.08
rc4 1.00 0.10 1.00 0.08 1.00 0.13 1.00 0.28 1.00 0.94 0.09 0.12 0.15 0.11 0.11 0.45 0.54 0.68 0.09 0.12 1.84 0.12 0.07 0.10 0.49 0.19 1.00 1.00 1.00 0.07
Table 2. Average annual returns (ret), closing capitals (rc) and closing capitals related to the passive investment closing capitals obtained for the 20-year investment simulation with individual features only. The experiments show the results for many fixed training window widths: 125, 250, 500 and 1000 days, corresponding roughly to 0.5, 1, 2 and 4 years. The + or - at the front of feature name indicates whether the feature is included or not in the optimal subset of features selected by PBIL algorithm. Investor Annual return [%] Relative closing capital (rcc) [%] Passive investor 8.9 463.8 prInvestor 19.1 2310.7 Table 3. Performance comparison between passive investment strategy and prInvestor obtained over investment simulation on 20 years of CSC share price history.
Towards Automated Share Investment System
15
6
5
CSC share price [$]
4
3
2
1
0
0
500
1000
1500
2000 2500 3000 working days since 1964
3500
4000
4500
5000
(a) CSC daily share price since 1964 22 20 18 16
annual return
14 12 10 8 6 4 2 0
0
2
4
6
8 10 12 transaction commission rate [%]
14
16
(b) Annual return as a function of transaction commission rate Fig. 3. Visualisation of the optimal labelling algorithm capability.
18
20
16
Dymitr Ruta
Fig. 4. Visualisation of the quadratic discriminant classifier generating the boundaries among classes. 4(a) shows the plot of the Gaussian-based discriminative function for the pair of features atd(p50) and ppf. 4(b) discriminative function for features dif[mva(p20)] and plf(1). The resulting class boundaries are visualised by the thick lines at the bottom of the plots
Towards Automated Share Investment System
17
6
5 sell buy CSC share price [$]
4
3
2 loss transaction 1
0
0
500
1000
1500
2000
2500
3000
3500
4000
4500
(a) prInvestor transactions 30
25
Relative capital [C/C0]
20 prInvestor 15
10
5 passive investor 0
0
500
1000
1500
2000
2500
3000
3500
4000
4500
(b) Relative capital - comparison Fig. 5. prInvestor in action: the transactions returned as a result of learning to invest with profit from historical data. 5(a) Visualisation of the transactions generated by prInvestor. 5(b) Comparison of the relative capital evolutions of the prInvestor and passive investor always keeping shares.
18
Dymitr Ruta
reliability requirements confirmed by the extensive testing across different company shares, markets, and time.
8 Appendix A. Implementation of optimal labelling algorithm The following set of Matlab functions realise the complete algorithm for optimal labelling of the training set. Given the price time series x and broker handling fees charge (eg charge=0.01 would mean 1%) the function OptimalLabelling finds the optimal sequence of labels: 0 - wait with money (WM), 1 buy (B), 2 - wait with shares (WS), 3 - sell (S), that generates the highest possible profit as a results of corresponding buy-sell transactions. The function returns the label vector labels. function [labels]=OptimalLabelling(x,charge) new_buy_id=1; n=length(x); labels=zeros(n,1); charge=(1+charge)/(1-charge); while new_buy_id