Optimizing Parameters of Algorithm Trading ...

16 downloads 0 Views 296KB Size Report
Abstract — In algorithm trading, computer algorithms are used to make the decision on the time, quantity, and direction of operations (buy, sell, or hold) ...
2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2012)

Optimizing Parameters of Algorithm Trading Strategies using MapReduce 1

Xiongpai Qin1,2, Huiju Wang1,2, Furong Li1,2, Jidong Chen3, Xuan Zhou1,2, Xiaoyong Du1,2, Shan Wang1,2 Ministry of Education Key Lab of Data Engineering and Knowledge Engineering (RUC), Beijing, 100872, P.R.China 2 School of Information, Renmin University of China, Beijing, 100872, P.R.China 3 EMC-Greenplum Research China, 100084, P.R.China [email protected], [email protected], [email protected], [email protected] [email protected],{duyong, swang}@ruc.edu.cn

Abstract — In algorithm trading, computer algorithms are used to make the decision on the time, quantity, and direction of operations (buy, sell, or hold) automatically. To create a useful algorithm, the parameters of the algorithm should be optimized based on historical data. However, Parameter optimization is a time consuming task, due to the large search space. We propose to search the parameter combination space using the MapReduce framework, with the expectation that runtime of optimization be cut down by leveraging the parallel processing capability of MapReduce. This paper presents the details of our method and some experiment results to demonstrate its efficiency. We also show that a rule based strategy after being optimized performs better in terms of stability than the one whose parameters are arbitrarily preset, while making a comparable profit. Keywords – algorithm trading; trading strategy; parameter optimization; MapReduce

I.

INTRODUCTION

The application of telecommunication and computer techniques in financial has created numerous electronic markets, in turn stimulated the rise of algorithm trading. Algorithm trading is the use of computer programs for entering trading orders, in which computer algorithms decide on every aspects of the order, such as the timing, price, and quantity of the order. In many cases, computer can even initiate the execution of the order without human intervention [1]. There are several standard modules in a proprietary algorithm trading system, including trading strategies, trading execution, cash management and risk management. Trading strategies are the core of an automated trading system. Complicated algorithms [2] are used to analyze data (price data and news data) to capture anomalies in market, to identify profitable patterns, or to detect the strategies of rivals and take advantages of the information reaped. Different kinds of algorithms have been proposed, such as rule-based algorithm [3], fuzzy rule-based algorithms [4], artificial neural network based algorithms [5], genetic algorithms [6], and hybrid approaches [7], for buy/sell decision making. Before an algorithm is put into practical use, it is necessary to back test it using enough historical price data to validate and optimize the algorithm in terms of profitability, stability, etc.

In a complex algorithm, there are many parameters that need to be optimized. When the domain of each parameter has a large cardinality (distinct values of the parameter), the number of parameter combinations, also known as the size of search space for parameter optimization, will become very large. For instance, in the algorithm that will be presented subsequently, there are only ten parameters that need to be optimized. However the combination number is around 612,360,000. Optimizing the parameters of a complex algorithm using a scaling up method will be time consuming and expensive. We have to resort to some parallel processing techniques to reduce execution time. MapReduce has drawn attention from both academia and industry, since it was introduced by Google in 2004[8]. MapReduce proves to be a powerful parallel data processing tool, with the advantages of simple interface, automatic parallel execution, highly fault tolerance, and highly scalability. MapReduce has been applied to many data intensive and computation intensive areas such as OLAP, machine learning and tuning [9], data mining, information retrieval, and multi media data processing. The parameter optimization of trading strategies is both computation intensive (computation of complicated indicators) and data intensive (large volume of historical data). Therefore, it is an ideal case for MapReduce. In this paper we investigate how to use MapReduce to improve its execution. The rest of the paper is organized as follows, Section 2 presents a simple rule based trading strategy and describes the parameters to be optimized. Section 3 brings forth the proposed parameter optimization algorithm based on MapReduce. Section 4 shows the experiment results and discuses some lessons we have learned. Section 5 concludes the whole paper. II.

Designing a complex and profitable trading strategy is outside of the scope of this paper. A simple yet effective rule based trading strategy is presented in this paper. The philosophy we follow in the design is: (1) no one can forecast what will happen in the future; (2) what we can do is to identify the trend and jump on it; (3) we are trying to beat others in the market, not trying to beat the market itself.

Supported by National Natural Science Foundation under Grant no.61170013

978-1-4673-0024-7/10/$26.00 ©2012 IEEE

A RULE BASED TRADING STRATEGY

2751

The strategy is built around moving averages of price of some specific time frame (10seconds, 1 minute, 3 minutes, and so on…). A dozen of rule groups are used in decision making. Several indicators calculated from the price data are matching to every rule group one by one. If some rule group is matched, a buy or a sell signal is generated. Otherwise, the holding signal (no new buy/no new sell) is generated. In a rule group, there can be two rules (a basic rule and a confirmation rule) or a single rule (a basic rule). The basic rule is used to generate the operation signal (buy or sell), and the confirmation rule is used to screen away fault signals. A signal with an expectation to make a profit is a true signal. After a buy or a sell signal is generated, the quantity of the order is computed using the cash management and risk managements policies. The order is executed immediately. A stop lose order is accompanying the executed order in case of that the trend is wrongly identified and the price goes in the opposite direction. Two of the rules are described in details as follows. Moving Average Convergence Divergence (MCAD) indicator is the most commonly used indicator in rule based strategies. The basic idea is that when MCAD is below the signal line and going up to cross the signal line, a buy signal is generated; when MCAD is above the signal line and going down to cross the signal line, a sell signal is generated; otherwise the asset should be hold in hand. The computation of MCAD requires three parameters, known as a short period, a long period, and a signal period. The short and long periods are used to compute two Exponential Moving Averages (EMA), which constitute the MCAD. The signal period is used to calculate an EMA, which acts as the signal line. All the three parameters need to be optimized, to achieve maximum profit.

test all combinations of the two parameters to decide which combination is the best. An application of the rule on price data (from July 2011 to November 2011) of the A1201 soybean futures contract of Dalian Commodity Exchange is depicted in Figure 1. A buy signal and a sell signal are also displayed on the chart (The futures contracts can be sold short. After that a profit can be made when the position is closed later, given that the price is moving down). Apparently 80 and 20 are not the best values for the upper and lower thresholds. In practice, such basic rules as MCAD and RSI may generate fault signals. Therefore another rule is used to confirm them. Confirmation rules can be built around other indicators, such as ADI (Accumulative Distribution Index), OBV (On Balance Volume) etc. Due to space limitation, the details of these indicators and rules are not presented. The trading strategy contains 7 rules (10 parameters) in its rule base, and the number of parameter combination is large, i.e. 612,360,000. It requires an efficient parallel algorithm for parameter optimization. III.

THE PARAMETER OPTIMIZATION ALGORITHM FOR TRADING STRATEGIES

A. The Parameter Optimization Algorithm The price dataset is partitioned into two subsets, an optimization dataset and a test dataset. After the trading strategy is optimized using the optimization dataset, the parameter combination that performs the best according to our objectives (refer to ranking function (1)) will be run against the test dataset, and the performance metrics will be compared against a baseline strategy (using parameters set by an experienced security analyst) to verify the effectiveness of the optimization (see Figure 2). Performance of Baseline Strategy

All Combinations Sample Subset of Combinations

Vary K And compare

Compare Performance of Optimized strategy

Parameter list K Parameter list 2 Figure 1. Generation of Buy and Sell Signals

Relative Strength Index (RSI) is an oscillation indicator, showing the potential moving direction of the price of a financial asset. If the RSI is initially greater than an upper threshold (normally 80) and going down under the threshold, a sell signal is generated; if RSI is initially less than a lower threshold (normally 20) and going up beyond the threshold, then a buy signal is generated. There are two parameters in the rule that should be optimized, namely the upper threshold and the lower threshold. For some financial assets, 80 and 20 are not necessarily good thresholds. In general cases, the upper thresholds are taken from the values in [65, 100] with 3 as step size, and the lower thresholds are taken from the values in [0, 25], also with 3 as step size. Historical price should be used to

Best Parameter List

Parameter list 1

Optimization Data

Testing Data

Figure 2. Trading Strategy Optimization and Testing

The computation of the first MCAD indicator requires that at least a leading number (LN) of prices have been gathered (see the long period parameter in the introduction of MCAD indicator). Only after these prices are crunched through, can the needed indicators be all ready for rule matching. As depicted in figure 3, The price data is partitioned using an overlapping scheme, such that the adjacent price data blocks should overlap

2752

in at least LN time frames(1minutes, 3minutes etc.). A block size of 16MB (a quarter of the system default) is used in data partitioning. Price Data over Time Leading Number of Prices for Overlapping

Block 1

……… Block 2 Figure 3. Partition of Price Data with Overlapping

The trading strategy is run against the price data for performance measurement. If there are N combinations of the parameter values, N independent optimization processes on the dataset should be performed. MapReduce fits well with this optimization task. For every round of running, some performance metrics are measured (will be described below). It is worth to mention that the time interval of price data should be long enough to apply to representative market situations including bullish, bearish, and sideway markets. To demonstrate the effect of over fit of optimization on latter trading performance, we use K (K can be 10, 20 30, ..., to 100 percent of the N parameter combinations) combinations as samples to optimize the trading strategy. The algorithm is comprised of two MapReduce jobs (Figure 4 and Figure 5), the first job enumerates through the permutations of parameter value combinations, and invokes the second job for once for every combination; the second job uses the designated parameter value combination to run the trading strategy and measures necessary performance metrics. MapReduce Job (Optimizer) 1. Create the permutations of parameter combinations. 2. Sample K combinations from the permutations. 3. For every combination in the sample, launch the second job using the parameter values and gather of every combination. 4. Rank the , , …, list according to equation (1). 5. Select the best parameter combination according to the ranking. Note: 1, 2, 3 are performed by a map function, 4, 5 is performed by a reduce function. P: parameter list, M: performance metrics of the parameter list

Figure 4. The Outside MapReduce Job

MapReduce Job (Strategy Running) 1. Set the parameters of trading strategy using the list of values. 2. Run the trading strategy against price data and generate buy or sell signals 3. Gather signals and sort the signals by the timestamp and eliminate duplicate signals. 4. Decide on quantity of trading using cash management and risk management policies, execute the order without delay, and measure the performance metrics of the trading strategy. 5. Save the parameter value list and performance metrics. Note: 1, 2 are performed by a map function, 3, 4, 5 is performed by a reduce function.

Figure 5. The Inner MapReduce Job

B. The Performance Metrics of Trading Strategies Investors are concerned about a number of performance metrics, including the number of profitable trades, the number of loss trades, the winning ratio, trading fees, repeated wins, repeated losses, the biggest winner, the biggest loss, total wins, total losses, the max drawdown, return volatility, the closed profit, the percentage of gain, etc. In our system, the percentage of gain (PG), the max drawdown (MD), the winning ratio (WR), the number of profitable trades (NPT) are considered to be the most important performance metrics. Drawdown is the measure of the decline from a historical peak to corresponding nadir in cumulated profit. The less the max drawdown, the more stable the trading strategy. Winning ratio is computed by dividing the number of profitable trades by the number of loss trades. A higher winning ratio is always favored, while a winning ratio exceeding 55% is good enough in practice. Investors wish that the profit is made from as many trades as possible, because putting eggs into one basket is dangerous. Furthermore, when considering the effect of compound interest, normally a percentage of gain exceeding 7% will make considerable profit in the long run. The above metrics are used by our scheme to rank the parameter value combinations. The final ranking function is: Rank = WPG .PG + WMD .MD + WWR .WR + WNPT . NPT

(1)

Although the four weights can also be optimized using the algorithm, in our experiments, they were set to 0.25, 0.45, 0.1, 0.2, as experience shows that they are suitable. Such a weight setting gives a higher priority to the max drawdown than those to other metrics, because a stable trading strategy that can make reasonable profit while escaping the situation of high volatility is favored. The parameter value combination with the highest of rank will be selected as an optimal one to be run against the test data. IV.

EXPERIMENT RESULTS

Our Experiments were conducted on a computer cluster comprising of five commodity servers. One of server was used as the Name Node of Hadoop, and the other four servers were used as Data Nodes. Every server is equipped with a Core-II Dual 2.0GHz CPU, 2GB memory, and a SATA (7200rpm) hard disk of 500GB. The Hadoop version used was 0.20.0. The HDFS was extended to accommodate the overlaps between data blocks by making some code modifications [10]. A price dataset of the CU (cuprum) futures contract of Shanghai Futures Exchange from 2005 to 2011 was used as the experiment dataset. The volume of data is around 7GB. The data was partitioned into two subsets, in which the data of Year 2005-2009 was used as optimization set, and the data of Year 2010-2011 was used as the test set. The initial balance of the trading account was set to 1, 000, 000RMB. A. Runtime Cut Down The first experiment was conducted to find out how the algorithm shortens the runtime of parameter optimization. In the four rounds of optimizations, 1,2,3,4 of data nodes are used. Figure 6 shows the speedup effect of the algorithm when using

2753

more data nodes. The running times of the optimization are 137, 79, 54, 38 minutes respectively.

Figure 6. Runtime Cut Down

B. Effect of Sample Size(K) In the second experiment, we varied the size of sample (K) from 10 percent, 20 percent, …, to 100 percent. In our work, the uniform sampling technique is used. Figure 7 shows the result. From the figure we can see that the percentage of gain goes up when more parameter value combinations are used to optimize the trading strategy. When the sample size is small, the max drawdown is high. As the sample size grows, the max drawdown decrease steadily. At certain point, when the sample ratio reaches 50 or 60 percent, the max drawdown reaches its minimum. Following that, it rises again and goes to an even higher level than before. The winning ratio and the number of profitable trades increase as the sample size gets larger. They go onto a plateau as the sample size approaches 100 percent. The results show that being rich overnight is not likely, and trading strategies that can bring in steady increment of wealth always win. We also learn that optimizing a trading strategy too much on a specific dataset can lead to over fit, and the trading strategy may not be profitable in the future market.

Figure 7. Effect of Various Sample Size

We also tested the trading strategy using a predefined set of parameter values, which are regarded as an optimal according to an experienced security analyst. The trading strategy can make a greater profit. Its profit is 9.7 % compared against 6.79% of the trading strategy optimized by our algorithm. However, the trading strategy incurs a max drawdown of 15%, which is not preferred.

C. Discussion Using a cluster to processing a dataset of 7GB may be argued to be an overkill due to the fact that memory capacity of a server can easily exceed 8GB and the whole dataset can be put into memory for fast access. It is true when only one futures contract or only one stock is considered, but we also want to test trading strategies against other financial assets and at different time frame sizes. At those situations, our solution will be justified. The other reason for us to use a Hadoop cluster is that, in near future, news data from web sites, BLOGs etc. will be used to augment the trading decision after some sentiment analysis has been performed on the data. The data will be large in volume. V.

CONCLUDION

MapReduce is a simple yet powerful parallel data processing framework. Optimizing parameters of trading strategies fits the framework very well. In this paper, we propose an optimization algorithm for the work. By using the parallel processing capability of MapReduce, the runtime of parameter optimization is cut down significantly. We also present a set of interesting results in optimizing a rule based trading strategy, which show that excessive optimization leads to over fit, and the resulting trading strategy may be highly risk in terms of max drawdown. REFERENCES [1]

Algorithm Trading. http://en.wikipedia.org/wiki/Algorithmic_trading. 2012. [2] Giuseppe Nuti, Mahnoosh Mirghaemi, Philip Treleaven, Chaiyakorn Yingsaeree. Algorithm Trading. Computer, 2011, 44(11): 61-69. [3] W. Wen. R.T. Qin. A low risk stock trading decision support system. IEEE International Summer Conference of Asia Pacific Business Innovation and Technology Management (BITM), 2011, pp.117-121. [4] Yeh I-Cheng, Lien Che-hui. Fuzzy Ruled based Stock Trading System. IEEE International Conference on Fuzzy Systems (FUZZ), 2011, pp.2066-2072. [5] Simon Haykin. Neural Network: A Comprehensive Foundation. Prentice Hall PTR, NJ, 2004. [6] L.C. Lien, I.C.Yeh. Building Trading System for Taiwan Stock Market using Genetic Neural Networks. Journal of Information Management, 2008, 15(1): 29-52. [7] G. Armano, M. Marchesi, A. Murru. A Hybrid Genetic Neural Architecture for Stock Indexes Forecasting. Information Science, 2005, 170(1):3-33. [8] Kyong-Ha Lee, Yoon-Joon Lee, Hyunsik Choi, Yon Dohn Chung, Bongki Moon. Parallel Data Processing with MapReduce: A Survey. SIGMOD Record, 2011, 40(4):11-20. [9] Yasser Ganjisaffar, Thomas Debeauvais, Sara Javanmardi, Rich Caruana, Cristina Videira Lopes. Distributed Tuning of Machine Learning Algorithms using MapReduce Clusters. Workshop on Largescale Data Mining: Theory and Applications (SIGKDD) 2011, Article No 2. [10] Pramod Bhatotia, Alexander Wieder, Rodrigo Rodrigues, Umut A. Acar, Rafael Pasquini. Incoop: MapReduce for Incremental Computations. ACM Symposium of Cloud Computing (SOCC), 2011, Article No 7.

2754

Suggest Documents