when presented totally a new problem. Alternatively we may use other artificial intelligence (AI) techniques called reinforcement learning, genetic algorithms,.
Application of Genetic Algorithms to Game Theory Model for Efficient Spectrum Allocation Y. B. Reddy1, N. Gajendar1, and S.K.Gupta2 Grambling State University, Grambling, LA 71245, USA Indian Institute of Technology, Hauz Khas, New Delhi – 110016, India 1
2
Abstract - In this research we introduce the game theory model for efficient spectrum allocation. The proposed game theory model uses the genetic algorithm application to analyze a radio frequency and optimize the utility function (each player’s preferences). The simulations were done using MATLAB ‘gatool’. The results show that genetic algorithms have positive effect on game theory models in spectrum allocation. Keywords: Game theory, genetic algorithm, cognitive radio, spectrum allocation, utility function, channels
1
Introduction
Mitola [16, 17] described an adaptive radio which adjusts its operation based on information captured from the environment and measurements of its own performance. The adaptive radio named as cognitive radio (CR), plays the role of sharing the spectrum by multiple users through adaptive mechanisms that distinguish users in terms of time, frequency, code, and other signal characteristics. Currently CR requires computationally efficient and self evolving cognitive models where their behaviors change with changing environment. The current machine learning (ML) models do not meet these requirements due to following reasons: • Rule-based systems: limited to fixed capabilities because they depend upon external experts. • Fuzzy-logic: permits approximate solutions to be found in the case of uncertain inputs. It does not have inherent evolutionary ability that allows the logic to change in time as capabilities are required and environments are encountered. • Neural networks: typically uncontrollable in that it may or may not play within a set of operational constraints. They need extensive training and behave unexpectedly when presented totally a new problem. Alternatively we may use other artificial intelligence (AI) techniques called reinforcement learning, genetic algorithms, and/or combination of these methods, because:
• Reinforcement Learning: uses Q-learning and learns through experience, can handle totally a new problem. • Genetic Algorithms: genes of the chromosome represent the adjustable parameters in a given radio. By genetically manipulating the chromosomes using crossover, mutation, selection, and fitness, the genetic algorithms (GA) can find a set of parameters that optimize the radio for the user’s current needs. The biologically inspired models (genetic algorithm based models) address the traditional shortcomings of artificial intelligence systems that lacked distributed self evolution and learning capabilities often observed in models of human cognitive development process. The cognitive cycle of the cognitive radio defined by Mitola [16], contains various states such as observe, learn, plan, decide, and act. The output of the cycle then translates to settings for various ‘knobs’ that control the wireless system’s behavior in a given wireless channel. The system uses simple if-then-else rules, the most commonly used AI techniques, and typically uncontrollable to work within a set of operational constraints. The system may use neural networks, but neural network models require extensive training to replicate observed behaviors and usually in unexpected ways when presented with a totally new problem. In other words, biologically inspired cognitive models address the traditional short comings of most of the AI models. The study of self evolving cognitive models whose behavior changes with environment is in its primitive level. In the cognitive development process, if we assume that cognitive radio is a complex state machine that possesses the states: orient, plan, decide, act, learn, and observe, then we can easily map these states into a game model, where each state is a player and possible actions are the action set. The players choose different actions in an attempt to maximize their returns. If the players have chosen a strategy and no player benefit by changing strategy while other players keep their strategy unchanged, then we say that it reached Nash equilibrium (NE), an important concept in game theory. NE corresponds to the steady state of the game and predicts most of the outcomes of the game. Using game theory which is a set of mathematical tools and models for analyzing interactive decision process, we
can analyze the rules to predict the impact on the device and system [15]. The game model obtained after analysis will help to create new rules that are efficient, fair, stable, and predictable for all radios in the locality. The decision making process depends upon the past observations and expectations about the other channel (radio) behavior; the radios then proceed with their decision-making process for the next iteration in the recursive process. The role of cognitive radio is to behave like an intelligent agent to obtain the spectrum using one of the following models: • Repeated game model: to obtain a spectrum (channel) for its customer, CR may behave similar to a repeated game. In a repeated game, players observe the actions of other players, remember past actions, choose strategies at each stage, and predict the future actions of opponents before the player takes an action. • Myopic game model: the CR tries to obtain the spectrum (channel) based on current state of knowledge. • Potential game model: the decisions are updated according to better response (converges to NE). CR chooses a game model depending upon the type of environment it is currently located. Game theory is important where nodes enter and leave in a dynamic manner. Spectrum allocation is one of the examples where game theory can be applied for the dynamic allocation and de-allocation of spectrum. The allocation depends upon the current status (myopic game model), but selection of better channel depends upon the previous history of the channel.
2
Related Work
Game theory is a wonderful tool with different outcomes including full cooperation to near-disastrous conflict. For successful implementation one can consider many models including: • Behavioral approaches • Computational approaches • Evolutionary approaches • Experimental approaches • The real world applications with clear predictions like in share market The models may be used to increase the utilization of underutilized spectrum. DARPA in 2002 funded the next generation (xG) program [6] to create adaptive radios that sense and share the spectrum. The xG program specifies the policy based usage of spectrum holes (unused spectrum), but it does not specify the cognitive learning. The FCC [7] later stated the aim of CR dialogue for competitive new wireless service for the secondary or cooperative spectrum markets.
The CRs are autonomous agents with learning capability in their environment and optimize their performance by modeling transmission parameters. The interactions can be modeled using game theory framework. In these models the CRs are players and their actions are the new selection of transmission parameters. The new transmission frequencies influence the CR’s parameters and neighboring players. Dynamic spectrum allocation and management policy helps to increase profits and support as many users as possible. During the process the radio access network may lease some spectrum at basic cost to proceed with its service providing [12, 13]. The genetic algorithm approach provides the better optimization of the spectrum bidding. The problem of generating more revenue by bidding for spectrum in the market was discussed by Reddy [14] and concluded that the spectrum will be used efficiently while generating more revenue by bidding for the spectrum in the market. Pan [11] proposed a market competition dynamic spectrum management schemes based on multi-agent models. The Pan’s application uses game theory models and Nash equilibrium which applied to reconcile the conflicts in spectrum leasing and issues related to interferences among radio access networks. Neel et al. [4] examined the applications of game theory models and behavior of several game models and their influence on the structure of cognitive networks. Neel concluded that game theory will be a valuable tool but during the analysis of algorithms one must consider convergence status and steady state behavior. Neel et al. [1, 2, 3] discussed extensively the game models for cognitive radios and their analysis. Neel’s results address five issues: steady state existence, steady state identification, steady state optimality, convergence, and stability. Using the potential and super modular game models one can identify when a CR algorithm has reached a steady state and determine the kinds of adaptations that are assured of convergence, and establish the steady regions. Vivek et al. [5] provided the basics of games, benefits of applying games to ad hoc networks, and various incentives of application of games in networks. They further discussed the interaction in wireless networks modeling as a game. Shin et al. [10] applied the game theory for distributed channel allocation in broadband fixed wireless channel (BFWC) allocation. They used the pay-off function (the bidding process) and applied game theory to BFWC. Akyildi et al. [8] in their survey paper discussed the CR architecture, current issues in application of CR to wireless technology, and future research problems with CRs in wireless communications. The survey paper did not focus on the applications of game theory models to cognitive networks. But the biologically inspired CR model for wireless communications [9] provides the new direction in using the genetic algorithm models. The models developed by Neel [1-4] and Rieser [9] are inspiration to the current research problem.
3
u i (ai , a −i ) − u i (bi , a −i ) = P(ai , a −i ) − P (bi , a −i )
Game Model
If the game is potential, there is an exact hit, means if the CR needs the set of channels or channel for its customer, the CR knows how exactly it can get. Game theory is an interactive decision making process in the current environment. The decision maker normally must contain some sort of sighted knowledge for their actions to win the game at the current state. The game typically has player or players, action space, and utility function for possible outcomes. The player should have knowledge of his/her movements and how their actions will affect themselves. Each player’s action returns the current state of high utility. The highest utility return with no change is the complete learning state. A game G is expressed in the normal form as [1, 2, 3]:
G = M , A, U
∀i ∈ M , ∀a ∈ A, ∀b ∈ A
The game is exact potential game if the following is a necessary and sufficient condition [18]: 2 ∂ 2 u i ( a) ∂ u j (a) = ; ∀i, j ∈ M , a ∈ A ∂a i ∂a j ∂a j ∂a i
In our problem we assumed that there are M players and each player is a node in the network. At any given time, a player i has number of channels C, set of actions Ai, and objective functions Ui. The objective function (utility function) for the player i is (sum of the all benefits) given by [12, 14]
U i (a) = ∑ f c (σ c (a))
• M is set of players {m1, m2, . . . , mn}
Where,
Since we assumed M as the number of players where each player is a node in the network. We further assume the node in the network as cognitive radio (CR). The CR learns the needs of the users it serves, evaluates the current state, and chooses the alternative action. Therefore the player, CR, reflects for any unilateral change in the utility function Ui of ith user. The change is in the value assigned to the player. In this case the current state of the value is very important. Therefore we define that a game is an exact potential game if there exists some function P such that the following equation is satisfied.
σ c (a )
the number of links simultaneously
operating on channel c (c ε C) and the corresponding action a (a ∈ A) and f c (σ c (a)) is the benefit. The exact potential game solves to maximize the benefits over transferring the information from source to destination. The maximization of the potential game Pmax is given by:
• U is the set of utility functions {Ui} that described by each player’s preference for certain actions. The player wish to maximize the utility functions (objective functions).
For every player i, (i = m1 or m2 or, …, or mn) the objective function Ui is a function of the particular action Ai chosen by player i. The action chosen by all other players in the game is A-i. Therefore, the action-tuple is defined as A = Ai – A-i
(4)
c∈ai
where
In addition to players, actions, and objective functions, the player has preferences, rules, and outcomes. The outcomes always follow the objective functions. The rules are fixed for each game and the preferences and outcomes are player dependant.
(3)
Now we use the above theory in our cognitive network.
(1)
• A is the action space formed by the Cartesian product of each player’s action set. Ai is set of actions of player i, which is {A1, A2, . . , An}. The action tuple is the vector (profile) of actions (strategies) chosen by players (one action at a time per player).
(2)
Pmax = ∑ U i (a ) i∈M
(5)
The potential game problem in equation (5) is an optimization problem and can be solved by using integer programming. Solving the problem through integer programming will take long time to converge and/or takes long time to reach a required solution. Therefore we need an alternative method to solve the above problem in equation (5) within estimated time and produce better solution. One of the alternative methods to solve this problem is genetic algorithm (GA). The genetic algorithm provides beam search (beam search is a heuristic search algorithm that is an optimization of best-first search; it stores the first m most promising nodes at each search step, where m is a fixed number, the "beam width”) in a problem domain and produces an approximate and required solution.
4
Genetic Algorithm Based Simulations
Genetic algorithms are a particular class of evolutionary algorithms used in computing to find exact or approximate solution. Genetic algorithms are inspired by evolutionary biology such as inheritance, mutation, selection, and crossover. In computations the abstract representations of candidate solutions are chromosomes and set of chromosomes formed as population. Traditionally the
chromosomes are randomly generated as binary strings of 0s and 1s, but other encodings are also possible. In each generation the fitness of individual (chromosome) in population is evaluated and multiple individuals are stochastically selected from the current population based on their fitness. The new population is formed using mutation, crossover and selection operators and fitness of the individual chromosome. The algorithm terminates as maximum number of generations are reached or satisfactory fitness level has been reached for the population [19]. For example, we can use simple genetic algorithm (SGA) provided in Goldberg [20] by forming the chromosome with the cluster of users, actions, and object functions to solve the current optimization problem. Alternatively, we may use ‘gatool’ of MATLAB to solve the above optimization problem. The ‘gatool’ generates needed graphs for conclusions. Therefore, the ‘gatool’ of MATLAB was selected to solve the equation (5). The advantage of ‘gatool’ is that it generates needed graphs and converges fast. The termination of the problem will be decided by the user through the selection of the number of generations. The exact potential game solves by maximizing the benefits over transferring the information from base station to destination. The maximization of the benefits was nothing but optimization of the utility function in equation (4). To solve the problem, we assumed 10 players. We then generated random number of channels for each player varying 1 to 99. For each channel we selected random number of trials to get action value. The maximum of these values was assigned to the channel as action value and calculated the utility function value (equation (4)). The Figure 1 concludes that the utility function values are stable as the number of players are four or more and accepts Nie’s results [19]. In the current problem, the utility function in equation (4) was calculated by substituting number of players and number of simultaneously operating channels. The utility function values were provided as input to the ‘gatool’ of MATLAB. The parameters of ‘gatool’ were set as: the number of generations 500, stall limit 50 (stall limit was used to get for satisfactory best fitness), population size as 30, crossover function as scattered (Figure 2) and heuristic (Figure 3), and other parameters used as defaults. The best fitness and mean fitness was shown in top left part of Figure 2 and Figure 3. The best fitness values are the optimum values for number of users, corresponding actions, and object functions. The difference between best fitness to mean fitness in Figure 2 was smaller than in Figure 3. The difference was due to crossover parameter value was set as heuristic in Figure 3. The average distance between the individuals converged approximately at 500 generations (see Figure 2 top right part) with crossover parameter equals scattered, but converges quickly in Figure 3 (see top right part) when crossover was set as heuristic.
The lower bottom right part of the Figure 2 and Figure 3 show the fitness of each individual. The fitness of individuals were better if the crossover was set as heuristic (Figure 3) compared to crossover was set as scattered. Further, the lower bottom left part of Figure 2 and Figure 3 show that fitness scaling of raw scoring converges faster with the crossover parameter as heuristic. The results from Figure 2 and Figure 3 conclude that the best utilization of channels will be obtained by setting the crossover parameter as heuristic.
5
Conclusions
The spectrum allocation was modeled using game theory and then optimized using MATLAB genetic algorithm tool. The utility function was calculated by allocating random number of channels to each player. Each player was allocated 1 to 99 channels randomly. Figure 1 concludes that the spectrum will be allocated uniformly if the number of players were more than four. The figure further concludes that the lowest utilization of spectrum was possible only if the number of players were less than four. In the current problem the potential game model was optimized using MATLAB ‘gatool’ and found that system converges after 450 generations (top right of Figure 2), whereas the system converges quickly if the crossover parameter was set as heuristic (Top right of Figure 3). At the end of execution the distance between any two individuals was nearly close means the best possible population was created and maximization of the potential function was obtained. The maximum and mean fitness values of top left parts of Figure 2 and Figure 3 were not close, means that the objective function varies a lot due to the presence of the more number of users.
Acknowledgment The research work was supported by NSF-Multi University Research and Training in Information Assurance and Computer Security to LSU Baton Rouge through award# 0621128. The authors wish to express appreciation to Dr. Connie Walton, Dean, College of Arts and Sciences, and Dr. Brett Sims, Head, Department of Math and Computer Science, Grambling State University, for their continuous support.
6
References
[1] James Neel, Jeffrey H. reed, Robert P. Gilles., “The Role of Game Theory in the Analysis of Software Radio Networks”, SDR Forum Technical Conference, November, 2002. [2] James Neel, Jeffrey H. reed, Robert P. Gilles., “Game Model for Cognitive Radio Algorithm Analysis”, SDR Forum Technical Conference, November 2004.
[3] James Neel, Rekha Menon, Jeffrey H. Reed, Allen B. MacKenzie., “Using Game Theory to Analyze Physical Layer Cognitive Radio Algorithms”, Conference on Economics, Technology and Policy of Unlicensed Spectrum, Lansing, Michigan, May 16-17, 2005. [4] James Neel, Jeffrey H. reed, Robert P. Gilles., “Convergence of Cognitive Radio Networks”, WCNC, March 2004. [5] Vivek Srivastava, James Neel, Allen B. MacKenzie, Rekha Menon, Luiz, A. Dasilva, James E. Hicks, Jeffrey H. Reed, and Robert P. Gilles., “Using Game Theory to Analyze Wireless Ad Hoc Networks”, IEEE Communications Surveys & Tutorials, Issue: 4, pp 46- 56, 2005, ISSN: 1553-877X . [6] DARPA: http://www.darpa.mil/ato/programs/xg/rfcs.htm [7] FCC Report., “Facilitating Opportunities for Flexible, Efficient, and Reliable Spectrum Use Employing Cognitive Radio Technologies”, FCC Report and Order, FCC-0557A1, March 11, 2005. [8] Akyildiz, W., Y. Lee, M.C. Vuran, and S. Mohanty., “NeXt generation/dynamic spectrum access/cognitive radio wireless networks: a survey”, Computer Networks, vol. 50, no. 13, pp. 2127-2159. [9] Christian James Rieser., “Biologically Inspired Cognitive Radio Engine Model Utilizing Distributed Genetic Algorithms for Secure and Robust wireless Communications and Networking”, Ph. D. thesis, Virginia Polytechnic Institute and State University, 2004. [10] Shin Horng Wong and Ian J. Wassell., “Application of Game Theory for Distributed Dynamic Channel Allocation”, IEEE Vehicular Technology Conference, Spring 2002, USA.
[11] Miao Pan, Jie Chen and Yang Ji., “Multi-agent architecture and game theory’s applications on the management of Spectrum in reconfigurable systems”, WWRF16, Shanghai, China, April 2006. [12] Sorabh G., Chiranjeeb B., Lili C., Haitao Z., and Subhash S., “A General Framework for Wireless Spectrum Auctions”, 2nd IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, DySPAN 2007. [13] Sandholm, T., and Suri, S., “Market Clearability”, Proc. of the International Joint Conference on Artificial Intelligence (IJCAII) 2001. [14] Y. B. Reddy., “Optimum Spectrum Utilization through Auctions”, IEEE BROADNETS 2007, sept 10-14, 2007. [15] Allen MacKenzie and Luiz DaSilva., “Game Theoryfor Wireless Engineers”, Synthesis of Lectures on Communications, Morgan & Claypool Publications,” 2006. [16] J. Mitola and G.Maguire, Jr., “Cognitive Radio: Making Software Radio More Personal”, IEEE Personal Communications Magazine, Vol. 6, No.6, pp 13-18, August 1999. [17] J. Mitola., “Cognitive Radio Architecture: The Engineering Foundations of Radio XML”, Wiley Interscience, ISBN: 0-471-74244-9, 2006. [18] D. Monderer and L. Shapley., “Potential Games”, Games and Economic Behavior, Vol 14, pp 124-143, 1996. [19] Nie Nie and Cristina Comaniciu., “Adaptive Channel allocation spectrum etiquette for cognitive radio networks”, Mobile Networks and Applications, Volume 11, Issue 6, 2006, pp: 779 – 797. [20] David E. Goldberg., “Genetic Algorithms in Search, Optimization, and Machine Learning”, Addison-Wesley, 1989.
Figures: Calculation of Utility Function 100
Utility Function Value
90
80
70
60
50
40
1
2
3
4
5 6 Number of Players
7
8
9
10
Figure 1: Calculation of Utility function for 10 players
Figure 2: The optimization of the utility function - crossover as scattered
Figure 3: The optimization of the utility function – crossover function as heuristic