Evolutionary Signal Processing: A Preliminary Study - CiteSeerX

0 downloads 0 Views 121KB Size Report
[1], evolution acts as a low pass analogue filter and for the discrete, non-overlapping ... 1 Introduction. Signals are quantities that convey information [3]. Signal.
Evolutionary Signal Processing: A Preliminary Study Tony Hirst HCRL, Department of Psychology, Gardiner Bldg, Open University, Walton Hall, Milton Keynes, MK7 6AA mailto: [email protected], http://socsci.open.ac.uk/~monty Abstract

[4][5] demonstrates the transfer properties of the evolutionary filter.

The notion of Evolutionary Signal Processing in temporal and ‘spatial’ domains is introduced both theoretically and experimentally. Analytical results from quantitative genetics suggest that in a sinusoidally fluctuating fitness environment, the population mean phenotype tracks the optimum phenotype with a well defined attenuation and phase lag. I show that for the continuous model of (Lande, 1996) [1], evolution acts as a low pass analogue filter and for the discrete, non-overlapping generational model of (Charlesworth, 1993) [2] evolution acts as a band pass nonrecursive digital filter. Results from a genetic algorithm experiment illustrate that these theoretical biology/signal processing models are applicable in the evolutionary computation domain. In addition to the filtering of continuous signals, evolutionary operators that are capable of transforming evaluation and fitness landscapes are likened to spatial image processing filters.

I shall also suggest that landscape transforming operators [6] may likened to spatial filters acting on the evaluation or selective value landscape, equating the image mask with a suitably defined operator, or search space, neighbourhood [7].

1 Introduction

In this initial study, I shall consider only sinusoidally varying environments. It lies to further work to generalise the theoretical approach through the Fourier analysis of rather more complex optimal signals. Firstly, I consider the continuous ‘steady state’ model of [1]. Secondly, the discrete, generational model of [2].

Signals are quantities that convey information [3]. Signal processing relates to a set of techniques that allow the manipulation of signals and is frequently encountered in the guise of filters which are broadly defined as systems whose output signal differs from the input signal. A more useful definition is of a frequency selective system that attenuates certain frequency components whilst passing others unchanged. This applies to both temporal and spatial (typically image processing) domains. In this paper, I will argue that it is possible to treat evolution as a filter of temporal signals. Traditional quantitative genetics discusses evolving systems in terms of optimal phenotypes, population mean phenotypes and measures of selection strength and genetic variance. By treating the optimum phenotype at any given time as the input signal to the evolutionary system, and the population mean phenotype as the system output, it is possible to characterise the transfer function of the evolutionary system. In this paper, I shall draw on such transfer functions derived in quantitative genetics analyses [1][2] and liken them to those of signal processing filters as derived for use by engineers [3]. Preliminary results from an ongoing genetic algorithm study

2 Filtering Temporal Signals The first part of this report covers the filtering of temporally varying signals. If there is a way of putting the evolutionary transfer function into form of a traditional filter transmission function, relating genetic variance and selection strength parameters to the design parameters of a particular filter, it will be possible to use the knowledge of filter design to tune the evolutionary parameters so as to obtain the required filter characteristics. Since the ‘output’ of the evolutionary filter is given by the mean of the population, this represents population level filtering.

2.1 A Continuous Model Realises an Analogue, Butterworth Filter Now, from [1], a continuous evolutionary signal: f (t ) = Asin bt (1a) is tracked by the (continuous) mean expressed signal, g(t) ≈ ζ Asin(bt − β ) (1b)

γσ g2

where

ζ=

and

β = cos −1 ζ

γ 2 σ g4 + b 2

(2a) (2b)

ζ corresponds to the gain of an analogue Butterworth filter, given by [3], equation (5.73) as:

1

2

H ( jω ) =

1 1 + ω 2n

(3)

where n is the degree of the filter. By letting ω = b, the amplitude squared function of the evolutionary filter is given by: 2

H ( jω ) = ζ 2 =

1 b2 1+ 2 4 γ σ

(4)

which corresponds to a first order filter, n = 1, with 2 cutoff γσ . Note the gain never exceeds 1 (i.e. evolution is acting as a passive filter), and equals 1 for b = 0. The typical phase and gain responses of this system are given in figures 1a and 1b.

In his treatment of the evolution of recombination, [2] presents a theory one would expect to be rather more in accord with genetic algorithm experiments, specifically the tracking of a sinusoidally varying environment with discrete generations (equations (5) to (8a) below are taken from this source). As in the continuous case, two measures of variation are required: the additive genetic variance, Vg, and a quantity Vs = Ve + 1/S, where Ve is a the environmental variance (arbitrarily set to 1) and S is the strength of selection. Vg is itself a function of Vs and the generational variance due to mutation (and hence the mutation rate). Further, let: k = Vg/(Vg + Vs) (5a) and V = Vg/Vs (5b) so for small V, k ≈ V (5c)

1 z(g,s,b) z(g,s,b) z(g,s,b)

0.9 0.8 0.7 0.6

In a fluctuating environment, with constant equilibrium values for Vs and Vg , and optimal phenotype f(n) in generation n, the mean phenotype in generation n, g(n), is given by: n

g(n) = (1 − k ) g(0) + k ∑ (1 − k ) n

0.5

i −1

f (n − i)

(6a)

i =1

0.4

n

0.3

≈ k ∑ (1 − k )

0.2

i −1

f (n − i)

(6b)

i =1

0.1 0 0.01

0.1

1

Figure 1a: Gain squared versus frequency for equation (1b).

approximating for large n. Note that by doing so, the current mean is only a function of the previous optimum, and not the previous mean. In a sinusoidally varying environment:

10 d(g,s,b) d(g,s,b) d(g,s,b)

2 πn  f (n) = Acos = Acos( Bn)  L 

(7)

For a low amplitude/period ratio, the solution is approximated by an integral with solution:

1

g(n) ≈ 0.1

0.01 0.01

0.1

1

Figure 1b: Phase shift (lag) versus frequency for equation (1b).

In this example, then, evolution is acting as a low pass filter, with a low cut-off frequency arising from the low genotypic variation (i.e. the evolutionary process passes signals with a long period). In addition, the filter is maximally flat in the pass- and stop-bands.

2.2 Discrete, Non-Overlapping Generations

2 πALV sin(2 π (n − 1) L ) V 2 L2 + 4 π 2 ABV sin( B(n − 1)) ≈ B2 + V 2

(8a) (8b)

As in the previous continuous case, it is now possible to appeal to filter theory to try to understand the mean expression in those terms. The gain squared term and phase shift of the ‘filter’ are given by:

and

V B 2 H ( jω ) ≈  +   B V π Β( B) = −B + 2

−2

(9a) (9b)

The phase shift (not shown) is a linear function of the environmental frequency with an additional constant shift of 90 degrees (not 180 degrees as Charlesworth states). This contrasts with the continuous result for which the phase

2

change was also a function of the selection strength and genetic variance (i.e. V in this case). 0.25 s(b,v) s(b,v) s(b,v)

0.2

digital filter may be realised either recursively or nonrecursively, as follows: For a system with input sequence {f(n)} and output sequence {g(n)}: • a recursive filter obtains g(n) as a function of {g(n1), g(n-2),...; f(n), f(n-1),...}; • a non-recursive filters derives g(n) as a function of {f(n), f(n-1),...}.

0.15

Now, the expression for g(n) in equation (6a) is a function of {f(n-i)} but not {g(n-j)}, i, j > 0. In this sense, the filter is a non-recursive filter, although unusual in that g(n) is not a function of f(n)1. The phase characteristic, equation (9b), is typical of an antimetric finite impulse response (FIR) filter, given by (Baher, 1990), equation (8.108):

0.1

0.05

0 0.01

0.1

ψ (ω ) = −

1

Figure 2: Gain squared versus frequency for equation (8). Plotting the gain term in figure 2 gives a characteristic typical of a band pass filter. The maximum value the gain (squared) term can take is simply 0.5 (0.25) and is located at:

d H ( jω ) dB

2

=

d (2 + V 2 B−2 + B2V −2 ) dB

=0

(10a) −2

−3

2

and so BV − B V = 0 (10b) giving B = V. Further, taking the 3dB point to be 3dB down on the maximum possible gain squared curve (i.e. for gain squared 0.125 rather than 0.5), the two cut off points are located at the solution of: B2 − 2 2VB + V 2 = 0 (11a) i.e.

B=V

(

)

2 ±1

(11b)

What this means in evolutionary terms is that not only does the population fail to track high frequency signals, but also drifts away from tracking ‘biologically stationary’ optima where the environmental period is large. Although this may at first seem unreasonable, if one considers a case of high mutation and weak selection then it is likely that the mutation selection balance will allow individual population members to drift quite some distance from the optimum.

2.3 The Evolutionary Implementation of a Digital Filter In terms of electronic filter design, what can we say about the transmission function? Firstly, I consider the characteristics of the signal being tracked and the signal being passed. In the above treatment, the input to the system is a continuous environmental signal, and the analogue filter analysis seems to hold. In a simple generational analysis, or simulation, populations are evaluated on the basis of an instantaneous sample of the environmental state. The filter is thus acting on a discrete-time or sampled data signal and as such is likely to be classed as a digital filter. The transfer function of a

ωNT π + 2 2

(12)

where ω is the environmental frequency, N is the ‘depth’ of the filter (i.e. f(n-0)…f(n-N) samples are used in finding g(n), so strictly we require N = n) and T is the sampling period (in our case, T = 1). If it is possible to equate (9b) and (12a), this fixes N = 2 and hence (6b) would be a summation over i = 0 to 2. Given the approximation of (6a) to (6b), it is not unreasonable to expect that for higher values of i, little extra is contributed to the summation of g(n). Secondly, how is the evolutionary filter being realised? The evaluation-selection process takes two types of argument (as well as the selection type, strength etc. parameters) - the population under selection, and the current optimal phenotype, f(t). The output population mean phenotype then corresponds to g(t). The current population, let us call it p(t), is derived through selective transmission of the previous population, p(t-1), that is, as a function of the previous population, p(t) = st(p(t-1)). In this sense, the evolutionary filter may be realising a recursive design, since g(t) (the mean expression of the population) is a function of p(t) and hence of p(t-1) (which supervenes g(t-1)). It lies to further work to clarify the actual architecture of the evolutionary filter, at least as it is modeled by genetic algorithms.

2.4 Consequences of Plasticity and the Transmission of Acquired Characteristics Treatments of learning and culture, in which a distinction is made between genotype and expressed phenotype, may be modeled by changes in selection strength and genetic variation, with the following effect on cut-off frequencies: •

with individual learning, there are two essentially two filters in operation, one providing an output phenotypic mean, the other the directly evaluated genotypic mean. The large amounts of phenotypic variation provide a

1 If within generation learning is allowed, then g(n) may be a

function of the current environmental state, f(n).

3

1 0.5

301

251

201

151

101

51

0

Figure 3a: Single typical run of population mean versus optimal signal; α = 0.1, mutation rate fixed at 0.045/bit.

Rate 0.001 0.01 0.05 0.1 0.5 Period 6283 628 126 63 13 Table 9-1: Period of environmental sinusoid in generations for the applied range of rates, α. Previously reported results using this fitness function have concentrated on the ability of the fittest population member at any one time to track the environmental state. In this section, I shall be rather more interested to see how the population as a whole tracks the environment, specifically by monitoring population mean fitnesses. In the experiment that follows, two phases may be identified in the evolutionary dynamic (no data shown) - firstly, a lead in period of search from the initially random population; secondly, the equilibrium dynamic. The initial phase may be likened to the initial ‘settling time’ of the system.

0.5

101

81

0 61

(13b) where p is a 32 bit Gray coded phenotypic individual over the range [0.0, 2.0], ht gives an environmental state that varies sinusoidally over time, with rate parameter, α. Table 9-1 gives the generational period for the values of α considered herein.

1

41

(13a)

1.5

21

In this experiment, I shall utilise an evaluation function originally owing to Cobb [10], that requires the tracking of a sinusoidally varying optimum that changes b e t w e e n generations : 2

1.5

2

3 Genetic Algorithm Study

f t ( p) = ( p − ht ) ht = 1.0 + sin(α × Generation)

2

1

What this means is that the evolutionary system has a low cut off frequency and only passes slowly changing signals. Cultural algorithms have an intermediate cut-off frequency and adaptive plasticity alone has a high cut off frequency, although all are low pass filters. [8] and [9] offer theoretical results demonstrating these properties, and the genetic algorithm model of [4] and [5] provides experimental support. The instability of systems where the environmental period is 1 or 2 generations [9] may be related to the sampling theorem which states that the sampling frequency should be at least twice the cut off frequency.

All experiments were carried out using a tweaked Genesis 5.0. Tabulated results represent the mean of 10 runs, except where stated otherwise. Unless otherwise specified: population size was 200; the mutation rate was set at 0.045 per bit); two point crossover was applied at a rate of 0.6 to individuals selected using linear ranking selection with rank minimum 0.5.

1



high cut-off frequency for the phenotypic mean and so mean phenotypic tracking of rapidly fluctuating environments is possible; the reduction in selection pressure on the genetic basis of selected phenotypes lowers the cut off frequency of the ‘genetic filter’ and so rapid fluctuations of the optimum are not mirrored by the genotypic mean. in a cultural system, the distinction between phenotype and genotype is essentially removed, although there is a single generation lag between the two. The net effect of high phenotypic variance, which under the inheritance of acquired characteristics translates to low genetic variance under strong selection, serves to set the single filter cut off frequency to an intermediate value.

Figure 3b: Tracking a noisy sinusoid. Low amplitude, high frequency noise is filtered out by the populaytion mean. Figure 3a demonstrates how the population mean does indeed track the optimum, although the population mean is attenuated compared to the optimal signal and there is a definite lag. In Cobb’s study, high mutation rates were required to optimise the evaluation of the fittest population member. However, a high mutation rate renders the population as a whole unstable and the population mean fitness suffers. Experiments with an evolvable mutation rate (initially reported in [4], with a more detailed study in [5]) suggest that the evolutionary process optimises mean fitness rather than peak fitness (as one would expect, table 1) and a relatively low mutation rate is up to this task for a wide range of α. In figure 3b, noise in the optimum is filtered out by the population. It remains to further work to identify the signal/noise ratios that result in acceptable tracking performance given a noisy signal.

4

The frequency response of this particular evolutionary system (characterised by the recombination parameters and the selection function) is given in tabular form in table 2. Note the low pass filter response as one would expect from the previous analysis. Note how the evaluation measures (online and offline fitness) suffer for increasingly fluctuating environments. In accord with [1], changes in gain and lag result from altering the genetic variance, as parameterised by mutation rate. (It may be possible to utilise the bias metric as a measure of the genetic variance in a more quantitaive model of the theory; equally useful would be a reltionship between ‘theoretical’ variance and actual (evolved) equilibrium mutation rate). So for example, in table 3, lag is decreased by a slight increase in the mutation rate, although only by a generation or so, and increases slightly more for a decrease in rate of similar magnitude. For larger changes in the mutation rate, tracking properties of the population are altered and useful direct comparison becomes difficult. The amplitude of the mean expressed phenotype is similarly a decreasing function for increasing mutation rate, although the effect on mean tracking ability is the converse to that of the lag. That is, as mutation rate increases, whilst the lag between mean phenotype and the environmental target is reduced, the attenuation of the mean increases (i.e. there is a worse fit between the time delayed environmental signal and the mean value).

As with mutation rates, so with selection pressure, parameterised by the rank minimum value: by reducing the selection pressure, the effectiveness of a population’s mean phenotypic tracking worsens and phase lag and attenuation both increase, as demonstrated in table 4. It was mentioned that there were two distinct phases in each run of the GA - a lead in, searching phase, and an equilibrium phase. Recalling Charlesworth [2], one of the assumptions he made for equation (6b) that the number of generations, n, was large. It may be that through experiment, a more accurate estimate of the minimal value of n , for which the approximation holds, may be achieved.

4 Filtering ‘Spatial’ Signals: “Landscape Processing” In this section, I shall briefly introduce the notion of landscape processing, which applies the metaphor of image processing to evaluation landscapes. In [6] and [5], I suggested that the structure of landscapes and neighbourhoods respectively is induced by the evolutionary and plasticity operators. For example, the exactly one bitflip mutation neighbourhood of an individual length L bits comprises the L Hamming 1 neighbours of the individual. In this section, I will suggest how plasticity and fault induction may be used to ‘filter’ an evaluation landscape. The argument

m-rate/bit 0.015 0.030 0.045 0.060 0.075 Online fitness 0.49 0.55 0.56 0.55 0.53 Offline fitness 2.51 3.11 3.54 3.81 4.06 Table 1: Online and offline fitness versus fixed mutation rate in the vicinity of the evolved rate for α = 0.1. α. 0.001 0.01 0.05 0.1 0.5 Online fitness 1.72 1.24 0.91 0.53 0.23 Offline fitness 6.11 4.75 3.39 2.78 1.62 peak to peak 2.0 1.99 1.82 1.70 0.28 mean lag ~8 7.43 7.01 3.89 lag sd N/A 0.44 0.29 0.71 Table 2: Gain and lag for a simple GA with mutation rate 0.02/bit, showing a low pass filter characteristic over α. m-rate/bit 0.015 0.030 0.045 0.060 0.075 peak to peak 1.71 1.64 1.56 1.45 1.38 mean lag 7.70 6.39 5.74 5.15 4.80 lag sd 0.44 0.38 0.32 0.27 0.36 Table 3: regime SGA - selection rank minimum fixed at 0.5; alpha 0.1; discard first cross after origin start; change mutation rate from 0.035 to 0.055 through 0.045. rank min. 0.1 0.3 0.5 0.7 peak to peak 1.75 1.68 1.56 1.29 mean lag 4.00 4.80 5.74 7.28 lag sd 0.27 0.28 0.32 0.21 Table 4: regime SGA - mutation rate fixed at 0.045; alpha 0.1; discard first cross parameterised by rank minimum, from 0.1 to 0.9 step size 0.2.

0.9 0.59 8.75 0.99 after origin start; relax selection,

5

is informally presented, and relies on a simplistic two dimensional representation of the search space. The evaluation landscape is visualised as a contour map, with thick contours signifying low fitness and fine contours high fitness, as in figure 4a. It is now possible to consider filtering the ‘evaluation image’ in a manner akin to image processing techniques. Such methods often make use of a 3 x 3 pixel ‘mask’ (the Moore neighbourhood in cellular automata terms) to generate the next state of the central pixel following filtering. The mask is passed over each image pixel in turn to filter the whole image. A low pass filter corresponds to the mask shown in figure 4b. This is similar to an individual being evaluated according to the mean evaluation over the whole of a suitably defined plasticity neighbourhood. White noise (‘speckles’) is often removed from an image through a median filter, in which the central pixel is given the median value of those taken over the mask. Steepest ascent learning corresponds to setting the focal ‘pixel’ to the highest value in the neighbourhood, thus smearing any noise present, rather than removing it. Fault injection using steepest descent compares directly with using the lowest neighbourhood value (and ‘blurring’

Figure 4a: An idealised ‘landscape image’. Thick lines represent contours of low ‘fitness’, fine lines high ‘fitness’, over a search space structured by the genetic operators. 1 1 1 1 1 1 1 1 1 Figure 4b: An image mask as used in a low pass filter. of the image). Filtering through plasticity implements smoothing of the landscape by an individual and is thus individual level filtering. Evaluation surfaces may be further transformed by the use of particular selection functions [6]. For example, whatever the population size, if the population is constrained such that no 2 individuals are allowed to be the same, and weak truncation selection is applied (so most individuals get a chance at reproduction) the evaluation surface is significantly smoothed. Since the selection function

determines the range of allowable selective values over the population, selection function transformations of the evaluation surface .are examples of population level spatial filtering.

5 Conclusion and Comments on Future Work I have discussed treating the evolutionary system and the landscape transformation operators it applies as an example of temporal and spatial filters respectively. This notion of Evolutionary Signal Processing may prove beneficial not only in fostering communication between quantitative genetics and the signal processing community, but also in the way genetic algorithms are applied in fluctuating environments. For example, by suitably defining the evolutionary filter, one may track components below a certain (known) frequency, cutting out all higher frequency noise. It is likely that adaptive filtering will be achieved through allowing the genetic variance (i.e. mutation rate) to evolve, subject to certain constraints on the signal/noise ratio. By using plasticity, the two cut off frequencies (one for the phenotypic mean, the other for the genotypic mean) enable a form of tunable band pass filtering: by compensating the lag and lower frequency gain between the two evolved means, the frequencies between the two cut-offs may be extracted. Although not considered in this report, it is possible that a steady state GA with a small tournament size will approximate the analogue filter characteristics even more closely. In a rapidly fluctuating environment, signal information is preserved through keeping ‘generational time’ down to a minimum (i.e. just the time it takes to evaluate the tournament participants). Sampling of the individuals should involve some sort of least recently used metric, perhaps competing least recently and most recently generated individuals. Where the tournament size equals the population size, M, the state of the population resembles {f'(n), f'(n-1),…, f'(n-M)} where f'(n) represents the fittest individual of the current population and hence the population’s best estimate at f(n). This is close in spirit to the signal processing algorithms discussed above and deserves further study. Landscape processing suggests another area of research that is likely to be fruitful in the future through filtering arbitrarily complex (local) landscapes into ones rather more suited to evolutionary search and optimisation. Work by Happel and Stadler [11] on decomposing landscapes is the first step towards filtering the whole of those landscapes (i.e. removing certain prespecified components).

References [1] Lande, R, & Shannon, S. (1996) “The Role of Genetic Variation in Adaptation and Population6Persistence in a Changing Environment.” Evolution 50(1):434-437.

[2] Charlesworth, B. (1993) “Directional Selection and the Evolution of Sexual Recombination.” Genetical Research 61:205-224. [3] Baher, H. (1990) Analog & Digital Signal Processing. John Wiley & Sons Ltd. [4] Hirst, AJ. (1997) “Plasticity and Culture in Cyclically Fluctuating Environments.” In AISB97 Evolutionary Computation Workshop Handbook.. [5] Hirst, AJ. (in prep) The Interaction of Evolution, Plasticity and Inheritance in Genetic Algorithms. PhD Thesis Dept. of Psychology, Open University. [6] Hirst, AJ. (1997) “The Structure and Transformation of Landscapes.” In AISB97 Evolutionary Computation Workshop Handbook.. [7] Hirst, AJ. (1996) “Search Space Neighbourhoods as an Illustrative Device.” In Proceedings of WSC1, pp.4954. [8] Boyd, R, & Richerson, PJ. (1988) “An Evolutionary Model of Social Learning: the Effects of Spatial and Temporal Variation.” Social Learning: Psychological and Biological Perspectives. Ed. TR Zentall & BG Galef Jr. Lawrence Erlbaum Associates. pp. 29-48. [9] Feldman, MW, Aoki, K, & Kumm, J. (1996) Individual versus Social Learning: Evolutionary Analysis in a Fluctuating Environment. Working Paper 96-05-30, Santa Fe Institute. [10] Cobb, HG. (1990) An Investigation into the Use of Hypermutation as an Adaptive Operator in Genetic Algorithms Having Continuous, Time-Dependent Nonstationary Environments. Navy Center for Applied Research in AI. December 11, 1990. [11] Happel, R, & Stadler, PF. (1995) Canonical Approximation of Fitness Landscapes. Working Paper 95-07-068, Sante Fe Institute.

7