Electronics and Signal Processing

ISAST Transactions on

No. 1, Vol. 1, 2007 (ISSN 1797-2329)

Electronics and Signal Processing Regular Papers Khoa N. Le and Fan Yi: On marginal and joint Gaussian and hyperbolic angle-of-arrival probability density functions in multi-path mobile environment .................................................................................................... 1 Farah Bouakrif, Djamel Boukhetala and Fares Boudjema: Trajectory tracking control for robot manipulators using a simple disturbance observer .......... 11 Alexander Teggatz, Andreas Jöstingmeier and Abbas S. Omar: Investigation of an Efficient SAR Focusing Technique for Ground Penetrating Radar Data .... 18 Gagan Mirchandani and Mohamed Elfataoui: A Uniformly Convergent Approximation for Ideal Complex Half-Band Filters ....................... 24 M. R. Shirazi, S. W. Harun, M. Biglary, K. Thambiratnam and H. Ahmad: Effect of Brillouin Pump Linewidth on the Performance of Brillouin Fiber Lasers .................. 30 Martin P. Foster, Adam J. Gilbert, Christopher R. Gould, David A. Stone and Christopher M. Bingham: Automated design of LCC resonant converters using a genetic algorithm employing a describing function equivalent circuit converter model ............................................................. 33 Jonathan M. Blackledge: Diffusion and Fractional Diffusion based Models for Multiple Light Scattering and Image Analysis ...................................................................................................................................... 38 Jonathan M. Blackledge: Digital Watermarking and Self-Authentication using Chirp Coding ......................................... 61 Jonathan M. Blackledge: Modelling and Computer Simulation of Radar Screening using Plasma Clouds ....................... 72 Julian Meng: Linear Prediction Filtering and Transform-Domain-Based Spread-Spectrum Receivers for Narrowband Interference Suppression ....................................................................................... 81 Robert Niese, Ayoub Al-Hamadi and Bernd Michaelis: Nearest Neighbor Classification for Emotion Recognition in Stereo Image Sequences............ 88 Magnus Karlsson and Shaofang Gong: A Frequency-Triplexed Inverted-F Antenna System for Ultra-wide Multi-band Systems 3.1-4.8 GHz ............................................................................................................................................ 95 Jonathan M. Blackledge: An Approach to Unification using a Linear Systems Model for the Propagation of Broad-Band Signals ...................................................................................................................................... 101

Greetings from ISAST Dear Reader, You have the First ISAST Transactions on Electronics and Signal Processing on your hands. It consists of thirteen original contributed scientific articles at the various fields of advanced electronics and signal processing technology. Every article has gone through peer-review process. ISAST - International Society for Advanced Science and Technology – was founded in 2006 for the purpose of promote science and technology, mainly electronics, signal processing, communications, networking, intelligent systems, computer science, scientific computing, and software engineering, as well as the areas near to those, not forgetting emerging technologies and applications. To show, how large the diversity of computers and software engineering field is today, we shortly summarize the contents of this Transactions Journal: In the paper of Khoa N. Le and Fan Yi, mobile environment is discussed, and the Gaussian and hyperbolic distributions to model the scatterers are proposed. Farah Bouakrif, Djamel Boukhetala and Fares Boudjema discuss trajectory tracking control for robot manipulators. On the other hand, Alexander Teggatz, Andreas Jöstingmeier and Abbas S. Omar have studied syntethic aperture radar and ground penetrating radar data. They introduce methods for improving the resolution of radar images. Gagan Mirchandani and Mohamed Elfataoui have a research paper related complex half-band filters. It can do the same task as frequency domain method for generating a discrete-time analytic signal. M. R. Shirazi, S. W. Harun, M. Biglary, K. Thambiratnam and H. Ahmad demonstrate the effect of the pump linewidth on the performance parameters of Brillouin fiber lasers. Martin P. Foster, Adam J. Gilbert, Christopher R. Gould, David A. Stone and Christopher M. Bingham study genetic algorithms for applying to design LCC resonant converters. The paper of Jonathan M. Blackledge considers a fractional light diffusion model as an approach to characterizing the case when intermediate scattering processes are present. The same author studies also digital watermarking and selfauthentication as well as modeling and computer simulation of radar screening. Julian Meng has a research paper of linear prediction filtering and transform-domain-based spread-spectrum receivers. Robert Niese, Ayoub Al-Hamadi and Bernd Michaelis have written a paper of user independent realtime capable automatic approach for recognition of basic emotion expressions from stereo image sequences. Magnus Karlsson and Shaofang Gong have a paper of fully integrated triplex antenna system for multiband UWB 3.1-4.8 GHz. Finally, Jonathan M. Blackledge has his fourth paper, where he reviews the inhomogeneous scalar Helmholtz equation in three-dimensions and the scattering of scalar wavefields from a scatterer of compact support. We are happy to see how much we have obtained manuscripts with ambitious and impressive ideas. We hope that you will inform of the existence of our Society to your colleagues to all over the academic, engineering, and industrial world. Best Regards, Professor Timo Hämäläinen, University of Jyväskylä, FINLAND, Editor-in-Chief Professor Jyrki Joutsensalo, University of Jyväskylä, FINLAND, Vice Editor-in-Chief

Regular Paper Original Contribution 1

ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Khoa N. Le and Fan Yi: On marginal and joint Gaussian and hyperbolic angle-of-arrival probability density functions in multi-path mobile environment

On marginal and joint Gaussian and hyperbolic angle-of-arrival probability density functions in multi-path mobile environment Khoa N. Le and Fan Yi. Griffith School of Engineering, Griffith University, Gold Coast campus, Queensland, Australia Abstract— In mobile environment, derivation of the angle-of-arrival (AoA) probability density function (pdf) of signals transmitted from a user equipment's antenna (UE) to a base station (BS) by modeling random scatterers between them using an appropriate distribution is an important task. Gaussian and hyperbolic distributions have been successfully employed to derive the corresponding Gaussian and hyperbolic AoA marginal pdfs. In general, the AoA is assumed to be between –90o and +90o. This paper proposes the Gaussian and hyperbolic distributions to model the scatterers, resulting in joint Gaussian and hyperbolic AoA pdfs. Approximate marginal and joint Gaussian and hyperbolic AoA pdfs when the AoA is small and when it is large are derived. The entropy of these pdfs is then estimated to determine the effectiveness of the Gaussian and hyperbolic distributions in modeling the scatterers. Relative merits of the Gaussian and hyperbolic AoA pdfs are discussed. Comparisons between joint AoA pdfs and marginal AoA pdfs are made. Future work is also outlined. Index Terms— marginal angle-of-arrival probability density function, joint angle-of-arrival probability density function, Gaussian kernel, hyperbolic kernel, entropy.

I. INTRODUCTION

C

onsider the BS and the UE as shown in the schematic diagram in Figure 1 (Bevan et al., 2004). x

x1

UE

θ eff

y

θ Base Station (BS)

D

Reff

Figure 1: A schematic diagram of angle θ subtended by random scatterers at the BS. The AoA θ seen at the BS is assumed to be small so that sin(θ) ≈ θ

The main task is to derive the AoA pdf seen at the BS when signals are transmitted from the UE. In practice, signals are scattered by objects between the BS and the UE. These objects are commonly known as random scatterers which can be assumed to follow a particular distribution. A Gaussian AoA pdf was derived by using a Gaussian distribution to approximate the scatterer distribution (Bevan et al., 2004). Other related works on deriving the AoA pdf focused on using directional antennas and estimating channel capacity. Janaswamy derived the AoA pdf and time-of-arrival (ToA) pdf for outdoor and indoor mobile environment using a Gaussian scatter density model (Janaswamy, 2002) in which the findings were consistent with results obtained in (Bevan et al., 2004). Dubey and Ng (Ng and Dubey, 2003) studied the AoA pdf using directional antennas in mobile OFDM communications systems with a time-varying channel. Loyka and Tsoulos (Loyka and Tsoulos, 2002) studied channel capacity in a multiple-input-multipleoutput communications system in which the AoA pdf was assumed to be uniform. Lee (Lee, 1973) gave fundamental frame work on AoA pdf derivation using directional antennas. A hyperbolic distribution has also been used to model the random scatterers in mobile environment (Le, 2007), from that the hyperbolic AoA pdf was derived under the following assumptions: (1) the elevation angle is zero and the hyperbolic channel model (HCM) is twodimensional with scattered signals received by the BS in the horizontal plane, (2) collisions between signals and scatterers are elastic, i.e. unity reflection coefficient and random phase, (3) infinite attenuation for a path that is a direct line-of-sight between the BS and the UE's antenna, i.e. it is impossible for signals to travel from the UE to the BS without encountering random scatterers, and (4) signals traveling from the UE's antenna to the BS are scattered and reflected by scatterers only once and they are independent of each other, i.e. there is no further coupling among the signals and the scatterers are omnidirectional. By modeling the random scatterers using a suitable distribution, Figure 2 schematically shows a communications model which can be used to describe the system depicted in Figure 1.


2

Signal

BS

Hyperbolic Channel

Signal

UE

Model

Coverage Distance D

Figure 2: A communications model between BS and UE with the BS as the receiver and the UE as the transmitter

This work and the work reported in (Bevan et al., 2004) have been motivated by the work of Pedersen, Mogensen and Fleury (Pedersen et al., 2000) in which real data were collected from experiments conducted in Aarhus, Denmark and Stockhom. The main reason that the hyperbolic distribution was used to model the random scatterers is because of its better fit to the real data toward the tail region in the time domain than the Gaussian distribution as seen in Figure 3. Thus, it can be expected that the hyperbolic AoA pdf is more uniform and informative than the Gaussian AoA pdf as will be shown later. From Figure 1, it should be noted that when the AoA is small, signal transmission is practically under the optimum condition in which the signal path is approximately a direct line-of-sight between the BS and the UE. This further means that only scatterers lying in this path can cause distortion and thus affect the quality of the received signals at the BS. Typical examples when the AoA is small are: (1) the BS is located in an open space area where the direct line-of-sight path with the UE is not significantly disturbed, and (2) locations at which signals prior to entering the BS are trained to follow a predefined route. It should also be noted that the case of small AoA is practically ideal, yielding more satisfactory performance than the general case in which the AoA can range between –90o and +90o which is random and therefore unpredictable. By studying the hyperbolic and Gaussian AoA pdfs when the AoA is small and when it is large, it is possible to further examine their properties and behaviour under these critical conditions. Hyperbolic and Gaussian distributions approximation to real data 1 Gaussian Hyperbolic Real data

0.9 0.8 0.7

II. ENTROPY OF AOA PDFS The entropy of a pdf of a random variable shows how much "information" is contained in the variable (in this case θ), which means that the larger the entropy, the more informative and uniform the pdf. Mathematically, the entropy of a pdf is given by (Lathi, 1998) (1) ⎛ ⎞

H=

∑

1 pi log⎜⎜ ⎟⎟ ⎝ pi ⎠

i where pi is the occurrence probability of an angle θi. The entropy of a joint pdf of two random variables x and y is thus given by (2) ⎛ ⎞ 1 ⎟⎟ . H ( x, y ) = pi ( x, y ) log⎜⎜ ⎝ pi ( x, y ) ⎠ i The effectiveness of the Gaussian and hyperbolic distributions in modeling the random scatterers in multipath mobile environment is examined by using a method of estimating their entropy as functions of the coverage distance D and the effective radius Reff. It should be noted that even though AoA pdfs have been extensively studied, its effectiveness has been commonly assessed by fitting the real collected data from experiments using another method of finding the best-fit distribution(s) to the data such as the Gaussian and hyperbolic distributions (Bevan et al., 2004) as seen in Figure 3. Thus, by using both methods, the effectiveness of the Gaussian and hyperbolic distributions in modeling the scatterers can be thoroughly determined.

∑

III. THE AOA PDF USING THE HYPERBOLIC DISTRIBUTION TO MODEL THE RANDOM SCATTERERS

0.6 0.5 0.4 0.3 0.2 0.1 0 -1

The paper is organized as follows. Section II briefly gives background on how to compute the entropy of a pdf. Section III and IV analyse the hyperbolic and Gaussian marginal AoA pdf when θ is small and when it is large by estimating their corresponding entropy. Section V derives the joint Gaussian and hyperbolic AoA pdfs under the same conditions. Relative merits of Gaussian and hyperbolic distributions are also discussed. Section VI concludes the main findings of the paper and outlines possible further work. Detailed derivations of the hyperbolic AoA pdf were given in (Le, 2007).

-0.8

-0.6

-0.4 -0.2 0 0.2 0.4 Angle-of-arrival θ in radians

0.6

0.8

1

Figure 3: Approximations to the measured data histogram given in (Pedersen et al., 2000) using the hyperbolic and Gaussian distributions with β ≈ 9.5493 and σ ≈ 0.1074 respectively to yield the line of best fit to the data. The hyperbolic distribution clearly gives a better fit to the data than the Gaussian distribution toward the data's tail region.

As explained earlier, it is clear that when θ is small, there are two critical paths over which signals can be transmitted from the UE to the BS. The first path is a direct LoS between the UE and BS, i.e. the angle between them is zero which corresponds to a small AoA, and the received signals are in-phase with the transmitted signals from the UE. It is clear that most signals will arrive at the BS via the first path because it is the shortest


3

between the UE and BS. In addition, the more signals arriving at the BS via the first path, the more efficient the transmission. Since the BS is omni-directional, there exists a second critical signal path making an angle of 180o (corresponding to large AoA) between the BS and the UE, yielding out-of-phase arrival signals with respect to the transmitted signals from the UE. It should be noted that arrival signals at the BS via the second path usually consist of noise, reflected and/or refracted signals from the UE, and signals from other sources, and thus, they should be filtered out before reaching the BS to avoid undesirable distortion. This scenario can be considered as multiple-input single-output (MISO) because there is only one BS as different to the MIMO scenario studied in (Loyka and Tsoulos, 2002).

The AoA pdf using the hyperbolic distribution to model random scatterers is theoretically given in Eq. (3) for a small AoA as ∞ ⎧ m −1 1 ⎪ 2 I= ⎨(−1) 2 exp(− mτ )

π

∑

⎪

m =1,3,.. ⎩

⎤⎫ ⎡1 π cos(θ ) ⎥⎪ ⎢ +τ m ⎥⎪ ⎢m ⎢exp[mτ 2 cos 2 (θ )] ⎥ ⎪ ⎥⎬ ⎢ ⎢ 1 − erf [τ m cos(θ )] ⎥ ⎪ ⎥⎪ ⎢ ⎪ ⎦⎥ ⎭ ⎣⎢

(

)

(3) Since θ is small, we can approximate, cos θ ≈ 1, sin θ ≈ θ, and note that (4) Reff 1 sin(θ max ) = = ,

D

τ

which implies (5) τ >> 1 . After some mathematical manipulations, we obtain the approximate hyperbolic AoA pdf when θ is small as given in Eq. (6) 2 (6) 1 I ≈ tan −1 ( e −τ ) ,

π

which is plotted in Figure 4, from which it can be noted that the hyperbolic AoA pdf is perfectly symmetrical.

For large AoA

Under this assumption, cos(θ) → 0, thus Eq. (3) can be rewritten as

π

∞

∑

m −1 ⎛ exp(− mτ 2 ) ⎞ 2 ⎜ ⎟, (−1)

⎜

⎝ m =1,3,.. which is plotted in Figure 5.

(7)

⎟ ⎠

m

It should be noted that the conditions of large θ is critical in which arrival signals having a large AoA are usually undesirable. The hyperbolic AoA pdf under this condition still resembles an approximate bell shape as seen in Figure 7 with slight skewness toward its tail region. IV. THE AOA PDF USING THE GAUSSIAN DISTRIBUTION TO MODEL THE RANDOM SCATTERERS

For small AoA

A.

B.

I=

1

A.

For small AoA

The AoA pdf using the Gaussian distribution to model the random scatterers is theoretically given by

⎛ ⎞ 1 ⎜ − D2 ⎟ p(θ ) = exp⎜ 2 ⎟⎟ 2π ⎜ Reff ⎝ ⎠ ⎧ ⎛ ⎞⎫ 2 2 ⎪1 + π D cos(θ ) exp⎜ − D cos (θ ) ⎟ ⎪ ⎜ ⎟⎪ , 2 ⎪ Reff ⎜ ⎟ Reff ⎝ ⎠ ⎪⎪ ⎪⎪ ⎨⎡ ⎬ τ ⎤ ⎪⎢ ⎪ 2 1 ⎥ ⎪ ⎢1 + ⎪ e − t dt ⎥ ⎪⎢ 2 π ⎪ ⎥⎦ ⎪⎩ ⎣ ⎪⎭ 0

∫

(8) where τ = D Reff . For small θ, Eq. (8) can be rewritten as

p(τ ) ≈

1 exp(−τ 2 ) 2π

⎧ erf (τ ) ⎤ ⎫ 2 ⎡ ⎨1 + τ π exp(−τ ) ⎢1 + ⎬ 4 ⎥⎦ ⎭ ⎣ ⎩

(9) ,

which is plotted in Figure 4. From Figure 4, it is clear that the Gaussian AoA pdf for small θ is skewed to the right which means that it is not perfectly symmetrical with respect to τ. In addition, compared to the hyperbolic AoA pdf for small θ given in Figure 4, it is clear that the hyperbolic AoA pdf is perfectly symmetrical with respect to τ, suggesting that it is more effective than the Gaussian AoA pdf. The advantages of the hyperbolic distribution over the Gaussian distribution can be further examined by estimating their corresponding entropy using Eq. (1) as shown in Figure 5. The entropy difference of the


4

hyperbolic AoA pdf from the Gaussian AoA pdf is plotted in Figure 6 as functions of Reff and the coverage distance D. Hyperbolic and Gaussian AoA pdf when θ is small 0.25 Hyperbolic Gaussian 0.2

random scatterers. The maximum entropy difference is about 0.06 as a function of D. It is also clear that the hyperbolic distribution is more effective than the Gaussian distribution to model the scatterers for D ≤ 200. For D > 200, the entropy difference becomes smaller and reaches a saturated value of about 0.01 when D ≥ 500. In addition, it appears that the entropy of both pdfs is identical for very large values of D.

0.15

0.1

0.05

0 -4

-3

-2

-1

0

1

2

3

4

τ

Figure 4: The hyperbolic and Gaussian AoA pdfs as functions of τ =

D Reff

for small θ

Table 1: Usable ranges of the hyperbolic and Gaussian distributions when θ is small Distribution D D> Reff ≤ 250 Reff > ≤ 200 200 250

Entropy of Gaussian and Hyperbolic AoA pdfs as functions of D 0.4 Hyperbolic Gaussian

Entropy

0.3

With respect to Reff, the entropy difference between the hyperbolic and Gaussian AoA pdfs attains its maximum value of about 0.015 at Reff ≈ 250. For Reff ≥ 500, the entropy difference reaches a saturated value of about 0.005. From Figure 6, it is clear that the hyperbolic distribution is robust over the entire ranges of Reff and D, whereas the Gaussian distribution is limitedly effective for large D and Reff. Table 1 summarizes the findings in this section.

0.2

Hyperbolic

suitable

suitable

suitable

suitable

0.1

Gaussian

unsuitable

suitable

unsuitable

suitable

0

0

100

200

300 400 500 D Entropy of Gaussian and Hyperbolic AoA pdfs as functions of R

600

B.

For large AoA

0.4

Entropy

0.3 0.2 Hyperbolic Gaussian

0.1 0

0

100

200

300 R

400

500

600

Figure 5: Entropy of the hyperbolic and Gaussian AoA pdfs as functions of Reff and D for small θ Entropy Difference of Hyperbolic from Gaussian AoA pdfs as functions of D

It should be realized that arrival signals to the BS at a large AoA should be filtered out to avoid distortion. For completeness, the Gaussian AoA pdf is approximately derived under this condition. Similar to the hyperbolic AoA pdf, under this condition, cos(θ) → 0, from Eq. (8), the Gaussian pdf is approximately given by (10) 1 p (τ ) ≈ exp( −τ 2 ) ,

2π

Entropy

0.06

and plotted in Figure 7. The Gaussian AoA pdf for large θ resembles a perfect bell shape which indicates that it is more robust than the hyperbolic AoA pdf when θ is large. This thus can be considered as a trade-off between the two distributions.

0.04

0.02

0

0

100

200

300 400 500 600 D Entropy Difference of Hyperbolic from Gaussian AoA pdfs as functions of R

Gaussian and hyperbolic pdfs for large θ 0.9 Hyperbolic Gaussian

0.8

0.015

Entropy

0.7

0.01 0.6 0.5

0.005

0.4

0

0

100

200

300 R

400

500

600

Figure 6: Entropy difference of the hyperbolic AoA pdf from the Gaussian AoA pdf as functions of Reff and D for small θ From Figure 5 and Figure 6, it is clear that the hyperbolic AoA pdf possesses larger entropy than that of the Gaussian AoA pdf, indicating that the former distribution is more effective than the latter to model the

0.3 0.2 0.1 0 -3

-2

-1

0

1

2

3

τ

Figure 7: The AoA pdfs as functions of τ =

D Reff

using the

Gaussian and hyperbolic distributions to model the random scatterers for large θ


5

It is clear that the Gaussian AoA pdf is more uniform than the hyperbolic AoA pdf because of the imperfectness of the latter under this condition. However, since the received signals at the BS are usually undesirable, the advantage of the Gaussian distribution over the hyperbolic distribution to model the scatterers is thus minimal. For comparison purposes, the entropy difference of the Gaussian AoA pdf from that of the hyperbolic AoA pdf is plotted in Figure 8 as functions of D and Reff. Entropy Difference of Gaussian from Hyperbolic AoA pdfs as a function of D for large θ 0.4

Entropy

0.3 0.2 0.1 0

0

20

40

60

80

100 120 140 160 180 200 D Entropy Difference of Gaussian from Hyperbolic AoA pdfs as a function of R for large θ 0.4

0

20

40

60

80

Entropy

0.3 0.2 0.1 0

100 R

120

140

160

180

200

Figure 8: Entropy difference of the Gaussian AoA pdf from the hyperbolic AoA pdf as functions of D and Reff for large θ

From Figure 8, the Gaussian AoA pdf possesses larger entropy than the hyperbolic AoA pdf as explained earlier. The entropy difference linearly decreases as D increases, and increases as Reff increases. The entropy difference between the Gaussian and hyperbolic AoA pdf reaches a saturated value of about 0.05 at D > 200 and of about 0.25 at Reff ≈ 200 which are consistent with the findings presented in Figure 6. It should be noted that as Reff increases beyond a limit, random scatterers located beyond that do not have significant effects on the quality of the received signals at the BS as the transmitted signals from the UE could not reach them. When the coverage distance D increases beyond a limit, the entropy difference under the same condition decreases which is consistent with the findings reported in Section A. Under the condition of small θ, the entropy decreases but reaches a saturated value with respect to D and Reff as can be seen in Figure 5 and Figure 6. It should be noted that the entropy difference with respect to Reff is about five times smaller than that with respect to D when θ is small. Thus, for small θ, varying D can significantly affect the effectiveness of the Gaussian and hyperbolic distributions to model the scatterers and the robustness of the received signals at the BS. For large θ, the Gaussian distribution outperforms the hyperbolic distribution with their entropy difference ranging from 0.25 to 0.3. The following table summarizes the main findings in this section. Table 2: Usable ranges of the hyperbolic and Gaussian distributions when θ is large

Distribution

D ≤ 200

Hyperbolic Gaussian

unsuitable suitable

D> 200 suitable suitable

Reff ≤ 200 suitable suitable

Reff > 200 suitable suitable

V. JOINT GAUSSIAN AND HYPERBOLIC AOA PDFS AND THEIR ENTROPY It is clear that the received signals at the BS usually arrive from different directions because of the random scatterers. As such, random scatterer modeling is a difficult task and in some applications, one distribution may not be sufficient to effectively model these scatterers. As shown in Table 1, for small θ, the Gaussian and hyperbolic distributions can only be effectively used to model the scatterers under specific conditions of D and Reff. Thus, it is possible that each distribution can be individually used to model the scatterers within its effective ranges of D and Reff according to the suggestions drawn from Table 1. If both Gaussian and hyperbolic distributions are employed to model the random scatterers, it should then be realized that for each distribution, there exists one random AoA variable whose pdf is governed by one distribution. Thus, there are two random AoA variables of which the main difficulty is to derive their joint AoA pdf seen at the BS. It should be noted that joint AoA pdfs have never been used in multipath mobile environment because it is hard to find more than one distribution which can yield the best-fit to the same data. As evidenced in Figure 3, the Gaussian and hyperbolic distributions can be used to give the line of best fit to the same set of experimental data (Pedersen et al., 2000) and thus their joint AoA pdfs can be computed as products of their individual marginal AoA pdfs as given by Eq. (11) (11) p x , y (τ 1,τ 2 ) = p x (τ 1 ) p y (τ 2 ) . It should be noted that the marginal Gaussian and hyperbolic AoA pdfs given in Eqs. (6) and (9) for small θ, and Eqs. (7) and (10) for large θ can be effectively used to derive their joint AoA pdfs at the BS. There are three different joint AoA pdfs: (1) Gaussian-Gaussian (GG), (2) Gaussian-hyperbolic (GH), and (3) hyperbolichyperbolic (HH). As a result, Figure 2, which depicts a communications system between the UE and the BS using one distribution, can now be modified to accommodate two distributions to model the scatterers as can be seen in Figure 9. Signal

Gaussian channel model D large

BS

Signal

Signal D small

Hyperbolic channel model

UE

Signal

Coverage distance D

Figure 9: A schematic diagram of channel modeling of the random scatterers using the Gaussian and hyperbolic distributions


6

From Figure 9, the hyperbolic and Gaussian distributions are simultaneously used to model the scatterers depending on their distance from the BS. From Table 1, the hyperbolic distribution is shown to be robust for large ranges of D and Reff and thus it can be used to effectively model the scatterers independent of their distance from the BS. Another possibility is to employ two Gaussian distributions to model short- and longdistance scatterers from the BS. From Section IV, the Gaussian AoA pdf is shown to be less effective than the hyperbolic AoA pdf and therefore it is expected that the GG joint AoA pdf may not yield better performance than the HH joint AoA pdf. However, to thoroughly assess the effectiveness of each joint AoA pdf, the entropy should be computed as carried out in Sections III and IV. It should be further noted that even though the use of joint Gaussian and hyperbolic distributions gives flexibility to the random scatterer modeling, the entropy of the corresponding joint AoA pdfs may not be larger than those of the marginal AoA pdfs, which is mainly twofold: (1) entropy of the joint AoA pdfs are computed in terms of τ, and those of the marginal AoA pdfs are in terms of D and Reff, thus it is not possible to compare all the entropy in terms of τ or in terms of D and Reff; and (2) Since the joint Gaussian and hyperbolic distributions give flexibility to the random scatterer modeling, it is expected that there exists a trade-off between the AoA pdf effectiveness and flexibility in modeling the scatterers. Detailed studies are given in Sections A and B.

which are plotted in Figure 10 to Figure 12 respectively. To further examine the symmetry of the pdfs, their contour plots are shown in Figure 13 from which it is clear that the joint HH AoA pdf is isotropic and rotational-invariant. The joint GG and GH AoA pdfs are not perfectly symmetrical and appear to be directional. Joint Gaussian-Gaussian AoA pdf

0.06 0.05 0.04 0.03 0.02 0.01 0 4 2

4

τ2

2

0 0

-2

-2 -4

τ1

-4

Figure 10: The joint GG AoA pdf as a function of τ1 and τ2 for small θ Joint Gaussian-Hyperbolic AoA pdf

0.06 0.05 0.04

For small AoA

A.

0.03 0.02

Using Eqs. (6), (9) and (11), mathematically, the joint GG, GH and HH AoA pdfs are given from Eqs. (12) to (14) respectively

0.01 0 4 2

pGG (τ 1 ,τ 2 )

⎧ 1 ⎡ erf (τ 1 ) ⎤ ⎫ exp(−τ 12 )⎨1 + τ 1 π exp(−τ 12 ) ⎢1 + ≈ ⎬ 2π 4 ⎥⎦ ⎭ ⎣ ⎩ ⎧ 1 ⎡ erf (τ 2 ) ⎤ ⎫ exp(−τ 22 )⎨1 + τ 2 π exp(−τ 22 ) ⎢1 + ⎬, 2π 4 ⎥⎦ ⎭ ⎣ ⎩

pGH (τ 1 ,τ 2 )

4

τ2

(12)

2

0

0

-2

τ1

-2 -4

-4

Figure 11: The joint GH AoA pdf as a function of τ1 and τ2 for small θ Joint Hyperbolic-Hyperbolic AoA pdf

0.08

0.06

⎧ 1 ⎡ erf (τ 1 ) ⎤ ⎫ exp(−τ 12 )⎨1 + τ 1 π exp(−τ 12 ) ⎢1 + ⎬, 2π 4 ⎥⎦ ⎭ ⎣ ⎩ 2 ⎡1 ⎤ × ⎢ tan −1 (e −τ 2 )⎥ ⎣π ⎦ ≈

(13) 2 ⎤ ⎡1 2 ⎤ ⎡1 p HH (τ 1 , τ 2 ) ≈ ⎢ tan −1 ( e −τ 1 )⎥ ⎢ tan −1 ( e −τ 2 )⎥ , ⎣π ⎦ ⎣π ⎦

(14)

0.04

0.02

0 4 2

τ2

4 2

0

0

-2

-2 -4

-4

τ1

Figure 12: The joint HH AoA pdf as a function of τ1 and τ2 for small θ


7

Joint Gaussian-Hyperbolic AoA pdf

Joint Hyperbolic-Hyperbolic AoA pdf

2

2

0

0

Entropy difference of three joint pdfs 0.045 HH - HG HH - GG HG - GG

0.04

τ2

τ2

0.035

-2

0.03 0.025

-2 -2

0

2

-2

0

τ1

2

τ1

0.02 0.015

Joint Gaussian-Gaussian AoA pdf

0.01 0.005

2

0

τ2

0 -0.005 -3

-2

0

Figure 13: Contour plots of the joint AoA pdfs shown in Figure 11 to Figure 10 as functions of τ =

D Reff

for small θ

Entropy of the joint AoA pdfs 0.07 H-G H-H G-G

0.06

0.05

Entropy

-1

0

1

2

3

Figure 15: Entropy differences of the joint HH from GG AoA pdfs, HH from GH and GH from GG as functions of τ for small θ

2

τ1

0.04

0.03

0.02

0.01

0 -3

-2

τ

-2

-2

-1

0

1

2

3

τ

Figure 14: Entropy of the joint GH, GG and HH AoA pdfs as functions of τ =

D Reff

for small θ

Under the condition of small θ, the entropy of the joint AoA pdfs given in Figure 13 are estimated using Eq. (2) as functions of τ and plotted in Figure 14 from which the HH joint AoA pdf has the largest entropy. The GH joint AoA pdf has smaller entropy than the former and larger than the GG joint AoA pdf. This shows the effectiveness of the hyperbolic distribution over the Gaussian distribution to model the scatterers as predicted earlier. In addition, it is clear that the HH and GH joint AoA pdfs are perfectly symmetrical which implies that omnidirectional antennas should be employed at the UE, whereas directional antennas should be used at the UE when using two Gaussian distributions to model the scatterers.

To examine the effectiveness of the three joint AoA pdfs, their entropy differences are computed and shown in Figure 15 from which it is clear that the joint HH AoA pdf possesses the largest entropy. The entropy difference between the HH and GH AoA pdfs monotonically decreases with τ of which an entropy peak is obtained when τ = 0, i.e. D = 0 which means the BS and the UE are geographically mounted at the same location, except for the cases of GH and GG AoA pdfs and HH and GG AoA pdfs because of the imperfectness of the GG AoA pdf as shown in Figure 4 and Figure 13. From Figure 15, the entropy difference of the GH and GG AoA pdfs is negative for 0 < τ < 1. This means that the GG AoA pdf possesses larger entropy than the GH AoA pdf for D < Reff which is consistent with results obtained in Table 1. It is also clear that there is a trade-off between the symmetry of AoA pdfs and their entropy difference. From Figure 15, the entropy difference of the HH and GG joint AoA pdfs is largest with the peak value of about 0.045 while that of the GH and GG AoA pdfs is smallest with the peak value of about 0.02, i.e. about half as much compared to the case of unsymmetrical entropy difference. However, for the latter case, the entropy is perfectly symmetrical, whereas it is not for the former case. As shown in Sections II to V, unsymmetrical entropy difference implies that the corresponding AoA pdf is not symmetrical and therefore it is not practically useful. It should be noted that larger entropy difference between two AoA pdfs means that one pdf is more informative than the other under certain conditions of the coverage distance D and effective radius Reff. Thus depending on a particular application, the appropriate joint AoA pdf is employed so that its effectiveness and suitability to the application are maximized. It should be noted that the entropy of the hyperbolic and Gaussian AoA pdfs plotted in Figure 5 and Figure 6 in Section IV.A are in terms of D and Reff, not in terms of τ = D Reff . Thus, corresponding fine-detailed comparisons are not possible. However, the entropy


8

peaks can be used to give insight to the effectiveness of the hyperbolic and Gaussian distributions to model the scatterers. Table 3 and Table 4 compare the performance of marginal and joint Gaussian and hyperbolic AoA pdfs using their corresponding entropy-peak values and entropy-difference-peak values given in Sections IV and V. Table 3: Performance comparisons of marginal and joint AoA pdfs using their entropy-peak values for small θ AoA pdf Entropy-peak value Hyperbolic (H) 0.35 w.r.t D Gaussian (G) 0.3 w.r.t D Hyperbolic (H) 0.32 w.r.t R Gaussian (G) 0.31 w.r.t R Joint HH 0.065 w.r.t τ Joint HG 0.044 w.r.t τ Joint GG 0.041 w.r.t τ Table 4: Performance comparisons of marginal and joint AoA pdfs using their entropy-difference-peak values for small θ Difference Entropy-difference-peak value AoA pdf H–G 0.06 w.r.t D H–G 0.014 w.r.t R HH – GH 0.021 w.r.t τ HH – GG 0.044 w.r.t τ GH – GG 0.025 w.r.t τ

Based on Figure 14 and Figure 15, usable ranges of τ and the corresponding suitable joint Gaussian and hyperbolic distributions to model the scatterers are given in Table 5. Table 5: Usable ranges of τ and the corresponding joint Gaussian and hyperbolic distributions to model the scatterers for small θ Joint 0≤τ≤1 12 Distribution HH suitable suitable suitable GG — — suitable GH — — suitable

B.

For large AoA

For completeness, the joint pdfs of the Gaussian and hyperbolic distributions are given in this section. It should be noted that under this condition, the arrival signals are usually undesirable and therefore should be filtered out before reaching the BS to avoid distortion. By using Eqs. (7), (10) and (11), the joint Gaussian and hyperbolic AoA pdfs when θ is large are given in Eqs. (15) to (17) respectively

which are plotted in Figure 16 to Figure 18 respectively.

pGG (τ 1 , τ 2 ) ≈ pGH (τ 1 ,τ 2 ) ≈

1 4π 2 1 2π 2

Joint hyperbolic-hyperbolic AoA pdf

exp( −τ 12 − τ 22 ),

(15)

exp(−τ 12 )

(16)

⎡ +∞ ⎤, m −1 ⎛ exp(− mτ 22 ) ⎞ ⎥ ⎢ ⎟ (−1) 2 ⎜ ⎢ ⎜ ⎟⎥ m ⎝ ⎠⎥ ⎢⎣ m =1,3,.. ⎦ p HH (τ 1 ,τ 2 ) ≈

0.8

0.4

∑

+∞ ⎡ ⎤ m −1 ⎛ exp(− mτ 12 ) ⎞ ⎥ ⎢1 ⎟ , (−1) 2 ⎜ ⎢π ⎜ ⎟⎥ m ⎝ ⎠⎥ ⎢⎣ m =1,3,.. ⎦

∑

+∞ ⎡ ⎤ m −1 ⎛ exp(− mτ 22 ) ⎞ ⎥ 1 ⎢ ⎜ ⎟ 2 (−1) ⎢π ⎜ ⎟⎥ m ⎝ ⎠⎥ ⎢⎣ m =1,3,.. ⎦

∑

0.6

0.2

0 5 5

(17)

τ2

0

0 -5

τ1

-5

Figure 16: A mesh plot of the joint HH AoA pdf as a function of τ1 and τ2 for large θ Joint hyperbolic-Gaussian AoA pdf

0.2

0.15

0.1

0.05

0 5 5 0

0

τ2 -5

-5

τ1


9

Figure 17: A mesh plot of the joint GH AoA pdf as a function of τ1 and τ2 for large θ

Figure 20: Entropy of the joint AoA pdfs shown in Figure 16 to Figure 18 Entropy difference of three joint pdfs for large θ

Joint Gaussian-Gaussian AoA pdf 0.07

HH - HG HH - GG HG - GG

0.06

0.03 0.05

0.025 0.02

0.04

0.015 0.03

0.01 0.005

0.02

0 5

0.01

5

τ2

0

0 -5

0 -5

-5

Joint hyperbolic-Gaussian AoA pdf 35

30

τ2

τ2

30 25 20

25 20

15 20

25

15

30

20

25

τ1

30

τ1

Joint Gaussian-Gaussian AoA pdf 35

τ2

30 25 20 15 15

20

25

30

35

τ1

Figure 19: Contour plots of the joint AoA pdfs shown in Figure 16 to Figure 18 Entropy of the joint AoA pdfs for large θ 0.09 HG HH GG

0.08 0.07

-2

-1

0

1

2

3

4

0.05 0.04

35

From Figure 19, it is clear that the GG AoA pdf is isotropic whereas the GH and HH AoA pdfs are directional which indicates that the GH and HH configurations can be effective for directional antennas mounted at the UE. The entropy of the GG, GH and HH AoA pdfs are plotted in Figure 20 and Figure 21 gives their entropy differences. From Figure 21, the HH AoA pdf possesses the largest entropy, while the GG AoA pdf has the smallest entropy. However, the GG AoA pdf is isotropic which validates the trade-off between joint AoA pdf's symmetry and its entropy stated earlier. It should be noted that the condition of large AoA is practically nonideal and arrival signals should be filtered out to avoid distortion and misleading interpretation. As compared to the findings in Section A, the hyperbolic distribution is more effective than the Gaussian distribution to model the scatterers in multi-path mobile environment between the BS and the UE. Based on Figure 20 and Figure 21, usable ranges of τ are given in Table 6. Table 6: Usable ranges of τ and the corresponding joint Gaussian and hyperbolic distributions to model the scatterers for large θ Joint 0≤τ≤ 0.2 ≤ τ 12 Distribution 0.2 ≤1 2 HH suitable suitable suitable suitable GG — suitable suitable suitable GH — — — suitable

0.03 0.02 0.01 0 -5

5

Based on Figure 13 and Figure 19, the following table gives the appropriate antenna type to be mounted at the UE when specific joint Gaussian and hyperbolic distributions are employed.

0.06 Entropy

-3

Figure 21: Entropy differences of the joint AoA pdfs shown in Figure 16 to Figure 18

Figure 18: A mesh plot of the joint GG AoA pdf as a function of τ1 and τ2 for large θ Joint hyperbolic-hyperbolic AoA pdf 35

-4

τ

τ1

-4

-3

-2

-1

0

τ

1

2

3

4

5

Table 7: Appropriate antenna type mounted at the UE for specific joint Gaussian and hyperbolic distributions Antenna type mounted at the UE Joint Distribution Small θ Large θ HH Omni-directional Directional GG Directional Omni-


10

GH

Omni-directional

directional Directional

VI. CONCLUSION It has been shown that the Gaussian and hyperbolic distributions can be effectively used to model random scatterers between a base station and a user equipment's antenna in multi-path mobile environment. The hyperbolic distribution has been shown to be more robust than the Gaussian distributions in modeling the random scatterers between the base station and the user equipment with a better fit the real data. Joint angle-ofarrival Gaussian and hyperbolic probability density functions have been derived as products of their corresponding marginal probability density functions. The hyperbolic marginal and joint angle-of-arrival probability density functions have been shown to possess larger entropy than the Gaussian probability density functions. It has also been shown that the hyperbolic distribution can be used to effectively model the scatterers over large ranges of the coverage distance D and effective radius Reff. The Gaussian distribution has also been shown to be limitedly effective over large ranges of D and Reff which makes it less practical than the hyperbolic distribution. Further work on using the Gaussian and hyperbolic distributions to model the scatterers can be focused on the followings: 1.Estimation of the inter-carrier-interference power caused by Doppler spreading; 2.Effects of omni-directional and directional antennas mounted at the user equipment; 3.Estimation of the received power at the BS when two distributions are used to model the scatterers; 4.Derivation of the joint Gaussian and hyperbolic AoA pdfs for the general case and their entropy estimation; 5.Effects of noise and disturbances on the quality of signal transmission between the BS and the UE. VII. REFERENCES Bevan, D. D. N., Ermolayev, V. T., Flaksman, A. G. and Averin, I. M., 2004. Gaussian Channel Model for Mobile Multipath Environment. EURASIP Journal on Applied Signal Processing, vol. 2004, pp. 1321-9. Janaswamy, R., 2002. Angle and time of arrival statistics for the Gaussian scatter density model. IEEE Transactions on Wireless Communications, vol. 1, pp. 488-497. Lathi, B. P. (1998). Modern Digital and Analog Communication Systems. NewYork, Oxford University Press. Le, K. N., 2007. A new formula for the angle-ofarrival probability density function in mobile environment. Signal Processing, vol. 87, pp. 1314-1325.

Lee, W. C. Y., 1973. Finding the approximate angular probability density function of wave arrival by using a directional antenna. IEEE Transactions on Antenna and Propagation, vol. Ap-21, pp. 328-334. Loyka, S. and Tsoulos, G., 2002. Estimating MIMO systerm performance using the correlation matrix approach. IEEE Communications Letters, vol. 6, pp. 1921. Ng, W. T. and Dubey, V. K., 2003. Effect of employing directional antennas on mobile OFDM system with time-varying channel. IEEE Communications Letters, vol. 7, pp. 165-167. Pedersen, K. I., Mogensen, P. E. and Fleury, B. H., 2000. A Stochastic Model of the Temporal and Azimuthal Dispersion Seen at the Base Station in Outdoor Propagation Environments. IEEE Transactions on Vehicular Technology, vol. 49, pp. 437-47.

ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Farah Bouakrif et al.: Trajectory tracking control for robot manipulators using a simple disturbance observer


Trajectory tracking control for robot manipulators using a simple disturbance observer Farah Bouakrif, Djamel Boukhetala and Fares Boudjema Laboratoire de Commande des Processus Ecole Nationale Polytechnique 10, Avenue Pasteur BP 182, El Harrach, Algiers, ALGERIA

Abstract – In this paper, we derive a simple disturbance observer based tracking control for nlink robot manipulators. The whole control law contains a feedback action plus an updated term represents the estimated disturbance, which is given by a simple disturbance observer. Using the wellknown Lyapunov based theory, the global asymptotic stability of the whole system is guaranteed, and the unmodeled dynamics with external disturbances are compensated. Finally, simulation results on the PUMA 560 robot are provided to illustrate the effectiveness of the proposed controller, when the Coulomb and Viscous friction is considered as an external disturbance. Keywords – Disturbance observer, Lyapunov theory, robot manipulator, trajectory tracking control.

I. Introduction The trajectory tracking control problem for rigid robot manipulators has been solved using several efficient classical and robust methods, and it has been shown that each control strategy ensures the stability of the trajectory tracking error in some suitable sense [1,2]. A great variety of these controllers have been referred to as model-based robot. Hence, exact knowledge of the system dynamics of a robot is required for the implementation of these control methods. In practice, there is always some parametric uncertainty in the dynamic of a robot. As a natural consequence, the robot control problem in the presence of model uncertainties has been analyzed extensively. There are basically two underlying philosophies to the control of uncertain systems: the adaptive control philosophy, and the robust control philosophy. In the adaptive approach, one designs a controller which attempts to “learn” the uncertain parameters of the particularly system and, if properly designed will eventually be a best controller for the system in question, a discussion of adaptive controllers in robotics may be found in [3]. In the robust approach, the controller has a fixed-structure which yields “acceptable” performance for a given plantuncertainty set, a comprehensive survey of robust

control theory is available in [4]. Many adaptive robot control scheme assume that the structure of the manipulator dynamics is known and/or the unknown parameters influence the system dynamics in an affine manner [5]. It has also be demonstrated [6] that these adaptive controllers may lack robustness against unmodeled dynamics, sensor noise, and other disturbances. Recently, several researchers have been applying the disturbance observer ‘DO’ theory to design tracking control algorithms for robot manipulators. Nakao et al.[7] proposed firstly the concept of disturbance observer “DO” as compensating unknown disturbance. Furthermore, friction is a common phenomenon in mechanical systems. One of the most promising methods is observer-based control, where a variable structure DO has been proposed [8], and a nonlinear observer for a special kind of friction, i.e., Coulomb friction, has been proposed by Friedland and Park [9]. It has been further modified and implemented on robotic manipulators by Tafazoli et al. [10]. In [11] Chen et al. proposed a DO, but the combination between the observer and the controller has not been made. However, a DO based control approach for nonlinear systems has been proposed in [12], but only semi-global stability condition of the whole system has been established. In this paper, we presente a simple disturbance observer based tracking control design for n-link robot manipulators with model uncertainty and subject to external disturbances. Using Lyapunov method, the global asymptotic stability of the whole system is established. Simulation results on the PUMA 560 robot show the convergence of tracking error to zero, when the Coulomb and Viscous friction is considered as an external disturbance.

II. Dynamic equations of robot manipulators The nominal dynamic equations of robot manipulator can be described by ..

.

τ = M n (q) q + Cn (q,q&) q +Gn (q)

(1)


12

where

q(t) , q&(t) , q&&(t) ∈R n denote the link position,

velocity, and acceleration vectors, respectively, the

()⋅ n

subscript

Mn

(q(t))∈Rn×n

denotes

nominal

functions,

Theorem Given the robot dynamics (3). Under the following control law ..

Cn (q(t),q&(t))∈R n×n

represents

centripetal-Coriolis

matrix, Gn (q(t))∈R n×1 represents the gravity effects, and

.

τ = M n (q ) ( q d + K v E& (t ) + K p E (t )) + C n (q, q& ) q d

represents the link inertia matrix,

with the disturbance estimation dˆ (q, q& , t ) is obtained from

τ (t)∈R n×1 represents the torque input vector. The dynamic equation of the true plant is assumed to be ..

.

τ = M(q) q + C (q,q&) q + G(q)+ w(q,q&,t)

(2)

z& = φ (q&&d + K p E )

(8)

dˆ = z − φ q& .

(9)

(λ max (MK p ) + γ M )

2

If

ψm ≥2

λ min ( K v M )

, then

lim E (t ) → 0 ,

t →∞

where M(q)= M n (q)+ ∆M(q) , C(q,q&)=Cn (q,q&)+ ∆C(q,q&) ,

and lim Ed (t ) → 0 .

G(q)=Gn (q)+ ∆G(q)

Where φ = µ −1ψK −p1 , µ = µ1 I n×n ,ψ = ψ 1 I n×n ,

are the real system matrices.

w(q,q&,t)∈R n represents the disturbance vector. .

τ = M n (q) q + Cn (q,q&) q + Gn (q) +d(q,q&,q&&,t)

(3)

where d(q,q&,q&&,t)=∆M(q)q&&+∆C(q,q&)q& +∆G(q)+ w(q,q&,t) (4) The dynamic equation of (3) has the following properties [13]. P1. M n (q ) is symmetric and positive definite matrix, for all

q∈R n

.

P2. The inertia and centripetal-Coriolis matrices satisfy the following skew-symmetric matrix .

.

.

X T ( M n ( q , q )−2 Cn (q , q ) ) X = 0 ∀X ∈ℜn .

(5)

In this paper, the following lemma are used Lemma 1.[14, p.93] Consider the stable linear system e&= Ae+ Br . (6) If r∈L2 then e∈L2 ∩ L∞ and e→0 . Proof See [14, p.93]. Lemma 2. [15, p.31] Consider the continuous function f :R+ →R n ,

if f, f&∈L∞n and f ∈L2 ,

then lim f (t)=0 . t →∞

In addition, for symmetric matrix H ∈ R

n× n

, λmax ( H )

of H respectively, and the norm of a vector X is defined as XT X

III. Disturbance observer based tracking control Now, one can state the following result

are positive constants, with γ 1 > µ1 . q d (t ) , q& d (t ) , q&&d (t ) ∈R n denote the desired position,

velocity,

acceleration vectors, respectively, E (t ) = q d (t ) − q(t ) , E& (t ) = q& d (t ) − q& (t ) , I n×n ∈ R n× n is an

and

identity

E d (t ) = dˆ ( q, q& , t ) − d (q, q& , q&&, t ) ,

matrix,

ψ m = λ min (ψ ) , and γ M = λmax (γ ) .

Proof Substituting (7) to (3), we obtain M n ( E&&(t ) + K v E& (t ) + K p E (t )) + Cn E& (t ) = − Ed (t ) .

We define

[

yT =

(10)

]

E& T ET (F − Ed )T , and choose as a

Lyapunov function candidate V(y)= 1 yT P y 2 where M n  P= 0  0

(11)

0 µ  . µ 

0

γ µ

(12)

With γ > µ , and the vector F is to be determined. Hence V =

and λmin ( H ) are maximum and minimum eigenvalues

X =

t →∞

γ = γ 1 I n×n , K p = k p I n×n , K v = kv I n&n , ( µ1 ,ψ 1 , k p , k v , γ 1 )

Therefore, the dynamic equation of the true plant is ..

(7)

+ G n (q) + dˆ (q, q& , t )

(

)

1 &T 1 1 E M n E& + E T γ E + E T µ F − Ed 2 2 2 . T T 1 1 + F − Ed µE + F − Ed µ F − Ed 2 2

(

)

(

) (

)

(13)

Since, in general, there is no prior information about the derivative of the disturbance d , it is reasonable to suppose that d& =0 (14) which implies that the disturbance varies slowly relative to the obsever dynamics. The time-derivative of (13), evaluated along (10), (14), according to property (P.2), we obtain


13

& & V& = E T  µF& − µdˆ  + F T  µF& − µdˆ + µE&      &ˆ  T & T & & & − Ed  µF − µd + µE + E  − E MK v E&   T T & & − E MK E + E γ E

ψm ≥ 2

(15)

V& ≤ −φm E&

Choosing µF& − µd&ˆ + µE& = −φ −1F  &ˆ  & = −ψE  µF − µd &ˆ  & & &  µF − µd + µE + E = 0

(16)

where µ and ψ are diagonal, positive definite matrices, and φ is symmetric, positive definite matrix to be determined. Hence F =φ −1E&

(17)

(24)

λmin ( MKv )

then, we have

p

2

−

ψm 2

E

2

(25)

From (18), we note that the disturbance estimation is not practical to implement, because, the acceleration && is not available in many robotic manipulators, signal q and it is also difficult to construct the acceleration signal from the velocity signal by differentiation due to measurement noise. For landing this problem, we define an auxiliary variable vector z =dˆ + p(q,q&) (26) where z∈R 2 . The designed function vector p(q,q&) is to be determined.

Analysis of stability

and & dˆ = φ E&& + α E& + µ −1ψE

(

(18)

)

part, we demonstrate that lim Ed (t)→0 . t →∞

Therefore, we have V& = − E& T φ E& − E TψE − E& T MK v E& − E T MK p E& + E T γ E& .

(19) Using theorem of Rayleigh-Ritz [15, p.38], becomes V& = −φm E&

2

−ψ m E

2

− λmin ( MK v ) E&

2

.

+ 2 δ E E&

where δ =

(20)

n

sufficient to show that E∈L2 , to conclude that lim E(t)→0 . n

E∈L2 if there exists some constant, ξ , such that

We note that

2

∞

1/ 2

 2  ≤ 2 E& δ   ψ m 

2δ E& E

ξ≥

1/ 2

ψ m   2   

E

(21)

−

ψm 2

E

2

 δ  & E −  λmin ( MK v ) − 2   ψ m 

dt .

(27)

We note that φm E&

2

> 0.

(28)

From (25) and (28), we can write

Then, we obtain 2

∫ E (t ) 0

2 2 2ψm ≤ E& δ 2 + E . ψm 2

V& ≤ −φm E&

Part 1 From (25), V& is a negative semi-definite function, this result is not sufficient to demonstrate the asymptotic stability, and we can conclude only the stability of the system ( E& , E and ( F − Ed ) are bounded). Therefore, the lemma 2 is required to complete the proof of asymptotic stability, and it is

t →∞

M M K vM + γ M . 2

2

The Analysis of stability is in two parts. In the first part we demonstrate that lim E(t)→0 , and in the second t →∞

where α = d φ −1(t) . dt

2

.(22)

Choosing

d ψ 2 V ( E& (t ), E (t ), e(t )) ≤ − m E (t ) , ∀E(t)∈R n dt 2

(29)

where e(t)= F(t)− Ed (t) . Integrating the two members of (29), we obtain

λmin ( MKv ) ≥ 2

thus

δ2

2

δ ψm

(23)

V ( E& ( ∞ ), E ( ∞ ),e (∞ ))

∫

V ( E& ( 0), E (0),e( 0))

dV ( E& (t ), E (t ), e(t )) ≤ −

ψm 2

∞

∫ 0

2

E (t ) dt (30)


14

(

hence

)

where Γ = −φ M n−1 , Θ = − φ −1M n−1Cn + K v .

V ( E& (∞), E (∞), e(∞ )) − V ( E& (0), E (0), e(0)) ≤ −

ψm 2

∞

∫ E (t )

2

dt.

0

(31) We note that V(E& (t),E(t),e(t)) is function, then

a positive definite

Since M n is symmetric, positive definite matrix, φ is diagonal, positive definite matrix, it is clear that Γ is a stable matrix. Therefore, using lemma 1, it is sufficient to show that n E&∈L2 , to conclude that lim Ed (t)→0 . t →∞

n

V(E& (∞),E(∞),e(∞))≥0 .

(32)

E&∈L2 if there exists some constant, λ , such that

From (31) and (32), we can conclude that ψ − V ( E& (0), E (0), e(0)) ≤ − m 2

∞

∫ E (t )

2

∞

λ ≥ ∫ E& (t) dt .

(43)

0

2

(33)

dt

We note that

ψm

0

E

2

hence

2

≥0.

(44)

From (16), (20) and (44), we can write ∞

ξ≥

∫ E (t )

2

0

V ( E& (0), E (0), e(0))

ξ=

where

ψm /2

d V ( E& (t ), E (t ), F (t ) − Ed (t )) ≤ −φm E& dt

(34)

dt

2

.

(45)

Using same idea of the part 1, we find ,

which

implies

that

∞

2 λ ≥ ∫ E& (t) dt

n

(46)

0

E∈L2 , then lim E(t)→0 . t →∞

Part 2 Let the function p(q,q&) in (26) be given by the following equation p(q, q& ) = φ q& (35)

hence

where

λ=

V(E& (0), E(0),e(0))

φm

, and φm denotes the

minimum eigenvalue of φ . n Therefore E&∈L2 , and then lim Ed (t)→0 . t →∞

dp (q, q& ) = φ q&& + α q& dt

(36) where α =

This completes the proof.

d (φ ) . dt

IV. Numerical simulation results

Invoking (26) and (36) with (18) yields & dp(q,q&) z& =dˆ + dt

Consider the first three joints (waist q1 , shoulder q 2 , elbow q 3 ) of the PUMA 560 arm. The dynamic

−1

= φ q&&d + α q&d + µ ψE .

(38)

model for PUMA 560 can be written as (3). The elements of M (q) , C (q, q& ) and G (q) are given in appendix. In the following we will compare the simulation results of the two cases (nominal case and true plant). The nominal parameter values are assumed to be

(39)

m1 =0.4[kg], m2 =0.6[kg], l1 =1[m], l2 =1.4[m]

(37)

From (10), we have q&& = q&&d + K v E& + K p E + M n−1Cn E& + M n−1Ed .

From (14), (36), (37) and (38), we obtain & E& d = dˆ

(

) (

)

= − K v + φ M n−1Cn − α E& + µ −1ψ − φ K p E . −φ

M n−1Ed

The true plant parameters are assumed to be m1 = 0.8[kg ], m2 = 1.2[kg ], l1 = 2[ m], l2 = 2.8[m] .

Choosing −1

K −p1

(40)

d (φ ) =0. dt

(41)

φ=µ ψ

thus α=

From (39), (40) and (41), we have E& d =ΓEd +Θ E&

(42)

The Coulomb and Viscous friction is considered as an external disturbance. a. Friction simulation The external disturbance considered is Coulomb and Viscous friction, given by


15

d(q&)=c1 sign(q&)+c2q& (47) The parameters for first and second links in the simulation are given by c1 = [0.941 0.176 0.941]T N .m (48) N .m / rad / s . (49)

There are some problems in using the friction model (47) in simulation directly. One is due to the discontinuity of the friction characteristics at zero velocity, a very small step size is required for testing zero velocity. The other is that when the velocity is zero, or the system is stationary, the friction is indefinite and depends on the controlled torque. In the simulation, to improve the numerical efficiency, a revised friction model, which is modified from [16], is adopted. The revised friction model can be described by

(

)

dr =d + Tr −d e

−(q& /l )

2

(50)

where d is given by (47), d r is the revised friction, l is a small positive scalar, and Tr is given by t >K  K (51) Tr (t)=t − K ≤t ≤ K  − K t 0. Therefore g is an increasing function. The l’hˆ opital theorem implies g → 0 as x → 0+ . Clearly x < π2 implies g(x) < g( π2 ) = π2 . Now (2n+1)π ) ∈ (0, π2 ) which completes this we note that bN n = g( N proof.

4 −1 4 X sin(ω(2n + 1)) . π n=0 π(2n + 1)

It is well known that for N → ∞, ΨN (ω) converges uniformly everywhere except at multiples of π, to a unit valued square wave with period 2π. We claim that ΦN (ω) for N → ∞, converges uniformly everywhere except at multiples of π, to zero. In order to prove this, we consider the interval [0, 2π]. π N N Define bN n = π(2n+1) − cot( N (2n + 1)), for 0 ≤ n ≤ 4 − 1. 2 N N We first need to show that bN 0 ≤ b1 ≤ · · · · · · ≤ b N −1 ≤ π . 4

We acknowledge assistance provided by Professor Michael Wilson of the UVM Department of Mathematics and Statistics, in helping derive the proof for convergence in Theorem 1. R EFERENCES [1] M. Elfataoui, G. Mirchandani, “A Frequency Domain Method for Generation of Discrete-Time Analytic Signals,” IEEE Trans. Signal Processing, Vol. 54, No. 9, Sept. 2006, pp.3343-3352. [2] M. Elfataoui, G. Mirchandani, “A Novel Method for Generating Complex Half-Band Filters,” Proceedings, ICASSP 2005, Philadelphia, PA. pp. IV381 - IV-384, May 2005. [3] M. Elfataoui, G. Mirchandani, “Discrete-Time Analytic Signals With Improved Shiftability,” Proceedings, ICASSP 2004, Montreal, CA. pp. II-477 - II-480, May 2004. [4] Mohamed Elfataoui, “Discrete-time Analytic Signals With Improved Shiftability.” M.Sc. Thesis, The University of Vermont, Feb. 2004. www.cems.uvm.edu/∼mirchand/publications/me.pdf. [5] Felix C. A. Fernandes,“Directional, Shift-Insensitive, Complex Wavelet Transforms with Controllable Redundancy.” PhD. Thesis, Rice University, Jan. 2002. [6] S. L. Hahn: Hilbert Transforms in Signal Processing, Artech House, Norwood, MA, Dec. 1996. [7] N. Kingsbury, “Shift Invariant Properties of the Dual-tree Complex Wavelet Transform, Proceedings ICASSP 1999, vol. 3, pp. 1221-1224, March 1999. [8] S. Mallat: A Wavelet Tour of Signal Processing, 2nd edition, Academic Press, London, UK, Sep. 1999. [9] S.L. Marple, Jr., “Computing the Discrete-time Analytic Signal via the FFT,” IEEE Trans. Signal Processing, vol.47, No.9, pp. 2600-2603, Sep. 1999. [10] S.K.Mitra, Digital Signal Processing - A Computer-Based Approach, McGraw-Hill, New York, N.Y. Third Edition, 2006. [11] A. Oppenheim, R. Schafer, Discrete-time Signal Processing, Prentice Hall, Second Edition, 1999. [12] A. Reilly, G. Frazer and B.Boashash, “Analytic Signal Generation - Tips and Traps,” IEEE Trans. Signal Processing, vol. 42, No.11, pp. 32413245, Nov. 1994.

ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Gagan Mirchandani and Mohamed Elfataoui: A Uniformly Convergent Approximation for Ideal Complex Half-Band Filters

29

[13] I.W.Selesnick, “Hilbert Transform Pairs of Wavelet Bases,” IEEE Signal Processing Letters, vol.8, No.6, pp. 170-173, Jun. 2001. [14] E. P. Simoncelli, W.T.Freeman, E. H. Adelson, and D. J. Heeger, “Shiftable multi-scale transforms,” IEEE Trans. Inform. Theory, 38(2), pp. 587-607, Mar. 1992. [15] G. Strang, T. Nguyen: Wavelets and Filter Banks, Wellesley-Cambridge Press, Wellesley, MA, 1996.

Ratio of ehilbert and hilbert

L-1 highpass 0.78302



L-3 lowpass 0.77648

TABLE IV R ATIO OF T OTAL VARIATION WITH AN IMPULSE FUNCTION .

Gagan Mirchandani was born in Mussoorie and attended the Doon School, Dehra Dun, both in India. He received his M.S. and Ph. D. degrees in Electrical Engineering from Syracuse and Cornell respectively. He has been a professor of Electrical Engineering at The University of Vermont since 1979, where he also holds a secondary appointment as professor of Computer Science. He has coauthored more than 40 technical papers and was coauthor of the paper that won the Best Paper award in the IEEE Transactions on Education in 1977. His research interests, stemming from his circuit theory background, are in signal processing, multiscale analysis and related theoretical aspects. Current applications include the assessment of atrial electrophysiology, measurement of protein-protein colocalization in live cells and automated means of protein spot detection.

1.4 New method. Hilbert − Equiripple. 1.2 Hilbert − Fourier series. LP − Equiripple. LP Daubechies.

1

*

Magnitude

0.8

0.6

0.4

0.2

0 −1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

Normalized Frequency (×π rad/sample)

* Fig. 1.

*

Magnitude frequency response for the five methods

Ratio of new method and LP daub HT windowed HT equiripple LP equiripple

Level-1 highpass 0.00017 0.03168 0.09391 Inf

Level-2 highpass 0.73242 0.89389 0.87422 0.71158

Level-3 highpass 0.34227 0.87695 0.84777 0.79030

Level-3 lowpass 0.07561 0.70610 0.65434 0.07561

TABLE I R ATIO OF T OTAL VARIATION WITH AN IMPULSE FUNCTION .

*


Level-1 highpass 0.99311 0.99796 0.97564 0.99228

Level-2 highpass 0.95226 1.00291 0.99394 0.99975

Level-3 highpass 0.56999 1.02911 1.01776 1.01105

Level-3 lowpass 0.79540 1.05536 1.09391 0.79540

TABLE II R ATIO OF T OTAL VARIATION WITH A STEP FUNCTION .

*


Level-1 highpass 0.99682 1.00550 0.98464 1.00345

Level-2 highpass 0.98198 0.98648 0.97623 0.98749

Level-3 highpass 0.96633 0.99835 0.98521 0.99204

TABLE III R ATIO OF T OTAL VARIATION WITH A

Level-3 lowpass 1.16566 1.03330 1.06701 1.16566

FRACTAL SIGNAL .

Mohamed Elfataoui was born in Demnate, Morocco. He received the B.Sc. degree in Applied Mathematics from the University Cadi Ayyad, Marrakesh,Morocco in 1996. He received his M.Sc. and Ph.D. degrees in Electrical Engineering from The University of Vermont, Burlington, Vermont in 2004 and 2007 respectively. His research interests include theoretical and applied aspects of statistical pattern recognition and multiresolution analysis, with primary applications to image processing. He is currently working with Ascension Technology Corporation in Burlington, VT.


ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 M. R. Shirazi et al.: Effect of Brillouin Pump Linewidth on the Performance of Brillouin Fiber Lasers

Effect of Brillouin Pump Linewidth on the Performance of Brillouin Fiber Lasers M. R. Shirazi1, S. W. Harun, M. Biglary1, K. Thambiratnam1, and H. Ahmad1

Abstract— The effect of the pump linewidth on the performance parameters of Brillouin fiber lasers (BFLs) such as the linewidth, output power, and threshold, as well as the suitability of the BFL in measuring narrow laser linewidths is demonstrated. For the Brillouin pump (BP) linewidths of 15.2 MHz and 124 MHz, BFL linewidths of 8 Hz and 24 Hz respectively were obtained from heterodyne beat signals. Due to the BFL linewidth decrease, the corresponding BFL threshold also decreases, resulting in the BFL power for the BFL linewidth of 8 Hz being 9.7 dBm higher than the power of the BFL linewidth of 24 Hz at the same BP power of 14.3 dBm in a pure silica single-mode fiber. These results demonstrate ultra-narrow BFL linewidths and the role of the narrow BP linewidth in the development of high performance BFLs.

Index Terms—Brillouin scattering, fiber laser, linewidth, BFL

I. INTRODUCTION

S

timulated Brillouin Scattering (SBS) is a nonlinear effect that results from the interactions between the intense pump light and acoustic waves in a single-mode fiber (SMF), subsequently giving rise to backward propagating frequencydownshifted light [1]. The thermally excited acoustic waves generate an index grating that co-propagates with the pump at the acoustic velocity in the SMF. This moving grating reflects the pump light and causes the backscattered light to experience a downshift in the frequency as a result of Doppler Effect. The shift in frequency with respect to the pump is given by ΔνBS = (2VA / c) νP, where VA is the acoustic velocity in the fiber and νP is the optical frequency of the pump beam. The frequency shift in the 1550 nm region is approximately 10 GHz (0.08 nm). Although SBS generation can be detrimental in coherent optical communication systems [2], it does serve useful purposes, for instance in producing Brillouin fiber lasers (BFL) [3]. BFLs are highly coherent light sources and have generated increasing interest for a number of applications such as gyroscopes and sensors due to their extremely narrow linewidths [4].

In this paper, the effect of the pump linewidth on the performance of the BFL is demonstrated after measuring the narrow and wide Brillouin pump linewidths. The output power, linewidth, and threshold of the BFL are compared for the two different Brillouin pump linewidths.

II. EXPERIMENTAL SETUP

The experimental setup is shown in Fig. 1 with a coupler and a long single-mode fiber (SMF) acting as a resonator. The BFL is pumped by an external cavity tunable laser source (TLS) which is amplified by an erbium-doped fiber amplifier. The maximum power of the amplified Brillouin pump (BP) is approximately 14.3 dBm. The BP is injected into the resonator from port 1 through port 2 of the optical circulator in a clockwise direction. The generated backward-propagating SBS oscillates inside the resonator in an anti-clockwise direction to generate the backward BFL, which is coupled out via a 3-dB coupler. From port 2, the BFL is then routed into an optical spectrum analyzer (OSA) through port 3 of the optical circulator. The SMF used in the experiment is 25 km in length and has a cut-off wavelength of 1161 nm with a zero dispersion wavelength of 1315 nm and a mode field diameter of 9.36 μm. The output laser is characterized using an OSA with a resolution of 0.015 nm while the linewidth of the laser is measured using an optical spectrum analyzer and the heterodyne beat technique [5]. Optical Circulator BP

Port 1

Coupler

Port 2 Port 3

OSA SMF 25km Fig. 1: Experimental set up

Manuscript received May 7, 2007.This work was done in the Department of Physics, Faculty of Science, University of Malaya, Malaysia. Authors1 are with the Photonics Research Center, Department of physics, University of Malaya, 50603 Kuala Lumpur, Malaysia.(Phone & fax: 603-79674290; e-mails: [email protected], [email protected]). S.W. Harun is now with the Department of Electrical Engineering, Faculty of Engineering, University of Malaya, 50603 Kuala Lumpur, Malaysia. (email: [email protected]).

III. RESULTS AND DISCUSSION Fig. 2 and Fig. 3 compare the BFL output spectrum at different BP linewidth settings with the BP powers of 4.5 dBm and 14.3 dBm, respectively. Both figures show three


31

simultaneous lines: anti-Stokes at approximately 1549.9 nm, BP reflections at 1550 nm and BFL lines at around 1550.1 nm so that the Brillouin shift is approximately 0.086 nm. The antiStokes signal is observed at a shorter wavelength due to fourwave mixing between the BP and Stokes line. Although the anti-Stokes power remains almost unchanged, the powers of the BP reflections and BFL lines increase as the BP power increases due to Brillouin induced crosstalk between the lines [6]. With the BP wide linewidth setting, the BFL power is relatively lower as compared to the BP narrow linewidth setting as depicted in Fig. 2.

BFL are the same, their frequencies are slightly different. The heterodyne beat spectrum between the pump and the BFL is shown in Fig. 4 for the narrow and wide linewidth pump settings. The pump linewidths are obtained at 15 MHz and 124 MHz. (a)

0 -10

P ow er (dB m )

-20

BP=15MHz BP=124MHz

-30 -40 -50 -60 -70 -80 1549.6 1549.7 1549.8 1549.9

1550

1550.1 1550.2 1550.3 1550.4

Wavelength (nm)

(b)

Fig. 2: The BFL output spectrum for different BP linewidths ( 15 MHz and 124 MHz ) at BP power 4.5 dBm.

The maximum BFL power as shown in Fig. 3 is obtained 8.5 dBm using the BP narrow linewidth setting (15 MHz) at the BP power of 14.3 dBm, which is 9.7 dB higher than the maximum BFL power obtained using the BP wide linewidth setting (124 MHz) at the same BP power. Thus the pump conversion efficiency increases from 2.82% to 26.3% by changing the BP linewidth from wide to narrow. 10 0

P o w er (d B m )

-10

BP=15MHz BP=124MHz

-20 -30

Fig. 4: Heterodyne beat spectra between the narrow BFL and the BP at (a) narrow (15 MHz) and (b) wide (124 MHz) BP linewidth settings.

-40 -50 -60 -70 -80 1549.6 1549.7 1549.8 1549.9

1550

1550.1 1550.2 1550.3 1550.4

Wavelength (nm) Fig. 3: The BFL output spectrum for different BP linewidths ( 15 MHz and 124 MHz ) at a BP power of 14.3 dBm.

In the BP linewidth measurement, we use a 3-dB coupler to obtain the heterodyne beat frequency between the pump and the narrow BFL generated by the narrow linewidth pump setting. In accordance with the theory [7], this approximation is allowed for as the BFL linewidth is ignorable in comparison to the BP linewidth. While the powers of the pump and the

In order to obtain BFL linewidth, two similar but independent BFL configurations are used to generate the heterodyne beat spectum in the 3-dB coupler. As before, the two BFLs have the same power but slightly different frequencies. Fig. 5 shows the heterodyne beat spectra of the two BFLs generated by the two different BP linewidths at 15 MHz and 124 MHz with measured BFL linewidths of 8 Hz and 24 Hz respectively.


32

(a) 20 BP linewidth 124MHz

BFL peak power (dBm)

10

BP linewidth 15MHz

0 -10 -20 -30 -40 -50 -60 -70 -80 -30

-25

-20

-15

-10

-5

0

5

10

15

BP power (dBm)

Fig. 6: BFL peak power as a function of BP power at different BP linewidths.

(b)

IV. CONCLUSION The effect of the BP linewidth on the power, linewidth, and threshold of the BFL are demonstrated. At BP powers of 14.3 dBm, the BFL power of 8.5 dBm is obtained at the BP linewidth 15 MHz which is 9.7 dB higher as compared to the BFL power at a BP linewidth of 124 MHz. The measured ultra-narrow BFL linewidths revealed 24 Hz and 8 Hz as the BP linewidth is changed from 124 MHz to 15 MHz and the corresponding BFL threshold is decreased from approximately 11 dBm to 2 dBm. Thus, these results show ultra-narrow BFL linewidths and how the narrowing of the BP linewidth improves the performances of the BFL output. REFERENCES

Fig. 5: Heterodyne beat spectra between the two similar independent BFLs generated by the BP linewidths (a) 15 MHz and (b) 124 MHz.

Finally, Fig. 6 shows the BFL power against the BP power at the BP linewidths of 15 MHz and 124 MHz respectively. As shown in Fig. 5, the narrow linewidth contributes to the higher output power of the BFL and the lower BFL operation threshold. In the low–BP power range, BFL power for both BP linewidths 15 MHz and 124 MHz increase linearly as BP power increases. However, as the power exceeds a given critical power, the BFL power rapidly increases. The slope of the graph is measured to be about to 1.24 and 1.21 before the critical power levels, and also having approximately same values after the levels for the BFL linewidths 15 MHz and 124 MHz, respectively. The SBS threshold power is defined the BP power at this critical power level, therefore, the SBS thresholds, and so the BFL thresholds are observed at approximately 2 dBm and 11 dBm at BP linewidths of 15 MHz and 124 MHz respectively.

[1] G. P. Agrawal, Nonlinear Fiber Optics, 3rd ed. San Diego, CA: Academic, Ch. 9, 2001. [2] A. R. Chraplyvy, “Limitations in lightwave communications imposed by optical-fiber nonlinearities,” J. Lightwave Technol., vol. 10, pp. 1548–1557, 1990. [3] Hill, K.O., Kawasaki, B.S., and Johnson, D.C.: ‘CW Brillouin laser’, Appl. Phys. Lett., 28, pp. 608–609, 1976. [4] S. P. Smith, F. Zarinetchi, and S. Ezekiel, “Narrow-linewidth stimulated Brillouin fiber laser and applications,” Opt. Lett., vol. 16, pp. 393–395, 1991. [5] W. V. Sorin, Fiber Optic Test and Measurement, D. Derickson, Ed. Upper Saddle River, NJ: Prentice Hall, 1998, ch. 10. [6] G.P. Agrawal, Fiber Optic Communication System, 3rd ed., New York, Wiley Interscience, John Wiley & Sons INC, pg.369, 2001. [7] A. Debut, S. Randoux and J. Zemmouri, “Linewidth narrowing in Brillouin lasers: Theoretical analysis,” Phys. Rev., vol. A62, no. 2, pp. 023803-1 - 023803-4, 2000.

M.R. Shirazi (Mohammadreza Rezazadeh Shirazi) was born in Shiraz, Iran in 1970.This author received B.Sc. in applied physics and M.Sc. in physics (atomic & molecular branch) from Shahid Bahonar University of Kerman, Iran in 1993 and 1999, respectively. From 2000 to 2006, he worked as a full-time academic member in physics department at the Islamic Azad University (Kerman branch). Now, as a student in PhD, he works in nonlinear fiber optics with Professor Dr. Harith Ahmad and Dr. Sulaiman Wadi Harun as supervisors and Mozhgun Biglary and Kelvin Thambiratnam as physics students in M.Sc at the University of Malaya, Malaysia.

ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Martin P. Foster et al.: Automated design of LCC resonant converters using a genetic algorithm employing a describing function equivalent circuit converter model


Automated design of LCC resonant converters using a genetic algorithm employing a describing function equivalent circuit converter model Martin P. Foster, Adam J. Gilbert, Christopher R. Gould, David A. Stone and Christopher M. Bingham

Abstract— This paper describes how genetic algorithms can be applied to design LCC resonant converters. Operation of a binary genetic algorithm is described and an appropriate cost function is developed which is used in conjunction with a describing function based model to rapidly evaluate multiple candidate converters over a wide operating range. Simulation results demonstrate the versatility of the design tool. Index Terms— Resonant computer aided design

converter,

genetic

algorithm,

PRINCIPAL SYMBOLS Cp Cs Ls n Vdc Vout Qs ω0s A Vout_spec Vout_worst fworst Vout_best fbest fB

Parallel capacitor (F) Series capacitor (F) Series inductor (H) Transformer turns ratio DC input voltage (V) DC output voltage (V) Series quality factor Series resonant frequency (rads-1) Capacitor ratio Specified output voltage (V) Worst case output voltage (V) Worst case operating frequency (Hz) Best case output voltage (V) Best case operating frequency (Hz) Frequency bandwidth (Hz) I. INTRODUCTION

R

ESONANT converters have been growing in popularity over the past decade or so and they have been finding themselves in all manner of applications ranging from X-Ray machines [1] to consumer electronic products. Indeed, owing to their inherent soft-switching characteristic resonant converters can often achieve higher operating efficiencies at higher frequencies when compared to hard-switched converters with the added advantage of increasing power density due to the compatibility between the resonant circuit and the required isolation transformer. Although this trend is growing, there are still many power supply designers that are

reluctant to adopt them because of their unfamiliar circuit topology and control requirements and the lack of suitable design tools. Generally, the output voltage of a resonant converter is controlled by varying the frequency at which the input MOSFETs are operated, leading to a control characteristic that resembles the magnitude frequency response of a resonant circuit. If the topology employed utilises a series resonant branch featuring a single capacitor and inductor then the converter is operated above the system resonant frequency to achieve soft-switching. Of the many possible resonant converter configurations [2,3] it is the LCLC family that have received the greatest interest from industry and academics alike owing to the presence of this series branch. From this family of converters, it is the LCC voltage-output converter that has received special interest owing to its simple topology and the inherent low-noise soft-switching characteristics of the rectifier, see Fig.1. Although the circuit only contains 3 resonant components, its control response is highly non-linear and is a function of both the load and the excitation frequency thus making design optimisation cumbersome. Further to this, accurately predicting the output voltage can be challenging. Traditionally, designers have employed FMA-based models to obtain a ‘ball-park’ design which is subsequently manually refined using more accurate, but significantly more tedious, time consuming transient simulations (e.g. SPICE). In an effort to reduce computation time some designers have resorted to lookup table type technology to correct for inaccuracies in FMA prediction. More recently a describing function based model for the LCC converter has been presented offering improved accuracy thus negating the requirement for complex simulation [4,5]. However to apply this model usefully to design a converter still requires significant knowledge of the converter topology and an in depth understanding of its characteristics in order to make

T1 iin Ls

Cs

n:1

iR

Vdc Manuscript received May 19, 2007. This work was supported in part by the Engineering and Physical Sciences Research Council under Grant EP/C015924/1 and The Nuffield Foundation Grant All authors are with Department of Electronic and Electrical Engineering, The University Of Sheffield, Sheffield, UK; e-mail: [email protected].

T2

Vi

Cp

Fig. 1. LCC voltage-output resonant converter

Cf

RL

Vout


34

informed choices given a very large parameter search space. This paper attempts to remove design complexity by describing how a relatively basic genetic algorithm (GA) can be used to choose converter parameters based on a set of simple rules. Genetic algorithms provide a convenient mechanism for searching a wide parameter space with minimal effort and have found themselves used in many applications from optimising orders for a reheat furnace [6] to the design of permanent magnetic 3-phase motors [7]. GAs are typically employed to solve problems where traditional gradient based methods are difficult use (i.e. non-linear problems with multiple local minimas). Specifically, the problem in this paper is how to choose component values to achieve a specified output voltage range whilst trying minimise/maximise some other parameter(s). In this pursuit GAs are excellent, they can operate on highly non-linear systems and evaluate a large parameter space very effectively. The remainder of this paper is devoted to developing a rudimentary LCC converter design tool based on a genetic algorithm. A basic genetic algorithm is introduced in section 2. Section 3 describes the LCC converter and the model used to evaluate the chosen parameters. Section 4 describes how to use a GA to design a converter. Results are presented in Section 5. II. GENETIC ALGORITHMS The term genetic algorithm refers to techniques based on Darwinian ‘survival of the fittest’ rules that can be employed to search a large parameter space in effort to optimize a cost function, in general the cost function is minimised. Genetic algorithms take many forms but the common factors are that GAs use a population of candidate solutions coupled with a searching algorithm to find a minimum cost. The searching algorithm is split into two distinct section a breeding process and mutation. The population and searching algorithm is iterated many times until the population converges. The population is made from chromosomes, where each chromosome represents the parameters of the function being optimized. In this paper a simple binary representation is used to encode the converter parameters. The breeding process is where the best candidates (i.e. those with the minimum cost) are chosen to be parents from which children are obtained. The justification for this strategy being the children should hopefully inherit some desirable attributes from their parents’. The children from this process are used to replace the poorer chromosomes in the population, hence ‘survival of the fittest’. The breeding process can take many forms but the most common, and the one used here, is tournament selection where two parents are selected at random and then their chromosome crossed-over at some randomly determined point. For the first iteration a randomised population is used. However, subsequent iterations will refine the population and, ultimately, it should converge. The effect of breeding on this population is obvious; for the first few runs breeding allows a large parameter space to be searched, as the

Start

Generate initial population

Explore Mutation

Evaluate fitness Breed Crossover

Select parents Tournament selection

No

Converged?

Yes Completed design

Fig. 2. Flow diagram of GA operation

number of iterations increases more of the chromosomes in the population will begin to resemble each other thus permitting desired characteristics to be retained. Unfortunately, just employing breeding on its own may not be enough because it is possible for the GA to become trapped in a local minimum. To ensure this does not happen, and also in an effort to explore more of the search space, the chromosomes are often modified in some way using mutation. In binary based GAs this is achieved by randomly toggling bits in the population. The random processes involved in the mutation and crossover phases are usually determined by a uniform random number generator and the mutation and crossover rates pmut and pcross respectively. The whole genetic algorithm optimization process is represented diagrammatically in Fig. 2. The reader is directed to [8] for further information on GAs and application of their techniques. III. LCC CONVERTER A. Description of operation Referring to Fig. 1, the half-bridge MOSFETs T1 & T2 are operated in anti-phase to produce the square waveform Vi at a frequency of fs which is used to excite the resonant tank causing a current of iin to flow. Assuming the rectifier is already in conduction then vCp is clamped to Vout and current flows from the resonant circuit into Cf, and the load, thus producing the output voltage. If Cf >> Cp then negligible current flows through Cp. When the rectifier’s current falls to zero it will turn-off and resonant circuit current will circulate through Cp until |vCp|=Vout. This is known as the rectifier nonconduction period, highlighted as θ1 in Fig.3, in which vCp takes on a cosinusoidal form. B. Converter model Since traditional FMA type analyses does not produce


35

vcp i in

100 Voltage (V) / Current (A)

I in =

Vi

150

2Vdc π Z tot

(3)

Finally, the output voltage,

Vout =

50

2nR L

γ

I in

(4)

0

IV. APPLICATION OF GENETIC ALGORITHMS FOR RESONANT CONVERTER DESIGN

-50

?α

-100

θ11

-150 7.689

7.6895

7.69

7.6905 7.691 Time (ms)

7.6915

7.692

7.6925

Fig. 3. Resonant circuit voltage and current waveforms

accurate predictions the describing function equivalent circuit model presented in [4,5] will be used to evaluate candidate converter designs. Specifically this model accounts for the non-linear interaction of the rectifier by developing a 1st-order approximation for the equivalent impedance Zeq presented by Cp, the rectifier and output filter parallel branch. Subsequently Zeq is combined with the series resonant components to obtain the equivalent impedance of the whole circuit, Ztot. Once obtained, the output voltage can be evaluated. Only the model and its application will be described here, the reader is directed to the references for a more detailed description. The AC equivalent impedance, Zeq, of the parallel branch is obtained by determining the fundament component of vCp by performing a Fourier transform on a piecewise equation which is subsequently divided by an expression for the input current iin. Performing these manipulations leads to,

Z eq = Rη +

1 jω s Cη

(1)

where

γ =π +

2 Aω s 8n 2 R L , Rη = Qs ω 0 s γ2

A. Design principles Most power supplies are designed to provide a steady output over a wide range of input voltages and output powers and the designer assumes the feedback controller will accommodate these variations. For resonant converters the design process is further complicated by the non-linear effects of the load and the requirement for above resonant operating restriction to ensure soft-switching. Of the many possible approaches one of the most successful is to use a bandwidth-limited design methodology where the converter’s performance is evaluated for worst and best case operating conditions and to choose parameters that fit these response within a given control frequency bandwidth [9]. A similar approach is taken here where the user enters their input and output requirements and the GA finds the optimum series quality factor Qs, transformer turns ratio n and capacitor ratio factor A to achieve the require converter specification over a given frequency bandwidth range fB, see Fig. 4. In order to make the search as effective as possible it was decided to restrict the transformer turns ratio by forcing the converter to work as close as possible to the converter’s series resonant point f0s under the worst case operating condition. B. 4.2 Cost function For the GA to function correctly the design rules presented above must be translated into to an appropriate cost function. For each candidate converter, the worst-case and best-case operating points are evaluated. Measurements obtained from 90

πγ 2 C p 2

ω0 s =

 2πAω s (γ − 2π ) + γ 2 π − cos −1  γ − 2π Qs ω 0 s  γ 

1 Ls C s

, Qs =

  

70

Output voltage (V)

Cη =

Cp 1 , A = Cs ω 0 s C s n 2 RL

The total circuit equivalent impedance is,

Z tot

1 = jω s L s + + Z eq jω s C s

Best case Worst case

80

(2)

From which the magnitude of the input current can be obtained,

60 50 40

f0s 30

Vout=24V

20

fB

10 0

0

100

200

300 Frequency (kHz)

400

Fig. 4. Operating characteristics of LCC resonant converter

500

600


36

these evaluations are subsequently combined. 1) Function 1 – Worst-case operating point For the worst case operating condition (i.e. maximum load, minimum input voltage) the converter must provide the appropriate level of gain and therefore correct assessment of this point is crucial if the GA is to operate as effectively as possible. The cost function associated with point is squared error between the design specification point and candidate point. 2

 Vout _ worst − Vout _ spec   f worst − f 0 s  (8)  +  J1 =     V f out _ spec 0 s     where Vout_worst and fworst are the candidate points, Vout_spec_worst and f0s are the desired points. 2) Function 2 – Best-case operating point Here, the output voltage Vout_best and operating frequency fbest for minimum load, maximum input voltage is evaluated. Again squared error distances from the desired specification are employed to give the cost function, 2

2

 Vout _ best − Vout _ spec    (9)  + 1 − f best − f 0 s  J2 =      V f out _ spec B     Total combined cost function J is obtained from the addition of J1 and J2. However to ensure convergence a penalty, penworst, is added to cost function if the worst case output voltage condition cannot be met. 2

J = J 1 + J 2 + penworst

V. VALIDATION To validate the proposed automated design procedure 5 prototype converters, chosen to demonstrate the GA’s performance over a wide range of input and output voltage ratios, were designed and then simulated using PLECS®, the Matlab®/Simulink® power electronics toolbox. All converters featured an input voltage range of 300-420V, a resonant frequency of 200kHz, a control bandwidth of 500kHz and rs=0.3Ω and Vd=0.7V. The converter specifications and designs are shown in Table I. In each case, the filter capacitor was chosen using the technique described in [5] for a ripple voltage of 5% and simulated for 20 times the output filter time constant to ensure steady-state. Figure 5 compares the predicted output voltage obtained from PLECS with the desired specification at spot frequencies obtained by varying the load from its maximum to its minimum specified value. The plots clearly show that the designed resonant components provide a converter whose output is in close proximity to the desired output voltage. Indeed, all results are within a 10% error band. However, it must be noted that prediction error is mainly due to the approximate nature of the describing function and not 7 6.9 6.8

(10)

6.5 6.4

6.2

Desired Min Vdc

6.1

Max Vdc

6 250

2

For simplicity, n was restricted to integer values, (i.e. 2:1 or 1:2). It is known that parasitic series resistance of the MOSFETs and inductor and the voltage drop across the rectifier can have a detrimental effect on the prediction accuracy of the many models. Therefore, (2) was modified to account for a series resistance, rs, and the iterated as described in [5] to include the effects of the rectifier. As was mentioned earlier, it is common to retain some of the better designs and replace the poorer ones with children from the breeding stage. This process is referred to as elitism and is employed to ensure that information from the best candidates is retained during the breeding process and mutation processes. For the GA implemented here an elitism factor (or more correctly a cross-over factor) of pcross=50% and a mutation rate of pmut=30%. The mutation rate was decreased by 1% per iteration.

6.6

6.3

300

350

400

450 500 550 Frequency (kHz)

600

650

700

750

(a) 16 Desired Min V dc

15.8

Max V dc

15.6 15.4 Output voltage (V)

Vdc

Output voltage (V)

6.7

C. Implementation details For simplicity the converter parameters Qs and A are encoded as 8-bit values. The series quality factor is scanned over the range 0.05≤Qs≤5 and the capacitor ratio over the range 0.1≤A≤10. The resonant capacitor values are further restricted to the E24 preferred value range to ensure a realizable design. The transformer turns ratio was determined by noting that at the series resonant point the converter has a gain Vout =1 .

15.2 15 14.8 14.6 14.4 14.2 14 300

350

400

450


(b)

650

700

750

800


37

TABLE I PROTOTYPE CONVERTER PARAMETERS

25 Desired Min Vdc

24.8

Converter Vout (V) Min.. Iout (A) Max. Iout (A) Cp (nF) Cs (nF) Ls (µH) n

Max V dc

24.6

Output voltage (V)

24.4 24.2 24

3 24 1 7 0.82 5.6 113 6

4 60 1 7 0.56 8.2 77 2

5 320 0.05 0.3 0.68 4.7 135 0.5

Pentium M 1.73GHz processor), thus providing a substantial improved over manual methods.

23.4 23.2

VI. CONCLUSIONS

23 200

300

400

500 Frequency (kHz)

600

700

800

(c) 60.5

60

59.5 Output voltage (V)

2 15 1 20 2.7 15 42 10

23.8 23.6

59

This paper has demonstrated that the proposed GA to be a viable method for designing LCC voltage-output resonant converters over a wide parameter space. A bandwidth limited design methodology has been described which has been translated into an appropriate cost function for a two parameter binary code GA. Implementation issues regarding parameter encoding, chromosome mutation and crossover methods have been discussed. Simulation results have shown the GA to be capable of designing resonant converters many times faster than previously possible with manual evaluation methods.

58.5

ACKNOWLEDGEMENTS The authors would like to express their thanks to the Engineering & Physical Sciences Research Council and the Nuffield Foundation for supporting this work.

58 Desired Min Vdc

57.5

Max Vdc 57 200

250

300

350


550

600

650

REFERENCES

700

[1]

(d) 322 Desired Min V

[2]

Max V dc

[3]

dc

320

318 Output voltage (V)

1 6.3 1 5 0.15 1.2 520 21

[4] 316

314

[5]

312

[6]

310

308 300

350

400

450


650

700

750

800

(e) Fig. 5. Output voltage versus frequency for entire load range: (a) converter 1, (b) converter 2, (c) converter 3, (d) converter 4 & (e) converter 5

necessarily the genetic algorithm. Furthermore, for each converter design 1,600 out of the possible 65,536 candidates were evaluated in a fraction over 12 seconds (based on an Intel

[7]

[8] [9]

Hino, H., Hatakeyama, T., Kawase, T., and Nakaoka, M; ‘Highfrequency parallel resonant converter for X-ray generator utilizing parasitic circuit constants of high voltage-transformer and-cables’, INTELEC 89, p.20.5/1 - 20.5/8 vol.2 Severns, R. P.; ‘Topologies for Three-Element Resonant Converters,’ IEEE Trans Power Electronics, vol. 7, no. 1, p.89-98, January 1992. Batarseh, I.; ‘Resonant Converter Topologies with Three and Four Energy Storage Elements,’ IEEE Trans Power Electronics, vol. 9, no. 1, p.64-73, January 1994. Sewell, H. I., Foster, M. P., Bingham, C. M., Stone, D. A., Hente, D. and Howe, D.; ‘Analysis of voltage output LCC resonant converters, including boost mode operation’, IEE Proc. Electric Power Applications, Vol. 150, No. 6, p.673-679, November 2003 Foster, M. P., Sewell, H. I., Bingham, C. M. and Stone, D. A.; ‘Methodologies for the design of LCC voltage-output resonant converters’, IEE Proc. Electric Power Applications, Vol. 153, No. 4, p.559-567, July 2006. Broughton, J., Mahfouf, M., Linkens, D. A.; ‘GA-Based Optimisation of a Continuous Walking Beam Reheating Furnace’, UKCI 2003 The 2003 UK Workshop on Computational Intelligence, CD-ROM proceedings. Bianchi, N. and Bolognani, S.; ‘Design optimisation of electric motors by genetic algorithms’, IEE Proc. Electric Power Applications, September 1998, Vol. 145, Iss. 5, p. 475-483. Goldberg, D. E.; ‘Genetic Algorithms in Search, Optimization, and Machine Learning’, Addison-Wesley Professional Gould, C. R., Foster, M., D. A., and Bingham, C. M.; ‘Bandwidthcontrolled Steady-state Design and Analysis of a 3kW CLL Resonant Converter with Rectified Mains Output Characteristics’, ICEMS 2006, DS1E3-01, CD-ROM Proceedings.


ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Jonathan M. Blackledge: Diffusion and Fractional Diffusion based Models for Multiple Light Scattering and Image Analysis

Diffusion and Fractional Diffusion based Models for Multiple Light Scattering and Image Analysis Jonathan M Blackledge, Fellow, IET, Fellow, IoP, Fellow, RSS

Abstract— This paper considers a fractional light diffusion model as an approach to characterizing the case when intermediate scattering processes are present, i.e. the scattering regime is neither strong nor weak. In order to introduce the basis for this approach, we revisit the elements of formal scattering theory and the classical diffusion problem in terms of solutions to the inhomogeneous wave and diffusion equations respectively. We then address the significance of these equations in terms of a random walk model for multiple scattering. This leads to the proposition of a fractional diffusion equation for modelling intermediate strength scattering that is based on a generalization of the diffusion equation to fractional form. It is shown how, by induction, the fractional diffusion equation can be justified in terms of the generalization of a random walk model to fractional form as characterized by the Hurst exponent. Image processing and analysis methods are proposed that are based on diffusion and fractional diffusion models and some application examples given. Index Terms— Multiple Scattering, Optical Diffusion, Fractional Optical Diffusion, Random Walk Processes, Intermediate Strength Scattering, Image Processing and Analysis

I. I NTRODUCTION

T

HE use of formal scattering methods for modelling the interaction of light with an inhomogeneous medium together with associated inverse scattering models is well known (e.g. [1]). In applications associated with the processing and analysis of an image, the aim is to develop a model that maps the object plane to the image plane. If the scattering is ‘weak’ (i.e. based on single scattering events) and the scattered wavefield is measured in the far field, then the map is determined by the Fourier transform which, for a clear aperture, yields the fundamental imaging equation [2] I(x, y) = p(x, y) ⊗2 f (x, y) + n(x, y) for an image I where p is the point spread function (a characteristic of the imaging system), f is the object function and ⊗2 denotes the two-dimensional convolution operation, i.e. Z Z p(x, y) ⊗2 f (x, y) = p(x − x0 , y − y 0 )f (x0 , y 0 )dx0 dy 0 The noise n is taken to be a stochastic function which at best, can be characterized by a probability density function Manuscript received June 1, 2007. This work was supported by Microsharp Corporation Limited. Jonathan Blackledge is Professor of Information and Communications Technology, Applied Signal Processing Research Group, Department of Electronic and Electrical Engineering, Loughborough University, England and Professor of Computer Science, Department of Computer Science, University of the Western Cape, Cape Town, Republic of South Africa (e-mail: [email protected]).

Pr[n(x, y)] that conforms to a physically significant statistical model. The function n is taken to include a range of perturbations to the scattered field that is recorded in the image plane. Within the context of the weak scattering approximation used to derive the fundamental imaging equation, this includes multiple scattering. The object function f (x, y) is related to a three-dimensional scattering function γ(r) where r is the three-dimensional spatial vector. In the far field, the weak scattered wavefield us is (ignoring scaling factors) given by the Fourier transform of the scattering function [3] us (k) ∼ Fˆ3 [γ(r)] where Fˆ3 denotes the three-dimensional Fourier transform operator and k is the spatial frequency vector. The inverse scattering problem is then compounded in the inversion of this result, i.e. the inverse Fourier transform. This weak scattering result can be interpreted in terms of single scattering events generated by a scattering function consisting of an ensemble of localized point-like scatterers, for example. When multiple is present, this simple result is not sufficient to model the scattered field which must be modified to take into account double, triple, quadruple etc. scattering events. This yields results that make the objective of ‘engineering’ a practically viable imaging and image processing model for various applications rather intractable. In such cases, it can be of value to develop a stochastic model for the scattered field whereby, instead of relating the scattering function to some object function which is then mapped onto the image plane, we attempt to generate a model for the probability density function of a multiple scattered wavefield in order to account for the statistical distribution of the intensity field obtained in the image plane. This involves an approach in which the resultant scattered wavefield (i.e. the wave amplitude) is taken to be a consequence of a random walk where each node in the random walk is taken to be a scattering event. There is a fundamental connection between a random walk model for describing Brownian motion, for example, and the process of diffusion as defined by the diffusion equation. This ‘connectivity’ provides an approach for interpreting strong scattering in terms of a diffusive process. But formal scattering methods (including multiple scattering) are based on considering solutions to the (inhomogeneous) wave equation. Now, the essential difference between the wave equation and the diffusion equation is with regard to the order of the time differential. By ‘fractionalizing’ the time differential and


39

considering a fractional diffusion equation of the type [4] 1 ∂q 2 ∇ + q q u(r, t) = 0 D ∂t where D is the fractional diffusivity, we consider the role that the fractional exponent q plays in terms of characterizing an image from a fully diffusive (strong scattering) model when q = 1 to a propagative (weak scattering) model when q = 2. In order to introduce this idea, we review, by way of a short tutorial, the principal formal solutions to the forward and inverse scattering problem in terms of solutions to the inhomogeneous wave equation for both deterministic and random media. We then address the properties and solutions to the (inhomogeneous) diffusion equation and discuss the basis for using this equation to model the propagation of light through an optical diffuser. This provides an inverse solution to the optical diffusion problem that can be cast in terms of appropriate finite impulse response filters, the first order solution providing the well known ‘high emphasis filter’ [5], [6]. The principles associated with random phase walk models are addressed and the rationale for generalizing some well known results to fractional form considered. Forward and inverse solutions to the fractional diffusion equation are derived and, in the latter case, used to propose of a new metric for segmenting a digital image under the assumption that it has been formed from a fractional diffusive (intermediate scattering) process. II. F ORMAL S CATTERING M ETHODS FOR S CALAR WAVEFIELDS Formal scattering methods for scalar electromagnetic wavefields interacting with (non-conductive) dielectric media are based on the inhomogeneous Helmholtz equation [7] which can be derived quite generally from the (inhomogeneous) time dependent wave equation 1 ∂2 2 ∇ − 2 2 U (r, t) = 0 c ∂t

The general (Green’s function) solution to this equation at a point r0 is [10], [11] Z I ˆ d2 r u(r0 , k) = k 2 gγud3 r + (g∇u − u∇g) · n V

S

where g is the ‘outgoing’ free space Green’s function given by exp(ik | r − r0 |) g(r | r0 , k) = 4π | r − r0 | which is the solution to the equation (where δ 3 is the threedimensional delta function) (∇2 + k 2 )g(r | r0 , k) = −δ 3 (r − r0 ) ˆ is the unit vector perpendicular to the surface element and n dr of a closed surface S. To compute the surface integral, a condition for the behaviour of u on the surface S of γ must be chosen. Consider the case where the incident wavefield ui is a simple plane wave of unit amplitude exp(ik · r) satisfying the homogeneous wave equation (∇2 + k 2 )ui (r, k) = 0. By choosing the condition u(r, k) = ui (r, k) on the surface of γ, we obtain the result Z I 2 3 ˆ d2 r. u(r0 , k) = k gγud r + (g∇ui − ui ∇g) · n V

S

Now, using Green’s theorem to convert the surface integral back into a volume integral, we have I Z ˆ d2 r = (g∇2 ui − ui ∇2 g)d3 r. (g∇ui − ui ∇g) · n S

V

Noting that ∇2 ui = −k 2 ui

by letting 1 1 = 2 (1 + γ) 2 c c0 where γ is a dimensionless quantity (the scattering function) and c0 is a constant (wave speed). With U (r, t) = u(r, ω) exp(iωt) we have

and that ∇2 g = −δ 3 − k 2 g we obtain Z

(∇2 + k 2 )u(r, k) = −k 2 γ(r)u(r, k)

(g∇2 ui − ui ∇2 g)d3 r =

V

where k=

ω . c0

In electromagnetism, u is the scalar electric field, c0 is the speed of light and the scattering function, γ = r − 1 where r (r) is the relative permittivity1 , is taken to be of compact support [8], [9], i.e. γ(r) ∃ ∀ r ∈ V 1 The

relative permeability is assumed to be constant.

Z

δ 3 ui d3 r = ui .

V

Hence, by choosing the field u to be equal to the incident wavefield ui on the surface of γ, we obtain a solution of the form u = ui + us where us = k 2

Z

gγud3 r.

V

The function us is the scattered wavefield.


40

A. The Born Approximation From the last result it is clear that in order to compute the scattered field us , we must define u inside the volume integral. Unlike the surface integral, a boundary condition will not help here because it is not sufficient to specify the behaviour of u at a boundary. In this case, the behaviour of u throughout V needs to be known. In general, it is not possible to do this (i.e. to compute the scattered wavefield exactly) and we are forced to choose a model for u inside V that is compatible with a particular physical problem in the same way that an appropriate set of boundary conditions are required to evaluate the surface integral. The simplest model for the internal field is based on assuming that u behaves like ui for r ∈ V . The scattered field is then given by Z 2 us (r0 , k) = k g(r | r0 , k)γ(r)ui (r, k)d3 r. V

This assumption provides an approximate solution for the scattered field and is known as the Born approximation [7] after Max Born who was amongst the first to introduced the approximation in the study of (non-relativistic) quantum scattering when the basic wave equation is the Schrödinger equation2 [12] (∇2 + k 2 )u(r, k) = γ(r)u(r, k) where γ is a scattering potential (not necesserily of compact support). There is another way of deriving this result that is instructive and helps to obtain a criteria for the validity of this approximation which will be considered shortly. We start with the inhomogeneous Helmholtz equation (∇2 + k 2 )u = −k 2 γu

Solving for us using the Green’s function and homogeneous boundary conditions (i.e. us = 0 on S and ∇us = 0 on S) we get I Z 2 2 ˆd r + k us = (g∇us − us ∇g) · n gγui d3 r S

V

= k2

Z

gγui d3 r.

V

1) Validity of the Born Approximation: In general, the Born approximation requires that us is ‘small’ or ‘weak’ compared to ui . What do we mean by the term ‘weak’ and how can we quantify it? One way to answer this question is to compute an appropriate measure for both the incident and scattered fields and compare the two results. Consider the case where we compute the root mean square modulus (i.e. the `2 norm) of each field. We then require   21   12 Z Z  | us (r0 , k) |2 d3 r0  1 giving an approximate solution for the de-diffused image I0 . If we include all the terms in this series, then an exact solution for I0 can be

= I(i+2)j + Iij + I(i+1)(j+1) + I(i+1)(j−1) − 4I(i+1)j +Iij + I(i−2)j + I(i−1)(j+1) + I(i−1)(j−1) − 4I(i−1)j +I(i+1)(j+1) + I(i−1)(j+1) + Ii(j+2) + Iij − 4Ii(j+1) +I(i+1)(j−1) + I(i−1)(j−1) + Iij + Ii(j−2) − 4Ii(j−1) −4I(i+1)j − 4I(i−1)j − 4Ii(j+1) + 4Ii(j−1) + 16Iij = 20Iij + I(i+2)j + 2I(i+1)(j+1) + 2I(i+1)(j−1) − 8I(i+1)j +I(i−2)j + 2I(i−1)(j+1) + 2I(i−1)(j−1) − 8I(i−1)j + Ii(j+2)


50

−8Ii(j+1) + Ii(j−2) − 8Ii(j−1) . In terms of a convolution written as  0 0  0 2   1 −8   0 2 0 0

kernel, the result above can be  1 0 0 −8 2 0   20 −8 1  . −8 2 0  1 0 0

Hence, given the convolution kernel associated with the first order solution I − ∇2 I, the convolution kernel associated with the second order solution I − ∇2 I + 12 ∇4 I is given by     1 0 0 0 0 0 0 0 0 0 2  0 0 −1 0 0   0 1 −4 1 0       0 −1 5 −1 0  +  1 −4 10 −4 1  2    2   0 0 −1 0 0   0 1 −4 1 0  1 0 0 0 0 0 0 0 0 0 2   0 0 1 0 0  0 2 −10 2 0   1  =  1 −10 30 −10 1   2 0 2 −10 2 0  0 0 1 0 0 To compute the convolution kernel associated with the third order solution f − ∇2 f + 21 ∇4 f − 16 ∇6 f , we use the same method as above to evaluate ∇6 Iij to obtain   0 0 0 −1 0 0 0  0 0 −3 15 −3 0 0     0 −3 24 −87 24 −3 0   1  −1 15 −87 202 −87 15 −1    6   0 −3 24 −87 24 −3 0   0 0 −3 15 −3 0 0  0 0 0 −1 0 0 0 An example of the application of these filters is given in Figure 4 which shows the result of diffusing a image by applying a Gaussian low-pass filter and then restoring the image using the first (high emphasis) and second order FIR filter given above.

VI. F RACTIONAL D IFFUSION A. Random Walk Processes The purpose of revisiting random walk processes is that it provides a useful conceptual reference for introducing fractional diffusion and an appreciation of the use of the fractional diffusion equation, an equation that arises through the generalisation of coherent and incoherent random walk processes into a single model. In the Nineteenth Century, the Scottish botanist, Robert Brown, discovered (observing through a microscope) the motion exhibited by small particles (pollen grains) that is immersed in a liquid. Each particle follows a random walk as a result of the elastic collisions it has with ensembles of liquid molecules which are them selves in a state of random motion. Brownian motion is the basis of modelling all kinds of statistical fluctuations, most prominently in the field of

Fig. 4. Original 256×256 image (top-left) - M83 galaxy; result after applying a Gaussian low-pass filter (top-right); output after application of the first order (high emphasis) FIR filter (bottom-left); output after application of the second order FIR filter (bottom-right).

gambling. However, it was many years after Brown’s discovery that work was undertaken to provide a quantitative description associated with this motion. The first work of its type was undertaken by Albert Einstein and published in 1905. The basic idea is to consider a random walk in which the mean value of each step is a but where there is no correlation in the direction of the walk from one step to the next. That is, the direction taken by the walker from one step to next can be in any direction described by an angle between 0 and 2π radians - for a walk in the plane. The angle that is taken at each step is entirely random and all angles are taken to be equally likely. Thus, the PDF of angles between 0 and 2π is given by ( 1 , 0 ≤ θ ≤ 2π; Pr[θ] = 2π 0, otherwise. If we consider the random walk to take place in the complex plane, then after n steps, the position of the walker will be determined by a resultant amplitude A and angle Θ given by the sum of all the steps taken, i.e. A exp(iΘ) = a exp(iθ1 ) + a exp(iθ2 ) + ... + a exp(iθn ) =a

n X

exp(iθm ).

m=1

The problem is to obtain a scaling relationship between A and n. The trick to finding this relationship is to analyse the result of taking the square modulus of A exp(iΘ). This provides an expression for the intensity I given by n 2 X 2 I=a exp(iθm ) m=1


51

= a2

n X

exp(iθm )

m=1

 = a2 n +

n X

exp(−iθm )

m=1

n X

exp(iθj )

j=1,j6=k

n X

 exp(−iθk ) .

k=1

Now, in a typical term exp(iθj ) exp(−iθk ) = cos(θj − θk ) + i sin(θi − θk ) of the double summation, the functions cos(θj − θk ) and sin(θj − θk ) have random values between ±1. Consequently, as n becomes larger and larger, the double sum will reduces to zero since more and more of these terms cancel each other out. This insight is the basis for stating that for n >> 1 I = na2 and the resulting amplitude is therefore given by √ A = na. Thus, A is proportional to the square root of the number of steps taken and if each step is taken over a mean time period, then we obtain the result √ A(t) = a t. Clearly, if each step in the walk is in the same direction, then the resulting amplitude after a time t will be at. This is a deterministic result. However, with a√random walk, the interpretation of the above result is that a t is the amplitude associated with the most likely position that the random walker will be after time t. If we imagine many random walkers, each starting out on their ‘journey’ from the origin of the (complex) plane at t = 0, record the distances from the origin of this plane after a set period of time t, then the PDF of A will have a maximum value - the ‘mode’ of the distribution - that √ occurs at a t. In the case of a non-random walk, the PDF will consist of a unit spike that occurs at at. In the (classical) kinetic theory of matter (including gases, liquids, plasmas and some solids), we consider a to be the average distance a particle travels before it randomly collides and scatters off another particle. The scattering process is taken to be entirely elastic, i.e. the interaction does not affect the particle in any way other than to change the direction in which it travels. Thus, a represents the mean free path of a particle. The mean free path is a measure how far a particle can travel before scattering with another particle which in turn, is related to the number of particle per unit volume - the density of a gas, for example. If we imagine a particle ‘diffusing’ through an ensemble of particles, then the mean free path is a measure of the ‘diffusivity’ of the medium in which the process of diffusion takes place. This is a feature of all classical diffusion processes which can be formulated in terms of the diffusion equation with diffusivity D. The dimensions of diffusivity are length2 /time and must be interpreted in terms of a characteristic distance of the process which varies with the square root of time. Suppose we now consider the three-dimensional diffusion of light to be based on a three-dimensional random walk.

Each scattering event is taken to be a point of the random walk in which a ray of light changes its direction randomly (any direction between 0 and 4π radians). The light field is taken to be composed of a complex of rays, each of which propagates through the diffuser in a way that is incoherent and uncorrelated in time. If this is the case, then the propagation of light can be considered to analogous to a process of (classical) diffusion and instead of modelling the process in terms of the (inhomogeneous) wave equation 1 ∂2 2 u(r, t) = 0 ∇ − 2 c (r) ∂t2 with intensity given by I(r, t) =| u(r, t) |2 we can consider the intensity to be given by the solution of the homogeneous diffusion equation 1 ∂ ∇2 − I(r, t) = 0 D ∂t with initial condition I(r, t) = I0 (r) at t = 0. This assumes that the diffusivity D is constant throughout the diffuser which in turn assumes that Pr[c(r)] for a random scattering model (based on a solution to the wave equation) is the same throughout the diffuser and thus, the autocorrelation function Γ(r) required to compute the intensity. Although the discussion above has been presented for the case of light, the principle remains the same for the case of any form of electromagnetic wavefield, for example, or indeed for the propagation/diffusion of information in general. Thus, for some random walk process whose macroscopic characteristic are defined by a field u, if the process is diffusive, then the field u is characterised by the operator 1 ∂ D ∂t and, if the process is propagative, then it is characterised by the operator 1 ∂2 ∇2 − 2 2 . c ∂t In multiple wave scattering theory, we consider a wavefront travelling through space and scattering from a site that changes the direction of propagation. The mean free path is taken to be the average number of wavelengths taken by the wavefront to propagate from one interaction to another as described by the free space Green’s function. After scattering from many sites, the wavefront can be considered to have diffused through the ‘diffuser’. Here, the mean free path is a measure of the density of scattering sites, which in turn, is a measure of the diffusivity of the material - an optical diffuser for example. ∇2 −

B. Hurst Processes We have considered random processes that characterise fully coherent (propagative) and fully incoherent (diffusive) behaviour and through the physical interpretation of such processes we have related them to differential operators associated with the corresponding macroscopic behaviour. For a random walk model in the √ plane, A(t) = at for a coherent walk and A(t) = a t for an incoherent walk. What would be the result if the walk is neither coherent or incoherent but partially


52

coherent/incoherent? In other words, suppose the random walk exhibited a bias with regard to the distribution of angles used to change √ the direction. What would be the effect on the scaling law t? Intuitively, one expects that as the distribution of angles reduces, the corresponding walk becomes more and more coherent, exhibiting longer and longer time correlations until the process conforms to the scaling law t. Conceptually, scaling models associated with the intermediate case(s) √ should be based on a generalisation of the scaling laws t and t to the form tH where 0.5 ≤ H < 1. This reasoning is the basis for generalising the random walk processes considered so far, the exponent H being known as the Hurst exponent or ‘dimension’. H E Hurst (1900-1978) was an English civil engineer who designed dams and worked on the Nile river dam projects in the 1920s and 1930s. He studied the Nile so extensively that some Egyptians reportedly nicknamed him ‘the father of the Nile.’ The Nile river posed an interesting problem for Hurst as a hydrologist. When designing a dam, hydrologists need to estimate the necessary storage capacity of the resulting reservoir. An influx of water occurs through various natural sources (rainfall, river overflows etc.) and a regulated amount needs to be released for primarily agricultural purposes, for example, the storage capacity of a reservoir being based on the net water flow. Hydrologists usually begin by assuming that the water influx is random, a perfectly reasonable assumption when dealing with a complex ecosystem. Hurst, however, had studied the 847-year record that the Egyptians had kept of the Nile river overflows, from 622 to 1469. He noticed that large overflows tended to be followed by large overflows until abruptly, the system would then change to low overflows, which also tended to be followed by low overflows. There appeared to be cycles, but with no predictable period. Standard statistical analysis of the day revealed no significant correlations between observations, so Hurst developed his own methodology. Hurst was aware of Einstein’s (1905) work on Brownian motion (the erratic path followed by a particle suspended in a fluid) who observed that the distance R the particle covers increased with the square root of time, i.e. √ R(t) ∝ t where R is the range (equivalent to the amplitude for a walk in the complex plane) covered in time t. It results, from the fact that increments are identically and independently distributed random variables. Hurst’s idea was to use this property to test the Nile River’s overflows for randomness. His method was as follows: Begin with a time series xi (with i = 1, 2, ..., n) which in Hurst’s case was annual discharges of the Nile River. Next, create the adjusted series, yi = xi − x ¯ (where x ¯ is the mean of xi ). Cumulate this time series to give Yi =

i X

yj

j=1

such that the start and end of the series are both zero and there is some curve in between. (The final value, Yn has to be zero if the mean is zero.) Then, define the range to be the

maximum minus the minimum value of this time series, Rn = max(Yi ) − min(Yi ). This adjusted range, Rn is the distance the systems travels for the time index n, i.e. the distance covered by a random walker if the data √ set yi were the set of steps. Einstein’s equation Rn = a n will apply provided that the time series xi is independent for increasing values of n. However, Einstein’s equation only applies to series that are in Brownian motion. Hurst’s contribution was to generalize this equation to (R/S)n = anH where S is the standard deviation for the same n observations and a is a constant. We define a Hurst process to be a process with a (fairly) constant H value. The quotient R/S is referred to as the ‘rescaled range’ because it has zero mean and is expressed in terms of local standard deviations. In general, the value of R/S increases according to a power law value equal to H known as the Hurst exponent. Rescaling the adjusted range was a major innovation. Hurst originally performed this operation to enable him to compare diverse phenomenon. Rescaling, fortunately, also allows us to compare time periods many years apart in a range of time series. It is the relative change and not the change itself that is of interest. Rescaled range analysis can also describe time series that have no characteristic scale. By considering the logarithmic version of Hurst’s equation, i.e. log(R/S)n = loga + Hlog(n) it is clear that the Hurst exponent can be estimated by plotting log(R/S)n against the log(n) and solving for the gradient with a least squares fit, for example. If the system were independently distributed, then H = 0.5. Hurst found that the exponent for the Nile River was H = 0.91, i.e. the rescaled range increases at a faster rate than the square root of time. This meant that the system was covering more distance than a random process would and therefore the annual discharges of the Nile had to be correlated. It is important to appreciate that this method makes no prior assumptions about any underlying distributions, it simply tells us how the system is scaling with respect to time. So how do we interpret the Hurst exponent? We know that H = 0.5 is consistent with an independently distributed system. The range 0.5 < H ≤ 1, implies a persistent time series, and a persistent time series is characterized by positive correlations. Theoretically, what happens today will ultimately have a lasting effect on the future. The range 0 < H ≤ 0.5 indicates anti-persistence which means that the time series covers less ground than a random process. In other words, there are negative correlations. For a system to cover less distance, it must reverse itself more often than a random process. Hurst analysed all the data he could including rainfall, sunspots, mud sediments, tree rings and others. In all cases, Hurst found H to be greater than 0.5. He was intrigued that H often took a value of about 0.7 and Hurst suspected that some universal phenomenon was taking place. He carried out some experiments using numbered cards. The values of the cards

53


were chosen to simulate a probability density function with finite moments, i.e. 0, ±1, ±3, ±5, ±7and±9. He first verified that the time series generated by summing the shuffled cards gave H = 0.5. To simulate a bias random walk, he carried out the following steps. 1) Shuffle the deck and cut it once, noting the number, say n. 2) Replace the card and re-shuffle the deck. 3) Deal out 2 hands of 26 cards, A and B. 4) Replace the lowest n cards of deck B with the highest n cards of deck A, thus biasing deck B to the level n. 5) Place a joker in deck B and shuffle. 6) Use deck B as a time series generator until the joker is cut, then create a new biased hand. Hurst undertook 1000 trials of 100 hands and calculated H = 0.72. We can think of the process as follows: we first bias each hand, which is determined by a random cut of the pack; then, we generate the time series itself, which is another series of random cuts; then, the joker appears, which again occurs at random. Despite all of these random events H = 0.72 would always appear. This is called the ‘joker effect’. The joker effect, as described above, demonstrates a tendency for data of a certain magnitude to be followed by more data of approximately the same magnitude, but only for a fixed and random length of time. A natural example of this phenomenon is in weather systems. Good weather and bad weather tend to come in waves or cycles (as in a heat wave for example). This does not mean that weather is periodic, which it is clearly not. We use the term ‘non-periodic cycle’ to describe cycles of this kind (with no fixed period). Thus, Hurst processes exhibit trends that persist until the equivalent of the joker comes along to change that bias in magnitude and/or direction. In other words rescaled range analysis can be used to characterise a time series that contains within it, many different short-lived trends or biases (both in size and direction). The process continues in this way giving a constant Hurst exponent, sometimes with flat episodes that correspond to the average periods of the non-periodic cycles, depending on the distribution of actual periods. √ The generalisation of Einstein’s equation A(t) = a t by Hurst to the form A(t) = atH , 0 < H ≤ 1 was necessary in order for Hurst to analyse the apparent random behaviour of the annual rise and fall of the Nile river for which Einstein’s model was inadequate. In considering this generalisation, Hurst paved the way for an appreciation that most natural stochastic phenomena which, at first site, appear random, have certain trends that can be identified over a given period of time. In other words, many natural random patterns have a bias to them that leads to time correlations in their stochastic behaviour, a behaviour that is not an inherent characteristic of a random walk model and fully diffusive processes in general. C. The Fractional Diffusion Equation

√ Given that incoherent random walks, where A(t) = a t, describe processes whose macroscopic behaviour is characterised by the diffusion equation, then, by induction, Hurst processes, where A(t) = atH , H ∈ (0, 1], should be

characterised by generalizing the diffusion operator ∇2 − σ

∂ ∂t

∇2 − σ q

∂q ∂tq

to the fractional form

where q ∈ [1, 2] and Dq = 1/σ q is the fractional diffusivity. Fractional diffusive processes can therefore be interpreted as intermediate between diffusive processes proper (random phase walks with H = 0.5; diffusive processes with q = 1) and ‘propagative process’ (coherent phase walks for H = 1; propagative processes with q = 2). For non-stationary processes, we consider the operator ∇2 − σ q(t)

∂ q(t) . ∂tq(t)

It should be noted that the fractional diffusion operator given above is the result of a phenomenology. It is no more (and no less) than a generalisation of a well known differential operator to fractional form which follows from a physical analysis of a fully incoherent random process and it generalisation to fractional form in terms of the Hurst exponent. Unlike the diffusion operator (which is based on accepted and experimentally verifiable physical laws - Fourier’s law of thermal condition, for example) this approach to introducing a fractional differential operator is based on postulation alone. It is therefore similar to certain other operators, a notable example being Schrödinger’s operator in quantum mechanics, i.e. ~2 2 ∂ ∇ − i~ . 2m ∂t In order to work with fractional derivatives, it is necessary to briefly review the fractional calculus which for completeness, is provided in Appendix I. D. Solution to the Fractional Diffusion Equation Consider the fractional diffusion equation for the intensity I of a wavefield given by Dq ∇2 I(r, t) =

∂q I(r, t) ∂tq

where D is the fractional diffusivity and I0 (r) = I(r, t = 0) (the initial condition). For q = 1, the solution to this equation in the infinite domain (see Section III) for dimensions n = 1, 2 and 3 is (with σ = 1/D) Z I(r0 , τ ) = σ I0 (r)G(r | r0 , τ )dn r. where 2 1 σ n2 σR G(R, τ ) = exp − H(τ ). σ 4πτ 4τ which is the solution to ∂ 2 ∇ −σ G(R, τ ) = −δ n (R)δ(τ ). ∂t


54

For the fractional diffusion equation, we consider the same basic solution but where the Green’s function is given by the solution of q 2 q ∂ G(R, τ ) = −δ n (R)δ(τ ) ∇ −σ ∂tq where σ q = 1/Dq . Using the Fourier based operator for a fractional derivative (see Appendix I), we can transform this equation into the form (∇2 + Ω2q )g(r | r0 , ω) = −δ n (r − r0 ) where Z∞ g(r | r0 , ω) =

1 1 =√ 2π 8πR ...

1 1 q/4 3q/4 2 − (iωσ) R + (iωσ) R − ... 2! (iωσ)q/4 r 1 1 R q/4 q/4 =√ − σ δ (τ ) 8π 8πR σ q/4 τ 1−q/4

∞ 1 X (−1)n+1 (2n+1)/2 3nq/4 3nq/4 +√ R σ δ (τ ); 8π n=1 (n + 1)!

1 G(R, τ ) = 2π

(∇2 + k 2 )g(r | r0 , ω) = δ n (r − r0 )

where H0 is the Hankel function, and n=3: 1 g(r | r0 , k) = exp(ik | r − r0 |), n = 3. 4π | r − r0 | Generalizing these results, for q ∈ [1, 2], by writing the exponential function in its series form, with R =| r − r0 | we have, for Ωq = i(iωσ)q/2 , n = 1:

=

1 2π

dωi

exp(iωτ ) 2(iωσ)q/2

i exp(iΩq R) exp(iωτ )dω 2Ωq

R2 1 − R(iωσ)q/2 + (iωσ)q − ... 2!

−∞

= +

∞ X (−1)n+1 n+1 nq/2 qn/2 R σ δ (τ ); 2(n + 1)! n=1

1 2π

q/2

dω exp(iωτ ) −∞

dω exp(iωτ )[1−(iωσ)q/2 R+

1 (iωσ)q R2 −...] 2!

= +

1 q/2 q/2 δ(τ ) − σ δ (τ ) 4πR 4π

∞ 1 X (−1)n+1 n (n+1)q/2 (n+1)q/2 R σ δ . 4π n=1 (n + 1)!

These are the Green’s functions for the fractional diffusion equation in one-, two- and three-dimensions. Simplification of these infinite sums can be addressed be considering suitable asymptotics, the most significant of which (for arbitrary values of R) is the case when the (fractional) diffusivity D is large. In particular, we note that as σ → 0, 1 1 − Rδ(τ ), n = 1; 2 2σ q/2 τ 1−(q/2) 1 G(R, τ ) = √ , n = 2; q/4 8πRσ τ 1−(q/4)

G(R, τ ) =

δ(τ ) , n = 3. 4πR Thus, in two-dimensions, we can consider a solution to the fractional diffusion equation ∂q q 2 D ∇ − q I(r, t) = 0, I(r, t = 0) = I0 (r) ∂t G(R, τ ) =

of the form (for t0 = 0 and at time t = T ) 1 1 1 I(x, y) = √ 1 ⊗ ⊗I0 (x, y), 1−q/4 2 (DT ) 2 2π (x + y 2 ) 4

which should be compared to the solution to the twodimensional diffusion equation, i.e. 2 1 x + y2 ) I(x, y) = exp − ⊗2 I0 (x, y). 4πDT 4DT

n = 2: G(R, τ ) =

exp[−(iωσ)q/2 R] 4πR

D→∞

1 1 1 − Rδ(τ ) 2 2σ q/2 τ 1−(q/2)

Z∞

dω exp(iωτ )

−∞

where k = ±ωσ. This equation defines the Green’s function for the time independent wave operator in n dimensions, the ‘out going’ Green’s functions being given by [19], [20] n=1: i exp(ik | r − r0 |); g(r | r0 , k) = 2k n=2: i g(r | r0 , k) = H0 (k | r − r0 |) 4 1 exp(ik | r − r0 |) ' √ exp(iπ/4) p , k | r − r0 |>> 1 8π k | r − r0 |

−∞

Z∞

1 1 = 4πR 2π

Note that for q = 2, this equation becomes

Z∞

Z∞ −∞

Ω2q = −iωσ, Ωq = ±i(iωσ)q/2 .

Z∞

dω exp(iωτ )... −∞

n = 3: G(r | r0 , τ ) exp(iωτ )dτ,

−∞

1 G(R, τ ) = 2π

Z∞

exp(iπ/4) exp[−(iωσ) R] √ √ 8π iR(iωσ)q/4

Observe that when the diffusivity is large and the diffusion time t = T is small such that DT = 1, the difference between an image obtained by a full two-dimensional diffuser and a fractional diffuser is compounded in the difference between


55

the convolution of the initial image√ with (ignoring scaling) the functions exp(−R2 /4) and 1/ R. Compared with the Gaussian, the function R−1/2 decays more rapidly and hence will have broader spectral characteristics leading to an output that is less blurred than that produced by the convolution of the input with a Gaussian which, in the context of the fractional diffusion model introduced, is to be expected. E. Optical Fractional Diffusers Optical diffusers are used in a range of applications including the de-pixelation of Liquid Crystal Displays (LCDs) which becomes especially important when the LCD is composed of relatively few elements and is viewed at close range, e.g. LCD goggles. A common technique is to produce a thin film that is composed of a randomly distributed complex of scatterers (micro-spheroids whose relative permittivity is a weak perturbation of the body of the film) that is over-layed onto the LCD. The goal is to produce a diffuser that ‘manages’ the light in such a way that it de-pixelates the LCD while minimizing the angular distribution of light. This requires the manufacture of a fractional optical diffuser, an example of which is given in Figure 5 which shows the effect of a ‘light management film’ manufactured by Microsharp Corporation Limited (http://www.microsharp.co.uk).

Now, since ∂ 1−q ∂ q ∂u = 1−q q u ∂t ∂t ∂t then from the fractional diffusion equation ∂u ∂ 1−q = Dq 1−q ∇2 u ∂t ∂t and ∂2 u ∂t2 =

∂ ∂t

∂u ∂t

=

∂ ∂t

Dq

∂ 1−q 2 ∇ u ∂t1−q

= Dq

∂ 1−q 2 ∂u ∇ ∂t1−q ∂t

1−q 1−q 1−q ∂ 1−q 2 ∂ q ∂ 2 2q ∂ 4 =D ∇ D ∇ u =D ∇ u ∂t1−q ∂t1−q ∂t1−q ∂t1−q q

so that in general, n(1−q) ∂nu nq ∂ = D ∇2n u. ∂tn ∂tn(1−q)

Now, since (see Appendix I) 1 ∂ −q I(r, t) = ⊗ I(r, t) ∂t−q Γ(q)t1−q we can write the Taylor series for the field at t = 0 in terms of the field at t = T as 1 T Dq ∂ 2 I(r, 0) = I(r, T ) + ⊗ ∇ I(r, t) Γ(q) ∂t t1−q t=T −

T 2 D2q ∂ 2 1 4 ⊗ ∇ I(r, t) 2!Γ(2q) ∂t2 t1−2q t=T

T 3 D3q ∂ 3 1 6 − ... + ⊗ ∇ I(r, t) 3!Γ(3q) ∂t3 t1−3q t=T For the case when T 0, i.e. ∂ T 2 ∂2 I(r, t) I(r, t) I(r, 0) = I(r, T )+T − +... ∂t 2! ∂t2 t=T t=T

T Dq 2 ∇ I(r, T ). Γ(q)

Thus, for an image I(x, y) recorded in the image plane at z = 0 say, after the image I0 has been fractionally diffused over a period of time T , we have I0 (x, y) = I(x, y) +

T Dq 2 ∇ I(x, y). Γ(q)


56

VIII. I MAGE S EGMENTATION M ETRIC The result above provides us with an approach to estimating q given I and I0 as follows: Let P (x, y) =| I0 (x, y) − I(x, y) |, and Q(x, y) =| ∇2 I(x, y) | then with R(x, y) = P (x, y)/Q(x, y), hR(x, y)i = where

RR hR(x, y)i =

T Dq Γ(q)

R(x, y)dxdy RR . dxdy

Hence, ln T − ln Γ(q) + q ln D = M where M is the metric (i.e. a measure of q) given by hP i M = lnhRi ≤ ln hQi This metric can be used effectively as a quality control measure for the manufacture of fractional optical diffusers (see Figure 5). For an image I which has been formed by the fractional diffusion of a uniform light source in which I0 is a constant, T Dq 2 I − I0 = ∇ (I − I0 ) Γ(q) and with J = I − I0 , M = ln

equation, we have shown that the point spread function of the image I is determined by R−1/2 , D >> 1. An FIR filter (a fractional high emphasis filter) has been designed which scales as T Dq /Γ(q) compared with T D for the fully diffusive case when T 0 (t − τ )1−q

−∞

and the Weyl transform 1 I f (t) = Γ(q)

IX. C ONCLUSIONS

ˆq

Z∞

f (τ ) dτ, q > 0 (t − τ )1−q

The use of a fully diffusive process for modelling strong (multiple) scattering has been considered and then extended to model intermediate scattering by generalizing the diffusion equation to fractional order q ∈ (1, 2). The rationale for this approach follows that of a random walk model in which 1 diffusive processes characterized by a t 2 scaling law and propagative processes characterized by a t1 scaling law are generalized to a scaling law of the form tH where 21 < H < 1 is the Hurst exponent. The homogeneous diffusion equation provides a series solution to the inverse problem in which a Gaussian blurred image can be restored using appropriate FIR filters that depend on the order of the solution that is considered (i.e. the number of terms in the Taylor series). This approach has been extended to include fractional diffusion as defined by the equation (for an image I)

For integer values of q (i.e. when q = n where n is a nonnegative integer), the Riemann-Liouville transform reduces to the standard Riemann integral. This transform is just a (causal) convolution of the function f (t) with tq−1 /Γ(q). For fractional differentiation, we can perform a fractional integration of appropriate order and then differentiate to an appropriate integer order. The reason for this is that direct fractional differentiation can lead to divergent integrals. Thus, ˆ q for q > 0 is given by the fractional differential operator D

∂q I(x, y, t) ∂tq where D is the fractional diffusivity and I0 (x, y) = I(x, y, t = 0). By computing the appropriate Green’s function for this

q n ˆ q f (t) ≡ d f (t) = d [Iˆn−q f (t)]. D dtq dtn Another (conventional) approach to defining a fractional differential operator is based on using the formula for nth order

Dq ∇2 I(x, y, t) =

t

where Z∞ Γ(q) =

tq−1 exp(−t)dt.

0


57

differentiation obtained by considering the definitions for the first, second, third etc. differentials using backward and then generalising the formula by replacing n with q. This approach provides us with the result [21] 

−1 −q N X

ˆ q f (t) = lim  (t/N ) D N →∞ Γ(−q)

j=0

Γ(j − q) f Γ(j + 1)

t−j



t  . N

A review of this result shows that for q = 1, this is a point process but for other values it is not, i.e. the evaluation of a fractional differential operator depends on the history of the function in question. Thus, unlike an integer differential operator, a fractional differential operator has ‘memory’. Although the memory of this process fades, it does not do so quickly enough to allow truncation of the series in order to retain acceptable accuracy. The concept of memory association can also be seen from the result n ˆ q f (t) = d [Iˆn−q f (t)] D dtn where Zt 1 f (τ ) q−n Iˆ f (t) = dτ, n − q > 0 Γ(n − q) (t − τ )1+q−n −∞

Suppose we let ( 1, g(t) = H(t) = 0,

Then, G(p) = 1/p and the system becomes an ideal integrator: Zt s(t) = f (t) ⊗ H(t) =

A. The Laplace Transform and the Half Integrator It informative at this point to consider the application of the Laplace transform to identify an ideal integrator and then a half integrator. The Laplace transform is given by ˆ (t)] ≡ F (p) = L[f

Z∞

and from this result we can derive the transform of a derivative given by ˆ 0 (t)] = pF (p) − f (0) L[f and the transform of an integral given by  t  Z ˆ  f (τ )dτ  = 1 F (p). L p 0

f (τ )dτ. 0

Now, consider the case when we have a time invariant linear system with an impulse response function by given by ( | t |−1/2 , t > 0; H(t) g(t) = √ = 0, t < 0. t The output of this system is f ⊗ g and the output of such a system with input f ⊗ g is f ⊗ g ⊗ g. Now Zt g(t) ⊗ g(t) =

√

dτ = √ √ τ t−τ

0

Z 0

t

2xdx √ x t − x2

√t x −1 √ = 2 sin = π. t 0 Hence,

H(t) H(t) √ ⊗ √ = H(t) πt πt

and the √ system defined by the impulse response function H(t)/ πt represents a ‘half-integrator’ with a Laplace transform given by H(t) 1 ˆ √ L =√ . p πt This result provides an approach to working with fractional integrators and/or differentiators using the Laplace transform. Fractional differential and integral operators can be defined and used in a similar manner to those associated with conventional or integer order calculus and we now provide an overview of such operators. B. Operators of Integer Order The following operators are all well-defined, at least with respect to all test functions u(t) say which are (i) infinitely differentiable and (ii) of compact support (i.e. vanish outside some finite interval). Integral Operator: ˆ Iu(t) ≡ Iˆ1 u(t) =

Now, suppose we have a standard time invariant linear system whose input is f (t) and whose output is given by s(t) = f (t) ⊗ g(t) where the convolution is causal, i.e. Zt f (τ )g(t − τ )dτ. 0

f (t − τ )dτ =

f (t) exp(−pt)dt 0

s(t) =

Zt

0

ˆq−n

in which the value of I f (t) at a point t depends on the behaviour of f (t) from −∞ to t via a convolution with the kernel tn−q /Γ(q). The convolution process is of course dependent on the history of the function f (t) for a given kernel and thus, in this context, we can consider a fractional derivative defined via the result above to have memory.

t > 0; t < 0.

Zt u(τ )dτ. −∞

Differential Operator: ˆ ˆ 1 u(t) = u0 (t). Du(t) ≡D Identify Operator: ˆ 0 u(t). Iˆ0 u(t) = u(t) = D


58

Similarly, Now, ˆ Du](t) ˆ I[ =

Zt

u0 (τ )dτ = u(t)

ˆ n u(t) ≡ Iˆ−n u(t) = D

Zt

d ˆ Iu](t) ˆ D[ = dt

u(τ )dτ = u(t)

On the basis of the material discussed above, we can now formally extend the integral operator to fractional order and consider the operator

−∞

1 I u(t) = Γ(q) ˆq

so that

1 = Γ(q)

For n (integer) order: I u(t) =

Zτ1

Zτ2

Zt dτn−1 ...

where

Zt

Z∞

−∞

−∞

−∞

u(τ )tq−1 + (t − τ )dτ

u(τ )tq−1 + (t − τ )dτ

−∞

u(τ )dτ,

dτ1

Z∞ −∞

ˆ1 = D ˆ 1 Iˆ1 = Iˆ0 . Iˆ1 D

ˆn

δ (n) (τ )u(t − τ )dτ = u(n) (t).

−∞

−∞

and

Z∞

Γ(q) =

ˆ n u(t) = u(n) (t) D

tq−1 exp(−t)dt, q > 0

0

with the fundamental property that

and ˆ n u](t) = u(t) = D ˆ n [Iˆn u](t). Iˆn [D C. Convolution Representation Consider the function tq−1 + (t)

( | t |q−1 , t > 0; H(t) = 0, t < 0.

≡| t |q−1

which, for any q > 0 defines a function that is locally integrable. We can then define an integral of order n in terms of a convolution as 1 tn−1 Iˆn u(t) = u ⊗ (t) (n − 1)! + =

Zt

1 (n − 1)!

(t − τ )n−1 u(τ )dτ

Γ(q + 1) = qΓ(q). Here, I q is an operator representing a time invariant linear system with impulse response function tq−1 + (t) and transfer function 1/pq . For the cascade connection of I q1 and I q2 we have Iˆq1 [Iˆq2 u(t)] = Iˆq1 +q2 u(t). This classical convolution integral representation holds for all real q > 0 (and formally for q = 0, with the delta function playing the role of an impulse function and with a transfer function equal to the constant 1). D. Fractional Differentiation For 0 < q < 1, if we define the (Riemann-Liouville ) derivative of order q as

−∞

1 = (n − 1)!

Zt

τ n−1 u(t − τ )dτ

1 d ˆ q u(t) ≡ d [Iˆ1−q u](t) = D dt Γ(1 − q) dt

(t − τ )−q u(τ )dτ,

−∞

then,

−∞

In particular, Iˆ1 u(t) = (u ⊗ H)(t) =

ˆ q u(t) = D

Zt

1 Γ(1 − q)

These are classical (absolutely convergent) integrals and the identity operator admits a formal convolution representation, using the delta function, i.e.

ˆ q u] = Iˆq [Iˆ1−q u0 ] = Iˆ1 u0 = u Iˆq [D ˆ q is the formal inverse of the operator Iˆq . Given any and D q > 0, we can always write λ = n − 1 + q and then define

δ(τ )u(t − τ )dτ

ˆ λ u(t) = D

−∞

1 dn Γ(1 − q) dtn

Zt

u(τ )(t − τ )−q dτ.

−∞ q

ˆ δ(t) = DH(t).

(t − τ )−q u0 (τ )dτ ≡ Iˆ1−q u0 (t).

Hence,

Z∞

where

Zt −∞

u(τ )dτ. −∞

Iˆ0 u(t) =

Zt

D is an operator representing a time invariant linear system consisting of a cascade combination of an ideal differentiator


59

and a fractional integrator of order 1 − q. For Dλ we replace the single ideal differentiator by n such that 1 d D u(t) = Γ(1) dt ˆ0

Z∞

Zt

u(τ )δ(t − τ )dτ

u(τ )dτ = u(t) ≡ −∞

−∞

and ˆ n u(t) = D

1 dn+1 Γ(1) dtn+1

Zt u(τ )dτ −∞

Z∞

= u(n) (t) ≡

u(τ )δ (n) (t − τ )dτ.

−∞

In addition to the conventional and classical definitions of fractional derivatives and integrals, more general definitions are available including the Erdélyi-Kober fractional integral [25] t−p−q+1 Γ(q)

Zt

τ p−1 f (τ )dτ, q > 0, p > 0 (t − τ )1−q

0

which is a generalisation of the Riemann-Liouville fractional integral and the integral tp Γ(q)

Z∞

τ −q−p f (τ )dτ, q > 0, p > 0 (τ − t)1−q

t

which is a generalization of the Weyl integral. Further definitions exist based on the application of hypergeometric functions and operators involving other special functions such as the Maijer G-function and the Fox H-function [26]. Moreover, all such operators leading to a fractional integral of the Riemann-Liouville type and the Weyl type to have the general forms (through induction) Zt

Iˆq f (t) = tq−1

Φ

τ t

τ −q f (τ )dτ

−∞

and Iˆq f (t) = t−q

Z∞ t τ q−1 f (τ )dτ Φ τ

the case when q = 1) is referred to as a ‘differentiator’. When q < 0, we have a definition for the fractional integral where, in the case of q = −1, for example, the filter (iω)−1 is an ‘integrator’. When q = 0 we just have f (t) expressed in terms of its Fourier transform F (ω). This Fourier based definition of a fractional derivative can be extended further to include a definition for a ‘fractional Laplacian’ ∇q where for n dimensions Z 1 q dn kk q exp(ik · r), k =| k | ∇ ≡− (2π)n and r is an n-dimensional vector. This is the fractional Riesz operator. It is designed to provide a result that is compatible with the case of q = 2 for n > 1, i.e. ∇2 ⇐⇒ −k 2 (which is the reason for introducing the negative sign). Another equally valid generalization is Z 1 dn k(ik)q exp(ik · r), k =| k | ∇q ≡ (2π)n which introduces a q dependent phase factor of πq/2 into the operator. E. Fractional Dynamics Mathematical modelling using (time dependent) fractional Partial Differential Equations (PDEs) is generally known as fractional dynamics [27], [28]. A number of works have shown a close relationship between fractional diffusion equations of the type (where p is the space-time dependent PDF and σ is the generalized coefficient of diffusion) ∇2 p − σ and ∂ p = 0, 0 < q ≤ 2 ∂t and continuous time random walks with either temporal or spatial scale invariance (fractal walks). Fractional diffusion equations of this type have been shown to produce a framework for the description of anomalous diffusion phenomena and Lévy-type behaviour. In addition, certain classes of fractional differential equations are known to yield Lévy-type distributions. For example, the normalized one-sided Lévytype PDF ∇q p − σ

t

respectively, where the kernel Φ is an arbitrary continuous function so that the integrals above make sense in sufficiently large functional spaces. Although there are a number of approaches that can be used to define a fractional differential/integral, there is one particular definition, which in terms of its ‘ease of use’ and wide ranging applications, is of significant value and is based on the Fourier transform, i.e. dq 1 f (t) = dtq 2π

Z∞

q

(iω) F (ω) exp(iωt)dω −∞

where F (ω) is the Fourier transform of f (t). When q = 1, 2, 3..., this definition reduces to a well known result that is trivial to derive in which, for example, the ‘filter’ iω (for

∂q p = 0, 0 < q ≤ 1 ∂tq

p(x) =

aq exp(−a/x) , a > 0, x > 0 Γ(q) x1+q

is a solution of the fractional integral equation x2q p(x) = aq Iˆ−q p(x) where Iˆ−q p(x) =

1 Γ(q)

Zx

p(y) dy, q > 0. (x − y)1−q

0

Another example involves the solution to the anomalous diffusion equation ∂ ∇q p − τ p = 0, 0 < q ≤ 2. ∂t


60

Fourier transforming this equation and using the fractional Riesz operator defined previously, we have 1 ∂ P (k, t) = − k q P (k, t) ∂t τ which has the general solution P (k, t) = exp(−t | k |q /τ ), t > 0. which is the characteristic function of a Lévy distribution. This analysis can be extended further by considering a fractal based generalization of the Fokker-Planck-Kolmogorov (FPK) equation [29] ∂β ∂q p(x, t) = [s(x)p(x, t)] ∂tq ∂xβ where s is an arbitrary function and 0 < q ≤ 1, 0 < β ≤ 2. This equation is referred to as the fractal FPK equation; the standard FPK equation is of course recovered for q = 1 and β = 2. The characteristic function associated with p(x, t) is given by P (k, t) = exp(−akβ tq ) where a is a constant which again, is a characteristic of a Lévy distribution. Finally, d-dimensional fractional master equations of the type [30], [31] X ∂q p(r, t) = w(r − s)p(s, t), 0 < q ≤ 1 ∂tq s can be used to model non-equilibrium phase transitions where p denotes the probability of finding the diffusing entity at a position r ∈ Rd at time t (assuming that it was at the origin r = 0 at time t = 0) and w are the fractional transition rates which measure the propensity for a displacement r in units of 1/(time)q . These equations conform to the general theory of continuous time random walks and provide models for random walks of fractal time. ACKNOWLEDGMENT The author would like to thank Professor Michael Rycroft and Professor Roy Hoskins for reading the original manuscript and the suggestions they made in its preparation. R EFERENCES [1] M. Bertero and B. Boccacci, Introduction to Inverse Problems in Imaging, Institute of Physics Publishing, 1998. [2] J. M. Blackledge, Quantitative Coherent Imaging, Academic Press, 1989. [3] J. M. Blackledge, Digital Image Processing, Horwood Scientific Publishing, 2005. [4] R. Hilfer, Foundations of Fractional Dynamics, Fractals 3(3), 549-556, 1995. [5] R. H. T. Bates and M. J. McDonnal, Image Restoration and Reconstruction, Oxford Science Publications, 1986. [6] R. C. Gonzalez and P. Wintz, Digital Image Processing, AddisonWesley, 1987. [7] P. M. Morse and H. Feshbach, Methods of Theoretical Physics, McGrawHill, 1953. [8] J. A. Stratton, Electromagnetic Theory, McGraw-Hill, 1941. [9] R. H. Atkin, Theoretical Electromagnetism, Heinemann, 1962.

[10] G. F. Roach, Green’s Functions (Introductory Theory with Applications), Van Nostrand Reihold, 1970. [11] E. Butkov E, Mathematical Physics, Addison-Wesley, 1973. [12] R. G. Newton, Inverse Schrödinger Scattering in Three Dimensions, Texts and Monographs in Physics, Springer-Verlag, 1989. [13] G. A. Evans, J. M. Blackledge and P. Yardley, Analytical Solutions to Partial Differential Equations, Springer-Verlag, 2000. [14] R. Jost and W. Kohn, Construction of a Potential from a Phase Shift, Phys. Rev. 37, 977-922, 1952. [15] M. J. Turner, J. M. Blackledge and P. Andrews, Fractal Geometry in Digital Imaging, Academic Press, 1997. [16] J. M. Blackledge, Digital Signal Processing, Horwood Scientific Publishing, 2003. [17] M. Buchanan, One Law to Rule Them All, New Scientist, November, 30-35, 1997. [18] R. Hilfer Scaling Theory and the Classification of Phase Transitions, Modern Physics Letters B 6(13), 773-784, 1992. [19] G. F. Roach, Green’s Functions (Introductory Theory with Applications), Van Nostrand Reihold, 1970. [20] E. N. Economou, Green’s Functions in Quantum Physics, SpringerVerlag, 1979. [21] K. B. Oldham and J. Spanier, The Fractional Calculus, Academic Press, 1974. [22] A. Dold and B. Eckmann (Eds.), 1975, Fractional Calculus and its Applications, Springer, 1975. [23] K. S. Miller and B. Ross, An Introduction to the Fractional Calculus and Fractional Differential Equations, Wiley, 1993. [24] S. G. Samko, A. Kilbas and O. I. Marichev, Fractional Integrals and Derivatives: Theory and Applications, Gordon and Breach, 1993. [25] Sneddon I N, 1975, The use in Mathematical Physics of Erdélyi-Kober operators and of some of their Generalizations, Lectures Notes in Mathematics (Eds. A Dold and B Eckmann), Springer, 37-79. [26] V. Kiryakova, Generalized Fractional Calculus and Applications, Longman, 1994. [27] R. Hilfer, Foundations of Fractional Dynamics, Fractals 3(3), 549-556, 1995. [28] A. Compte, Stochastic Foundations of Fractional Dynamics, Phys. Rev E, 53(4), 4191-4193, 1996. [29] S. Jespersen, R. Metzler and H. C. Fogedby, Lévy Flights in External Force Fields: Langevin and Fractional Fokker-Planck Equations and Their Solutions, Phys. Rev. E, 59(3), 2736- 2745, 1995. [30] R. Hilfer, Exact Solutions for a Class of Fractal Time Random Walks, Fractals, 3(1), 211-216, 1995. [31] R. Hilfer and L. Anton, Fractional Master Equations and Fractal Time Random Walks, Phys. Rev. E, 51(2), R848-R851, 1995.

Jonathan Blackledge received a BSc in Physics from Imperial College, London University in 1980, a Diploma of Imperial College in Plasma Physics in 1981 and a PhD in Theoretical Physics from Kings College, London University in 1983. As a Research Fellow of Physics at Kings College (London University) from 1984 to 1988, he specialized in information systems engineering undertaking work primarily for the defence industry. This was followed by academic appointments at the Universities of Cranfield (Senior Lecturer in Applied Mathematics) and De Montfort (Professor in Applied Mathematics and Computing) where he established new post-graduate MSc/PhD programmes and research groups in computer aided engineering and informatics. In 1994, he co-founded Management and Personnel Services Limited (http://www.mapstraining.co.uk) where he is currently Executive Director for training and education. His work for Microsharp (Director of R & D, 1998-2002) included the development of manufacturing processes now being used worldwide for digital information display units. In 2002, he founded a group of companies specialising in information security and cryptology for the defence and intelligence communities, actively creating partnerships between industry and academia. He currently holds academic posts in the United Kingdom and South Africa, and in 2007 was awarded a Fellowship of the City and Guilds London Institute for his role in the development of the Higher Level Qualification programmes in engineering and computing, most recently for the nuclear industry.


ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Jonathan M. Blackledge: Digital Watermarking and Self-Authentication using Chirp Coding

Digital Watermarking and Self-Authentication using Chirp Coding Jonathan M Blackledge, Fellow, IET

Abstract— This paper discusses a new approach to ‘watermarking’ digital signals using linear frequency modulated or ‘chirp’ coding. The principles underlying this approach are based on the use of a matched filter to provide a reconstruction of a chirped code that is uniquely robust, i.e. in the case of very low signal-to-noise ratios. Chirp coding for authenticating data is generic in the sense that it can be used for a range of data types and applications (the authentication of speech and audio signals, for example). The theoretical and computational aspects of the matched filter and the properties of a chirp are revisited to provide the essential background to the method. Signal code generating schemes are then addressed and details of the coding and decoding techniques considered. Index Terms— Digital Watermarking, Chirp Coding, Data Authentication, Self-Authentication

I. I NTRODUCTION

D

IGITAL watermarking has been researched for many years in order to achieve methods which provide both anti-counterfeiting and authentication facilities [1]. One of equations that underpins this technology is based on the model a the signal given by (e.g. [2], [3] and [4]) s = Pˆ f + n

(1)

where f is the information content for the signal, Pˆ is a linear operator, n is noise and s is the output signal. This equation is usually taken to describe a stationary process which includes the characterisation of n (i.e. the probability density function of n is assumed to be invariant of time). In the field of cryptography, the operation Pˆ f is referred to as the processes of ‘diffusion’ and the process of adding noise (i.e. Pˆ f + n) is referred to as the process of ‘confusion’. The principal ‘art’ is to develop methods in which the processes of diffusion and confusion are maximized; one important criterion being that the output s should be dominated by the noise n which in turn should be characterized by maximum Entropy (i.e. a uniform statistical distribution) [6]. Instead of n being taken to be noise, suppose that n is a known signal and that knk >> kPˆ f k. In this case it may be possible to embed or ‘hide’ the information contained in f in the signal n without significantly perturbing it. The process of hiding secret information in signals or images is known Manuscript received June 1, 2007. Jonathan Blackledge is Professor of Information and Communications Technology, Applied Signal Processing Research Group, Department of Electronic and Electrical Engineering, Loughborough University, England and Professor of Computer Science, Department of Computer Science, University of the Western Cape, Cape Town, Republic of South Africa (e-mail: [email protected]).

as Steganography [5] and being able to recover f from s in equation (1) can provide a way of authenticating the signal n. If, in addition, it is possible to determine that a copy of s has been made leading to some form of data degradation and/or corruption that can be conveyed through an appropriate analysis of f , then a scheme can be developed that provides a check on: (i) the authenticity of the data n; (ii) its fidelity [7], [8]. In this case, signal f is an example of a watermark. Formally, the recovery of f from s is based on the inverse process f = Pˆ −1 (s − n) where Pˆ −1 is the inverse operator. Clearly, this requires the signal n to be known a priori and that the inverse process Pˆ −1 is well defined and computationally stable. Since the host signal n must be known in order to recover the watermark f , this approach leads to a private watermarking scheme in which the field n represents a key. In addition, the operator Pˆ (and its inverse Pˆ −1 ) can be key dependent. The value of this operator key dependency relies on the nature and properties of the operator that is used and whether it is compounded in an algorithm that is required to be in the public domain, for example. Another approach is to consider the case in which the signal n is unknown and to consider the problem of extracting the watermark f in the absence of knowledge of this signal. In this case, the reconstruction is based on the result f = Pˆ −1 s + m where m = −Pˆ −1 n. If a process Pˆ is available in which kPˆ −1 sk >> kmk, then an approximate reconstruction of f may be obtained in which m is determined by the original signal-to-noise ratio of the data s and hence, the level of covertness of the information Pˆ f - diffused watermark. In this case, it may be possible to post-process the reconstruction and recover a relatively highfidelity version of the watermark, i.e. f ∼ Pˆ −1 s. This approach (if available) does not rely on a private key (assuming Pˆ is not key dependent). The ability to recover the watermark only requires knowledge of the operator Pˆ (and its inverse) and post-processing options as required. The problem is to find an operator that is able to diffuse and recover the watermark f effectively in the presence of the signal n when kPˆ f k 0 and 0 < b < 1 determine the amplitude and the SNR of s respectively where a = kn(t)k∞ . The coefficient a is required to provide a watermarked signal whose amplitude is compatible with the original signal n. The value of b is adjusted to provide an output that is acceptable in the application to be considered and to provide a robust reconstruction of the binary sequence by correlating s(t) with chirp(t), t ∈ [0, T ). IV. C ODE G ENERATION In the previous section, the method of chirp coding a binary sequence and watermarking the signal n(t) has been discussed where it is assumed that the sequence is generated from this same signal. In this section, the details of this method are presented. There are a wide variety of coding methods that can be applied [15]. The problem is to convert the salient characteristics of the signal n(t) into a sequence of bits that is relatively short and conveys information on the


65

signal that is unique to its overall properties. In principle, there are a number of ways of undertaking this. For example, in practice the digital signal ni - which is composed of an array of floating point numbers - could be expressed in binary form and each element concatenated to form a contiguous bit stream. However, the length of the code (i.e. the total number of bits in the stream) will tend to be large leading to high computational costs in terms of the application of chirp coding/decoding. What is required, is a process that yields a relatively short binary sequence (when compared with the original signal) that reflects the important properties of the signal in its entirety. Two approaches are considered here: (i) Power Spectral Density decomposition and (ii) Wavelet decomposition [16]. A. Power Spectral Density Decomposition Let N (ω) be the Fourier transform n(t) and define the Power Spectrum P (ω) as P (ω) =| N (ω) |2 An important property of the binary sequence is that it should describe the spectral characteristics of the signal in its entirety. Thus, if, for example, the binary sequence is based on just the low frequency components of the signal, then any distortion of the high frequencies of the watermarked signal will not affect the recovered watermark and the signal will be authenticated. Hence, we consider the case where the power spectrum is segmented into N components, i.e.

where ZΩN P (ω)dω.

E= 0

Code generation is then based on the following steps: 1) Rounding to the nearest integer the (floating point) values of Ei to decimal integer form: ei = round(Ei ), ∀i 2) Decimal integer to binary string conversion: bi = binary(ei ) 3) Concatenation of the binary string array bi to a binary sequence: fj = cat(bi ) The watermark fj is then chirp coded as discussed in Section V. B. Wavelet decomposition

P1 (ω) = P (ω), ω ∈ [0, Ω1 )

Wavelet signal analysis is based on convolution type operations which include a scaling property in terms of the amplitude and temporal extent of the convolution kernel (e.g. [3], [17], [18] and [19]). There is a close synergy between the wavelet transform and imaging science. For example, in Fresnel optics, the two-dimensional (coherent) optical wavefield u generated by an object function f (in the object plane at a distance z) is given by (e.g. [4] and [20])

P2 (ω) = P (ω), ω ∈ [Ω1 , Ω2 )

u(x, y, L) = p(x, y, L) ⊗ ⊗f (x, y)

.. .

where

PN (ω) = P (ω), ω ∈ [ΩN −1 , ΩN ) Note that it is assumed that the signal n(t) is band-limited with a bandwidth of ΩN . The set of the functions P1 , P2 , ..., PN now represent the complete spectral characteristics of the signal n(t). Since each of these functions represents a unique part of the spectrum, we can consider a single measure as an identifier or tag. A natural measure to consider is the energy which is given by the integral of the functions over their frequency range. In particular, we consider the energy values in terms of their contribution to the spectrum as a percentage, i.e. 100 E1 = E

ZΩ1 P1 (ω)dω 0

100 E2 = E

ZΩ2 P2 (ω)dω Ω1

EN =

100 E

.. . ZΩN PN (ω)dω ΩN −1

iπ 2 2πz 1 exp (x + y 2 ) p(x, y, L) = i exp i λ L L and L = λz for wavelength λ. An important feature of this result is that the amplitude of the kernel p and its scale length is determined by the reciprocal of the wavelength λ. Physically, this implies that as the wavelength decreases, the ‘resolving power’ of an image given by I(x, y, L) =| u(x, y, L) |2 increases, the bandwidth u being proportional to λ−1 . Thus, by considering a hypothetical Fresnel imaging system, in which the wavelength can be varied by the user, we can consider the imaging system to have multi-resolution properties. The Frensel transform is essentially a wavelet transform with a wavelet determined by a (two-dimensional) chirp function. The multi-resolution properties of the wavelet transform have been crucial to their development and success in the analysis and processing of signals. Wavelet transformations play a central role in the study of self-similar or fractal signals. The transform constitutes as natural a tool for the manipulation of self-similar or scale invariant signals as the Fourier transform does for translation invariant signals such as stationary and periodic signals. In general, the wavelet transformation of a signal f (t) say f (t) ↔ FL (t)


66

is defined in terms of projections of f (t) onto a family of functions that are all normalized dilations and translations of a prototype ‘wavelet’ function W , i.e. Z ˆ [f (t)] = FL (τ ) = f (t)wL (t, τ )dt W

EN =

100 E

Z

where E=

| FLN (τ ) |2 dτ N X

Ei

i=1

where wL (t, τ ) = p

1 |L|

w

τ −t L

.

The parameters L and τ are continuous dilation and translation parameters respectively, and take on values in the range −∞ < L, τ < ∞, L 6= 0. Note that the wavelet transformation is essentially a convolution transform in which w(t) is the convolution kernel but with a factor L introduced. The introduction of this factor provides dilation and translation properties into the convolution integral that gives it the ability to analyse signals in a multi-resolution role (the convolution integral is now a function of L). A multi-resolution signal analysis is a framework for analysing signals based on isolating variations that occur on different temporal or spatial scales. The basic analysis involves approximating the signal at successively coarser scales through repeated application of a smoothing (convolution) operator. A necessary and sufficient condition for a wavelet transformation to be invertible is that w(t) satisfy the admissibility condition Z | W (ω) |2 | ω |−1 dω = Cw < ∞ where W is the wavelets Fourier transform, i.e. Z W (ω) = wL (t) exp(−iωt)dt. For any admissible w(t), the wavelet transform has an inverse given by [3] Z Z ˆ −1 [FL (τ )] = 1 FL (τ )wL (t, τ )L−2 dLdτ. f (t) = W Cw There are a wide variety of wavelets available [i.e. functional forms for wL (t)] which are useful for processing digital signals in ‘wavelet space’ when applied in discrete form. The properties of the wavelets vary from one application to another but in each case, the digital signal fi is decomposed into a matrix (a set of vectors) Fij where j is the ‘level’ of the decomposition. The wavelet transform can be used to generate a suitable code by computing the energies of the wavelet transformation over N levels. Thus, the signal f (t) is decomposed into wavelet space to yield the following set of functions: FL1 (τ ), FL2 (τ ), ... FLN (τ ) The (percentage) energies of these functions are then computed, i.e. Z 100 | FL1 (τ ) |2 dτ E1 = E Z 100 E2 = | FL1 (τ ) |2 dτ E .. .

The method of computing the binary sequence for chirp coding from these energy values follows that described in the method of power spectral segmentation given in previous Section. V. MATLAB A PPLICATION P ROGRAMS Two MATLAB programs have been developed to implement the watermarking method discussed in this paper. The coding program reads in a named file, applies the watermark to the data using wavelet decomposition and writes out a new file using the same file format. The Decoding program reads a named file (assumed to contain the watermark or otherwise), recovers the code from the watermarked data and then recovers the (same or otherwise) code from the watermark. The coding program displays the decimal integer and binary codes for analysis. The decoding program displays the decimal integer streams generated by the wavelet analysis of the input signal and the stream obtained by processing the signal to extract the watermark code or otherwise. This program also provides an error measure based on the result P | xi − yi | i e= P | xi + yi | i

where xi and yi are the decimal integer arrays obtained from the input signal and the watermark (or otherwise). In the application considered here, the watermarking method has been applied to audio (.wav) files in order to test the method on data which requires that the watermark does not affect the fidelity of the output (i.e. audio quality). Only a specified segment of the data is extracted for watermarking. The segment can be user defined and if required, form the basis for a (private) key system. In this application, the watermarked segment has been ‘hard-wired’ and represents a public key. A. Coding process The coding process is compounded in the following basic steps: 1) Read a .wav file. 2) Extract a section of a single vector of the data (note that a .wav contains stereo data, i.e. two vectors arrays). 3) Apply wavelet decomposition using Daubechies wavelets with 7 levels. Note that in addition to wavelet decomposition, the approximation coefficients for the input signal are computed to provide a measure on the global effect of introducing the watermark into the signal. Thus, 8 decomposition vectors in total are generated. 4) Compute the (percentage) ‘energy values’. 5) Round to the nearest integer and convert to binary form. 6) Concatenate both the decimal and binary integer arrays. 7) Chirp code the binary sequence.


67

8) Scale the output and add to the original input signal. 9) Re-scale the watermarked signal. 10) Write to a file. B. Decoding process The decoding process is as follows: 1) Steps 1-6 in the coding processes are repeated 2) Correlate the data with a chirp identical to that used for chirp coding 3) Extract the binary sequence 4) Convert from binary to decimal 5) Display the original and reconstructed decimal sequence 6) Display the error Note that in a practical application of this method for authenticating audio files, for example, a threshold can be applied to the error value. If and only if the error lies below this threshold is the data taken to be authentic. The prototype MATLAB programs for implementing this scheme are given in Appendix II (Coding) and Appendix III (Decoding). They have been developed to explore the applications of the method for different audio (.wav) signals but can be tailored for different signals and file formats. Note that in the decoding program, the correlation process is carried out using a spatial cross-correlation scheme (using the MATLAB function xcorr), i.e. the watermark is recovered using the process chirp(t) s(t) instead of the Fourier equivalent CHIRP∗ (ω)S(ω) where CHIRP and S are the Fourier transforms of chirp and s respectively. This is due to the fact that the ‘length’ of the chirp function is significantly less than that of the signal. Application of a spatial correlator therefore provides greater computational efficiency.

is performed in high amplitude portions of the signal, either in the time or frequency domains. A common pitfall for both types of watermarking systems is their intolerance to detector de-synchronization and deficiency of adequate methods to address this problem during the decoding process. Although other applications are possible, chirp coding provides a new and novel technique for fragile audio watermarking. In this case, the watermarked signal does not change the perceptual quality of the signal. In order to make the watermark inaudible, the chirp generated is of very low frequency and amplitude. Using audio files with sampling frequencies of over 1000Hz, a logarithmic chirp can be generated in the frequency band of 1-100Hz. Since the human ear has low sensitivity in this band, the embedded watermark will not be perceptible. Depending upon the band and amplitude of the chirp, the signal-towatermark (chirp stream) ratio can be in excess of 40dB.

VI. D ISCUSSION The method of digital watermarking discussed here makes specific use of the chirp function. This function is unique in terms of its properties for reconstructing information (via application of the Matched Filter). The watermark f extracted from the host signal n is, in theory, an exact band-limited version of the original watermark. The approach considered in this paper allows a code to be generated directly from the host signal and that same code used to watermark the signal. The code is therefore self-generating and its reconstruction only requires a correlation process with the watermarked signal to be undertaken. This means that the signal can be authenticated without reference to a known data base. In other words, the method can be seen as a way of authenticating data by extracting a code (the watermark) within a ‘code’ (the host signal) and is consistent with approaches that attempt to reconstruct information without knowledge of the host data [21]. Audio data watermarking schemes rely on the imperfections of the human audio system. They exploit the fact that the human auditory system is insensitive to small amplitude changes, either in the time or frequency domains, as well as insertion of low amplitude time domain echo’s. Spread spectrum techniques augment a low amplitude spreading sequence, which can be detected via correlation techniques. Usually, embedding

Fig. 2. Original signal (above) and chirp based watermarked signal (below).

Figure 2 is an example of an original and a watermarked audio signal which shows no perceptual difference during a listening test. Various forms of attack can be applied which change the distribution of the percentage sub-band energies originally present in the signal including filtering (both low pass and high pass), cropping and lossy compression (MP3 compression) with both constant and variable bit rates. In each case, the signal and/or the watermark is distorted enough to register the fact that the data has been tampered with. An example of this is given in Figure 3 which shows the power spectral density of an original, watermarked and a (band-pass filtered) tampered audio signal. The filtering is such that there is negligible change in the power spectral density. However, the tampering was easily detected by the proposed technique. Finally, chirp coded watermarks are difficult to remove from the host signal since the initial and the final frequency is at the discretion of the user and its position in the data stream can be varied through application of an offset, all such parameters being combined to form a private key.

68


The principal trick is to write Q(ω)P (ω) =| N (ω) | Q(ω) ×

P (ω) | N (ω) |

so that the above inequality becomes 2 Z Z 2 Q(ω)P (ω)dω = | N (ω) | Q(ω) P (ω) dω | N (ω) | Z ≤

| N (ω) |2 | Q(ω) |2 dω

Z

| P (ω) |2 dω. | N (ω) |2

From this result, using the definition of r given in equation (2), we see that Z | P (ω) |2 dω. r≤ | N (ω) |2

Fig. 3. Difference in the power spectral density of the original, watermarked and tampered signal. The tampering has been undertaken using a band pass filter with a normalised lower cut-off frequency of 0.01 and higher cut-off frequency 0.99.

Chirp coding is generic in the sense that it can be used to watermark any (user defined) bit stream in a signal. For watermarking with plaintexts, the bit stream can be generated using a standard ASCII (7-bit) code. Thus, the use of this method for self-authenticating signals, as discussed in this paper, is just one approach, albeit a useful one. However, in terms of sending and receiving data through some communications channel, the most important feature of chirp coding is the facility it provides for transmitting information through environments with significant amounts of noise, recovery of this information being based on knowledge of the exact chirp function used to ‘chirp code’. The radio frequency spectrum of the universe is relatively quiet when compared to other parts of the electromagnetic spectrum such as the microwave spectrum. Nevertheless, radio wave emissions will acquire a significant amount of noise if transmitted over distances of many light years. Chirp coding may provide a way of preserving such information when it is known that the final SNR is likely to be very small. In the search for extraterrestrial intelligence, the radio spectrum is considered to be the most likely frequency range in which ‘intelligent signals’ might exist. In light of the above, it may be of value to analyse such radio signals by correlating them with a range of different chirp functions, focusing on those outputs (i.e. the correlation functions) that provide some minimum Entropy measure. A PPENDIX I D ERIVATION OF THE M ATCHED F ILTER Given equation (2), the matched filter is essentially a ‘byproduct’ of the ‘Schwarz inequality’, i.e. the result Z 2 Z Z 2 Q(ω)P (ω)dω ≤ | Q(ω) | dω | P (ω) |2 dω.

Now, if r is to be a maximum, then we require that Z | P (ω) |2 r= dω | N (ω) |2 or Z 2 | N (ω) | Q(ω) P (ω) dω | N (ω) | Z =

2

2

Z

| N (ω) | | Q(ω) | dω

| P (ω) |2 dω. | N (ω) |2

But this is only true if | N (ω) | Q(ω) =

P ∗ (ω) . | N (ω) |

Hence, r is a maximum when Q(ω) =

P ∗ (ω) . | N (ω) |2

Noise is usually characterised by: (i) the Probability Density Function (PDF) or the Characteristic Function (i.e. the Fourier transform of the PDF); (ii) the Power Spectral Density Function (PSDF). To apply the Matched Filter, the function | N (ω) |2 (i.e. the power spectrum of the noise), in addition to P (ω), is required to be known a priori. In some practical systems, this is possible if the Impulse Response Function is zero so that the output of the system is ‘noise driven’. In general however, it is often necessary to develop a suitable model for the PSDF. Such models may include uniform, Gaussian, Poisson or random fractal noise, for example, which may be suitable in many cases [3]. However, if we consider the case when the PSDF is uniform or ‘white’ and of unit amplitude then we can write | N (ω) |2 = 1∀ω so that the Matched Filter reduces to the simple result Q(ω) = P ∗ (ω).

69


A PPENDIX II P ROTOTYPE MATLAB C ODING A LGORITHM % Read a (.wav) audio file [au2,fs,nbit]=wavread(’file’); % Clear the screen clc % Compute the size of the data size(au2); % Extract a single set of data composed of % 1500150 (arbitrary) elements au1=au2(1:1500150,1); % Set the watermarking scaling factor % (user defined) div_fac=270; % Extract data segment from 300031 to % 1500150 (arbitrary) and compute the % maximum value au=au1(300031:1500150,1); au_max1=max(au1(300031:1500150,1)); % Apply wavelet decomposition using % Daubechies wavelets with 7 levels [ca cl]=wavedec(au(:,1),7,’db4’); % Compute the approximation % coefficients at level 7 appco=appcoef(ca,cl,’db4’,7); % Extract the ’detail coefficients’ % at each level detco7=detcoef(ca,cl,7); detco6=detcoef(ca,cl,6); detco5=detcoef(ca,cl,5); detco4=detcoef(ca,cl,4); detco3=detcoef(ca,cl,3); detco2=detcoef(ca,cl,2); detco1=detcoef(ca,cl,1); % Compute the energy for each set % of coefficients ene_appco=sum(appco.ˆ2); ene_detco7=sum(detco7.ˆ2); ene_detco6=sum(detco6.ˆ2); ene_detco5=sum(detco5.ˆ2); ene_detco4=sum(detco4.ˆ2); ene_detco3=sum(detco3.ˆ2); ene_detco2=sum(detco2.ˆ2); ene_detco1=sum(detco1.ˆ2); % Compute the total enegy of all % the coefficients tot_ene=round(ene_detco7...

+ene_detco6+ene_detco5+... ene_detco4+ene_detco3+... ene_detco2+ene_detco1); % Round towards the nearest % integer the percentage energy of % each set pene_hp7=round(ene_detco7*100/tot_ene); pene_hp6=round(ene_detco6*100/tot_ene); pene_hp5=round(ene_detco5*100/tot_ene); pene_hp4=round(ene_detco4*100/tot_ene); pene_hp3=round(ene_detco3*100/tot_ene); pene_hp2=round(ene_detco2*100/tot_ene); pene_hp1=round(ene_detco1*100/tot_ene); % Do decimal integer to binary conversion % with at least 17 bits tot_ene_bin=dec2bin(tot_ene,31); f7=dec2bin(pene_hp7,17); f6=dec2bin(pene_hp6,17); f5=dec2bin(pene_hp5,17); f4=dec2bin(pene_hp4,17); f3=dec2bin(pene_hp3,17); f2=dec2bin(pene_hp2,17); f1=dec2bin(pene_hp1,17); % Concatenate the arrays f1,f2,... % along dimension 2 to produce a binary % sequence (watermark code) wmark=cat(2,tot_ene_bin,f7,f6,f5,f4,... f3,f2,f1); % Concatenate decimal integer array per_ce=cat(2,tot_ene,pene_hp7,... pene_hp6,pene_hp5,pene_hp4,... pene_hp3,pene_hp2,pene_hp1); % Write out decimal integer and binary % codes for analysis d_string=per_ce b_string=wmark % Assign -1 to 0 and +1 to 1 for j=1:150 if str2num(wmark(j))==0 x(j)=-1; else x(j)=1; end end % Initialise a compute chirp % function using a log sweep t=0:1/44100:10000/44100; y=chirp(t,00,10000/44100,100,’log’); % Compute +chirp for 1 and -chirp for 0, % scale by div_fac and concatentate.

70


znew=0; for j=1:150 z=x(j)*y/div_fac; znew=cat(2,znew,z); end % Compute length of znew and % watermark signal znew=znew(2:length(znew)); wmark_sig=znew’+au1; % Compute power of watermark and % power of signal w_mark_pow=(sum(znew.ˆ2)); sig_pow=(sum(au1.ˆ2)); % Rescale watermarked signal wmark_sig1... =wmark_sig*au_max1/max(wmark_sig); % Concatenate and write to file wmark_sig... =cat(2,wmark_sig1,au2(1:1500150,2)); wavwrite(wmark_sig,fs,nbit,’file’); A PPENDIX III P ROTOTYPE MATLAB D ECODING A LGORITHM % Clear variables and functions % from memory clear % Read watermarked file and % clear screen [au,fs,nbit]=wavread(’file’); clc % Extract data au1=au(300031:1500150,1); % Do wavelet decomposition [ca cl]=wavedec(au1,7,’db4’); % Extract wavelet coefficients appco=appcoef(ca,cl,’db4’,7); detco7=detcoef(ca,cl,7); detco6=detcoef(ca,cl,6); detco5=detcoef(ca,cl,5); detco4=detcoef(ca,cl,4); detco3=detcoef(ca,cl,3); detco2=detcoef(ca,cl,2); detco1=detcoef(ca,cl,1); % Compute energy of % wavelet coefficients ene_appco=sum(appco.ˆ2); ene_detco7=sum(detco7.ˆ2); ene_detco6=sum(detco6.ˆ2);

ene_detco5=sum(detco5.ˆ2); ene_detco4=sum(detco4.ˆ2); ene_detco3=sum(detco3.ˆ2); ene_detco2=sum(detco2.ˆ2); ene_detco1=sum(detco1.ˆ2); % Compute total energy factor tot_ene=round(ene_detco7... +ene_detco6+ene_detco5... +ene_detco4+ene_detco3... +ene_detco2+ene_detco1); % Express energy values as a % percentage of thetotal energy % and round to nearest integer pene_hp7=round(ene_detco7*100/tot_ene); pene_hp6=round(ene_detco6*100/tot_ene); pene_hp5=round(ene_detco5*100/tot_ene); pene_hp4=round(ene_detco4*100/tot_ene); pene_hp3=round(ene_detco3*100/tot_ene); pene_hp2=round(ene_detco2*100/tot_ene); pene_hp1=round(ene_detco1*100/tot_ene); per_ene=cat(2,tot_ene,pene_hp7,... pene_hp6,pene_hp5,pene_hp4,... pene_hp3,pene_hp2,pene_hp1); % Output original decimal integer code % obtained from signal via wavelet % decomposition original_d_string=per_ene; original_d_string orig=original_d_string; % Compute chirp function t=0:1/44100:10000/44100; y=chirp(t,00,10000/44100,100,’log’); % Correlate input signal with chirp % and recover sign for i=1:150 yzcorr=xcorr(au(10000*(i-1)... +1:10000*i),y,0); r(i)=sign(yzcorr); end % Recover bit stream for i=1:150 if r(i)==-1 recov(i)=0; else recov(i)=1; end end % Convert from number to sring recov=(num2str(recov,-8));


71

% Covert from binary to decimal % and concatenate rec_ene_dist... =cat(2,bin2dec(recov(1:31)),... bin2dec(recov(32:48)),... bin2dec(recov(49:65)),... bin2dec(recov(66:82)),... bin2dec(recov(83:99)),... bin2dec(recov(100:116)),... bin2dec(recov(117:133)),... bin2dec(recov(134:150)));

[18] D. Kundur and D. Hatzinakos, A Robust Digital Image Watermarking Method using Wavelet based Fusion, Proceedings of the International Conference on Image Processing (ICAP ’97), IEEE, 544-547, 1997. [19] H. Tassignon, Wavelets in Image Processing, Image Processing II:Mathematical Methods, Algorithms and Applications (Eds. J M Blackledge and M J Turner), Horwood Publishing, 2000. [20] M. V. Klein and T. E. Furtak, Optics, Wiley, 1986. [21] J. J. Chae and B. Manjunath, A Technique for Image Data Hiding and Reconstruction without a Host Image, Security and Watermarking of Multimedia Contents I, (Eds. P W Wong and E J Delp), SPIE 3657, 386-396, 1999.

% Write out reconstructed decimal % integer stream recoverd from % watermark reconstructed_d_string=rec_ene_dist; reconstructed_d_string rec=reconstructed_d_string; % Write out error between % reconsructed and original % watermark (decimal integer) codes. error... =sum(abs(rec-orig))/sum(abs(rec+orig)) ACKNOWLEDGMENT The author is grateful for the advice and help of Dr S Datta and Dr O Farooq. R EFERENCES [1] J. I. Cox, M. L. Miller and J. A. Bllom, Digital Watermarking, Morgan Kaufmann Publishers, Academic Press, 2002 [2] J. S. Lim, Two-Dimensional Signal and Image Processing, Prentice-Hall, 1990. [3] J. M. Blackledge, Digital Signal Processing, 2nd Edition, Horwood Publishing, 2006. [4] J. M. Blackledge, Digital Image Processing, Horwood Publishing, 2005. [5] S. Katzenbaiser and F. A. P. Petitcolas, Information Hiding Techniques for Steganography and Digital Watermarking, Artech, 2000. [6] B. Buck and V. A. Macaulay (Eds.), Maximum Entropy in Action, Oxford Science Publications, 1991. [7] R. J. Anderson and F. A. P. Petitcolas, On the Limits of Steganography, IEEE Journal of Selected Areas in Communication (Special issue on Copyright and Privicy Protection), 16(4), 474-481, 1989. [8] F. A. P. Petitcolas, R. J. Anderson and M. G. Kuhn, Information Hiding - A Survey, Proc. IEEE, 87(7), 1062-1078, 1999. [9] A. Jazinski, Stochastic Processes and Filtering Theory, Academic Press, 1970. [10] A. Papoulis, Signal Analysis, McGraw-Hill, 1977. [11] A. Bateman and W. Yates W, Digital Signal Processing Design, Pitman, 1988. [12] A. W. Rihaczek, Principles of High Resolution Radar, McGraw-Hill, 1969. [13] R. L. Mitchell, Radar Signal Simulation, Mark Resources Incorporated, 1985. [14] J. J. Kovaly, Synthetic Aperture Radar, Artech, 1976. [15] M. Darnell (Ed.), Cryptography and Coding, Lecture Notes in Computer Science (1355), Springer, 1997. [16] D. Kundur and D. Hatzinakos, Digital Watermarking using Multiresolution Wavelet Decomposition, Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP ’98), IEEE, 2969-2972, 1997. [17] D. M. A. Lumini, A Wavelet-based Image Watermarking Scheme, Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC ’00), IEEE, 122-127, 2000.



ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Jonathan M. Blackledge: Modelling and Computer Simulation of Radar Screening using Plasma Clouds

Modelling and Computer Simulation of Radar Screening using Plasma Clouds Jonathan M Blackledge, Fellow, IET, Fellow, IoP

Abstract— Following a brief introduction on the principles of screening an aerospace vehicle using a plasma, we develop models for the Impulse Response Functions (IRFs) associated with microwave (Radar) back-scattering from a strong and weakly ionized plasma screen. In the latter case, it is shown that the strength of the return signal is determined by an IRF that is characterised by the simple negative exponential exp(−σ0 t/0 ) where σ0 is the average conductivity of the plasma, 0 is the permittivity of free space and t is the two-way travel time. For a weakly ionized plasma, the conductivity is determined by the number density of electrons. We develop a model for an electron beam induced plasma that includes the effect of cascade ionization and losses due to diffusion and recombination. Qualitative results are then derived for the number density of a plasma screen over a sub-sonic aerospace vehicle and a numerical simulation considered that is based on an iterative approach using a Green’s function solution for a stationary and a moving vehicle. An example is provided for an idealised case relating to a subsonic missile such as a ‘cruise missile’. Index Terms— Stealth Technology, Microwave Scattering, Radar, Weak Plasmas, Plasma Density Simulation

surfaces - that reflect the microwave radiation away from the source. However, one of the principal factors for reducing the Radar Cross Section (RCS) is to minimize the profile of the aircraft while maximizing the ‘smoothness’ of the design. This effect was first noticed when a prototype ‘flying wing’ was developed in Germany by two Luftwaffe officers - the Horten brothers - and first tested in late 1944. This unique design was many years ahead of its time and was investigated further in the 1950s by the USA (the Northrop flying wing). However, limitations in control systems technology available at that time meant that the design was not practically viable due to an aerodynamic performance that was intrinsically unstable. The flying wing design only became of practical significance after the development of digital control processing (primarily in the 1970s), leading to the realisation of ‘fly by wire’. The problem of designing stealthy aerospace vehicles can be posed as follows: given that the aircraft can be assumed to be a Born scatterer and that [2], [3] (∇2 + k 2 )Es (r, ω) = −k 2 γ(r)Ei (r, ω)

I. I NTRODUCTION INCE its original development in the late 1930s by Britain and Germany, Radio Detection and Ranging or Radar has been used for many years to detect airborne objects using ground and/or airborne platforms. The use of stealth technology for suppressing the detection of aerospace vehicles by Radar has been the subject of intensive research since the early 1970s following the development of radar guided surface-toair missiles in the 1960s. One of the most notable current examples of the results of this research is the Lockhead-Martin F-117 stealth fighter and later the stealth bomber, first tested successfully under combat conditions in the Gulf war of 1991. Based on ideas first introduced by Denys Overholser in 1974 at Lockhead’s advanced engineering laboratories, the technology is based on two principal aspects: (i) design features; (ii) radar absorbing materials and coatings. The geometry of the design is based on trying to minimize those features of an aerospace vehicle that are responsible for reflecting microwave radiation in such a way that the result can fly. Obvious features include embedding the gas turbine engines deep into the structure of the aircraft and introducing facets - diamond shaped flat

S

Manuscript received June 1, 2007. This work was supported Matra BAE Dynamics, Bristol, England. Jonathan Blackledge is Professor of Information and Communications Technology, Applied Signal Processing Research Group, Department of Electronic and Electrical Engineering, Loughborough University, England and Professor of Computer Science, Department of Computer Science, University of the Western Cape, Cape Town, Republic of South Africa (e-mail: [email protected]).

+ikz0 σEi (r, ω) − ∇[Ei (r, ω) · ∇ ln r (r)]

(1)

where γ(r) = r (r) − 1, find ‘flying functions’ γ and σ which are of compact support such that Es = 0. Here, Es is the Fourier transform of the time-dependent scattered electric field vector es given by Z∞ Es (r, ω) = es (r, t) exp(iωt)dt, −∞

Ei is the Fourier transform of the time-dependent incident electric field vector, i.e. Z∞ Ei (r, ω) = ei (r, t) exp(iωt)dt, −∞

r is the relative permittivity, σ is the conductivity, z0 is the impedance of free space, r is the three-dimensional spatial vector and k = ω/c0 is the wavenumber where ω is the angular frequency and c0 is the speed of light (in a vacuum). In addition to investigating the RCS for different designs and materials, there is another approach to producing stealthy flying objects using a plasma. The reduction of the RCS of an aerospace vehicle through the generation of a plasma is an effect that has been known about for many years. The phenomenon has an obvious connection with the ‘radio silence’ phenomenon that occurs during re-entry of a spacecraft. This

73


occurs when a plasma is formed around the spacecraft due to the ‘friction’ of the Earth’s atmosphere. A fundamental parameter of any plasma is the ‘plasma (angular) frequency’ ωp given by 1 4πne2 2 ωp = m where e is the charge of an electron (1.6 × 10−19 C), m is the mass of an electron (0.91 × 10−30 kg) and n is the number density of electrons in m−3 . For a plane (transverse) electromagnetic wave incident on a plasma [4] 1q 2 ω − ωp2 . k= c0 A cut-off occurs when ω = ωp , i.e. when there is a critical number density mω 2 nc = . 4πe2 Radio waves can only propagate through a plasma when ω > ωp . For a typical laboratory plasma with n = 1012 cm−3 , a cut-off occurs when √ ωp ∼ 104 n = 10GHz fp = 2π which is in the microwave range. This effect is used as a method of measuring the density of laboratory plasmas. The idea of screening an aerospace vehicle in a self-induced plasma with an appropriate critical number density is not a practical proposition. However, partial plasma screening of specific features which are good radar point-scatterers is possible, one example being the ‘point’ on the ‘nose-cone’ of a missile. In this paper, we derive a model for radar signals generated by a conductor that is screened by a plasma. We develop an electromagnetic scattering model to investigate the effect that a plasma has on a conventional radar system. Expressions for the Impulse Response Function [5] generated by a scatterer with and without a plasma screen are studied. For a weakly ionized plasma, we derive a result that shows that the screening of the scatterer by the plasma is characterized by a simple negative exponential whose decay rate is determined by the conductivity which in turn, is proportional to the electron number density. A model for the distribution of the electron number density is then considered. II. M ICROWAVE S CATTERING M ODEL Our aim is to develop a suitable model for the plasma screening effect by developing some relatively simple analytical results that explain why, under certain conditions, it provides a near-zero RCS. The basic reason for this effect is assumed to be due to the following: (i) a plasma is a (good) conductor and will therefore absorb (and disperse) electromagnetic (microwave) radiation before it is reflected by a scatterer; (ii) the air/plasma boundary is continuous (on the scale of the wavelength) and will therefore not generate a strong reflection compared with that generated by the surface of the scatterer which represents a sharp discontinuity on the scale of a wavelength (of a microwave field).

Let us model the problem using the scalar wave equation (under the Born approximation) (∇2 + k 2 )Es = −k 2 γ(r)Ei + ikz0 σ(r)Ei , r ∈ V where V is the volume of the scatterer. In order to obtain this equation, we are required to ignore the cross-polarisation term ∇[Ei · ∇ ln r ] in equation (1). A general solution to this equation can now be obtained using the Green’s function method which, for homogeneous boundary conditions, gives Z Es = g(k 2 γ − ikz0 σ)Ei d3 r where g is the ‘out-going’ Green’s function given by [6] g(r | r0 ) =

exp(ik | r − r0 |) 4π | r − r0 |

and the integral is taken over the volume V of the scatterer. Here, r and r0 are the spatial coordinates of the scatterer and the position at which the scattered field is measured, respectively. The characteristics of the back-scattered field are dependent on r , σ and their geometry (i.e. the shape of the scatterer over volume V ). Note that, if r = 1 and σ = 0, then the scattered field is zero. Let us assume that the scatterer is a good conductor, and that r = 1 so that γ = 0. This assumption is consistent with the application of a scalar wave equation since ∇(Ei ·∇ ln r ) = 0 with r = 1. The scattered field is now determined by the conductivity alone. Let us also assume that the incident field is described by the Green’s function g instead of a plane wave (the more usual case). This assumption helps to simplify slightly the analysis required in generating a model for the back-scattered field. If the incident field propagates through a medium whose conductivity is effectively zero (i.e. air) then the solution for the back-scattered field will be given by Z Es = ikz0 σg 2 d3 r. The volume over which scattering is effective will be determined by the skin depth 12 2 δ= kz0 σ which, although very small for a good conductor, will be considered to be finite. This allows us to adopt a volume scattering approach instead of one based on surface scattering. The reason for this is that we can then consider the volume scattering effects introduced by a plasma screen. Note that the homogeneous boundary conditions used to produce this Green’s function solution yield a surface integral that is zero (i.e. Es and ∇Es are considered to be zero on the surface of V ). The solution for Es in the far field (i.e. when r/r0 > z0 σ0 that has been applied to achieve this simplification reduces to σ0 > νee , νei ; νia >> νii , νie . A highly ionized plasma is described by the reverse of these conditions. The conductivity of a weakly ionized plasma is given by [4] 2ne2 ne2 + σ= me νea mi νia where me and mi are the masses of an electron and ion, respectively. This expression for the conductivity is dominated by the first term which describes the conductivity for the electron component of the plasma. The reason for this is that mi >> me . Clearly, in this case, the conductivity is proportional to the electron number density n and the

76


conductivity of a weakly ionized plasma can be approximated by ne2 n σ= ∼ 10−9 me νea νea where νea is the frequency of collisions between electrons and atoms. The ratio n/νea will vary considerably from one regime (i.e. altitude and speed of flight) to another, although the values of n and νea may tend to off-set each other. Assuming that the plasma is generated by e-beam breakdown of the atmosphere, at ambient atmospheric pressures, n will be large as will νea . At higher altitudes, n will be less but so will νae . Finally, above the atmosphere there will be relatively few atoms to break down and the collision frequency will be relatively small. However, if, for example, hydrogen gas could be generated prior to ionization, then it would be possible to generate large electron densities with low collision frequencies leading to high and sustainable plasma conductivities and, therefore, more effective plasma screening systems. Since the conductivity of the plasma screen is linearly proportional to the electron number density, a principal problem is to determine the number density distribution for a given configuration (of source and aerospace vehicle). Thus, we are required to obtain a model that predicts the generation and transport of electrons subject to a variety of processes such ionization, recombination, diffusion, radiative losses, air flow, etc. This can be accomplished by considering the macroscopic properties of the plasma which are governed by the dynamics of the growth process, a process that involves avalanche electron multiplication (an exponential process), i.e. the ionization rate per initial electron. A limiting mechanism for the growth of the cascade is taken to be due to the (ambipolar) diffusion of electrons out the volume of the e-beam. Away from the plasma source, the electron number density is taken to be determined primarily by the recombination rate, radiative losses or bremsstrahlung radiation and flow regime. The ionization mechanism is taken to include inverse bremsstrahlung processes [4]. A. Ionization The ionization of a neutral gas by an electron beam, for example, is determined by a cascade process that produces an exponential growth in the electron density. In the absence of diffusion processes, this electron density is determined by the equation dn = In dt where I is the ionization rate per initial electron and is assumed to be a constant. The solution is trivial, represents exponential growth and is given by

In other words, the cascade process requires 40 generations to produce 1013 electrons from just 10 of them. This number is not strongly dependent on the assumed value of n0 within reasonable bounds. The electron density becomes large only near the end of the cascade process; 99% of the ionization is produced from the last 7 generations. Therefore, quantities such as the growth and losses from the cascade and the time to breakdown are determined by the conditions at times when the electron density is small. The ionization rate will be determined by two principal processes: (i) the ionization rate Ib due to collisions of neutral atoms or molecules with electrons that have absorbed energy in the inverse bremsstrahlung process; (ii) the loss of potential ionizing electrons due to electron attachment with an ion which we denote by a rate coefficient Ia . Thus, in general I = Ib − Ia . The process of inverse bremsstrahlung involves raising a free electron to a higher energy state in the continuum of states available to it. The energy is a result of the absorption of a photon due to bremsstrahlung radiation which is itself produced by the acceleration of charged particles involved in elastic collisions. This absorption must occur with a simultaneous interaction with a heavy particle (atom, molecule or ion) in order that momentum is conserved. B. Diffusion The diffusion of electrons in a plasma is determined by the diffusion equation ∂n = D∇2 n ∂t where D is the (ambipolar) diffusion coefficient. In this equation, n represents the electron density of the plasma. With regard to ionization, the term In can be added to the diffusion equation to produce the inhomogeneous equation ∂n = D∇2 n + In. ∂t Note that, in general, I and D may be functions of both space and time. Another source term that is required is the multi-electron ionization rate due the e-beam alone which is responsible for the production of the initial electron density from which the cascade process develops. This ionization will also depend on both space and time and, in particular, on the distance of the beam away from the source. Thus, if we denote the e-beam ionization rate by B (for beam), then the diffusion equation becomes ∂n = D∇2 n + In + B. ∂t

n = n0 exp(It) where n0 is the initial electron density. Suppose that for a given volume, we require the e-beam to produce 1013 electrons say and that this number should be produced from an initial value of 10 electrons that have been ionized by electrons from the e-beam alone, then Z n ln = Idt ∼ 40. n0

C. Recombination Electron-ion collisions may lead to recombination, i.e. the production of a neutral atom as a result of the capture of an electron by an ion. The efficiency of the processes responsible for recombination is considerable at low electron energies at which the electron-ion interaction time is sufficiently large. Accordingly, at low electron temperatures (i.e. much less

77


than the ionization energy) these processes strongly affect the balance of the charged plasma particles. The rate of charged particle removal due to recombination in a volume is determined by the total recombination cross section and depends of the number densities of both ions ni and electrons ne . Thus, the rate equation is given by

can be considered for different conditions compounded in the inclusion, or otherwise, of different terms. In some practical cases, the diffusion loss will dominate over losses from recombination after initiation (when B can be ignored), and we can consider the electron density to be determined by the solution of

∂n = −Rni ne = −Rn2 ∂t where R is the recombination coefficient. The minus sign is introduced here because the process is lossy. This nonlinear equation has a simple analytical solution which can be obtained by inspection and is given by

∂n = D∇2 n + In. ∂t For the characteristic diffusion length Λ of the breakdown, we may replace ∇2 by −1/Λ2 to obtain a solution of the form

1 1 = + Rt n n0

This solution illustrates exponential growth of electrons, subject to exponential damping due to diffusion. Clearly, for a given coefficient of diffusion, the characteristic diffusion length should be large in order to achieve a high concentration of electrons. Under conditions where, along with diffusion, the quadratic recombination term substantially affects the plasma decay, the rate equation takes the form

where n0 is the initial number density. After the density has fallen far below its initial value, it decays reciprocally with time, i.e. 1 . n∝ Rt This is a fundamentally different behaviour from the exponential decay associated with diffusive processes and exponential growth associated with ionization processes. Since the recombination rate is proportional to n2 , for high values of n it can be expected to be the dominant process. With regard to the diffusion equation, −Rn2 is a source term and, thus, the diffusion equation must be modified again, this time to the nonlinear inhomogeneous form ∂n = D∇2 n + In + B − Rn2 . ∂t Note that, in general, it is expected that, like I, D and B, the recombination coefficient R may be a function of both space and time. The rate equation above, has two source terms and two loss terms. The source terms are B and In which describe the initial population density of electrons produced by the e-beam alone and the population density generated by the cascade process. The loss terms D∇2 n and Rn2 describe losses due to the processes of diffusion and recombination, respectively. Another effect that can be considered is loss through radiative processes. However, for weakly ionized plasmas, it reasonable to assume that this effect is relatively small compared to diffusion and recombination. These losses will also be proportional to n2 since the total power P radiated per unit volume by a plasma is given by [4] 1

P ∼ 1.5 × 10−38 Z 2 ne ni Te2

(Watts/m3 )

where n is in m−3 and Te is in eV. Because the radiated power is proportional to the square of the atomic number Z, a low Z plasma (e.g. a hydrogen plasma) will last longer. IV. R ATE E QUATION A NALYSIS Analytical solutions to the rate equation ∂n = D∇2 n + In + B − Rn2 ∂t

n = n0 exp[(I − D/Λ2 )t].

∂n = D∇2 n + In − Rn2 ∂t or, in terms of the characteristic length of diffusion, D dn =− − I n − Rn2 . dt Λ2 The solution to this equation is [4] D D Λ2 − I n0exp It − Λ2 t . n(t) = D D Λ2 − I + Rn0 1 − exp It − Λ2 t Note that, when D/Λ2 − I >> Rn, this solution changes into an exponential form that is characteristic of ionization growth and diffusion decay. Alternatively, when Rn >> D/Λ2 − I the electron density is determined by the equation. 1 1 = + Rt. n n0 V. S TEADY S TATE S OLUTIONS For steady state conditions ∂n =0 ∂t and our rate equation reduces to D∇2 n + In + B − Rn2 = 0. Let us now consider some of the solutions available under different conditions. A. Steady State Solution without Flow If we consider the e-beam to produce ionization along the axis alone then the plasma source can be assumed to be axially symmetric. The electron density is then a function of the radius r and, using cylindrical coordinates, we have 1 ∂ ∂n 2 ∇ n= r . r ∂r ∂r


78

The simplest solution available to us in this case is obtained under the assumption that B, I and R are all zero. The plasma is therefore assumed to be a cylindrical plasma with losses due to diffusion alone. Except at r = 0, the density must satisfy ∂n 1 ∂ r =0 r ∂r ∂r which has the solution

where Re is the Reynolds number given by Re =

Lv , η

L is the characteristic length scale of the flow, v is the velocity of the flow and η is the kinematic viscosity of air. For a 10 m long aerospace vehicle travelling at 100 m/s, say, and with η ∼ 10−3 m2 /s for air,

n(r) = n0 ln r + c.

∆ = 1 mm.

With the boundary condition n(a) = 0 (i.e. the electron density is zero some distance away from the source) we have c = −n0 ln a and therefore a n(r) = n0 ln r which is the fundamental solution to the 2D Laplace’s equation. Let us now consider the solution to the equation D∇2 n + In = 0 in cylindrical coordinates. This requires that we solve the equation 1 ∂ ∂n In r =− r ∂r ∂r D or d2 n 1 dn I + + n=0 dr2 r dr D which is Bessel’s equation of order zero. This has the solution r ! I n(r) = n0 J0 r D

For a 1mm thick plasma screen of 1 siemen/metre and considering the two-way travel path, the absorption of microwave radiation with a wavelength of 1cm (due to the skin depth effect) is 87%. Thus, relatively large absorption can occur over small boundary layers composed of low conductivity plasmas (i.e. plasmas with low electron number densities). As the plasma streams away from the source, the electron density will decrease due to an increase in the extent of the boundary layer (ignoring recombination). Since the initial radial extent of the plasma at source is given by a, we can expect the screen thickness to be of the order of a+∆. The decay of the electron density as a function of r and L can therefore be estimated by q r ! I n J r D 0 0 I a q n0 J0 r = . n(r, L) = a+∆ D 1 + 0.4167 LηI

where J0 is the Bessel function of order zero. The boundary condition that must be applied is that n = 0 at r =p a. The Bessel function is zero for multiple values of x = r I/D. However, the first zero occurs when x ' 2.4 or when r D . r = a = 2.4 I This solution describes the lowest diffusion mode in which a can be taken to define the boundary between the plasma and air. Although it is possible for higher diffusion modes to occur, they tend to decay rapidly in most plasmas and may therefore be ignored. Note that the radial extent of the electron density is proportional to the square root of the coefficient of diffusion. B. Steady State Equation with Flow Suppose we consider the case when the plasma source is in a steady state condition (i.e. the e-beam is operating in the continuous mode) and that the radial distribution of the electron density is described by J0 . For the case when the plasma source is moving through the atmosphere, it will be expected that the plasma streams away from the source (down wind) producing a decay of the electron density due to: (i) air flow effects, e.g. boundary layer thickening; (ii) recombination. Let us assume that the plasma forms a boundary layer with thickness L ∆∼ √ Re

vD

This steady state estimate neglects the effects of recombination but provides a qualitative estimate of the electron density profile produced by a continuous on-axis e-beam. C. Numerical Simulation The rate equation for the electron density is given by ∂n = D∇2 n + B + In − Rn2 . ∂t If the plasma is generated in a flow of air then, to a good approximation, we can consider the electrons to flow with the air and thus conform to the conservation equation ∂n = ∇ · (nv) ∂t where v is the velocity of the flow. Hence, we are required to solve the equation D∇2 n + B + In − Rn2 − ∇ · (n∇u) = 0 where u is the velocity potential v = ∇u. Our problem is to find n given u which requires the velocity potential to be computed a priori. Suppose we compute the velocity potential for air (in the absence of a plasma). We can then consider a model in which the electron density is a characteristic of this potential. In other words, we consider the plasma to flow away from the source in a manner that is determined by the stream lines associated with the flow of air over the aerospace vehicle. For constant (air) density, the velocity potential is obtained by solving Laplace’s equation ∇2 u = 0


79

subject to appropriate boundary conditions. Noting that ∇u · ∇n = ∇ · (u∇n) − u∇2 n we can write (D + u)∇2 n + B + In − Rn2 − ∇ · (u∇n) = 0. This is the steady state equation for the electron density n subject to a flow regime characterized by velocity potential u. The 3D Green’s function solution to this equation is [6] 1 B In Rn2 n= ⊗3 + − − ∇ · (u∇n) 4πr u+D u+D u+D where ⊗3 denotes the three-dimensional convolution integral. The order of iteration required to compute n can follow the order in which the physical mechanisms described by each of the terms occur. Thus: Electron generation n1 =

1 B ⊗ 4πr u + D

Fig. 1. Plasma density profile generated by an electron beam without airflow (above) and with an airflow (below) from right to left over a ‘smoothed cone’. The beam is taken to be of uniform intensity and emitted from the ‘point’ of the cone ‘travelling’ to the right.

Ionization n2 = n1 +

In1 1 ⊗ 4πr u + D

Recombination n3 = n1 + n2 −

Rn22 1 ⊗ 4πr u + D

Flow 1 ⊗ ∇ · (u∇n3 ) 4πr Figure 1 shows the effect of a plasma (specifically, the electron number density n3 ) generated without (u = 0) and with (∇2 u = 0) an air flow (from right to left) over a cone with a smooth point. Here, we assume that the screen is axially symmetric and undertake the computations in the plane (x, y, 0). This is achieved by implementing the equations above on a two-dimensional uniform grid of size 700×300, applying the convolution theorem and using the result (where ⇐⇒ denotes transformation from real space to Fourier space) n 4 = n 1 + n2 − n 3 −

1 p

x2

+

y2

1

⇐⇒ q

kx2 + ky2

with the boundary condition n = 0 (applied over the boundary and over the extent of the cone) and where kx , ky are the spatial frequency components in the x− and y− directions respectively. The e-beam is taken to be a ‘pencil line beam’ (one pixel wide) emitted from the point of the cone with uniform intensity along its extent. The coefficients B, I, R and D are assumed constant with values: B = 4π, I = 4π, R = 4π and D = 1. The velocity potential u is computed using the Successive-over-Relaxation method [9] compounded in the following result (where ω = 1.1 is the relaxation parameter) ω k k+1 k k k (u + uk+1 uk+1 i−1,j + ui,j+1 + ui,j−1 − 4ui,j ) i,j = ui,j + 4 i+1,j

for i = 1, 2, ..., N and j = 1, 2, ..., M with conditions uij = 0 on the boundary and over the extent of the cone, u1,j = u0 ∀j, uN,j = u0 ∀j, ui,M = u0 ∀i, ui,1 = u0 ∀i ∈ / C where C is the extent of the cone at the extreme left-hand edge of the grid (with u0 = 1). The extent of the plasma screen that forms over the boundary of the cone to provide a radar screen is quite noticeable when air flow is present, an extent that is strongly determined by the magnitude of the recombination coefficient and air flow for a given beam energy and coefficient of ionization. Actual values for R along with I, D and the beam profile B (which will not be uniform as in the idealized simulation presented here) and the flow rate will depend on the operating conditions that apply. These include the vehicle velocity, the plasma medium, additives (readily ionizable or reactive species), the electron beam energy, its diameter and profile, details that remain classified. However, typical parameters include an electron beam energy of 100keV, a (Gaussian) beam diameter of less than 5mm with a loss of 1keV per cm for an aerospace vehicle travelling at up to 100ms−1 operating in a plasma medium of air (over a range of atmospheric pressures) and with additives such as water vapour. Applications include the plasma screening of in-coming missiles, for example, against close proximity anti-missile systems that use radar for targeting and control. VI. S UMMARY AND C ONCLUSIONS The idea of using a weakly ionized plasma to screen an aerospace vehicle is not new but interest in this effect and appraisal of the applications to which it can be practically applied are likely to grow. This paper has developed a model for the radar signal generated with and without a plasma screen and illustrates that, for a weakly ionized plasma, the effect of such a screen is compounded in the function exp(−σ0 t0 ) where t is the two-way travel time, σ0 is the average conductivity of the plasma and 0 is the permittivity

80


of free space. For a weakly ionized plasma, the conductivity is determined by the number density of electrons and qualitative results have been developed to estimate the number density of a plasma screen enveloping a moving vehicle. A numerical procedure to simulate the number density of a plasma has been developed and an example provided for the case when an e-beam induced plasma is generated from the front of a (sub-sonic) missile. This simulation is based on assuming cascade ionization with loss mechanisms due to diffusion and recombination. The simulator is not suitable for the supersonic case when the airflow cannot by determined by the solution to Laplace’s equation for the velocity potential. In this case, it may be expected that the plasma is partially distributed along the shock wave that is formed and thus, depending on the exact configuration of the aerospace vehicle, could provide a more extensive plasma screen. This will be the subject of a future publication. ACKNOWLEDGMENT The author would like to thank Professor Michael Rycroft and Professor Roy Hoskins for reading the original manuscript and the suggestions they made in its preparation. R EFERENCES [1] M. Bertero and B. Boccacci, Introduction to Inverse Problems in Imaging, Institute of Physics Publishing, 1998. [2] J. R. Wait, Electromagnetic Wave Theory, Wiley, 1987. [3] J. M. Blackledge, Quantitative Coherent Imaging, Academic Press, 1989. [4] V. E. Golant, A. P. Zhilinsky and I. E. Sakharov, Fundamentals of Plasma Physics, Wiley series in plasma physics, 1977. [5] J. M. Blackledge, Digital Signal Processing (Second Edition), Horwood Scientific Publishing, 2006. [6] G. A. Evans, J. M. Blackledge and P. Yardley, Analytical Methods for Partial Differential Equations, Springer, 2000. [7] A. W. Rihaczek, Principles of High Resolution Radar, McGraw-Hill, 1969. [8] R. L. Mitchell, Radar Signal Simulation, MARK Resources, 1985. [9] G. A. Evans, J. M. Blackledge and P. Yardley, Numerical Methods for Partial Differential Equations, Springer, 2000.



ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Julian Meng: Linear Prediction Filtering and Transform-Domain-Based Spread-Spectrum Receivers for Narrowband Interference Suppression

Linear Prediction Filtering and TransformDomain-Based Spread-Spectrum Receivers for Narrowband Interference Suppression Julian Meng, Member, IEEE

Abstract - Previous work has shown that the performance of a direct-sequence spread-spectrum (DS/SS) receiver can be improved by suppression of narrowband interference co-located in the frequency band. Transform-domain-based DS/SS receivers are of particular interest because of implementation simplicity and a closed form solution for an optimal detector when using time-weighting. However, the success of this technique requires a reasonable estimation of the frequency, power and spectral distribution of the narrowband interferer. Improper estimation of these quantities can lead to an unnecessary excision of spectral components resulting in performance degradation. Alternatively, a linear prediction filter can be used to suppress narrowband interference prior to detection and avoids the spectral estimation requirement of the transform-based receiver. Pre-processing using a linear prediction filter results in a basic DS/SS receiver that is optimal when considering a Gaussian noise case only. This paper presents a detailed analysis of this approach in comparison with the transform-based excision process. Both theoretical and Monte-Carlo simulation results are provided. Index Terms -- Digital Communications, Interference Suppression, Spread Spectrum.

Narrowband

I. INTRODUCTION

W

ith direct sequence spread spectrum (DS/SS) modulation, the “spreading” of the spectral bandwidth of the transmitted signal occurs when the original data is multiplied by a pseudo-noise (PN) sequence whose chip rate is a multiple of the original symbol rate. The PN sequence can be de-spread at the receiver to enable the original user data to be recovered. Some well-known DS/SS advantages include: resistance to multipath effects, low power output and secure communications. The DS/SS system can be extended for several users communicating over the same channel using orthogonal codes such as the Walsh code, i.e. code-divisionmultiple-access (CDMA). These unique strengths have led to the development of numerous CDMA based commercial wireless systems including those used in mobile cellular and for the various IEEE 802.11-based wireless LANs. Given the crowded nature of today’s RF spectrum, there is considerable This work was supported in part by the National Sciences and Engineering Research Council of Canada (NSERC). The author is with the University of New Brunswick, Fredericton, N.B., Canada, E3B 5A3 (phone: 506-458-7453; fax: 506-453-3589; e-mail: [email protected]).

interest in co-locating narrowband users within the CDMA channel bandwidth. Although the DS/SS system has some inherent narrowband interference rejection capability, there has been considerable effort in the development of interference rejection techniques to improve system performance [1]-[9]. In this paper we present a study of a previously proposed transform-domain excision approach using various interference scenarios [1]-[4]. Transform-domain processors can utilize DFT-block processing or a surface acoustic wave (SAW) device to generate the required Fourier Transform of the received DS/SS signal. Spectral leakage inherent to DFTblock processing can be managed somewhat by using a time weighting function that is non-rectangular prior to transforming the received signal. Reducing the effects of spectral leakage is beneficial to the overall processing gain of the DS/SS system. Fundamentally, the transform-domain technique requires prior knowledge of interference parameters, such as frequency, power and spectral distribution, to eliminate or excise corrupted spectral components prior to detection. Clearly, the success of this approach to narrowband interference rejection is highly dependent on an accurate estimation of the interference signal. This is effectively a spectral estimation problem. As an alternative, we consider a pre-processing step utilizing linear prediction filtering to suppress narrowband interference prior to the transform-domain detection process. The linear prediction filter has well understood properties with regard to suppressing narrowband signals in wideband systems [5]-[8]. This type of filter can be formulated in an adaptive configuration to operate in a non-stationary environment and the narrowband signal can be modeled as a stochastic autoregressive (AR) process. If this assumption holds, the linear prediction filter can be used to predict and remove future values of the narrowband signal, thereby leaving the desired DS/SS signal. Linear prediction or whitening filters can be implemented using recursive-least-squares (RLS) methods which is similar in execution to the popular Kalman filter often used in state estimation and tracking [11]. RLS techniques have been found to have better convergence properties and improved steady state tracking over the popular least-mean squares (LMS) methods but at a cost of higher computational complexity [11], [12]. Linear prediction has a distinct advantage over the transform-based method since it does not require knowledge of interference parameters such as frequency, power and spectral dispersion. Previous work with


82

linear prediction has shown success even with narrowband interferers whose frequencies change with time [13]-[16]. For transform-based detectors, the conventional approach is to match the excised result directly with the DS/SS spreading sequence, regardless of any time weighting or prefiltering processes. In this case, this solution is only optimal (in a minimum Probability of Error sense) in the presence of additive white Gaussian noise (AWGN). However, Sandberg [1] has previously introduced an optimal solution when the application of time weighting is used1. This work has shown that the optimal solution with time weighting can improve receiver sensitivity by 3 dB. In contrast, the stochastic properties of the output from a linear prediction filter can be assumed to be Gaussian and white in nature enabling a conventional solution of a matched decorrelator to be used without any loss in optimality. The paper is organized as follows. Section II presents the detector models used in this study and Section III gives the probability error analysis of the various approaches to DS/SS detectors assuming the presence of narrowband interference. Section IV presents the Monte Carlo simulation results and finally conclusions are given Section V. II. DETECTOR MODELS For DS/SS modulation, a PN sequence is used to spread the user’s signal spectrum over a wider bandwidth to allow for the advantages described in the previous section to be realized. The PN sequence for a given user is

mi (t ) =

∑ c w(t − lT ) il

c

(1)

where N c is the number of chips per message bit, Tc is the chip duration, cil ( ∈ {1,−1}) is the lth chip of the PN sequence, and w(t) is a unity rectangular pulse with a duration equal to the input symbol duration. N c is also referred to as the processing gain of the spread-spectrum. The DS/SS signal is given by ∞

∑ b m (t − iT ) i

q(t ) = 2Pq cos(2πf m t + Θ),

i

s

(2)

i = −∞

where bi is the information symbol to be transmitted by a particular user and T s ( = N c Tc ) is the symbol duration. In practice, s(t) is pulse shaped and processed by a M-ary modulator prior to RF transmission. For CDMA, the PN sequence is specific to each user to facilitate the multipleaccess requirement. If the DS/SS signal is corrupted by AWGN, n(t ), and by narrowband interference, q(t ), the equivalent low pass receive signal is simply

r (t ) = s(t ) + n(t ) + q(t ) .

[r ]i = [s ]i + [n]i + [q]i ,

(5)

where [r ]i , [n]i and [q]i represent the received DS/SS sequence for the particular bit bi , the noise sequence and interference sequence, respectively. The signal term in (5) is represented by [s]i = bi [c] where [c] is the vector of length N c representing the PN sequence. Assuming quadrature modulation, the noise vector, [n]i , is a zero-mean complex Gaussian process with a real and imaginary component that are identically distributed and zero-mean. The variance of the real and imaginary terms is N o / 2 , the AWGN noise spectral density. The interference term is represented as 2 Pq e jΘ [exp{ j 2πf m n / N c }] where 0 ≤ n ≤ N c − 1. For

convenience, assume transform processing of length N c and thus the Nyquist theorem limits the interference frequency to −N c / 2 ≤ f m ≤ N c / 2. The phase term, Θ , is a random variable uniformly distributed over {0,2π }.

A. Transform-Based DS/SS Detector The transform domain receiver processes the input on a block-by-block basis as indicated in Figure 12. Each block sequence will contain N c chips representing one bit of information. As shown, the DS/SS signal, [r ]i , can be weighted by an optional weighting matrix [D ] prior to detection. This matrix is formulated using a N c x N c identity matrix with the diagonal components equal to the selected time-weighting function3. A DFT operation is performed by the multiplication of [D ][r ]i with the DFT matrix, [W ] = [exp (− j 2πkn ) / N c ] where 0 ≤ k ≤ N c − 1. The timeweighted received transform signal is then, [R]i = [W ][D ][r ]i , which extended gives,

[R]i = [W ][D][s ]i + [W ][D][n]i + [W ][D][q]i .

(6)

(3) 2

1

(4)

where Pq , f m and Θ are the interference power, frequency, and phase offset, respectively. This model can be extended for multiple tonal interference signals representing multiple narrowband users in the DS/SS channel. Multiple interference signals represents a more difficult interference problem for the transform excision process due to the higher number of contaminated spectral components. In a matrix form (4) can be represented as

[q]i =

N c −1 l =0

s (t ) =

The narrowband interference can be modeled as

The application of time-weighting to AWGN renders the Gaussian and independent assumption false.

We are concerned with the demodulation process only and assume perfect symbol synchronization. 3 Blackman-Harris time-weighting function was used for this study.


83

Assuming the band of frequencies corrupted by the narrowband user is known i.e. ∆ ∈ {f f ∈ ( f m − n w , f m + n w )} where n w defines the upper and lower band limits, a variant of ~ the [R ]i can be defined as R i = [ X ][R ]i . The matrix [ X ] is an identity matrix less the rows corresponding to desired ~ excision frequencies4. For transform-based excision, R i is processed by the demodulation stage of the receiver. The transform-based detector formulates a hypothesis of the transmitted bit ∈ {−1,+1}, that is,

[]

[]

~ > 0 ⇒ bˆ = +1 Re [G ]* R , < 0 ⇒ bˆ = −1

( [ ])

(7)

resulting filter tap coefficients of the linear prediction process, [hn ] . These coefficients can also be used to provide the spectral characteristics of the narrowband interference signal. With linear prediction, the weighting term in (6) is dropped since the interference is suppressed prior to detection. In the time domain, the correlation receiver is

(

Re [g ]T [rˆ]i

⇒ bˆ = +1 , ) 00 ⇒ bˆ = −1

(13)

where the demodulator is simply g = c . The decision rule given in (13) is identical to (7) using the [G1 ] demodulator without the use of the time-weighting, [D] , and excision

where [G ] is the demodulator5. From [1], the optimal detector with time-weighting is

[G0 ] = [A][W ][D][c] ,

matrix, [X ] . This detector topology is shown in Figure 1.

(8)

where

[A] = (([X ][W ][D ])([X ][W ][D])* )

−1

.

(9)

The reduction in the computational burden of [A] is discussed in [1]. For a conventional demodulator optimized for the AWGN case only, the demodulator is

[G1 ] = [X ][W ]c .

(10)

The [G 0 ] and [G1 ] demodulators represent two of the three DS/SS detectors studied in this research.

B. Linear Prediction Filtering-based DS/SS Detector If a linear prediction suppression filter is first applied to the received signal prior to decorrelation, the input to the PN correlator is given as +∞

rˆ(t ) =

∫

hn (α )r (t − α )dα ,

(11)

Figure 1 Transform-Domain and Linear Prediction Receivers for Narrowband Interference

This variation of a DS/SS detector has certain advantages over the transform-domain detector. The utilization of the linear prediction filter eliminates the need to assess the frequency localization, power level and spectral width of the narrowband interference signal. Also, the simple demodulation process in (13) is optimal since the output of the prediction filter is assumed to have Gaussian-like statistics6.

−∞

where hn (α ) represents the impulse response of the linear prediction filter. The function, rˆ(t ) , is the error term of the linear prediction process and should, in theory, contain the desired DS/SS and AWGN signals less the interference signal. In matrix form, this becomes

[rˆ]i = [H ][r ]i .

The dimension of [X ] is (N c − count (∆ ) ) x N c .

5

[ ]*

denotes conjugation.

For Gaussian noise-based communication systems, a well described relationship between the theoretical Probability of Error for digital modulation is

(

)

Pe ∝ Q SNR .

(14)

(12)

where the convolution matrix, [H ] , is formulated using the 4

III. PROBABILITY ERROR ANALYSIS

Although the Gaussian assumption is not quite accurate for the time-weighted transform domain detectors, it can be used as an approximation for relative performance between detectors. 6 This is a first order approximation. The combination of the noise term, n(t ) , and the signal term, s (t ) , as an input to the linear prediction filter is not purely Gaussian.


84

The SNR of the [G0 ] over [G1 ] demodulators is derived as7 where

SNRG0 =

(15)

2 Eˆ b Rb N 0 Bw

and

  Tr (D )2   2 Eˆ b Rb  SNRG1 =  N c−1  ,  Tr D 2   N 0 Bw    

(16)

( )

N − count (∆) Ec is the corrected energy per bit where Eˆ b = c Nc as a result of the transform domain excision process and Ec is the energy per chip of PN sequence. The bandwidth of the DS/SS channel (Hz) and data rate in (bits/second) are Bw and Rb , respectively. Theoretically, the result in (15) indicates an approximate 3 dB improvement in detector performance for [G0 ] over [G1 ] . For either detector, there is a trade-off between removing corrupted spectral components versus the reduction in effective processing gain. We will see that this becomes more problematic in the multiple-interference test case. With respect to the linear prediction detector, the output can be given as

(

)

Re [g ]T [rˆ]i = [Y ]i + [N ]i + [Q ]i ,

(17)

where the DS/SS signal component is

[Y ]i

(

)

(18)

= Re [g ]T [H ][s ]i ,

For the other components given in (5), the correlation output for the AWGN and interference components are given as

[N ]i = Re([g ]T [H ][n]i ) ,

(19)

[Q ]i = Re([g ]T [H ][q]i ) .

(20)

and

From the results provided in the Appendix, the detector SNR utilizing the linear prediction filtering is SNRLP =

(N K 0

=

2 K hn β Eb2 Rb hn

n

)

β Eb + β Eb N 0 Bw

K hn β 2 Eb Rb

(K h

(21)

)

+ 1 N 0 Bw

,

K hn , β ,

Eb and N 0

represents the scalar term

[hn ]T [hn ]* , the spreading waveform coefficient [10] and signal energy per bit, power spectral density of the AWGN, respectively. Thus, with interference suppression, the resulting detector SNR for this case is independent of the interference power and the overall processing gain of the system is not affected since no transform excision is required i.e. Eb = N c Ec . For values of β < 1 [10] for all possible PN waveforms, (20) yields the bound SNRLP ≤

2 Eb Rb , N 0 Bw

(22)

which indicates an upper capability of a detector operating in the presence of AWGN only. Comparing (22) with the results given in (15) and (16) clearly indicates the major advantage of maintaining the original DS/SS system processing gain. The following section illustrates the performance of the DS/SS system using the various detector types in simulation. IV. RESULTS In order to investigate the performance of each detector, the respective Pe was approximated using the bit error rate performance in a simulated DS/SS system. A Monte Carlo simulation with 106 bits per run was performed for each test case. Three detectors were assessed, [G0 ] , [G1 ] and the linear prediction filter detector, using various interference scenarios under various operating conditions8. This includes multiple interference signals present at the detector simultaneously. Other simulation specifics include: a PN sequence length set to 63 and a 10th order RLS prediction filter was implemented. For the study cases, two E b / N 0 cases of 3 and 6 dB were used in the assessment procedure and up to three interference signals were present at the receiver. For the succinctness, only the 6 dB results are shown. Frequency localizations were varied from 2 to 26.5 Hz and various interference powers were used. Previous literature [1] has shown the transform-detector performance to be sensitive to the spectral location of the interference signal. This is due to the time-weighting DFT in the detection process and thus excising different spectral locations will yield difference BER results. Figure 2 confirms this result which is not indicated in the theoretical results given in (15) and (16). For this test, the interference-to-signal ratio (ISR) was set to 40 dB. Also note the oscillatory nature of the [G0 ] response which is similar to the previous results in [1]. The non-integer values of frequency present a particularly difficult excision problem for the transform-domain detectors due to the greater spreading of the interference energy. It should be noted that both transform-detectors were given the exact knowledge of the spectral location of the interference.

7

The demodulators are modified to include the effective processing gain after the excision process [1]. Also, Tr [ ] denotes the trace of a matrix.

8

For a more detailed analysis of transform-based detectors please see [1].

85


single tone located at 2.5 Hz. A spectral width of ∆ = 8 was utilized for all of the ISR test cases.

Figure 2 Detector Performance as a function of Interference Frequency, Eb/N0=6dB

The spectral width of excision, ∆ , was varied from 4, 8 and 169. Figure 2 also illustrates that at lower frequencies, the [G0 ]

Figure 3 Detector Performance as a function of Interference Power, Single Tone f m = 2.5 Hz, Eb/N0=6dB

detector is a poorer performer than the [G1 ] detector but improves at higher frequencies indicating greater sensitivity of the spectral location of the interference. With the optimal detector, [G0 ] , increasing the spectral width of excision beyond ∆ = 8 worsened the detector performance. Reducing the spectral width to ∆ = 4 proved catastrophic to both the [G0 ] and [G1 ] detectors. In this case, the remaining excess residual interference energy after excision results in poor detection performance. This is a good example of the practical implementation difficulties of the transform-based detector. As anticipated from (22), the linear prediction filter detector offers superior BER performance for this test case since the SNR is not dependent on the interference frequency and the original processing gain is maintained. Another advantage of this detector is that no prior knowledge of the interference signal is required. The same test case was run with E b / N 0 set at 3 dB and, as expected, the respective BER performance for all detectors has worsened in response to the increased levels in AWGN. However, the relative trends in performance remain the same, with the linear prediction filter detector having better BER performance when compared to the transform detectors. To investigate DS/SS detector performance as a function of interference power, three test cases were assessed. The test cases are defined by the number of interference signals present at the receiver. This research extends previous work in [1]-[4] which extensively analyzed the impact of a single narrowband interference signal. Multiple narrowband users sharing the same channel is a distinct possibility given current trends of increased user density over wireless channels. In this case, care must be taken since the number of excised spectral components is limited by the transform dimension. Figure 3 illustrates the BER performance as a function of the ISR for a

For the single tone case, except for higher ISR values, the optimal [G0 ] demodulator outperforms the [G1 ] demodulator. It should be noted that for our implementation of the transform-domain detectors, excision is performed irrespective of the interference power. Ideally, the excision process should be bypassed at lower interference power levels to improve detector performance. This is illustrated with the ∆ = 0 result for the [G0 ] and [G1 ] detectors. In comparison, the linear prediction filter detector yields improvement in BER performance at all ISR levels and this is more evident at higher ISR values. Figure 4 and Figure 5 illustrate the BER results for the two and three interference case where an additional interference signals are added at 12.5 and 22.0 Hz, respectively.

9 Optimal spectral widths are analyzed in [1]. It is not the focus to repeat this work but show some trends in altering this value.

As expected, these test cases present increasingly more

Figure 4 Detector Performance as a function of Interference Power, Two Tone f m ∈ {2.5,12.5} Hz, Eb/N0=6dB


86

difficult interference scenarios for the transform-based detectors due to the increased number of excised spectral components. For the linear prediction filter detector, the additional interference signals also degrade system performance since multiple notches result in additional distortion of the desired DS/SS signal. Overall, however, the linear prediction filter detector can offer better BER performance than transform-domain detectors with no spectral estimation requirements. In very low ISR regimes, the performance of the [G0 ] demodulator with ∆ = 0 converges to the linear prediction detector result. This is expected given a zero spectral width excision result for (15) compared to the linear prediction result in (22). All multiple interference scenarios were tested using Eb/N0 = 3 dB. Similar trends, albeit with higher BERs, were found.

VI. APPENDIX Assuming

[Y ]i , [N ]i

[Q]i

and

are zero-mean and

independent complex processes, the output SNR can be stated as (23) E [Y ]i 2 SNR = . 2 2 E [Q ]i + E [N ]i

{ } { } { }

Using the demodulator as g = c and (18) gives the correlation of the signal component of detector output with linear prediction filtering as

{ } { ( ) (   = E Re ([c] [H ][c] )([c ] [H ][c ] )      = E {Re (([c ] [H ][c ] )([c ] [H ] [c ] ))}

E [Y ]i 2 = E Re [c ]T [H ][c ]i Re [c ]T [H ][c ]i T

(24)

*

T

i

i

*T

T

)}

*

* i

i

Assuming an uncorrelated chip sequence and prediction filter tap coefficients, a simple exercise shows that (24) simplifies to

{ }

(25)

E [Y ]i 2 = K hn βEb2 where

β E b = [c n ]T [c n ]* and n indicates the chip index. The

term K hn is equivalent to the summation of the modulus squared of the filter tap coefficients, i.e. [hn ]T [hn ]* . Similarly Figure 5 Detector Performance as a function of Interference Power, Three Tone f m ∈ {2.5,12.5,22.0} Hz, Eb/N0=6dB

for the noise component, [N ]i ,

{ } { ( ) (   = E Re ([c ] [H ][n] )([c ] [H ]n )    

E [N ]i 2 = E Re [c ]T [H ][n]i Re [c ]T [H ][n]i

V. CONCLUSIONS This research focuses on the assessment of transformdomain excision and linear prediction-based techniques for the rejection of narrowband interference in DS/SS systems. This is an important aspect in modern communication systems where spread spectrum and narrowband users are often co-located in the same communication channel. Previously proposed transform-domain-based DS/SS receivers have shown good capabilities at suppressing the effects of narrowband interference. However, a drawback of this technique is the assessment of interference spectral content and power level. Unnecessary excision results in a reduction in processing gain. Also, excision of spectral components has greater impact on detection performance for the case of multiple interference signals. Alternatively, a linear prediction filter can be used to suppress the narrowband interference prior to detection and eliminates the need of the spectral estimation of the interference. The linear prediction filter DSSS receiver eliminates the major weaknesses in the transform-domain only detector. This is especially the case when interference powers are significantly higher than that of the DS/SS signal.

T

N0 2

(26)

*

T

i

i

=

)}

β Eb K hn

For interference component, [Q ]i , consider

{ } { ( ) (   = E Re ([c] [H ][q ] )([c ] [H ][q ] )    

E [Q ]i 2 = E Re [c ]T [H ][q ]i Re [c ]T [H ][q ]i T

*

T

i

i

)}

(27)

.

= Pi β E b [hn ]T [hn ]* Utilizing the linear prediction filter in transfer function form, the power spectral density of the received signal, r (t ) , after filtering can be written as [1]


87

(28)

N0

PS r ( f ) ≈

2

+∞

2

∫

[10]

hn (t ) exp(− j 2πft )dt

0

[11]

Letting f = f m and using the modulation and Parseval’s theorem yields PSr ( f m ) ≈

N0

.

2[hn ] [hn ] T

*

(29)

Assuming a pure tone, Pi ≈ PS r ( f m ) and (41), (40) can be simplified to

[12] [13]

[14]

[15]

[16]

{ }

E [Q ]i 2 =

β Eb N 0 2

(30) .

Transactions on Communications, vol. COM-30, no. 5, pp 925-928, May 1982. M.Z. Win, G. Chrisikos and N.R. Sollenberger, “Performance of Rake Reception in Dense Multipath Channels: Implications of Spreading Bandwidth and Selection Diversity Order”, IEEE Journal on Selected Areas in Communications, vol. 18, no. 8, pp. 1516-1525, August 2000. S. Haykin, Adaptive Filter Theory 2nd Edition., Englewood Cliffs, NJ; Prentice Hall, 1991. B. Farahng-Boroujeny, Adaptive Filters, New York; Wiley, 1999. J. Meng and X. Ding, “Narrowband jamming suppression capabilities of a RAKE receiver operating in a multipath environment,” SPIE Defense and Security Symposium, Orlando, March 2005. J. Meng, “A Fast Converging Minimum Frequency Error RLS Lattice Filter for Narrowband Interference with Discrete Frequency Steps”, IEE Vision, Image and Signal Processing, Vol. 152, no. 1, pp. 36-44, 2005. J. Meng, “A Modified Recursive Least Squares Adaptive Lattice Filter for Narrowband Interference Rejection,” Canadian Conference on Electrical and Computer Engineering, Montreal, Canada, May 2003. J. Meng and X. Ding, “Improved CDMA Performance in the Presence of Frequency Hopping Spread Spectrum Interference Utilizing Recursive Least-Squares Prediction Filtering,” submitted to IET Communications, April, 2007.

IX. BIOGRAPHIES

VII. ACKNOWLEDGMENT The author wishes to acknowledge the National Sciences and Engineering Research Council of Canada and the University of New Brunswick for their support of this research. VIII. REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

S.D. Sandberg,”Adapted Demodulation for Spread-Spectrum Receivers which Employ Transform-Domain Interference Excision,” IEEE Transactions on Communications, vol. 43, no. 9, pp. 2502-2510, September 1995. R. Rifkin, “Comments on ‘Narrowband Interference Rejection Using Real-Time Fourier Transforms’,” IEEE Transactions on Communications, vol. 39, no. 9, pp. 1292-1294. R.C. DiPietro, “An FFT based technique for suppressing narrow-band interference in PN spread spectrum communication systems,” Proc. IEEE ICASSP’89, 1989, pp. 1360-1363. S.D. Sanberg, S. Del Marco, K. Jagler, and M.A. Tzannes, “Some Alternatives in Transform-Domain Suppression of Narrow-Band Interference for Signal Detection and Demodulation,” IEEE Transactions on Communications, vol. 43, no. 12, pp. 3025-3036, December 1995. J.W. Ketchum and J.G. Proakis, “Adaptive algorithms for estimating and suppressing narrow-band interference in PN spread-spectrum systems,” IEEE Transactions on Communications, vol. COM-30, no.5, pp 913-924, May 1982. L. Milstein, “Interference rejection techniques in spread spectrum communications,” Proc. of the IEEE, vol. 76, no. 8, pp. 657-671, June 1988. R.A. Iltis and L.B. Milstein, “Performance analysis of narrow-band interference rejection techniques in DS spread spectrum systems”, IEEE Transactions on Communications, vol. COM-32, no. 11, pp. 11691177, May 1984. G.J. Saulnier, P.K. Das, and L.B. Milstein, “An adaptive digital suppression filter for direct-sequence spread-spectrum communications,” IEEE Journal on Selected Areas in Communications, vol. SAC-3, no.5, pp.676-686, Sept. 1985. [5] L. Li and L.B. Milstein, “Rejection of Narrow-Band Interference in PN Spread Spectrum Systems Using Transversal Filters”, IEEE

Julian Meng (M’02) received a Ph.D. degree in Electrical Engineering from Queen’s University, Kingston, ON, Canada in 1993. Currently he is an Associate Professor in the Department of Electrical and Computer Engineering, University of New Brunswick, Fredericton, NB, Canada. Prior to this position, he worked in various industrial positions in research areas such as LMDS wireless communications and image processing with a focus on data fusion applications. Some of his research interests include adaptive signal estimation and non-linear signal processing.


ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Robert Niese et al.: Nearest Neighbor Classification for Emotion Recognition in Stereo Image Sequences

Nearest Neighbor Classification for Emotion Recognition in Stereo Image Sequences Robert Niese, Ayoub Al-Hamadi and Bernd Michaelis Institute for Electronics, Signal Processing and Communications (IESK) Otto-von-Guericke-University Magdeburg, D-39016 Magdeburg, P.O. Box 4210 Germany {Robert.Niese, Ayoub.Al-Hamadi}@ovgu.de

Abstract—In this paper, we present a user independent realtime capable automatic approach for recognition of basic emotion expressions from stereo image sequences. The approach automatically detects faces in unconstraint pose based on depth and color information. In order to overcome difficulties caused by increasing change in pose, lighting transitions, or complicated background, we introduce a face normalization algorithm based on an Iterative Closest Point (ICP) algorithm. In normalized face images we defined a set of physiologically motivated face regions related to a subset of facial muscles that are apt to automatically detect the six well-known basic emotions. Visual emotion analysis takes place by an optical flow based feature extraction and a nearest neighbor classifier, which uses a distance measure, i.e. the current flow vector pattern is matched against empirically determined ground truth data. The presented approach has the advantage that it does not require prior knowledge about the appearance of the face what makes it broadly applicable. Key Words—Human Computer Interaction, Emotion Recognition, Image processing, Application

I. INTRODUCTION In recent years there has been a growing interest in improving all aspects of human computer interaction (HCI). This arising field has been a research interest for scientists from a wide spectrum of disciplines, i.e. computer vision, engineering, psychology, and neuroscience. It is claimed that to truly achieve effective human machine interfaces, a natural way of interaction is necessary. One core task in HCI is the intention recognition. This requires effective visual emotion recognition in the first step, which is addressed in this paper. Further, the possibility to automatically detect and classify emotional facial signals opens a field of applications from behavioral science and medicine to robotics, multimedia and companion systems. In these applications, of course the user should not be constraint by the system in order to work, for *Correspondence to: Robert Niese, Institute for Electronics, Signal Processing and Communications, University of Magdeburg, Germany E-mail address: [email protected] This work was supported by Bernstein-Group (BMBF: 01GQ0702) and NIMITEK grants (LSA: XN3621E/1005M). Received for publication 4 July 2007; Accepted 30 July 2007

instance in terms of a strict body and head posture during interaction. This has resulted in a need for better face detection, facial feature extraction and classification of expressions. Even though big advantages have been made in recent years these requirements are still challenges to conventional methods under real world conditions in real-time. First an automatic detection of faces and facial features must provide reliability across changes in pose, illumination and expressions (PIE). Further, robust classification must be assured. Considering the multitude of face appearances emotion detection purely based on static images usually requires some prior knowledge about the face observed and can be difficult even for humans.

II. RELATED WORK Study of faces has been of interest to humans ever since. We have the natural ability to recognize emotions, which are most expressively displayed by facial expressions. Since the 1970s psychologist Paul Ekman and his fellows have performed extensive studies of human facial expressions, where they found strong evidence of universality of facial expressions and introduced the Facial Expression Coding System (FACS) in order to describe all possible expressions in static images [10]. Inspired by the work of Ekman, many approaches have been developed to automatically analyze facial expressions based on evaluation of still images and video sequences. In depth review of much of the research done in automatic facial expression analysis can be found in recent surveys [2, 3, 16]. Temporal information in image sequences contains much more information in order to classify facial expressions. This is because static images do not clearly reveal subtle changes in faces. Commonly, facial expressions are categorized from video by tracking facial features and measuring the amount of facial movement. One of the first works to automatically quantify facial expression from image motion has been presented by Black and Yacoob [7] who used parameterized models of image motion to recover non-rigid motion. Applying a rule-based classifier, six basic facial expressions (happiness, sadness, anger, fear, surprise and disgust) were recognized from the model parameters. Essa [4] proposed the


89

FACS+ system, which is used to probabilistically describe facial motion and muscle actuation. This method uses geometric and motion-based dynamic models that are fed with optical flow data. In [5] the optical flow is computed for a set of regions on the face, and expression classification is done with a radial basis function network. Analysis based on probabilistic models such as Hidden Markov Models has been proposed in several works [6, 8]. The concept in [14] uses GaborWavelets and detects subtle changes in facial expression by recognizing facial muscle action units (AUs) and analyses their temporal behavior. Bartlett et al. [17] use Support vector Machines and AdaBoost classifiers in order to determine action units. Basically, the common scheme of all methods is that they first extract a number of features from the images and then feed these features into a classification system. The outcome is one of a predefined emotion category. Here, most of the methods attempt to directly map facial expression into one of the six basic emotion classes introduced by Ekman. The main difference between the facial expression analysis methods is the selection of features and the classifier used to distinguish between emotions. State of the art methods work well in frontal face analysis but often have difficulties with increasing change in PIE, or complicated background. Challenges arise from the fact that the users observed should not be constrained in the interaction. In this paper, we present a user independent real-time capable automatic approach for recognition of basic emotion expressions from stereo image sequences. The approach automatically detects faces in unconstraint pose based on depth and color information. In order to overcome difficulties caused by PIE, we introduce a face normalization algorithm, and based on that a set of physiologically motivated face regions related to a subset of facial muscles, which are appropriate to automatically detect the six well-known basic emotions. The visual emotion analysis takes place by using our optical flow-based nearest neighbor classifier, which applies a distance measure between an empirically determined ground truth and the current measurement. In this way we fulfill the above-mentioned demands on HCI systems. This concept reflects the common scheme of facial expression analysis methods, yet, the combination of stereo and color information in the image sequence represents a new and powerful method.

III. SUGGESTED APPROACH The presented approach for emotion classification is based on motion analysis in sequences of normalized face images. This approach has several advantages. First, the stereo vision based normalization of the face solves the pose problem, which causes a potential problem for many algorithms. In a normalized face, neither head rotation nor changing size due to back and forth movement interfere with the image analysis. Further, the incorporation of spatio-temporal information enables a classification of facial expressions without prior knowledge about the face’s texture and shape. Hence, the

facial motion analysis has the benefit of universality across different people with a multitude of face appearances, which usually constrain approaches that do not consider the temporal context. However, in order to capture subtle facial movements, in our approach we need to have at least 25 color images per second plus the additional stereo data. With the upcoming generation of affordable real time range sensors this challenge becomes feasible. In the first step of our approach (Fig. 1), we automatically create a person specific surface model. This model is required to estimate the face pose and subsequently create a normalized image of the face. Feature extraction and analysis is based on texture analysis of the normalized face image, therefore it is not directly performed in the 3D domain. The normalized image presents the basis for optical flow based feature extraction. Here we analyze physiologically motivated regions, which are automatically determined from 2D and 3D information. Finally, we use a nearest neighbor classifier, which is based on a distance measure, i.e. the current flow vector pattern is matched against ground truth data. Stereo Camera System in Color and Real Time Range Sensor A. Surface Model Generation Sequence Capturing of Video and Range Data

B. Model Matching and Face Pose Estimation C. 3D Based Face Normalization D. Feature Extraction E. Emotion Classification

Fig. 1. The suggested approach for motion based emotion classification

In our implementation we capture depth information of the scene as well as color images from a stereo camera pair. In particular we consider the set of points W as the 3D scene representation (Eq. 1). This represents the input data for surface model generation and ongoing analysis. Further, we use standard stereo-photogrammetric means in order to perform transformations between the image space and 3D space of the scene.

W = (p1 ,..., p n ), p ∈ 3D .

(1)

A. Surface Model Generation In the presented approach we use a polygonal mesh surface model for determining the current face pose and creating the face normalization. The model is created automatically in an initial step. There are several possibilities for creating a surface description of the face, i.e. accurate striped lighting methods or morphable models [1]. These approaches have the burden of disturbing light projection or high amount of manual interaction. Opposed to previous work [12] the presented


90

concept requires a rough description of the face shape only, and can therefore be gained from a frontal capturing with the mouth closed, using the passive range sensor. We apply a stereo based face localization technique that uses color information and a 3D clustering algorithm with a subsequent mesh reconstruction [12]. This reconstruction is referred to as personalized surface model M (Eq. 2). The polygon mesh structure is defined by a set of vertices aj and belonging normal vectors bj.

M = (a1 , b1 ,...,a n , b n ), a i ∈ 3D, b i ∈ V .

b)

(3)

T = (t x , t y , t z , rx , ry , rz ) , T4x4 - Transformation matrix wrt. T.

(2)

Additionally, we assign a skeleton to the model, which is attached at four significant points that are well detectable in the initial frontal image, i.e. left and right pupil plus left and right corner of the mouth. We determine these points from color and belonging 3D information on the basis of so-called horizontal and vertical projections (HP and VP) [13] (Fig. 2). This search starts at the nose, which is gathered from 3D-data, the mouth and eyes are localized by performing HP and VP in feature optimized, synthetic color spaces as well as in the gradient image [13]. The skeleton points are assigned to the surface model M and are used as a basis to perform facial feature extraction. a)

The separation is based on a Euclidean distance measure. The cluster representing the face is automatically determined from size, position and color (Fig. 3). The cluster center represents the ICP starting value for translation. As this is performed at each time step the system easily recovers in case that the head is rotated too much or moves out of scene.

The ICP principle applied is as follows:

Initialize translation parameters of pose vector T from 3D face detection Let W={ p1,…,pn} (Eq. 1) be a set of points pi and M (Eq. 2) the surface model with vertices aj and associated normals bj Let CP( pi, aj ) be the closest vertex aj in M to a point pi 1. 2.

Let T [0] be an initial transformation estimate (Eq. 3). Repeat for k = 1...k max or until convergence: Compute the set of corresponding pairs S: S=

m

Ui=1

{(

(

pi , CP T [

k −1]

( pi ) ,

aj

)) } .

Compute the new transformation T [ k ] that minimizes Error metric E (Eq. 4) with respect to all pairs S.

Y X Z

Fig. 2. a) Automatic initialization of skeleton using the principle of horizontal and vertical projection, b) Surface Model M with skeleton projection that is required for feature extraction

B. Model Matching and Face Pose Estimation The majority of work on face pose estimation is based on the determination of rigid body motion in six degrees of freedom. These are translation and rotation. Analogously, we infer face pose from geometric alignment of the person specific surface model M (Eq. 2) and point set W (Eq. 1) from stereo measurement. We use a variant of the Iterative Closest Point (ICP) algorithm [11] including a normal constraint. In the ICP algorithm correspondence between the closest points of the two sets of 3D data structures, i.e. point-cloud and geometrical model is established while the distance error between them is minimized. In the ICP procedure we determine pose vector T (Eq. 3), which contains the optimal translation and rotation alignment parameters for model M. In the first ICP step we determine the face position (translation tx, ty, tz) inside point cloud W using a 3D based face localization algorithm [12]. In this procedure point cloud W is divided into a set of clusters.

Fig. 3. Example for 3D based face detection based on clustering of point cloud W, clusters are represented by different colors

Correspondences between vertices aj ∈ M and points pi ∈ W are based on closest point search using a kd-tree for efficiency [11]. A kd-tree enables a fast search for closest points. We apply least squares error metric E as criterion to minimize the distance dj from each vertex aj to the plane containing the point pi and oriented perpendicular to the vertex normal bj (Eq. 4, Fig. 4a).

E=

m

∑(d j ) j=1

2

,

(

)

d j = T4x4 ⋅ a j − pi ⋅ b j .

(4)

Least-squares equations are solved by linearizing the rotations of the model transformation. The search for model alignment is conducted by iterative generation of correspond-


91

ing point sets using the current transformation parameters and finding new parameters that minimize error metric E until convergence (Fig. 4b). After this alignment the orientation of the model corresponds to the position and orientation of the real face. a)

bj

dj

W

if

a)

wn

jf

b)

c)

hn

Fig. 5. 3D to 2D Processing, a) Rasterization, b) Normalized face, c) Skeleton

M aj

Y

pi

X Z

b)

c)

Y X Z

Fig. 4. ICP, a) Orthogonal distance minimization, b) Surface model fit, c) Pose

C. 3D-Based Face Normalization With the position and orientation of the face known, it is possible to synthesize a standardized, frontal view of the individual face. This rendering is based on rasterization in which the surface model is converted from a mesh representation to a pixel representation according to an image raster. There are various techniques known from computer graphics in order to rasterize 3D objects, i.e. raycasting techniques. We use hardware based OpenGL rasterization [15], which is a quick solution that can be realized with standard graphics cards. In a pre-processing step mesh model M is sampled in frontal viewing direction (Fig. 5a). Then the color is back-projected onto the surface from the stereo images and used to rerender the face in a frontal pose (Fig. 5b). Additionally, self occlusion of the surface model is detected in a second rasterization step with emulated real camera parameters. Small occlusions do not disturb the subsequent motion detection. However, large occlusions must be removed with data from additional cameras that can be integrated in our framework. With the normalized image of the face, illumination correction can also quickly be applied [18], since the face is already segmented. Then, the only variance in the image is due to changes of facial expression and no longer due to changing pose or illumination. Feature localization and tracking are greatly simplified due to the fact that the face has a standardized size and orientation. In particular, we project the skeleton associated to model M to the normalized face, which presents the basis for the defined set of physiologically motivated face regions. These itself are the basis for subsequent facial motion analysis and classification (Fig. 5c).

D. Feature Extraction - Facial Motion Detection Facial motion is caused by muscle contractions. There are a large number of facial muscles, which cause facial expression. Ekman [10] proposed the facial action coding system, which was developed to taxonomize every conceivable human facial expression. It is the most popular standard currently used to systematically categorize the physical expression of emotions. In frontal normalized face images we found a set of n=12 physiologically motivated regions, so-called Flow Regions (FR), related to a subset of about twelve facial muscles to be appropriate to classify six basic emotions from optical flow analysis in the normalized face (Fig. 6a). These simplified rectangular shaped regions are determined with help of the skeleton that is associated to the normalized face image (Fig. 6b). Due to the sufficient image resolution and generated face normalization we use this simplification of the underlying muscles, which fulfills the requirements for adequate emotion determination. Muscle motion is determined for each flow region using a version of the well-established two-frame differential method by Lukas-Kanade [9], which is commonly referred to as optical flow estimation. Based on local Taylor series approximations this method calculates the motion between two consecutive images. In order to reduce the amount of highly similar information and decrease computational costs we compute the optical flow always at the corners of a grid with a raster width of wG=4 pixels (Fig. 6c). This leads to a set S of flow vectors for each region at any frame t (Eq. 5) and to a significant reduction of the processing time for the calculation of the displacement vector field for the whole of the defined regions.

{

}

S = v0 (t),..., v n j (t) , v ∈ 2D .

(5)

To achieve better homogeneity of the flow vectors and remove outliers due to small jittering that may occur in the normalized face we accumulate each vector v ∈ 2D at time t from nacc previous frames leading to vector vacc ∈ 2D (Eq. 6). The number of accumulations depends on the frame rate of the capturing system (we use nacc=5). vacc ( t ) =

1

n acc

∑ v ( t − i + 1) , n acc i =1

v∈ 2D, t > n acc ,

nacc - number of accumulations.

(6)


92

{

M2

(8)

We consider uj,k ∈ 2D as ground truth motion vector for flow region j and motion pattern k and consider vj ∈ 2D as the average motion vector of region j during measurement. Each motion pattern has a characteristic distribution of vectors across the set of flow regions. In the sense of information maximization of the evaluated regions (FR) we introduce a table of weights ωj,k to all regions j and corresponding motion patterns k (Eq. 9, Table 1). Thus, we rate those regions higher that contain a more distinct ground truth. For that purpose we analyze the deviation between the ground truth motion vector angles.

M1

M3

M4

}

U k = u1,k ,...,u n,k , u j,k ∈ 2D .

b)

a)

M5 M6

c)

ω j,k =

1 2π ( m − 1)

 u ⋅u j,k j,l arccos   u u l=1,l≠ k j,l  j,k

 .  

m

∑

(9)

The current measurement v needs to have a minimal motion activity M>Mmin. This is the activation threshold, which is the sum of vector lengths across all flow regions. If M is less Mmin facial motion activity is too small for classification (Eq. 10, Fig. 8). Fig. 6. Features, a) Facial muscles and their projection to facial regions, b) Regions along the skeleton, c) Flow grid and accumulated optical flow

The set of all flow accumulation vectors of each region reflects the current image motion induced by the underlying facial muscles. To further reduce outliers we discretize the amount of nj flow vectors by creating an average motion vector vmean ∈ 2D (Eq. 7) for each flow region. vmean ( t ) =

n

1 j ∑ vacc,i ( t ) , v∈ 2D, t > nacc . n j i =0

(7)

E. Emotion Classification Facial expressions that are associated to emotional states are similar across different humans and also across different cultures. In the 1970s psychologist Ekman [10] found evidence to support universality in facial expressions, which are those representing happiness, sadness, anger, fear, surprise and disgust. In our approach we use this fact and create a physiologically motivated ground truth of facial motion that is related to expressions of emotion (Fig. 7). With the presented motion estimation approach we found significant similarities for same expressions but clear variations across different expressions. Thus, the similarity of current motion and the ground truth is used to draw conclusions about the facial expression. In particular we used a database of strict frontal face videos, which contain presentations of the six basic emotions shown by 20 different persons. Analogously to the facial motion estimation procedure we determined the skeleton and the average motion vector u ∈ 2D for each region and expression across different persons (analogously Eq. 7). This results in a characteristic pattern of motion vectors Uk for each emotion expression k (Eq. 8, Fig. 7). We will further refer to this as motion pattern.

M=

(10)

n

∑ v j , v j ∈ 2D . j=0

Here we use a nearest neighbor classification that is based on a distance measure f, which evaluates the match between ground truth and measurement (Eq. 12). In particular the match is corresponding to the motion vector direction. For this purpose we determine angle φ between ground truth and measurement (Eq. 11).

(

cos ϕ j,k = u j,k ⋅ v j

f j,k = 1 −

ϕ j,k ϕmax

)(

u j,k v j

)

−1

.

, f j,k ∈[ 0, 1] .

(11)

(12)

j, k, φmax - region j, motion pattern k, maximum angle. For each motion pattern k the distance measure f is weighted and accumulated across all regions and gives us a corresponding matching value E (Eq. 13). Thus, we get a result for the matching against all six basic emotions. −1

n  n  E ( k ) =  n ∑ ω j,k  ⋅ ∑ ω j,k f j,k .  j=0  j=0  

E ( k c ) > E ( j) , ∀j, j ∈ (1,..., m) ∧ j ≠ k c .

(13)

(14)


93

Even though different emotions can cause similar facial movements, in the overall combination they distinguish clearly. The motion pattern kc with the highest matching value E represents the classification result (Eq. 14). If the matching value is below threshold Emin no characteristic expression could be identified. Fear

Stimulation

Disgust

Motion Patterns

Emotions

a) Muscle Activation / Flow Regions

Sadness

(Fig. 8). The resulting motion vector field is smooth and enables the further emotion classification (Fig. 8b), which takes place by a motion based nearest neighbor classification. This leads in turn to a fast and robust recognition of emotion expressions by using the defined classification criterion (Eq. 13). Application

Relaxation

1

0.5

Frame f

0

Surprise

Joy

Emotions

Anger

b) Classification Fitness Function

1

Surprise Fear Sadness Joy Disgust Anger

0.5

Frame f

0

Fig. 7. The six basic emotions, optical flow field for the defined regions and average motion patterns (ground truth)

Table 1. Ground truth motion vector angles φj,k (rad) and normalized weights ωj,k, which are an empirical result from the evaluation of an emotion labeled database. Flow Region j / Motion Pattern k

j=1

j=2

…

j=12

ϕ1, k

ω1, k

ϕ 2, k

ω 2, k

…

ϕ12, k

ω12, k

Anger (k=1) Disgust (k=2) Fear (k=3) Sadness (k=4) Joy (k=5) Surprise (k=6)

1.9 1.7 4.6 4.2 3.1 4.8

0.11 0.13 0.08 0.09 0.10 0.08

1.35 1.57 4.61 5.31 5.48 4.71

0.13 0.15 0.08 0.08 0.09 0.07

… … … … … …

4.56 4.51 2.37 5.84 3.96 0.09

0.08 0.090 0.11 0.123 0.11 0.11

IV. RESULTS This section discusses preliminary results of the proposed approach as applied to natural faces (Fig. 8, 9). Input is a stereo color image sequence and the output is the normalized face and emotion recognition. Face detection, pose estimation, normalization, and feature extraction and the co-action of these components are for the most part new, and allow performing the emotion recognition step to be conducted reliably under changing conditions of PIE. Experimental results of image sequences are presented, which demonstrate this for different persons and emotions. The emotion recognition is based on optical flow information. To get more homogeneity and reliability of the flow vectors we accumulate each vector

c) Flow Vectors Examples

Motion Patterns

Activation Threshold Mmin f=3

f = 13

Fig. 8. Example sequence, a) Motion analysis, b) Classification

This classification criterion is calculated for the above mentioned defined regions and weighted by the reliability according to Eq. 9 which leads to clear improvement of the matching quality. The matching takes place against all six basic expressions (Fig. 8). The current motion pattern with the highest matching value E represents the classification result according to Eq. 14. If the current measurement has a low motion activity like in the beginning of the muscle activation (so-called Stimulation Phase) no classification is performed (Fig. 8a). Further, if the matching value is below a minimal threshold no characteristic expression could be identified. An emotion classification takes place in case that the activation threshold is exceeded (so-called Application-Phase). The Relaxation-Phase is following the Activation-Phase, in which emotion classification is not assured, since the motion vector field is not significant here. In order to improve the analysis in this phase, with periods of still mimics it may be helpful to complement the dynamic expression classification with information about static facial feature properties, in particular information about the opening degree of the eyes and mouth (Fig. 8). This can be easily determined in the normalized face, for instance through a polynomial fitting along the vertical profile as shown in the


94

examples figures. This expresses the absent movement and leads to the decision about the last muscle activation and emotion recognition. b)

a)

ACKNOWLEDGMENT Image material was kindly provided by the group of Prof. Traue from the University of Ulm.

REFERENCES [1] [2]

kc = 2 (Disgust) E(kc) = 0.83

kc = 6 (Surprise) E(kc) = 0.95

[3]

d)

c)

[4]

[5] kc = 2 (Disgust) E(kc) = 0.85

kc = 5 (Joy) E(kc) = 0.90

[6]

f)

e)6

[7]

kc = 5 (Joy) E(kc) = 0.91

kc = 1 (Anger) E(kc) = 0.89

Fig. 9. a-f) Examples for different persons and facial expressions, Normalization, Flow Regions and average motion vectors, Motion patterns and classification results with the highest and clearly distinct matching value

[8]

[9]

[10]

V. CONCLUSION AND VIEW An automatic approach for recognition of basic emotion expressions has been presented. Incorporating stereo and color information our approach automatically detects the face of the user in unconstraint pose and creates a normalized face image. This is done on the basis of a user specific surface model, which is automatically created in an initial step. The model is used in an Iterative Closest Point algorithm to estimate the face pose. Further, the model is rasterized and color is backprojected onto the surface from the stereo images and used to re-render the face in a frontal pose. In normalized face images we found a set of physiologically motivated face regions, so called flow regions, related to a subset of facial muscles to be suitable to automatically detect the six well-known basic emotions. This is derived from facial motion detection realized by optical flow calculation with respect to the face regions. Classification of emotion expressions is based on a weighted fitness criterion between empirically determined ground truth motion patterns and the currently detected facial motion. Facial motion based emotion classification has the benefit of universality across different people with a multitude of face appearances, which constrain static approaches that do not consider the temporal context. Opposed to other optical flow-based works our approach has the benefit of face normalization, which accommodates for head motions and facilitates correction of changes in illumination that otherwise would disturb the optical flow calculation and therefore constrain the applicability.

[11]

[12]

[13]

[14]

[15] [16]

[17]

[18]

Li, S.Z., Jain, A.K. : Handbook of Face Recognition, ISBN: 0-38740595-X, 2005. Fasel, B., Luettin, J.: Automatic facial expression analysis: a survey. Pattern Recog. 36, 259–275, 2003. Pantic, M., Rothkrantz, L.J.M.: Automatic analysis of facial expressions: the state of the art. IEEE Trans. Pattern Anal. Mach. Int. 22(12), 1424–1445, 2000. Essa, I.A., Pentland, A.P.: Coding, analysis, interpretation, and recognition of facial expressions. IEEE Trans. Pattern Anal.Mach. Intell. 19(7), 757–763, 1997. Rosenblum, M., Yacoob, Y., Davis, L.S.: Human expression recognition from motion using a radial basis function network architecture. IEEE Trans. Neural Netw. 7(5), 1121–1138, 1996. Cohen, I., Sebe, N., Garg, A., Chen, L., and Huang, T., Facial expression recognition from video sequences: Temporal and static modeling, CVIU, 91(1-2):160-187, 2003. Black, M.J., Yacoob, Y.: Tracking and recognizing rigid and nonrigid facial motions using local parametric models of image motion. In: Proceedings of the International Conference on Computer Vision, pp. 374–381, 1995. Oliver, N., Pentland, A., Berard, F: LAFTER: a real-time face lips tracker with facial expression recognition. Pattern Recog.33, 13691382, 2000. Lucas, B., and Kanade, T. An Iterative Image Registration Technique with an Application to Stereo Vision, Proc. of 7th International Joint Conference on Artificial Intelligence (IJCAI), pp. 674-679., 1981. Ekman, P.: Strong evidence for universals in facial expressions: a reply to Russell’s mistaken critique. Psychol. Bull. 115(2), 268–287, 1994. Rusinkiewicz, S., Levoy, M.: “Efficient variants of the ICP algorithm”, Proc. of the 3rd Int. Conf. on 3D Digital Imaging & Modeling, pp. 145– 152, 2001. Niese R., Al-Hamadi, A.; Michaelis, B.: A Stereo and Color-based Method for Face Pose Estimation and Facial Feature Extraction. ICPR,: 299-302, 2006. Al-Hamadi, A.; Panning, A.; Niese, R. and Michaelis, B.: A Modelbased Image Analysis Method for Extraction and Tracking of Facial Features in Video Sequences, CSIT, Amman, pp. 502-512, 2006. Valstar, M.F., Pantic, M.: “Fully automatic facial action unit detection and temporal analysis”, Proceedings of IEEE Int’l Conf. Computer Vision and Pattern Recognition (CVPR’06), vol. 3, p. 149, New York, USA, June 2006. Woo, M., Neider, J., Davis, T.: OpenGL Programming Guide , 2nd Edition, 1997. Tian, Y.L., Kanade, T. and Cohn, J.F. Facial Expression Analysis. In Handbook of Face Recognition. Li, S.Z. and Jain, A.K., Eds. Springer, New York, 2005. Bartlett, M.S., Littlewort, G., Frank, M.G., Lainscsek, C., Fasel, I. and Movellan, J.: Fully automatic facial action recognition in spontaneous behavior. In Proc. Conf. Automatic Face&Gesture Recognition, 223230, 2006. Ebner, M.: Color Constancy, Wiley & Sons, ISBN-13: 9780470058299, 2007.

ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Magnus Karlsson and Shaofang Gong: A Frequency-Triplexed Inverted-F Antenna System for Ultra-wide Multi-band Systems 3.1-4.8 GHz


A Frequency-Triplexed Inverted-F Antenna System for Ultra-wide Multi-band Systems 3.14.8 GHz Magnus Karlsson, and Shaofang Gong, Member, IEEE

Abstract—A fully integrated triplex antenna system for multiband UWB 3.1-4.8 GHz is presented. The system utilizes a microstrip network and three combined broadside- and edgecoupled filters to connect three inverted-F antennas in parallel. The triplexd antenna system is fully integrated in a printed circuit board with low requirements on the printed circuit board process tolerance. The group delay variation within the triplexer was measured to be less than 1 ns. Furthermore, a good agreement between simulation and measurement results was observed. Index Terms—Bandpass filter, Broadside coupled, Edge coupled, Frequency multiplexing, Inverted-F antenna, Triplexer, UWB

U

I. INTRODUCTION

ltra-wideband radio (UWB) has gained popularity in recent years [1]-[8]. The interest for high speed and short range wireless-communication is one of the major driving forces behind the UWB development [2]-[5]. Ever since the effort to achieve one sole UWB standard halted in early 2006, two dominating UWB specifications have remained as top competitors [8]. One is based on the direct sequence spread spectrum technique, supported by the UWBforum [4], [7]-[8]. The other is based on the multi-band orthogonal frequency division multiplexing technique (Also known as “Wimedia UWB”, supported by Wimedia alliance) [5]-[6], [9]. The multi-band specification divides the frequency spectrum into 500 MHz sub-bands (528 MHz including guard carriers and 480 MHz without guard carriers, i.e., 100 data carriers and 10 guard carriers). Three sub-bands are mandatory, centered at 3.432, 3.960, and 4.488 GHz, respectively [4]-[8]. The total bandwidth of this so-called Mode 1 band group is from 3.1 to 4.8 GHz. Orthogonal frequency division multiplexing has the advantage of inherent robustness against gain, phase, and group delay variations [9][10]. Multiplexing techniques in transceiver structures with various purposes have previously been presented [11]-[16]. For instance, duplex solutions for simultaneously transmitting Manuscript received July 5, 2007. Ericsson AB in Sweden is acknowledged for financial support of this work. Magnus Karlsson; email: [email protected], and Shaofang Gong are with Linköping University, Sweden.

and receiving are well known [12], [14]. The use of frequency multiplexing techniques to enhance beam scanning and target detection has also been presented before [11], [15]. Furthermore, frequency multiplexing techniques are used in multiple input multiple output (MIMO) systems [16]. If a sufficient bandwidth cannot be reached with a specific type of antenna geometry, resistive loading has commonly been used to extend the impedance bandwidth but it reduces the antenna efficiency [17]. This paper instead presents a study of a triplexed antenna system for UWB systems, using our integrated triplexer for Mode 1 multi-band UWB. A brief theoretical study of how to extend frequency bandwidth with the multiplexing technology has been presented in [18]-[19]. The design is demonstrated using a conventional printed circuit board technology. II. OVERVIEW OF THE SYSTEM All prototypes were manufactured using a four metal layer printed circuit board. Two dual-layer RO4350B boards were processed together with a RO4450 prepreg, as shown in Fig. 1. The RO4450 prepreg is made of a sheet material (e.g., glass fabric) impregnated with a resin cured to an intermediate stage, ready for multi-layer printed circuit board bonding. Table 1. Printed circuit board parameters Parameter (Rogers 4350B) Dimension Dielectric height 0.254 mm Dielectric constant 3.48±0.05 Dissipation factor 0.004 Parameter (Rogers 4450B) Dimension Dielectric height 0.200 mm Dielectric constant 3.54±0.05 Dissipation factor 0.004 Parameter (Metal, common) Dimension Metal thickness, layer 1, 4 0.035 mm Metal thickness, layer 2, 3 0.025 mm 7 Metal conductivity 5.8x10 S/m (Copper) Surface roughness 0.001 mm

Table 1 lists the printed circuit board parameters, and Fig. 1 illustrates the stack of the printed circuit board layers. Metal layers 1 and 4 are thicker than metal layers 2 and 3 because the surface layers are plated twice (the embedded metal layers 2 and 3 are plated once).


96

Metal 1: antennas + filter beginning, end Metal 2: filter Metal 3: ground Metal 4: free for further expansion

RO4350B RO4450B RO4350B Fig. 1. Printed circuit board structure.

end has zero impedance, a desired feed point impedance (50 Ω in this implementation) can therefore be found somewhere on the line [20]-[23]. λ/4

Z 4 mm theta (θ) Via Gnd

Triplexer

A. Antenna system As shown in Fig. 2a, the designed antenna system consists of three printed inverted-F antennas and a triplexer network. Fig. 2b shows a photograph of the implementation. The triplexer structure is explained in section II-C. The SMA connector (Port 1) soldered from the backside of the printed circuit board is partially seen in the photograph. The prototype has a size of 90 x 58 mm, but note that the actual design only partially fills the printed circuit board.

UWB Frontend

(a) Block diagram of the antenna system.

Sub-band #2

Sub-band #3

Y Feed line

Feed point

phi (φ)

X Fig. 3. Modified inverted-F antenna, antenna in x-y plane, φ=0°.

C. Triplex network Fig. 4 shows the schematic of the proposed triplexer network. The network is realized with the microstrip technology. The triplexer consists of three series quarterwavelength transmission lines, three bandpass filters, and three transmission lines for filter tuning. The series transmission lines provides a high impedance at the respective frequency band. The filter tuning lines optimizes the stop band impedance of each filter, i.e., provides a high stop band impedance in the neighboring bands. The network is optimized together with the filters to achieve uniform passband performance within the sub-bands. Furthermore, since the sub-bands are so close with each other, a sharp bandpass function was prioritized over low reflection.

Port 1, array input

Bandpass filter

Port 2,

Sub-band #1

sub-band #1 λ/4 @ 3.432 GHz

Transmission-line (T-line)

Port 3, sub-band #2 λ/4 @ 3.960 GHz

T-line for stopband tuning

Port 4, sub-band #3

Port 1

λ/4 @ 4.488 GHz

(b) Photo of the implementation. Fig. 2. Triplex antenna system: (a) block diagram, and (b) photo of the triplexer antenna implementation.

Fig. 4. Principle of the triplexer.

B. Principle of the printed inverted-F antenna Fig. 3 shows one of the modified inverted-F antennas. Feeding is done using a microstrip-line of 50-Ω characteristicimpedance. A via is used to connect the antenna to the ground-plane. The length of the antenna is one quarterwavelength at the center-frequency of a sub-channel [20][21]. Since the open-end has high impedance and the short-

D. The triplexer filter structure Figs. 5a and 5b show the principle of broadside- and edgecoupling techniques, respectively. All implementations are made using the microstrip technology. Fig. 5c. shows the filter structure used in the triplex antenna system. The start and the stop segment are placed on metal layer 1, while the rest of the filter is placed on metal layer 2, resulting in a fifth order


97

bandpass filter, two orders from broadside-coupling and three orders from edge-coupling.

(a) Broadside coupling.

#1

4

Microstrip line VSWR

Metal 1 Metal 2 Ground

5

#2 #3

3 2

(b) Edge coupling.

1 2

Broadside coupling Edge coupling

3

4

5

6

Frequency (GHz)

(a) VSWR simulation of separate antennas.

Metal 1 Metal 2 Ground

5 4

Fig. 5. Filter structures: (a) broadside-coupling, (b) edge-coupling, and (c) combined broadside- and edge-coupled filter.

#1 VSWR

(c) Combined broadside- and edge-coupled filter.

3

1 2

3

4

5

6

Frequency (GHz)

(b) VSWR measurement of separate antennas. Fig. 6. Inverted-F antenna: (a) VSWR simulation, and (b) VSWR measurement. VSWR

12

VSWR

10 8

#1

6

#2

#3

4 2 2

3

4

5

6

Frequency (GHz)

(a) VSWR simulation of the system. VSWR

12 10 VSWR

A. Antenna system and antennas Fig. 6a shows the VSWR simulation of the three inverted-F antennas Fig. 6b shows the corresponding measurement. It is seen that all three antennas with a good margin provide VSWR E, the scattering function is negative and the gravitational field defined by U0s will yield a repulsive force. γ>0

=⇒

XIV. P RINCIPLE OF E IGENFIELD T ENDENCY: Q UANTUM M ECHANICS R EVISITED Given the approach considered in this paper, an eigenfield tendency principle is required in order to explain the properties of matter as described by Schrödinger’s equation (in the nonrelativistic case) as originally conceived by Schrödinger [6]. For different potential energy functions Ep (r), it is well known that this equation describes eigenfield systems that can be used to model the properties of matter through the principles of quantum mechanics (in the full context of the subject). The original reason for deriving the Schrödinger scattering function was so that the asymptotic behaviour of a scattered Helmholtz wavefield (i.e. when ω → 0) could be examined. However, the consequence of this is that the Helmholtz equation is the governing wave equation only over a limited frequency band and that as the frequency of a wavefield increases (i.e. as ω → ω1 ) the Helmholtz equation reduces to the Schrödinger equation. If we consider the Schrödinger equation to represent eigenfields (at least in terms of its description of matter waves), then we can argue that at the higher end of the our universal spectrum, wavefields tend to behave more and more like eigenfields. Matter is thus taken to be composed of eigenfield systems at higher and higher frequencies; first the atom, then the nucleus, then the constituents of the nucleus (the quarks) and so on. Equations such as Schrödinger’s equation and Dirac’s equation are both descriptions for eigenfield systems at different energies (non-relativistic and relativistic energies respectively). In the context of matter being an eigenfield described by solutions to Schrödinger’s equation, consider the case of a free electron and a free proton and the formation of hydrogen gas. In conventional (particle) terms, an electron and a proton have the same charge but of opposite polarity. This attracts the particles to form a neutral hydrogen atom, an effect which requires the introduction of a field, namely, an electric field. In terms of a wavefield theory, both the electron and proton are waves. In an ionised state, the electron is a free wave and the proton (relative to the electron) is a potential which is itself an eigenfield system (consisting of a higher frequency spectrum - the ‘nuclear spectrum’). The free wavefield requires greater energy to exist in a free state and hence, based on the principle of least energy, will ‘attempt to exist’ as an

eigenfield. This ‘eigenfield’ may have a number of eigenstates, each with a specific energy level. The difference in energy between the free energy state and the available eigenstate(s) provides a residual energy, i.e. a free energy wavefield with frequency E/~. Once formed, the eigenfield will not share its eigenstate(s) as this will require greater energy and hence, if another electron comes in to the vicinity of the neutral hydrogen atom, it will appear to undergo a repulsive force. On the other hand, since the combined eigenfields associated with two hydrogen atoms requires lower energy than two separate eigenfields (i.e. two hydrogen atoms) then the result is the diatomic Hydrogen molecule H2 - the result of a covalent bond. In this sense, an electric field is not the product of a charge, rather it is that entity associated with the propensity for a free wavefield to become an eigen wavefield. A magnetic field is then a measure of the rate of change over which this propensity is satisfied, i.e. If U (r, t) exists such that Z Z | U (r, t) |2 d3 rdt is a minimum, then Free Wavefield

Electric Field E → Magnetic field ∂E ∂t

Eigen Wavefield

Note that the transition described by Free Wavefield→ Eigen Wavefild may have both magnitude and direction since a free wavefield will attempt to find the shortest possible path in a three-dimensional space in order to become an eigen wavefield. An electric field will therefore appear to be a vector field. Further, if the transition has no directional preference, then an electric field will appear to have a Coulomb field strength characterised by an inverse square law. The principle of eigenfield tendency is just the principle of least energy as applied to a universal wavefield model of the type attempted in this paper. It is, however, a principle which allows us to explain an electric field without having to refer to the concept of a field being ‘radiated’ by a charge! For example, ‘electron cloud’ repulsion theory (Valence Shell Electron Pair Repulsion) is used to predict shapes and bond angles of simple molecules in which the ‘electron cloud’ may be a single, double or triple bond, or a lone pair of electrons - a non-bonding pair of electrons. The ‘electron clouds’ are taken to be negatively charged since the electrons are negatively charged, so electron clouds repel one another and try to get as far away from each other as possible. Instead of considering the electron cloud to consist of negatively charged electrons, we consider the cloud to be a eigenfield which arranges itself in such a way that it can exist in a minimum energy state, a state that affects the geometry of the molecule. In a simple hydrogen atom, for example, the eigenfield will be distributed symmetrically because, in a three-dimensional space, spherical symmetry represents the most energy efficient configuration which is equivalent to the electron wavefield ‘experiencing’ a Coulomb potential. The eigenfunctions that are the solutions to the Schrödinger equation for different materials will not necessarily be complete eigenfunctions. In some cases, solutions only allow for

ISAST Transactions on Electronics and Signal Processing, No. 1, Vol. 1, 2007 Jonathan M. Blackledge: An Approach to Unification using a Linear Systems Model for the Propagation of Broad-Band Signals

113

the existence of quasi-eigenfunctions. In conventional atomic physics, quasi-eigenfunctions are incomplete standing waves more commonly referred to a delocalised electrons. These are electrons that exist in the ‘lattice’ of a material but are free to move and provides a material with the property we refer to as conductivity. This includes materials such as various metals and chemicals (e.g. Benzene which is composed of a ring of delocalised electrons). The principle difference between an eigenfield and a quasi-eigenfield, is that a quasi-eigenfield has an energy spectrum, albeit a narrow one. The Schrödinger scattering function for matter waves is 2mc20 (E − Ep ) − 1. E2 In a macroscopic sense, Ep is the total potential energy associated with all the nuclei from which a material of compact support is composed and E is the total energy associated with the electrons. In the case of elements such as gold, the arrangement of electrons around the nucleus is such that a single electron occupies the outermost shell and is an example of a quasi-eigenfield, i.e. a relatively free wavefield (a free electron) that is only loosely bound to the host atom. Successive energy levels are contained in a small energy range dE and are so close that, in effect, a continuous energy spectrum is formed. Each energy level in this spectrum can accommodate a left-travelling and right-travelling wave (‘spinup’ and ‘spin-down’ electrons - Pauli’s principle) and these free electrons will distribute themselves throughout the energy band from 0 to some value E. Irrespective of any particular system, the number of possible modes of oscillation per unit volume dn in a frequency range ν to ν + dν for waves with a propagation velocity of c is given by γ=

dn =

4πν 2 dν . c3

With E = p2 /(2m) = ~ω and p = ~ω/c = E/c, then p ~dω and dE = dp = ~dω. c m The number of states per unit volume in the energy interval dE is therefore dp =

1

1

(2m3 ) 2 E 2 dn(E) = dE 2π 2 ~3 and thus, the total number of electrons per unit volume in the energy spectrum (0, E) is6 1

(2m3 ) 2 n(E) = 2 2π 2 ~3

ZE 0

1

(2m3 ) 2 3 E dE = 2 E2. 3π 2 ~3 1 2

Here m is taken to be the mass of an electron. Note that if the material is in a ‘ground state’ then the available electrons will occupy the lowest possible energy level. Further, if the total number of electrons per unit volume is less than the total number of energy levels available in a band (the bandwidth of the material), then the electrons can occupy all energy states up to a maximum energy Emax - the Fermi Energy. In this 6 The

factor of 2 is because of Pauli’s principle.

sense, the Fermi energy defines the (energy) bandwidth of a (conductive) material composed of a quasi-eigenfield. With an atomic number of 79, gold is the heaviest of the most conductive elements in the periodic table, i.e. the product of the conductivity with the atomic number (∼ 3.57×107 cmΩ) for gold is larger than any other element. If it were possible to reduce the total energy associated with the total quasieigenfield of gold such that E < Ep , then the result would be a scattering function that is negative. This requires the Fermi energy of gold to be reduced, the most influential factors being temperature and volume. Clearly, if the number of electrons per unit volume n is reduced then so is the Fermi energy. In terms of a physical material, n is determined by the number of atoms defining the physical extent of the material. This suggests an experimental investigation of the cryogenic properties of M-state (mono-atomic) gold. M-state gold is a white powder and is an example of a nano-material where each of the nano-metre size grains are clusters of a few hundred atoms. Like other M-state materials, the surface area is huge compared to the metallic (macro-crystalline) form. Thus, with the volume of each grain being small enough and the temperature of the material being low enough, it may be possibly to reduce the Fermi energy to an extent where E < Ep for the material as a whole. XV. D ISCUSSION The results developed in this paper encapsulate a phenomenology where the Helmholtz equation is, in effect, being used in an attempt to develop a unified scalar wavefield theory where the wavefield u(r, ω) is taken to exist over a broad range of frequencies limited only by the Planck frequency. At very high frequencies, u is taken to describe matter waves which are characterised by relativistic (KleinGordon and Dirac equations) and non-relativistic energies (Schrödinger equation) associated with nuclear and atomic physics respectively. At intermediate frequencies, u is taken to describe waves in the ‘electromagnetic spectrum’ and at low frequencies, u is taken to describe waves in the ‘gravity wave spectrum’. The structure of matter, the characteristics of light and other electromagnetic radiation and the properties of gravity become phenomenologically related via Helmholtz scattering over different frequency bands. Low frequency waves (gravity generating waves) are scattered by high frequency waves (matter waves) to produce a gravitational field; intermediate frequency waves (electromagnetic spectrum) are scattered by high frequency waves (e.g. a lens) but can also be scattered by the field generated from the scattering of low frequency waves to produce gravitational diffraction. In this sense, ‘physics’ becomes the study of waves interacting with waves at vastly different frequencies, the breadth of the spectrum ‘reflecting’ the instantaneous birth of the universe - the ‘big-bang’ - since it requires (noting that the Fourier transform of a δ-function is a constant over all frequency space) a short impulse to generate a broad frequency spectrum. However, in attempting to derive a ‘wavefield theory of everything’ we must reinterpret the nature of an electric field using the principle

114


of eigenfield tendency. Thus, instead of contemplating an electron in terms of a particle with a negative charge that ‘radiates’ an electric field and is attracted to particles with a positive charge (which also ‘radiate’ an electric field), we can visualise an electron in terms of a wave which is ‘attracted’ by the ‘requirement’ (through the minimum energy principle) of becoming an eigenfunction (a standing wave with lower energy than a free wave) whose properties are determined by the potential energy associated with the atomic nucleus which is itself, a higher (nuclear) frequency eigenfield system (quarks). The form of the wave equation 1 ∂2 2 ∇ − 2 2 U (r, t) = 0 c ∂t dictates that c must be of finite value. If a wavefield (whatever the wavefield may be) was to convey information from one point in space to another instantaneously, then the second term of the above equation would be zero; the ‘wave equation’ would be reduced to ‘Laplace’s equation’ ∇2 u = 0. Einstein’s principal postulate is that the upper limit at which any wavefield can propagate is the speed of light c0 in a perfect vacuum and thus c ≤ c0 . In a more general perspective, the rationale associated with the fact that c must have a finite upper bound is that the influence of any physical wavefield on any measurable entity can only occur in a finite period of time and that there can be no such thing as instantaneous ‘action at a distance’, i.e. as Issac Newton put it: That one body may act upon another at a distance through a vacuum, without the mediation of anything else, by and through which their action and force may be conveyed from one to the other, is to me so great an absurdity, that I believe no man who has in philosophical matters a competent faculty of thinking, can ever fall into it. Taking Newton’s own term, mediation requires the propagation (of information), but propagation at infinite speeds is not propagation and thus, we postulate that instantaneous fields are not possible, i.e. the speed at which a wavefield propagates must be finite for a wavefield to exist. In this context, the results developed for this paper highlight the idea that the ‘physics’ of a wavefield is more fundamental than the ‘physics’ of a field. This principle should be considered in light of the fact that the one property common to the principal field equation of physics (e.g. Einstein’s equations, Maxwell’s equations, Proca’s equations), is that they all describe wave phenomena - at least in an ‘indirect’ sense. In the case of Proca’s equations, the field equations are derived with the singular aim of ensuring that they can be decoupled to yield the inhomogeneous Klein-Gordon (wave) equation. The underlying philosophy associated with the approach considered, is based on a ‘waves within waves’ model, i.e. to quote an old Chinese proverb ‘In every way, one can see the shape of the sea’. This is a universal self-affine or fractal model in which the ‘fractal field’ is a scalar wavefield, a symbolic representation of the idea being given in Figure 3. As the frequency increases, a wavefield tends to become an eigenfield. This principle is required to explain the structure of matter and much of the discussion given in Section XIII is quantum mechanics revisited without the need to define an

Fig. 3. Example of fractal waves by the Japanese artist K Hokusai from the 1800s illustrating waves of different scale in both amplitude and wavelength.

electric field in terms of a charge. If we consider the structure of matter at the atomic, nuclear and sub-nuclear scales (indeed at all scales down to the scale of the Planck length) to be determined by eigenfields, then the question remains as to why eigenfield systems should ‘kick-in’ at the atomic scale? If the principle of eigenfield tendency applies at all frequencies then why do we not observe equivalent naturally occurring eigenfield systems in the electromagnetic spectrum? Perhaps we do under special circumstances, e.g. ball-lightning. The approach to unification considered in this paper has yielded a number of questionable and speculative results. The only experimental evidence offered in confirmation to our model for a gravitational field is a possible explanation as to why the Einstein rings associated with near field galaxies observed by the Hubble Space Telescope are blue. However, it should be noted that this ‘evidence’ is most typical of Carl Popper’s principle that all observation statements are ‘theory laden’ and that other explanations may be possible that are more appropriate in terms of established physical models. In general relativity, the curvature of space-time bends light by the same amount irrespective of the frequency - there is no dispersion relation. The λ−6 scaling law associated with gravitational diffraction may be validated (or otherwise) from appropriate simultaneous observations of the same Einstein ring (complete or otherwise) at different wavelengths. Other consequences such as a gravitational field generating a repulsive force that is proportional to the mass squared in the relativistic case remain of theoretical consequence only. However, it is noted that inflation theory (the expansion of the early universe) requires gravity to be a repulsive force. The model considered in this paper leads to the proposition that a gravity field is regenerative and exists through the continuous scattering of existing low frequency Helmholtz wavefields. This proposition may provide an answer to the following question: If nothing can escape the event horizon of a black hole because nothing can propagate faster than light then how does gravity get out of a black hole? The conventional answer to this question is that the field around a black hole is ‘frozen’ into the surrounding space-time prior

115


to the collapse of the parent star behind the event horizon and remains in that state ever after. This implies that there is no need for continual regeneration of the external field by causal agents. In other words, the explanation defies causality. In the model presented here, the gravitational field generated by a black hole or any other body is the result of a causal effect - the scattering of low frequency scalar waves. In this sense, a black hole is just a stronger scatterer than other cosmological bodies and a gravitational field ‘gets out of a black hole’ because it was never ‘in the black hole’ to start with. Propagative or wave theories of gravity have been proposed for many years. In 1805, Laplace proposed that gravity is a propagative effect and considered a correction to Newton’s law to take into account the observation that gravity has no detectable aberration or propagation delay for its action. Laplace’s ideas were advanced further by Weber, Riemann, Gauss and Maxwell in the Nineteenth Century using a variety of ‘corrective terms’. In 1898, Gerber, developed a propagative theory that took into account the perihelion advance of mercury and in 1906 Poincaré showed that the Lorentz transform cancels out gravitational aberration. After the success of general relativity (1916) for explaining gravity in terms of a geometric effect, propagation theories were discarded. However, more recently, attempts at explaining gravity in terms of causal effects through a ‘propagative’ force have been revisited [29] as debate over the basic Einsteinian postulates7 has intensified. Moreover, from Laplace to the present, propagation theories of gravity consider an object to be ‘radiating’ a field (in a passive sense). If general relativity considers gravity to be the result of an object warping space-time, then the proposition reported in this paper is that gravity is the result of an object scattering (long wavelength) waves that already exist as part of the low frequency component of a universal spectrum which is, itself, the by-product of the ‘big-bang’. The compatibility of this approach with general relativity might be realised if the wavefield as taken to warp space-time so that space-time is the medium of propagation. Any propagation theory of gravity must address some basic known observations: (i) Gravity has no detectable aberration or propagation delay for its action leading to effects predicted by general relativity such a gravitomagnetism; (ii) the finite propagation of light causes radiation pressure for which gravity has no counterpart pressure. These results represent the most vital evidence with regard to gravity being a geometric and not a propagative effect. For example, in an eclipse of the Sun, the gravitational pull on the earth by this 3-body (SunMoon-Earth) configuration increases. By comparing the delay in time it takes to observe the visible maximum eclipse on Earth (which can be calculated from knowledge of the distance of the Moon from the Earth) with the equivalent gravitational maximum, then if gravity is a propagating force, it appears to propagates at least 20 times faster than light! [30] Irrespective of whether this value is valid or not, a fundamental issue remains, which is compounded in the question: what is the speed of gravity? If we consider gravity to be a propagation 7 The invariance of the propagation of light in a vacuum for any observer which amounts to a presumed absence of any preferred reference frame.

and/or a low frequency scattering effect, then in order to account for the lack of propagation delay, it must be assumed that the speed of gravity is greater than the speed of light. This is contrary to the Einsteinian postulates if these postulates are taken to apply to all wavefields irrespective of their wavelength. The model presented here assumes that the speed of gravity is the same as the speed of light c0 . However, the asymptotic result k → 0 used to define a gravitational field yields, what will appears to be, an instantaneous effect from a wavefield that is taken to propagate at the speed of light. The wavelength is so long compared to the distances associated with a Sun-Moon-Earth system, for example, that the speed of gravity will appear to be significantly faster than the speed of light (i.e. U0s is observed to be an instantaneous field). XVI. F INAL C OMMENTS In terms of the fractal wavefield model considered here, the gravitational force is a consequence of very long wavelength waves and is therefore a long range force. Electromagnetism is a consequence of intermediate wavelength waves which exist as both free wavefields and eigen wavefields at the atomic scale, the transition from one to the other creating an ‘electric field’. The strong force is a consequence of a nuclear eigen wavefield where the values of E = ~ω and p = ~k are in the relativistic energy limit. The weak force (associated with radioactive decay, for example) is explained in terms of the transformation of a nuclear eigen wavefield to a more stable form allowing for the emission of a free wavefield (quantum ’tunneling effect’ when the potential barrier is low). For example, Rutherford scattering (the scattering of alpha particles from gold nuclei which historically provided the basic model for the atom) is an example of a free (nuclear) wavefield, interacting with a stable eigenfield system which consequently appears to exert a repulsive Coulomb force. At this frequency range the governing equation is Schrödinger’s equation which has a far field scattering amplitude determined by the three-dimensional Fourier transform of a Coulomb potential. Thus, as a function of the scattering angle θ Z∞ θ 2π γ(r)rdr sin 2kr sin A(θ) = 2 k sin θ2 0

and for the screened Coulomb potential8 γ(r) =

exp(−ar) , a>0 r

we obtain (for a → 0) π A(θ) = 2 2 k sin

!−1

a2

θ 2

1+ 2k sin

θ 2

2

=

k2

π sin2

θ 2

.

The intensity (scattering cross-section) is therefore inversely proportional to sin4 (θ/2) which is the basic ‘signature’ of Rutherford scattering. In terms of neutron scattering, a neutron is a free nuclear wavefield which, during its life time, is unable to combine with an existing nuclear eigen wavefield until it does, in some cases producing unstable nuclear eigen 8 Required

in order evaluate the integral over r.


116

wavefield systems which transform into new stable systems involving the emission of free wavefields, i.e. nuclear fission. Note that the principle of eigenfield tendency in which free wavefields tend to become eigen wavefield in order to achieve a minimum energy is equivalent to the least action principle. In field theory - in this case, the wavefield U (r, t) - the Lagrangian density L is a functional that is integrated over all space-time, i.e. Z Z S[U ] = L[U, ∂µ U ]d3 rdt where, using ‘relativistic notation’, ∂µ = (∂ 0 ; ∇), ∂ µ = (∂ 0 ; −∇), 1 ∂2 1 ∂ and ∂µ ∂ µ = 2 2 − ∇2 . c ∂t c ∂t The Lagrangian is the spatial integral of the density and application of the least action principle yields the EulerLagrange equations δS ∂L ∂L = −∂µ =0 + δU ∂(∂µ U ) ∂U ∂0 =

which are then solved for U . The wavefield approach adopted in this paper is consistent with the basic concepts associated with the Grand Unified Theories of C H Tejman [31] and in one sense, we have attempted to explain the example images given in Figure 2 using a single phenomenological model. Just as Poisson used a wave model to explain the Poisson spot without reference to light being an electromagnetic wave (Maxwell’s equations for an electric and magnetic field which Poisson did not know of at the time), so we have attempted to explain both a Poisson spot and an Einstein ring without reference to general relativity (Einstein’s equation for a gravitational field). The problem then remains of how to formally ‘recover’ Maxwell’s equations and Einstein’s equations from a single wave theoretic model.

[7] P. A. M. Dirac, The quantum theory of the electron Proc. R. Soc. (London) A, 117, 610-612, 1928. [8] P. A. M. Dirac, The quantum theory of the electron: Part II Proc. R. Soc. (London) A, 118, 351-361, 1928. [9] P. M. Morse and H. Feshbach, Methods of Theoretical Physics, McGrawHill, 1953. [10] J. J. Sakurai, Advanced Quantum Mechanics, Addison Wesley, 1967, ISBN: 0-201-06710-2. [11] A. S. Davydov, Quantum Mechanics (2nd Edition), Pergamon, 1976, ISBN: 0-08-020437-6. [12] W. Rarita and J. Schwinger, On a Theory of Particles with Half-Integral Spin, Phys. Rev. 60, 61-61, 1941. [13] A. Proca, Fundamental Equations of Elementary Particles, C. R. Acad. Sci. Paris, 202, 1490, 1936. [14] W. Greiner, Relativistic Quantum Mechanics (3rd edition), Springer, 2000. [15] J. A. Stratton, Electromagnetic Theory, McGraw-Hill, 1941. [16] R. H. Atkin, Theoretical Electromagnetism, Heinemann, 1962. [17] D. J. Griffiths, Introduction to Quantum Mechanics (Second Edition), Prentice Hall, 2004 . [18] R. Tomaschitz, Einstein Coefficients and Equilibrium Formalism for Tachyon Radiation, Physica A, 293, 247-272, 2001. [19] R. Tomaschitz, Tachyon Synchrotron Radiation, Physica A, 335, 577610, 2004. [20] R. Tomaschitz, Quantum Tachyons, Eur. Phys. J. D32, 241-255, 2005. [21] X. Bei, C. Shi and Z. Liu, Proca Effect in Kerr-Newman Metric, Int. J. of Theoretical Physics, 43, 1555-1560, 2004. [22] R. Scipioni, Isomorphism Between Non-Riemannian Gravity and Einstein-Proca-Wyle Theories Extended to a Class of Scalar Gravity Theories, Class. Quantum Gravity, 16, 2471-2478, 1999. [23] J. M. Blackledge, Digital Image Processing, Horwood Publishing, 2006, ISBN: 1-898563-49-7. [24] G. F. Roach, Green’s Functions (Introductory Theory with Applications), Van Nostrand Reihold, 1970. [25] http://en.wikipedia.org/wiki/Arago spot [26] http://en.wikipedia.org/wiki/Einstein ring [27] http://www.universetoday.com/2005/04/29/near-perfect-einstein-ringdiscovered/ [28] V. Belokurov et al., The Cosmic Horseshoe: Discovery of an Eistein Ring Around a Giant Luminous Red Galaxy, Astrophysical Journal (Submitted), 2007 (http://www.arxiv.org/abs/0706.2326). [29] T. C. Van Flandern, Dark Matter, Missing Planets and New Comets: Paradoxes Resolved, Origins Illuminated, North Atlantic Books, Berkeley, 1993 [30] T. C. Van Flandern, The Speed of Gravity: What the Experiments Say, Physics Letters A, 250, 1-11, 1998. [31] C. H. Tejman, http://www.grandunifiedtheory.org.il/

ACKNOWLEDGMENTS The author is grateful for the support, through numerous discussions, of Prof Nicholas Phillips, Mr Bruce Murray and Dr Dmitri Dubovitski, to Prof Roy Hoskins and Prof Michael Rycroft for their critical analysis of the manuscript and to Loughborough University, England, and the University of the Western Cape, Republic of South Africa. R EFERENCES [1] L. Smolin The Trouble with Physics: The Rise of String Theory, the Fall of a Science, and What Comes Next, Houghton-Mifflin, 2006, ISBN-10: 0-618-55105-0 (http://www.thetroublewithphysics.com/). [2] P. Woit Not Even Wrong: The Failure of String Theory and the Search for Unity in Physical Law, Basic Books, 2006, ISBN: 0-465-09275-6. [3] B. B. Mandelbrot, The Fractal Geometry of Nature, Freeman, 1983. [4] J. C. Maxwell, A Dynamical Theory of the Electromagnetic Field, Philosophical Transactions of the Royal Society of London 155, 459512, 1865. [5] A. Einstein, The Foundation of the General Theory of Relativity, Annalen der Physik, IV, Floge 49, 770-822, 1916 (http://www.alberteinstein.info/gallery/gtext3.html). [6] E. Schrödinger, Quantization as an Eigenvalue Problem, Annalen der Physik, 489, 79, 1926.

Jonathan Blackledge received a BSc in Physics from Imperial College, London University in 1980, a Diploma of Imperial College in Plasma Physics in 1981 and a PhD in Theoretical Physics from Kings College, London University in 1983. As a Research Fellow of Physics at Kings College (London University) from 1984 to 1988, he specialized in information systems engineering undertaking work primarily for the defence industry. This was followed by academic appointments at the Universities of Cranfield (Senior Lecturer in Applied Mathematics) and De Montfort (Professor in Applied Mathematics and Computing) where he established new post-graduate MSc/PhD programmes and research groups in computer aided engineering and informatics. In 1994, he co-founded Management and Personnel Services Limited where he is currently Executive Director. His work for Microsharp (Director of R & D, 1998-2002) included the development of manufacturing processes now being used worldwide for digital information display units. In 2002, he co-founded a group of companies specialising in information security and cryptology for the defence and intelligence communities, actively creating partnerships between industry and academia. He currently holds academic posts in the United Kingdom and South Africa, and in 2007 was awarded a Fellowship of the City and Guilds London Institute and Freedom of the City of London for his role in the development of the Higher Level Qualifications programme and CPD in engineering and computing, most recently for the nuclear industry.

Electronics and Signal Processing

Electronics and Signal Processing

Suggest Documents

Multirate Signal Processing - Signal Processing for Communications

Multirate Signal Processing - Signal Processing for Communications

Signals and Signal Processing

Digital Signal Processing in Power Electronics Control ... - Springer Link

High Speed Digital Signal Processing Electronics for the TLS ...

Signal Processing

Signal Processing

Multirate Digital Signal Processing - Signal and Image Processing ...

Signal Processing

signal processing

Signal Analysis and Signal Processing Recitation #7

Signal processing and frequency-dependent

Tensor Approximation and Signal Processing

Digital Communications and Signal Processing

Ramanujan Sums and Signal Processing

Mobile Stethoscope and Signal Processing

Communication Engineering and Signal Processing

IRZ0140 Signals and Signal Processing

Communications and Signal Processing Track

Tensor Approximation and Signal Processing

Detection, signal processing, and calibration

CONTROL AND SIGNAL PROCESSING CONCEPTS

Signal processing 1 Signal Processing for Speech Recognition ...

Cyclic LTI Systems in Digital Signal Processing - Signal Processing ...