Maximum-likelihood Modulation Classification with ... - ITA @ UCSD

Maximum-likelihood Modulation Classification with Incomplete Channel Information William C. Headley, V. Gautham Chavali, and Claudio R. C. M. da Silva Bradley Department of Electrical and Computer Engineering Virginia Polytechnic Institute and State University Blacksburg, VA USA 24061 Email: {cheadley, chavali, cdasilva}@vt.edu Abstract—This paper presents a discussion of the classification of digital communication signals given incomplete knowledge of the channel. Through a maximum-likelihood framework, modulation classifiers are presented which assume no or limited a priori knowledge of the fading experienced by the signal (including time offset, phase shift, and amplitude) and/or the distribution of the noise added in the channel. A recently published asynchronous classifier for digitally modulated signals, which uses a new channel estimator that is blind to the modulation scheme of the received signal, is introduced and analyzed. In addition, results are presented of our recent work on the classification of digitally modulated signals in flat fading non-Gaussian channels.

I. I NTRODUCTION Modulation classification can be defined as the process of determining the modulation format of a noisy signal r(t) from a given set of c possible formats, {H1 , H2 , . . . , Hc }. This process is of importance in multi-mode software-defined radios [1] and electronic warfare systems [2], among others. There are two main approaches to modulation classification, namely feature-based and likelihood-based (LB) [3]. Featurebased classifiers exploit modulation dependent features of the signal, such as cyclostationary signatures [3], [4]. While feature-based approaches are generally easier to implement, they are sub-optimal. LB classifiers are optimal in the Bayesian sense, as they minimize the probability of classification error [5]. LB approaches are composite hypothesis testing problems in which classification is performed by searching for the maximum a posteriori probability P (Hi |r(t)), or equivalently p(r(t)|Hi ) given equally likely modulation schemes; that is, ˆ = arg max log p ( r(t)| Hi ) . H

(1)

Hi

The signal r(t) is a function of the transmitted data, signal parameters (such as the symbol interval), the impulse response of the channel, and the noise distribution. The difficulty in performing modulation classification is primarily due to the fact that classifiers operate with incomplete signal and channel information. This is because, in general, radios have to first classify the signal before it can successfully estimate signal parameters, the channel realization, and the noise distribution [6], [7]. As a result, for example, the impractical assumption that r(t) is perfectly equalized by the radio front-end This work was supported in part by the Bradley Fellowship program of Virginia Tech.

before classification is often made in the design of modulation classification algorithms, as discussed in [3] and [8]. There are different approaches for LB classifiers to handle unknown variables in p ( r(t)| Hi ), including the Average Likelihood Ratio Test (ALRT), Generalized LRT, Hybrid LRT (HLRT), and quasi-HLRT (qHLRT) [3]. In these approaches, through a standard hypothesis testing procedure, the unknowns are either estimated, averaged out of (1) by using their probability density function, or a combination of both. In this paper, using the HLRT or qHLRT approaches, in which channel parameters are estimated prior to classification, we present classifiers that assume no or limited a priori knowledge of the fading experienced by the signal and/or the noise distribution. Specifically, in Section II, the asynchronous classifier proposed in [6], which uses a new channel estimator that is blind to the modulation scheme of r(t), is introduced and analyzed. In Section III, results are presented of our recent work [7] on signal classification in flat fading non-Gaussian channels. II. A SYNCHRONOUS M ODULATION C LASSIFICATION IN F LAT FADING C HANNELS A. System model Assuming that the transmitted signal passes through a flat fading channel, the signal r(t) can be written as { } ∞ ∑ j(2πfc t+θ) r(t) = ℜ α sk g(t − kT − tc )e + n(t), k=−∞

(2) where g(·) is the (real-valued) pulse shape, assumed to satisfy the Nyquist intersymbol interference criterion, T is the symbol interval, and fc is the carrier frequency. We assume that the data conveyed in the observed signal is mapped onto a digital amplitude-phase constellation Si (unknown to the classifier). The symbol sk ∈ Si represents the k th received data symbol and is assumed to be independent and uniformly distributed among the Li constellation points that define Si . In (2), α, θ, and tc are the gain, phase shift, and time delay introduced by the channel, respectively. It is assumed that the channel remains constant during the observation interval. Also in (2), n(t) is a zero-mean Gaussian process with two-sided power spectral density N0 /2. As previously noted, the channel state (including the thermal noise level) is typically unknown to the classifier. For this reason, α, θ, tc , and N0 are modeled as deterministic unknown variables.

For ease of analysis, we write the time delay in terms of symbol intervals as tc = (λ + ν) T , where λ represents the integer number of symbols delayed and ν represents the remaining fraction of a symbol delay (0 ≤ ν < 1). The proposed classifier is developed based upon the output of a receiver that consists of a frequency conversion stage (from RF to baseband) and a matched filter. From (2), the output of the receiver at t = (l + v)T , where l is an integer and v is a real number in the range [0, 1), is [6] ∞ αejθ ∑ sk R((l − λ + v − ν − k)T ) + nl,v , (3) 2 k=−∞ ∫∞ where R(t) = −∞ g(τ )g(t − τ )dτ and nl,v is a zero-mean complex Gaussian random variable with variance N0 /2.

rl,v =

B. Modulation classifier In the design of the classifier, it is first assumed that the receiver has a perfect estimate of the fractional time delay ν; that is, the receiver perfectly estimates when symbol transitions occur. In this case, the vector rν = [r1,ν , r2,ν , . . . , rNc ,ν ] is a set of sufficient statistics for the detection of the symbols s1−λ , s2−λ , . . . , sNc −λ . Given that the rl,ν values are independent, the Total Probability ∏N ∑Li Theorem can be used to show that c p(rν |Hi ) = l=1 m=1 p(rl,ν |Sm,i , Hi )P (Sm,i |Hi ), where Sm,i is one of the Li complex constellation values of the ith modulation scheme. Taking the logarithm of p(rν |Hi ), and jθ using the fact that (3) reduces to rl,ν = αe2 sl−λ + nl,v when v = ν, the classifier can be written as [6], [9] ( 2 ) Nc Li ∑ ∑ αejθ 2 1 − S − r m,i l,ν 2 ˆ = arg max H ln e N0 . (4) Hi Li m=1 l=1

Note that the classifier given by (4) is a function of α, θ, ν, and N0 . In order to handle these unknowns, we use a qHLRTbased approach. In this approach, the channel parameters are estimated through the use of low-complexity estimators that are blind to the modulation scheme of the received signal. This approach is used for two reasons. First, this approach does not require knowledge of the statistics of the channel parameters. Instead, these parameters are modeled as deterministic unknown variables. Second, this approach does not require multi-dimensional maximum-likelihood estimation of the parameters, leading to a lower complexity classifier. Given that α, θ, ν, and N0 are unknown at the classifier, these values are replaced by their estimates (denoted by ˆ ). This leads to the final form of [6] 2 } { ˆ Nc Li αe ˆ jθ 2 ∑ ∑ −N 1 ν − 2 Sm,i ˆ rl,ˆ 0 ˆ (5) H = arg max ln e Hi Li m=1 l=1

for the proposed asynchronous modulation classifier, where rl,ˆν is the output of the receiver at t = (l + νˆ)T . C. Estimation of the unknown channel parameters In order to estimate the unknown amplitude, time offset, and noise power of the received signal for use in the classifier, we developed in [6] a new estimator that is blind to the signal’s

modulation format (PSK, QAM, and PAM) and order. The estimators are based on a low-complexity estimation approach known as the Method-of-Moments (MoM). This is a suboptimal approach in which parameters are estimated through the solution of a system of statistical moment equations [10]. As stated, the unknowns α, ν, and N0 are to be estimated through the solution of a system of statistical moment equations. The first of these moments is E[|rl,v1 |2 ], where rl,v1 is the output of the receiver at t = (l + v1 )T , l is an arbitrarily chosen integer, and v1 is an arbitrarily chosen real number in the range [0, 1). The values chosen for l and v1 have no relation to λ or ν (which are unknown to the receiver). Using (3) with v = v1 , the moment can be written as [6] α2 N0 ψv ,ν + . (6) 4 1 2 It is important to note that (6) is a function of the unknowns α, ν, and N0 , while not being a function of the unknown ∑ data, λ, or θ. In (6), the function ψv1 ,ν is defined as ∞ ′ 2 ψv1 ,ν = k′ =−∞ R((v1 − ν − k )T ) . Assuming a square root-raised cosine pulse shape, as shown in [6], ψv1 ,ν = 1 + β4 (cos(2π(v1 − ν)) − 1). Therefore, for this pulse shape, (6) can be rewritten as Mv1 = E[|rl,v1 |2 ] =

α2 {4 + β (cos (2π [ν − v1 ]) − 1)} . (7) 8 Next, it is assumed that the receiver has a new sampling instant of (l + v2 )T , where l is an arbitrarily chosen integer and v2 is an arbitrarily chosen real number in the range [0, 1), with v2 ̸= v1 . (Note that l and v2 have no relation to λ or ν.) Given this new sampling instant, a second equation for N0 can be determined from the moment E[|rl,v2 |2 ] = Mv2 . Setting these two N0 equations equal, and performing some algebraic manipulation, gives N0 = 2Mv1 −

α2 =

16(Mv1 − Mv2 ) , β [cos (2π (ν − v1 )) − cos (2π (ν − v2 ))]

(8)

which is only a function of the unknowns α and ν. The final step is to use a third moment equation to remove the dependence on one of the two unknowns of (8). Assuming a third sampling instant (l + v3 )T , where again l is an arbitrarily chosen integer and v3 is an arbitrarily chosen real number in the range [0, 1), with v3 ̸= v2 ̸= v1 , the moment E[|rl,v3 |2 ] = Mv3 can be determined. (Note that l and v3 have no relation to λ or ν.) Using this third moment with either of the previous two moments, a second equation for α2 can be determined. Setting the two α2 equations equal and performing some manipulation leads to tan(2πν) = Z (9) (Mv3 − Mv2 )X1 + (Mv1 − Mv3 )X2 + (Mv2 − Mv1 )X3 , = (Mv2 − Mv3 )Y1 + (Mv3 − Mv1 )Y2 + (Mv1 − Mv2 )Y3 where Xi = cos(2πvi ) and Yi = sin(2πvi ). Inverting (9), and assuming that atan(Z) is in the range [−π/2, π/2), ν may assume one of three possible values, } { atan(Z) atan(Z) 1 atan(Z) , + , +1 . (10) ν= 2π 2π 2 2π

MSE for α

MSE for No

MSE for ν

1

0

−1

10

−2

10

−3

10

β = 0.35

−4

10

0

5

0

10 SNR (dB)

15

20

10 Average MSE

−1

10

−2

10

−3

10

−4

10

0.9 0.8 0.7 0.6 0.5 Optimal (perfect estimates) [9] Nest = 10000, β = 0.75

0.4

N

est

0

= 10000, β = 0.35

Nest = 1000, β = 0.75

0.3

β = 0.75

Nest = 1000, β = 0.35

−5

10

Average Probability of Correct Classification

Average MSE

10

5

10 SNR (dB)

15

20

0.2 −5

0

5

10 15 Average SNR (dB)

20

25

Fig. 1. Average MSE obtained with the blind channel estimator proposed in [6] (solid: Nest = 1000 , dashed: Nest = 10000).

Fig. 2. Average probability of correct classification obtained with the qHLRTbased classifier proposed in [6] (Nc = 500).

Therefore, the unknowns α, ν, and N0 can be determined from the moments Mv1 , Mv2 , and Mv3 . This is done by first using the moments to determine the three possible values for ν given by (10). One of the values can be immediately discarded for falling outside of the range 0 ≤ ν < 1. Given the remaining two possible ν values, two possible values for α are determined through (8). As shown in [6], one of the α values will be invalid, leaving just one valid solution for both α and ν. Finally, the values for α and ν are used to determine N0 through (7). Further details of the proposed estimator can be found in [6]. In practice, Mv1 , Mv2 , and Mv3 are unknown and must be estimated from r(t). This estimation can be done using a samˆ v = 1 ∑li +Nest |rl,v |2 , i = 1, 2, 3, ple average estimator M i i l=li Nest where rl,vi is the receiver output sampled at (l + vi )T , Nest is the number of samples observed, and li is an arbitrary integer. Therefore, given that the moments themselves are estimated, the solutions to (7), (8), and (10) are estimates for the parameters N0 , α, and ν, respectively. In order to estimate θ, a MoM-based algorithm known as the M -power phase synchronizer is used, which is also blind to the received signal’s modulation scheme [11].

for lower roll-off factors, leading to an overall reduction in the performance of the estimators. Fig. 2 presents the performance of the proposed classifier. ˆ νˆ, and N ˆ0 The effect of the reliability of the estimates α ˆ , θ, on the performance of the classifier can be observed in this figure. As the reliability of the estimates improve, be it through higher values of Nest , β, and/or SNR, the performance of the classifier improves, as expected. It can be seen that the classifier’s performance closely approaches that of an optimal classifier with perfect channel knowledge (developed in [9]) for an adequate estimation interval.

D. Performance analysis In the performance results presented in this subsection, the modulation schemes considered are BPSK, QPSK, 8-PSK, 16-QAM, and 64-QAM. The amplitude α is assumed to be Rayleigh distributed with E[α2 ] = 1, and v1 , v2 , and v3 are set equal to 0, 1/3, and 2/3, respectively. Fig. 1 presents the performance of the proposed MoM estimators for a square root-raised cosine pulse with roll-off factors of β = 0.35 and β = 0.75. As expected, as the SNR and/or Nest is increased, the average mean square error (MSE) of the estimates decreases. This is because an increase in either of these parameters reduces the estimation error of Mv1 , Mv2 , and Mv3 . Additionally, it is seen that the average MSE of the estimates increases for lower roll-off factors. This is due to the fact that timing offsets result in larger intersymbol interference

III. M ODULATION C LASSIFICATION IN F LAT FADING N ON -G AUSSIAN C HANNELS The significant majority of modulation classification algorithms in the literature were developed for the case in which the additive noise is assumed to be Gaussian distributed [3]. However, experimental studies have shown that most radio channels experience both man-made and natural noise, and that the combined noise is a highly non-Gaussian process. See, for example, [12]-[15] and references therein. It has also been shown that non-Gaussian noise dominates over Gaussian noise in some frequency ranges. For example, it was recently reported that “man-made noise levels can be up to 30 dB above the thermal noise floor” in TV-band frequencies [16]. For these reasons, we proposed in [7] an algorithm for the classification of digital amplitude-phase modulated signals in flat fading channels with non-Gaussian noise. In [7], the additive noise is modeled by a N -term Gaussian mixture distribution, ) ( N ∑ λn |w|2 p(w) = , (11) exp − 2πσn2 2σn2 n=1 which is a well-known model of man-made and natural noise that appears in most radio channels. In the development of the classifier, in addition to α and θ, the noise distribution N N parameters {λn }n=1 and {σn }n=1 are assumed to be unknown and are modeled as deterministic unknown variables.

1

0.9

0.9



1

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.8

0.7

0.6

0.5

0.4

0.3

−3

−1

1

3


9

11

13

15

0.2

−3

−1

1

3


9

11

13

15

Fig. 3. Probability of correct classification of the classifier proposed in [7] for λ2 = 0.1 and σ22 /σ12 = 100. Number of symbols used for parameter estimation: 100 (diamond), 200 (square), and 500 (circle).

Fig. 4. Probability of correct classification of the classifier proposed in [7] for λ2 = 0.3 and σ22 /σ12 = 250. Number of symbols used for parameter estimation: 100 (diamond), 200 (square), and 500 (circle).

In order to estimate the unknown channel and noise parameters, the proposed classifier utilizes a blind (no training signals are used) estimator that is based on a variant of the expectation-maximization algorithm. With these estimates, the signal is then classified using a HLRT-based algorithm. The performance of the classifier proposed in [7] is shown in Figs. 3 and 4. In these results, the modulation schemes considered are BPSK, QPSK, 8-PSK, and 16-QAM. The amplitude α is assumed to be Rayleigh distributed with E[|α|2 ] = 2, and θ is uniformly distributed in (0, 2π]. The number of terms in the mixture is taken to be N = 2. This case is widely used in the literature as a model for impulsive noise. In this case, the first and second terms of the mixture represent the thermal noise (with variance σ12 and proportion λ1 ) and the impulsive noise (with variance σ22 ≫ σ12 and proportion λ2 = 1 − λ1 ) components, respectively. For reference, the performance of the classifier with perfect knowledge of the channel state and noise statistics is also plotted (star markers). Similarly, we also present in these figures the performance of the LB classifier formulated for Gaussian channels when used in a Gaussian mixture environment (plus markers). It is seen in Figs. 3 and 4 that the performance of the proposed classifier approaches that of the ideal classifier designed for Gaussian mixture noise in all cases. It is also seen that the gap in performance between the proposed classifier and the classifier designed for Gaussian channels is always significant, and that this gap increases as the noise becomes “more impulsive.” It can also be seen that the performance improves for increasing number of symbols used in the estimation process, as the estimation accuracy increases.

added in the channel. The reader is referred to [6] and [7] for further information on these classifiers.

IV. C ONCLUSIONS Modulation classification in realistic channels is a challenging problem, as channel information is required for optimal performance. This paper presented a discussion of two classifiers that assume no or limited a priori knowledge of the fading experienced by the signal and/or the distribution of the noise

R EFERENCES [1] J. Hamkins and M. K. Simon, Eds., Autonomous Software-Defined Radio Receivers for Deep Space Applications (JPL Deep-Space Communications and Navigation Series). Wiley-Interscience, 2006. [2] R. A. Poisel, Introduction to Communication Electronic Warfare Systems. Artech House, 2008. [3] O. A. Dobre, A. Abdi, Y. Bar-Ness, and W. Su, “Survey of automatic modulation classification techniques: Classical approaches and new trends,” IET Commun., vol. 1, no. 2, pp. 137-156, April 2007. [4] W. C. Headley, J. D. Reed, and C. R. C. M. da Silva, “Distributed cyclic spectrum feature-based modulation classification,” in Proc. IEEE Wireless Comm. Netw. Conf., Las Vegas, NV, 2008, pp. 1200-1204. [5] S. M. Kay, Fundamentals of Statistical Signal Processing. Prentice-Hall, 1993, vol. II. [6] W. C. Headley and C. R. C. M. da Silva, “Asynchronous classification of digital amplitude-phase modulated signals in flat-fading channels,” IEEE Trans. Commun., vol. 59, no. 1, pp. 7-12, Jan. 2011. [7] V. G. Chavali and C. R. C. M. da Silva, “Maximum-likelihood classification of digital amplitude-phase modulated signals in flat fading nonGaussian channels,” submitted for publication. [8] E. E. Azzouz and A. K. Nandi, Automatic Modulation Recognition of Communication Systems. Kluwer, 1996. [9] W. Wei and J. M. Mendel, “Maximum-likelihood classification for digital amplitude-phase modulations,” IEEE Trans. Commun., vol. 48, no. 2, pp. 189-193, Feb. 2000. [10] S. M. Kay, Fundamentals of Statistical Signal Processing. Prentice Hall, 1998, vol. I. [11] U. Mengali and A. N. D’Andrea, Synchronization Techniques for Digital Receivers. Plenum Press, 1997. [12] D. Middleton, “Statistical-physical models of electromagnetic interference,” IEEE Trans. Electromagn. Compat., vol. 19, no. 3, pp. 106-127, Aug. 1977. [13] X. Wang and H. V. Poor, “Robust multiuser detection in non-Gaussian channels,” IEEE Trans. Signal Process., vol. 47, no. 2, pp. 289-305, Feb. 1999. [14] R. S. Blum, R. J. Kozick, and B. M. Sadler,“An adaptive spatial diversity receiver for non-Gaussian interference and noise,” IEEE Trans. Signal Process., vol. 47, no. 8, pp. 2100-2111, Aug. 1999. [15] R. J. Kozick and B. M. Sadler,“Maximum-likelihood array processing in non-Gaussian noise with Gaussian mixtures,” IEEE Trans. Signal Process., vol. 48, no. 12, pp. 3520-3535, Dec. 2000. [16] T. Erpek, M. A. McHenry, and A. Stirling, “DSA operational parameters with wireless microphones,” in Proc. IEEE Symp. New Frontiers Dynamic Spectrum Access Netw., Singapore, 2010, pp. 1-11.

Maximum-likelihood Modulation Classification with ... - ITA @ UCSD

Maximum-likelihood Modulation Classification with ... - ITA @ UCSD

Suggest Documents

Lattices are Everywhere - ITA @ UCSD

Joint multi-cell processing for downlink with limited ... - ITA @ UCSD

MDL hierarchical clustering with incomplete data - ITA @ UCSD

On Generalized Bent Functions - ITA @ UCSD

Grassmannian Packings from Multidimensional Second ... - ITA @ UCSD

Beamforming and Aligned Interference Neutralization ... - ITA @ UCSD

Synchronization of Phase-coupled Oscillators with ... - ITA @ UCSD

FV code trees with no self-synchronizing string - ITA @ UCSD

On Gaussian Interference Channels with mixed ... - ITA @ UCSD

Multilevel Diversity Coding with Secrecy Constraints - ITA @ UCSD

Facebrowsing: Search and Navigation through ... - ITA @ UCSD

From Almost Gaussian to Gaussian - ITA @ UCSD

Iterative Decoding Beyond Belief Propagation - ITA @ UCSD

Capacity Benefits of Antenna Coupling - ITA @ UCSD

Learning Hidden Markov Sparse Models - ITA @ UCSD

Applications of the Golden Code - ITA @ UCSD

Nearest Neighbor Classification - UCSD CSE

On Pricing of Spectrum in Secondary Markets - ITA @ UCSD

Simple Network Codes for Instantaneous Recovery ... - ITA @ UCSD

Multiuser detection in a dynamic environment - ITA @ UCSD

A Note on Convergence Rate of Constrained Capacity ... - ITA @ UCSD

The Saturation Throughput Region of p-Persistent CSMA - ITA @ UCSD

Optimal Allocation of Filters against DDoS Attacks - ITA @ UCSD

On the Typical Minimum Distance of Protograph-Based ... - ITA @ UCSD