(CSS) algorithms based on the pattern classification techniques for cognitive .... where No = E[|ni,k|2] is the power spectral density of noise. D. Primary User ...
Pattern Classification Techniques for Cooperative Spectrum Sensing in Cognitive Radio Networks: SVM and W-KNN Approaches Karaputugala Madushan Thilina, Kae Won Choi, Nazmus Saquib, and Ekram Hossain Abstract—We consider novel cooperative spectrum sensing (CSS) algorithms based on the pattern classification techniques for cognitive radio (CR) networks. In this regard, support vector machine (SVM) and weighted K-nearest-neighbor (KNN) classification techniques are implemented for CSS. The received signal strength at the CR users are treated as features and fed into the classifier to detect the availability of the primary user (PU). Each instance of PU activity (i.e., availability and unavailability) is categorized into positive and negative classes (respectively). In the case of SVM, for minimization of classification errors the support vectors are obtained by maximizing the margin between the separating hyperplane and data. Towards this end, we investigate the effect of different kernels through quantifying in terms of detection probability by representing the receiver operating characteristic (ROC) curves. Furthermore, weighted KNN classification technique is proposed for CSS and the corresponding weights are calculated by evaluating the area under ROC curve of each feature. Our comparative results clearly reveal that the proposed SVM and weighted KNN algorithms outperform the existing state-of-the-art pattern classificationbased CSS techniques. Index Terms—Cognitive radio, cooperative spectrum sensing, support vector machine, K-nearest-neighbor classification technique, primary user classification
I. I NTRODUCTION The concept of “cognitive radio” (CR) for designing wireless communications systems has emerged since last decade to mitigate the scarcity problem of limited radio spectrum by improving the utilization of the spectrum [1], [2]. Cognitive radio refers to an intelligent wireless communications device, which senses its operational electromagnetic environment and can dynamically and autonomously adjust its radio operating parameters [3]. In this context, opportunistic spectrum access (OSA) is a key concept, which allows a CR device to opportunistically access the frequency band allocated to a primary user (PU) when the primary user transmission is detected to be inactive [4]–[7]. For OSA, the CR devices have to sense the radio spectrum licensed to the PUs by using its limited resources (e.g., energy and computational power), and subsequently utilize the available spectrum opportunities to maximize its performance objectives. Therefore, efficient spectrum sensing is crucial for OSA. Efficient spectrum sensing can be accomplished through adopting techniques that can classify the PU signals properly. The classification algorithms based on machine learning techniques such as perceptron, Fisher linear discriminant, Knearest-neighbor (KNN), and support vector machine (SVM)
are well suited for PU detection due to their pattern recognition capabilities [8], [9]. Further, cooperative spectrum sensing (CSS) can be used when CRs are distributed in different locations. It is possible for the CRs to cooperate to achieve higher sensing reliability than the individual sensing does [10], [11] by yielding a better solution to the hidden PU problem that arises because of shadowing [12] and channel fading [13]. In cooperative sensing, CRs exchange sensing results with the fusion center for decision making [10], [11]. With hard fusion algorithms, CR users exchange only one bit of information with the fusion center, which indicates whether the received energy is above a particular threshold. For example, the OR-rule [14], AND rule, counting rule [15], and linear quadratic combining rule [16] can be used for cooperative spectrum sensing. In [17], softened hard fusion scheme with two-bit overhead for each CR user is considered. In soft decision algorithms [17], [18], the exact received energy of CRs are transmitted to the fusion center to make a better decision. In [19], the authors propose an optimal linear fusion algorithm for spectrum sensing. Relay-based cooperative spectrum sensing schemes are studied in [20], [21]. The authors in [22] suggest a linear fusion rule for cooperative spectrum sensing, in which linear coefficient weights are obtained by using the Fisher linear discriminant analysis. Motivation: To the best of our knowledge, other than the Fisher linear discriminant classification in [22], the pattern recognition techniques have not been utilized for spectrum sensing in the previous literature. However, SVM and KNN classifications have more predictive power compared to the Fisher linear discriminant classification. Further, the ability of SVM and KNN to extract linear combination of features in higher dimension reduces the complexity of classification. Moreover, a majority of the existing works comprise of only one PU in the detection area, which is not a practical assumption. Hence, it is important to investigate the multiple PU scenarios in the detection area while addressing the random PU behavior. Contribution: We model the PU network as a random geometric network [22], [23] to exploit the randomness of PU locations. By considering the received signal strength at CRs as features, we investigate the usefulness of pattern recognition techniques for PU detection. More specifically, the SVM and weighted KNN classification techniques are considered for cooperative spectrum sensing. The SVM scheme takes the following three steps: i) gathering training samples, ii)
obtaining support vectors from the training samples, and iii) putting a test sample into the SVM for classification. Note that since the primary users may not explicitly tell their locations or interference levels, the training samples may have to be obtained by offline simulations. We treat each instance of PU availability and unavailability as positive and negative classes, respectively. Thus, for SVM, the support vectors for minimizing the classification errors are obtained by maximizing the margin between the separating hyperplane and data. Then, the effect of different kernels for CSS are quantified in terms of the detection probability and receiver operating characteristic (ROC) curves. In addition, the weighted KNN classification technique is also investigated for cooperative sensing with different distance measures. Each weight of the corresponding feature of KNN classification is calculated by considering the area under ROC curves and performances are evaluated in terms of the detection probability. The effect of the number of CR devices on the performance of cooperative spectrum sensing is also quantified. The rest of the paper is organized as follows. In Section II, we present the system model and assumptions. Then, we describe the cooperative sensing algorithms (i.e., SVM and W-KNN) in Section III. Section IV presents the performance evaluation results for the proposed CSS algorithms. Lastly, Section V concludes the paper. II. S YSTEM M ODEL AND A SSUMPTIONS A. Network Model We consider a CR network of N secondary users (SUs) where each of these SUs is indexed by i = 1, . . . , N . In the SU network, the detection area (A ∈ R2 ) of a particular SU refers to the circular area with radius R where the SU under consideration is located at the origin of that circle. The other SUs (with coordinate ci ’s) located within the detection area of the SU under consideration participate in cooperative spectrum sensing to decide whether any PU-Rx lies in the detection area. The SUs in the detection area transmit sensing results to the SU in the origin via a common control channel to assist CSS. For efficient fusion, we assume that the relative locations of all SUs are available to each SU in the CR network (for example, by using the global positioning system [GPS]). The PU network is considered as a peer-to-peer random network. A pair of a PU transmitter (PU-Tx) and a PU receiver (PU-Rx) communicate in a peer-to-peer manner and operates in full-duplex mode through frequency-division duplexing. To capture the randomness of the PU network, we model the locations of PUs as a homogeneous Poisson point process (PPP) with density λ. In general, for a given PPP, the number of PUs within any bounded area of the network is a Poisson random variable and the numbers of PUs in disjoint areas are independent [26]. An SU can avoid causing interference to a PU-Rx by not transmitting if the SU senses that there is at least one PU-Rx in the detection area. The availability of PU-Tx and PU-Rx in the detection area under random geometric method is studied in [22], [26], [27] and the same approach is adopted in this network model. The SU and PU network models for
PU-RX
PU-TX
PU-RX
PU-TX
SU
SU
PU-TX
SU
SU
PU-Detection Area
SU
SU
SU SU
PU-RX
PU-TX
SU
SU SU
PU-RX PU-TX
PU-RX
SU PU-RX
PU-TX
SU
PU-RX
A PU-RX identified in PU-Detection Area
SU
SU
SU
SU
SU SU
SU
SU
PU-TX
SU
PU-RX
PU-TX
PU-RX
SU
SU
SU
SU
SU
SU
SU
(a) PU-absent case (Y = 0)
Fig. 1.
SU
SU
SU
SU
PU-TX
PU-RX
PU-TX
SU
(b) PU-present case (Y = 1)
The PU and SU network model for PU-absent and present cases.
the PU-present and the PU-absent cases are illustrated in Fig. 1. It is important to note that our PU detection approaches are not limited to any specific system model. B. Channel Model We assume that the channel gain from a PU-Tx to an SU undergoes path-loss, shadowing, and Rayleigh fading. The channel gain from PU-Tx situated at s, to i-th SU is denoted by hi (s). Hence, the channel gain hi (s) can be expressed as hi (s) = ρ(|s − ci |)ψi (s)ωi (s)
(1)
where ρ(d) = (1 + dα )−1 is the path-loss component for relative distance d and α is the path-loss exponent [23]. ψi (s) is the lognormal shadow fading component with zero mean and standard deviation σ and ωi (s) is the Rayleigh fading component. C. Energy Detection Model To sense PU availability, the SUs perform energy detection during T time period and take W T samples during T . W is the bandwidth of the spectrum. Hence, the k-th received signal sample at i-th SU, zi,k , can be written as X xi,k (s) + ni,k (2) zi,k = Q s∈ Tx
where xi,k (s) is the k-th sample component from PU-Tx Q situated at s to i-th SU. ni,k denotes the noise at i-th SU. T x denotes the PU-Tx locations. Thus, the output of the energy detector corresponding to the normalized energy of the i-th SU can be written as γi =
WT 1 X |zi,k |2 N0
(3)
k=1
where No = E[|ni,k |2 ] is the power spectral density of noise. D. Primary User Detection The main goal of the SU network is to identify whether there is a PU-Rx within the detection area A to avoid harmful interference at PU-Rx. The PU-absent and PU-present cases are illustrated in Fig. 1(a) and 1(b), respectively. Let y denote
a random variable that indicates PU-Rx availability in the detection area A, which is ( 1, if at least one PU-Rx is in A y= (4) −1, otherwise. Note that y is determined by using the proposed algorithms (i.e., SVM and improved KNN classification algorithms) which will be described in Section III. The sensing result of each user is treated as a feature in the proposed algorithms and utilized for PU-Rx classification. The vector of sensing results that is used for identifying PU-Rx availability in the detection area A is Γ = (γ1 , γ2 , · · · , γN )T . III. C OOPERATIVE S ENSING A LGORITHMS
P ∀k). The expression k δk measures the number of points which are misclassified by wφ(Γk ) + w0 = 0 as well as the number of points that are correctly classified but they lie in the slab −1 < wφ(Γk ) + w0 < 1. Hence, we can write the optimization problem as a convex optimization as follows: M X 1 ||w||2 + C δk 2 k=1 yk [wφ(Γk ) + w0 ] ≥ 1 − δk subject to δk ≥ 0, k = 1, 2, · · · , M .
min
Hence, the Lagrangian of (8) can be written as L(w, w0 , δ; α, β) =
A. Support Vector Machine The SU network gathers M samples of the vector of sensing results tagged with the PU-Rx availability. Let Γk and yk denote the k-th sample of the vector of sensing results and the PU-Rx availability. Let S = {(Γ1 , y1 ), (Γ2 , y2 ), · · · , (ΓM , yM )} denote a set of training samples, where Γk ∈ 1) min ||w|| + C 2 k=1 yk [wφ(Γk ) + w0 ] ≥ 1 − δk subject to δk ≥ 0, k = 1, 2, · · · , M
(7)
where ||w||2 = wT w and C is a soft margin constant [8]. The optimization problem defined in (7) is non-convex due to the segment 1(δk > 1) in P the objective function. Since δk > 1 for misclassification, k δk gives a bound on the number of misclassified training samples (i.e., 1(δk > 1) ≤ δk ,
(8)
−
M
M
k=1
k=1
X X 1 ||w|| + C δk − βk δ k 2 M X
(9)
αk yk [wφ(Γk ) + w0 ] − 1 + δk
k=1
where αk and βk are Lagrangian multipliers. By applying Karush-Kuhn-Tucker (KKT) conditions, we can obtain w=
M X
αk yk φ(Γk )
(10)
k=1 M X
αk yk = 0
(11)
αk = C − βk .
(12)
k=1
It is noticeable that 0 ≤ αk ≤ C since βk ≥ 0. The value of αk is the solution of the quadratic programming problem (9) and known as support vectors. Hence, we can obtain the dual problem in term of support vectors as follows [8]: M X
M
M
1 XX αk αj yk yj hφ(Γk ), φ(Γj )i αk − max 2 k=1 j=1 k=1 PM k=1 αk yk = 0 subject to 0 ≤ αk ≤ C, k = 1, 2, · · · , M
(13)
where the notation h., .i defines the inner product. The KKT conditions uniquely characterize the solution of the primal problem as (10), (11) and (12) and the solution of the dual problem as two active constraints αk yk [hw, φ(Γk )i + w0 ] − 1+δk = 0 and βk δk = (C −αk )δk = 0. Hence, by solving the optimization function defined in (13), the nonlinear decision function can be obtained as X M y(Γ) = sgn αk yk K(Γ, Γk ) + w0 . (14) k=1
However, we still do not know the exact form of φ(Γk ) to yield a solution to the quadratic programming problem. This issue is achieved by means of kernel function and can be defined as K(Γk , Γj ) = hφ(Γk )T φ(Γj )i. Some of the commonly used kernel functions are linear, polynomial, and Gaussian radial basis function [8].
B. Weighted K-Nearest-Neighbour Classification Technique The weighted K-nearest-neighbor is a classification technique based on the majority voting of the neighbors. The neighbors are training points previously loaded into the cognitive classifier. The set of the K-nearest training points must first be determined by calculating the weighted distances between the test point and each training point. The weight of each feature or sensing result is determined by calculating the area under the receiver operating characteristic curve of the corresponding feature [25]. Hence, the distance of each feature between training and test point is multiplied by the corresponding weight to obtain the relative distance between the training and test points. In this study, we implement a number of distance measures to calculate the distance, ∆(Γk , Γ, Aw ), between the test sample Γ and the training sample Γk , where Aw = {Aw1 , Aw2 , · · · , AwN } and Awi , 1 ≤ i ≤ N , is the i-th feature area-under-ROC (AuROC) curve. The squared Euclidean distance is the summation of squared distances of all features PN 2 |A (Γ i k (γi )−Γ(γi ))| . Cityblock (Manhattan distance) is i based on the absolute difference of dimensions. The weightedKNN cooperative spectrum sensing algorithm can be presented as in Algorithm 1. Algorithm 1 Weighted-KNN Algorithm for CSS Input: S: Set of training samples with corresponding labels Γ: Test samples K: Number of neighbors Aw : {Aw1 , Aw2 , · · · , AwN } // AuROC of all features 1: for k = 1 to M do 2: Calculate distance ∆(Γk , Γ, Aw ) 3: end for 4: Sort ∆(Γk , Γ, Aw ) // Ascending order 5: Select the first K entries 6: Count the number of labels in K entries 7: Labels ← Most frequent class label Output: Class label of the test sample
IV. N UMERICAL R ESULTS AND D ISCUSSIONS In the numerical study, we consider the density of PU-Tx is 10−5 m−2 and PU-Rx is uniformly located ξ m away from the PU-Tx. The PU-Rx location lies within a disk area centred at its PU-Tx. The SUs participating for cooperative sensing are located in a 7-by-7 (49 SUs) grid topology unless otherwise we specifically mentioned. The important simulation parameters are as follows: channel bandwidth (W) is 10 MHz, sensing duration T is 100 µs, spectral density of noise No is −174 dBm, the path-loss exponent is 3, and the standard deviation of shadowing is 8 dB. The transmit power of each PU-Tx is taken as 200 mW. The proposed algorithms are implemented by using MatLab 7.9.1 (R2009b) in a 64-bit computer with core i3 processor, clock speed 2.4 GHz, and 4 GB RAM.
TABLE I AVERAGE SVM T RAINING D URATION ( SECONDS ) Training Samples Linear 10 0.0039 100 0.1047 200 0.7306 300 1.7275 400 4.5504 500 11.7802
Polynomial-2 0.0031 0.1182 0.7057 1.3756 3.3646 9.3676
RBF 0.0031 0.0094 0.0160 0.0309 0.2614 1.1606
Table I features the average training duration of SVM classifier for different kernel functions. The SVM classifier with radial basis function (RBF) kernel takes relatively lower training duration in comparison to linear and second order polynomial kernel. Specifically, RBF takes 1.606 seconds to train its classifier by using 500 samples whereas Linear and Polynomial-2 kernel take 11.7802 and 9.3676 seconds, respectively, for training its classifier. However, for the KNN classifier, the average training time is measured as the training sample uploading time to the classifier and it is approximately 50 µs for 1000 samples. Hence, the KNN classification has the capability of changing the training samples more promptly when compared to the SVM. TABLE II AVERAGE SVM CLASSIFICATION DURATION ( SECONDS ) Training samples
Linear
Polynomial-2
RBF
10 100 500 1000
9.03 × 10−7 1.52 × 10−6 7.28 × 10−6 9.94 × 10−6
1.08 × 10−6 1.68 × 10−6 1.31 × 10−5 1.61 × 10−5
2.26 × 10−6 7.50 × 10−6 4.01 × 10−5 7.33 × 10−5
TABLE III AVERAGE KNN CLASSIFICATION DURATION ( SECONDS ) Training samples Sq. Euclidean 10 100 500 1000
5.76 × 10−4 5.87 × 10−3 2.51 × 10−2 5.92 × 10−2
Cityblock 5.07 × 10−4 5.17 × 10−3 2.50 × 10−2 5.21 × 10−2
The time taken for classifying PU availability/unavailability within the considered detection area for SVM and KNN classifier is shown in Table II and table III, respectively. It is important to notice that SVM has more potential for practical implementation since the classification time for a single sample is less than 1 × 10−4 seconds. However, though the RBF classifier takes comparatively lower training duration, its PU detection time is higher than the Linear and Polynomial2 kernel detection. Table IV shows the correct PU detection capability of SVM classifier for different type of kernel functions such as Linear, Polynomial-2 and RBF kernels. It is clearly noticeable that the RBF classification of SVM performs well in all the cases for PU detection. However, the computational complexity of RBF is compensated by higher correct PU detection probability. The correct PU detection percentage of KNN classifier for
TABLE IV C ORRECT PU DETECTION PERCENTAGE OF SVM FOR DIFFERENT KERNEL METHODS (7 X 7) Linear 81.43 ± 4.21% 84.85 ± 2.14% 88.01 ± 1.62% 91.35 ± 1.21%
Polynomial-2 61.01 ± 7.93% 63.03 ± 6.25% 67.21 ± 4.54% 71.40 ± 3.12%
RBF 91.21 ± 2.61% 92.06 ± 1.84% 93.15 ± 1.03% 94.01 ± 0.79%
TABLE V C ORRECT PU DETECTION PERCENTAGE OF KNN FOR DIFFERENT DISTANCE MEASURES (7 X 7) Training samples 10 100 500 1000 5000 10000
Sq. Euclidean 83.92 ± 3.49% 86.90 ± 2.55% 89.09 ± 1.44% 91.75 ± 1.11% 91.40 ± 0.83% 91.92 ± 0.36%
0.9 0.8 0.7 Detection probability
Training samples 10 100 500 1000
1
0.6 No. of SUs in cooperation 3× ×3, 5× ×5, 7× ×7
0.5 0.4 0.3 0.2
Cityblock 81.18 ± 4.91% 85.58 ± 4.71% 86.55 ± 3.86% 87.03 ± 3.02% 87.94 ± 1.37% 88.55 ± 0.89%
SVM - RBF SVM - Linear SVM - Polynomial-2
0.1 0 0
0.1
0.2
0.3
0.4 0.5 0.6 False alarm probability
0.7
0.8
0.9
1
Fig. 3. SVM classifiers comparison with different number of SU cooperation when ξ = 100 m, detection radius is 300 m, and 1000 training samples are used.
1 0.9
1
0.8
0.9 0.8 0.6
0.7 Detection probability
Detection probability
0.7
0.5 0.4
KNN - City Block KNN - Sq. Euclidean Fisher Linear AND OR SVM - Linear SVM - RBF SVM - Polynomial 2
0.3 0.2 0.1 0 0
0.1
0.2
0.3
0.4 0.5 0.6 False alarm probability
0.7
0.8
0.9
No. of SUs in cooperation 3×3, 5×5, 7×7
0.6 0.5 0.4 0.3 0.2 0.1
1
Fig. 2. ROC curves for different cooperative sensing schemes when ξ = 100 m, detection radius is 300 m, and 1000 training samples are used.
different distance measures and batch of training samples is shown in Table V. The results clearly reveal that the squared Euclidean distance measure performs better than Cityblock for all the cases. Further, increasing the number of training sample yields higher PU detection capability with lesser detection deviation. Fig. 2 compares the performance of cooperative spectrum sensing schemes in terms of receiver operating characteristic curves. Further, this figure depicts that the SVM with Linear and RBF kernels, and the KNN with square Euclidean and Cityblock distance measures based cooperative sensing outperform the existing state-of-the-art cooperative sensing technique such as Fisher linear discriminant, AND-rule, and OR-rule. Specifically, simple KNN classifier performs better than the SVM-RBF, SVM-Linear and SVM-Polynomial-2 (degree 2 polynomial kernel) classifier in the false alarm region of 0−0.1, 0−0.3 and 0−0.4, respectively. Further, SVM-Ploynomial-2 classifier also perform comparatively better than Fisher Linear
0
Sq. Euclidean Cityblock 0
0.2
0.4 0.6 False alarm probability
0.8
1
Fig. 4. KNN classifiers comparison with different number of SU cooperation when ξ = 100 m, detection radius is 300 m, and 1000 training samples are used.
discriminant in the false alarm region of 0.4−1, approximately. The KNN achieves higher performance in comparison to the existing cooperative spectrum sensing algorithms due to the exploitation of localized information whereas the SVM classifier achieves higher detection probability by mapping feature space to higher dimension with the help of kernel function. Fig. 3 quantifies the effect of number of SUs participation for cooperative spectrum sensing in SVM classification. It shows that the detection probability of a particular classifier for a particular false alarm probability decreases with the decreasing number of SUs. However, higher detection probability is compensated by increasing signaling overhead. Further, The SVM classifier based on RBF kernel achieves higher PU detection probability compared to SVM-Linear and Polynomial-2 classifier. The detrimental effect of KNN classification with the de-
creasing number of SUs in cooperation is quantified in Fig. 4. It is clear that the ROC performance degrades with decreasing number of SUs participating for cooperation and the KNN based on Euclidean distance outperforms the Cityblock for all cases. V. C ONCLUSION In this paper, we design a PU network as a random geometric network to capture the randomness of the primary user location. In this regard, we consider a cooperative spectrum sensing approach for PU detection in CR networks base on pattern recognition techniques, namely, the support vector machine and the weighted K-nearest-neighbor classification techniques. The received signal strengths at the SUs in the detection area are measured as the feature set for PU detection. We quantify the classifier performance in terms of training time, classification time, and the receiver operating characteristic curves. The proposed SVM classifier achieves higher detection probability compared to the existing algorithms by mapping feature space to higher dimension space with the help of kernel functions: Linear kernel, RBF kernel, and second order kernel. The simple weighted KNN classifier outperforms the proposed SVM classifiers in low false alarm region by exploiting the localized information from the training set. In practical implementation, it is crucial to obtain accurate training set since inaccurate training of classifier gives vulnerable decision. Note that since the proposed method calculates the support vectors from the training samples obtained by Monte Carlo simulations, we do not have to confine the PU network model to a simple PPP-based peer-to-peer network model. The proposed approaches can be used for other random network models each of which describes a representative wireless system, for example, random connection model in [23] for ad hoc networks, Matern hardcore point process for CSMA/CAbased networks [26], [27], and the Voronoi tessellation model for cellular networks. ACKNOWLEDGMENT This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. R EFERENCES [1] J. Mitola, “Cognitve radio: An integrated agent architecture for software define radio,” Ph.D dissertation, KTH, 2000. [2] E. Hossain and V. K. Bhargava, Cognitive Wireless Communication Networks. Springer, 2007. [3] S. Haykin, “Cognitive radio: Brain-empowered wireless communications,” IEEE Journal on Selected Areas in Communications, vol. 23, no. 2, pp. 201–220, February 2005. [4] E. Hossain, D. Niyato, and Z. Han, Dynamic Spectrum Access and Management in Cognitive Radio Networks. Cambridge University Press, 2009. [5] H. Huang, Z. Zhang, P. Cheng, and P. Qiu, “Opportunistic spectrum access in cognitive radio system employing cooperative spectrum sensing,” in Proc. of IEEE Vehicular Technology Conference (VTC’09- Spring 2009), pp.1–5, 26-29 April 2009. [6] T. Yucek and H. Arslan, “A survey of spectrum sensing algorithms for cognitive radio applications,” IEEE Communications Surveys and Tutorials, vol. 11, no. 1, pp. 116–130, First Quarter 2009.
[7] C. Li and C. Li, “Opportunistic spectrum access in cognitive radio networks,” IEEE International Joint Conference on Neural Networks (IJCNN’08), pp. 3412–3415, 1-8 June 2008. [8] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning J., vol. 20, no. 3, pp. 273–297, 1995. [9] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. 2nd Edition, Wiley, New York, 2001. [10] Y. Zeng, Y.-C. Liang, A. T. Hoang, and R. Zhang, “A review on spectrum sensing for cognitive radio: Challenges and solutions,” EURASIP Journal on Advances in Signal Processing, vol. 2010, Article ID 381465, 15 pages, 2010. [11] I. F. Akyildiz, B. F. Lo, and R. Balakrishnan, “Cooperative spectrum sensing in cognitive radio networks: A survey,” Physical Communications, vol. 4, no. 1, pp. 40–62, March 2011. [12] D. Cabric, S. M. Mishra, and R. Brodersen, “Implementation issues in spectrum sensing for cognitive radios,” in Proc. of 38rd Asilomar Conf. on Signals, Systems, and Computers, Pacific Grove, CA, November 2004. [13] A. Sahai, R. Tandra, S. M. Mishra, and N. Hoven, “Fundamental design tradeoffs in cognitive radio systems,” in Proc. of TAPAS’06, Boston, MA, August 2006. [14] A. Ghasemi and E. S. Sousa, “Spectrum sensing in cognitive radio networks: The cooperation-processing tradeoff,” Wireless Commun. and Mobile Comput., vol. 7, no. 9, pp. 1049–1060, November 2007. [15] E. C. Peh, Y.-C. Liang, Y. L. Guan, and Y. Zeng, “Optimization of cooperative sensing in cognitive radio networks: A sensing-throughput tradeoff view,” IEEE Trans. Veh. Technol., vol. 58, no. 9, pp. 5294–5299, November 2009. [16] J. Unnikrishnan and V. V. Veeravalli, “Cooperative sensing for primary detection in cognitive radio,” IEEE J. Sel. Topics in Signal Process., vol. 2, no. 1, pp. 18–27, February 2008. [17] J. Ma, G. Zhao, and Y. Li, “Soft combination and detection for cooperative spectrum sensing in cognitive radio networks,” IEEE Trans. Wireless Commun., vol. 7, no. 11, pp. 4502–4507, December 2008. [18] G. Ding, Q. Wu, J. Wang, and X. Zhang, “Joint cooperative spectrum sensing and resource scheduling for cognitive radio networks with soft sensing information,” in Proc. of IEEE Youth Conference on Information Computing and Telecommunications (YC-ICT), pp. 291– 294, 28-30 November 2010. [19] Z. Quan, W. Ma, S. Cui, and A. H. Sayed, “Optimal linear fusion for distributed detection via semidefinite programming,” IEEE Trans. Signal Process., vol. 58, no. 4, pp. 2431–2436, April 2010. [20] G. Ganesan and Y. G. Li, “Cooperative spectrum sensing in cognitive radio - part I: Two user networks,” IEEE Transactions on Wireless Communications, vol. 6, no. 6, pp. 2204–2213, June 2007. [21] W. Zhang and K. B. Letaief, “Cooperative spectrum sensing with transmit and relay diversity in cognitive radio networks,” IEEE Transactions on Wireless Communications, vol. 7, no. 12, pp. 4761–4766, December 2008. [22] K. W. Choi, E. Hossain, and D. I. Kim, “Cooperative spectrum sensing under a random geometric primary user network model,” IEEE Transactions on Wireless Communications, vol. 10, no. 6, pp. 1932–1944, June 2011. [23] M. Haenggi, J. Andrews, F. Baccelli, O. Dousse, and M. Franceschetti, “Stochastic geometry and random graphs for the analysis and design of wireless networks,” IEEE Journal on Selected Areas in Communications, vol. 27, no. 7, pp. 1029–1046, September 2009. [24] J. S. Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, UK, 2004. [25] T. Fawcett, “ROC Graphs: Notes and Practical Considerations for Data Mining Researchers,” Technical Report HPL-2003-4, HP Labs, 2003 [26] H. ElSawy, E. Hossain, and S. Camorlinga, “Characterizing random CSMA wireless networks: A stochastic geometry approach,” in Proc. of IEEE Int. Conf. on Communications (ICC’12), Ottawa, Canada, 1015 June 2012. [27] H. ElSawy and E. Hossain, “Modeling random CSMA wireless networks in general fading environments,” in Proc. of IEEE Int. Conf. on Communications (ICC’12), Ottawa, Canada, 10-15 June 2012.