From Prediction to Action: Improving User Experience ... - IEEE Xplore

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2017.2680918, IEEE Journal on Selected Areas in Communications IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. 8, AUGUST 2015

1

From Prediction to Action: Improving User Experience with Data-Driven Resource Allocation Yanan Bao, Student Member, IEEE, Huasen Wu, Member, IEEE, Xin Liu, Member, IEEE

Abstract—Driven by the desire for better user experience and enabled by improved data storage and processing, much recent work has studied user experience prediction in cellular networks. In this paper, moving beyond the prediction-only approach, we propose a data-driven resource allocation framework that uses data-generated prediction models to explicitly guide resource allocation for user experience improvement. In a closed-loop fashion, it further leverages and verifies the causal relation that often exists between certain feature values (e.g., bandwidth) and user experience in computer networks. As a case study, we consider how to reduce the number of user complaints in cellular networks. Our approach consists of three components: we train a logistic regression classifier to predict user experience, utilize the trained likelihood as the objective function to allocate network resource, and then evaluate user experience with allocated resource to (in)validate and adjust the original model. We design a DualHet algorithm to tackle the problem of multi-dimensional resource optimization with heterogeneous users. Numerical simulations based on both synthetic and real network datasets demonstrate the effectiveness of the proposed algorithms. In particular, the simulations based on real data demonstrate up to 2x performance improvement compared with the baseline algorithm. Index Terms—Data-driven networking, machine learning, resource allocation, non-convex optimization.

I. I NTRODUCTION ITH the explosive growth of wireless data traffic and various mobile applications, both industry and academia are increasingly focusing on user experience. In general, for the service provided, user experience determines user engagement, and therefore affects the revenue and long term development of a company. With the increase of storage and computation capacity, the analysis and prediction of user experience becomes more feasible. A large body of literature discusses how user experience is learned and predicted using machine learning techniques [1], [2], [3], [4], [5]. In many cases, however, user experience prediction itself is not the ultimate goal. Normally, we hope to proactively identify users with poor experience and take proper actions to improve it. For instance, cellular operators receive complaints about the data services from their customers. Based on realtime network performance indicators, the complaints can be predicted to a certain degree. Then, if network operators can allocate more wireless resources to the users with poor experience, it is possible that the complaints can be avoided

W

The authors are with the Department of Computer Science, University of California, Davis, CA 95616, USA e-mail: ({ynbao, hswu, xinliu}@ucdavis.edu). Manuscript received September 21, 2016; revised January 13, 2017. This work was supported in part by the National Science Foundation (NSF) under Grant CNS-1547461, CNS-1457060 and CCF- 1423542.

Classifier Initial Training Data Construction

Labeled Data

Evaluation & Data Sampling

Trained Classifier Users w/o Label

Resource Allocation

Users w/ Updated Features

Available Resource

Fig. 1. A closed-loop framework in data-driven resource allocation.

proactively. We are facing a natural problem: given limited resources, how to allocate them to multiple users to optimize the overall experience? To answer this question, we advocate a closed-loop approach that uses data-generated prediction models to explicitly guide resource allocation for user experience optimization. This approach is illustrated in Fig. 1. First, based on a historical dataset with labeled user experience, we construct an appropriate user experience prediction model to reflect the correlation between feature values and user experience. Then, we feed the model into the resource allocation component as the objective function to optimize resource allocation for incoming users. The output is an appropriate resource allocation and users with improved feature values. Last, the evaluation and data sampling component samples data after resource allocation, validates or invalidates the model, and adjusts the constructed prediction model as needed. The details are given in Sec. III-A. In this framework, we leverage existing machine learning methods for user experience prediction. Specifically, in this paper, we use the logistic regression model. We focus on the resource allocation algorithms for the trained model and discuss how to adjust the classification model based on evaluating resource allocation results to further improve performance. The proposed framework has two benefits. First, the constructed classifier illustrates a quantitative relationship between the feature values and the user experience. Using such a quantitative relationship and domain knowledge, we are able to allocate network resources more precisely to reduce the expected number of users with poor experience, in contrast to the typical approach of using abstract utility functions for resource allocation. Second, the framework includes an evaluation component, where users are sampled after resource allocation to validate or invalidate the causal relationship hypothesis between the feature values and the user experience.

0733-8716 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.


This step also provides further opportunities to adjust the constructed prediction model. The proposed framework has several challenges. First, it is typically more challenging to optimize resource allocation based on prediction models derived from real data than to use utility functions with nice properties such as convexity [6], [7]. In our data-driven resource allocation problem, the likelihood function of user complaint based on logistic regression is non-convex. Therefore, gradient-type methods are unlikely to achieve the global optimum for these problems. Furthermore, there are typically many users in the network and different users have different sensitivities to the network parameters. This heterogeneity makes it impossible to find the optimal solution by exhaustive searching, when the network scale is large. Moreover, when there are multiple types of resources, the problem becomes more difficult as resources are coupled. In this paper, we present a holistic solution using the closedloop framework in data-driven resource allocation. Specifically, we make the following contributions: • We propose the closed-loop framework in data-driven resource allocation. Two novel aspects in this framework are 1) the resource allocation component, where we explicitly use the constructed prediction model as the objective function to optimize resource allocation [8]; and 2) the evaluation and data sampling component, which validates or invalidates the model, and adjusts the constructed prediction model as needed [9]. • We propose a DualHet algorithm to obtain near optimal solutions for logistic-regression-based optimization problem (Sec. V). Different from [10], the algorithm can obtain near optimal solutions in scenarios with multiple types of resources, and each type further has heterogeneous effects on users. Moreover, we propose a perturbed version of DualHet, referred to as ϵ-Perturbed DualHet, that leverages the closed-loop feedback to continuously adjust the classifier and the improve QoE performance. • Finally, we evaluate the algorithms based on simulations with both synthetic and real world cellular datasets (Sec. VI). Results indicate our designed algorithms can reduce up to 2x expected user complaints compared to an optimized baseline. We compare our solution with the upper bounds obtained from the dual problem, and the results show the gap is less than 3%. II. R ELATED W ORK User experience in cellular networks is studied extensively in recent years [1], [2], [11], [3], [12], [13], [14], [15], [16]. In [1], the authors use a month-long anonymous data collected from a cellular network provider to study Quality of Experience (QoE) metrics including session length, abandonment rate, and partial download ratio. The relation between mobile video streaming performance and user engagement from the perspective of network operators is discussed in [2]. Using 27 TB video streaming traffic from more than 37 million flows, the authors observe strong correlations between many network features and the abandonment or skip rates. [11] uses controlled experiments and supervised learning (particularly, decision tree) to predict QoE for Skype calls based

2

on network-level measurements. The authors conclude that measuring delay, bandwidth, and loss rate can achieve 83% accuracy in predicting QoE. In [3], the authors studies the QoE prediction for mobile video services using a temporal quality metric and linear learning models. A large body of literature considers the learning-based costefficient decision making. Theoretically, our problem could be formulated as a reinforcement learning problem [17]. However, we will face the curse of dimensionality if we directly solve it with the standard reinforcement learning approach due to the large state and action space (as we have many users with possibly different states in the networks). There is another line of study on the combination of statistical learning and decision. In [18], the authors discuss the pipeline of data collection, predictive model, and decision analysis. The problem of patient readmission in hospitals with congestive heart failure is considered by [19]. The authors construct a classifier to predict readmissions and propose to use patientspecific interventions to reduce the cost. The combination of prediction and allocating interventions shows a reduction of both rehospitalization rate and cost. Considering the resource allocation as the decision, [20] uses a prediction engine to estimate the performance of a given resource allocation and a genetic algorithm to find an optimized solution for Infrastructure-as-a-Service (IaaS)-based cloud system. [21] uses the collected information from system behaviors to predict power consumption levels, CPU loads and SLA (Service Level Agreement) timings to improve scheduling decisions in data centers. [22] proposes a learning-based approach of power control for the uplink interference management in 4G cellular networks. However, none of the existing work, to the best of our knowledge, have considered data-driven network resource allocation problem for improving user QoE. Moreover, the methods they considered do not apply to resource allocation in our setting, due to complex user experience model and the large number of users. Network Utility Maximization (NUM) has been extensively studied, e.g., in [10], [23], [7]. The difference between our work and their work lies in two aspects: 1) Our utility function is learned from real datasets, and thus is more complicated; 2) Our problem includes multiple types of resources, which makes the problem more challenging. Optimizing the sum of separable utility functions is discussed in [24], [25], [26]. Our work focuses on the case where users are heterogeneous and parameters are independently randomly distributed real numbers. Compared with results in [25], [26], our algorithm has lower complexity while achieving the same theoretical performance. The framework of data-guided resource allocation is studied in our previous work [8] and [9]. [8] applies the classifier learned on labeled data to guide resource allocation on unlabeled data. However, [8] considers a simple scenario where there is only one type of resource, with homogeneous users. In contrast, this work considers more general and complex scenarios with multiple resources and heterogeneous effects of resources on different users. [9] considers the same framework but focusing on the neural-network prediction model, which is hard to solve efficiently and analyze, and only heuristic



algorithms are proposed. III. P ROBLEM S TATEMENT In this paper, we study the problem of reducing the number of customer complaints at mobile operators. In particular, we consider a tier-1 operator in a city in southern China. The city has about 6 million population and the operator has a user penetration of 2/3. The operator receives an average of 600 complaints a day from customers related to the network service quality. The goal of the operator is to proactively reduce the average number of user complaints by taking appropriate actions. We describe our framework for achieving this goal as follows.

3

TABLE I F EATURES OF THE MOBILE USER COMPLAINT DATASET. No.

Feature Name

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Intercept PDP Succ(%) ATT Succ(%) RAU Succ(%) Sess Succ/Sess Req Session Req Connection attempts Unexpected line drops Core-network Fail(%) Radio-network Fail(%) Transmission Succ(%) ThroughputD/TrafficD Traffic Downlink(KB) Traffic Uplink(KB)

Sample Value 1 0∼100 0∼100 0∼100 0.4 15 10 0 0∼100 0∼100 0∼100 2 7.45 3.69

Weights of LR 2.58 -0.0224 -0.0210 -0.0113 -1.33 -1.31e-5 -7.27e-5 0.0251 0.0112 0.0349 -0.0270 -3.31 9.51e-6 1.65e-4

A. Data-Driven Resource Allocation Framework To achieve this goal, we advocate a data-driven resource allocation framework, first proposed in [8] and then enhanced in [9]. As shown in Fig. 1, the framework has three components: classifier construction, resource allocation, and evaluation & data (re)sampling. First, we start with a historical dataset with labeled user experience and the corresponding feature values (including network performance metrics). Based on the data, we construct a user experience prediction model to reflect the correlation between feature values and user experience. In this step, standard machine learning techniques, such as logistic regression, Support Vector Machines (SVM), random forests, and neural networks, can be applied. In different application scenarios, they may differ in prediction performance and they also result in different complexity in the resource allocation component. For example, in this work, we use logistic regression as it provides the best prediction result in our dataset. Furthermore, it is also relatively simple that allow efficient resource allocations as shown in Sec. V. In [9], neural network model is applied because of its generality, and the corresponding resource allocation is much more computationally expensive. After constructing the prediction model based on labeled data, we feed the model into the resource allocation component as the objective function for optimal resource allocation. The key intuition here is that the constructed model provides the best indication on the quantitative relationship between the user features and experience. Therefore, such information helps guide resource allocation in a more quantitative manner. The output here is an appropriate resource allocation result and users with improved feature values. Last, the evaluation and data sampling component samples data after resource allocation, validates or invalidates the model, and adjusts the constructed prediction model as needed. The component is crucial in real world implementations. The prediction model constructed in the first step shows correlation, not necessarily causation, between the feature values and user experience. In addition to domain knowledge and field experience, this step allows us to validate or invalidate the causation relationship. Furthermore, because resource allocation may change the distribution of feature values, the original prediction model may need to be adjusted.

B. Applying to the Cellular Network Data The dataset we obtained from the operator contains 569170 normal users and 1275 complaining users, which illustrates the highly imbalanced nature of user complaints. The dataset is obtained from the network monitoring system that records 13 network performance indicators for each user. The recorded numbers are the averages of one hour of each user. The dataset has been pre-screened to contain complaints only about data services. Complaints for other reasons, e.g., billing issues, have already been filtered out from the dataset. Table I shows the features and their sample values. Feature 0 is a constant, decided by the logistic regression model described in the Sec. IV-B. Features 1-4, 7-8, and 10 are Packet Data Protocol (PDP) success percentage, Attachment (ATT) success percentage, Routing Area Update (RAU) success percentage, session success ratio, unexpected line drops, core network failure percentage, and transmission success ratio, respectively. These features are network features mostly related to core networks or existing radio front end characteristics. We consider these features “uncontrollable” in this study, i.e., we cannot change the feature values by allocating resources. Furthermore, Features 5 (requested sessions), 6 (attempted connections) and 12-13 are traffic characteristics of users, again, considered “uncontrollable”. The “controllable” features include Features 9 and 11, which are the radio network failure percentage and the rate of downlink throughput to the downlink traffic volume (Feature 12), respectively. In summary, for each user in the dataset, we have its hourly network measurements, as well as its label: a user is labeled as a complaining user (i.e., a positive, following the convention in the machine learning community), if the user called the customer service at least once during the hour; and normal (i.e., negative) otherwise. Based on this dataset, one can build a prediction model that correlates network performance metrics and the likelihoods of user complaints, which is discussed in more detail in Sec. IV-B. Based on the prediction model, the next step is to proactively allocate resource to reduce the expected number of user complaints. In this work, we only consider the resources that a base station (BS) can allocate to users in its coverage. In



IV. M ATHEMATICAL F ORMULATION We formulate the above-stated problem mathematically in this section. A. Features and Resources Consider users in a D-dimensional feature space, i.e., for user i, we have xi = [xi,1 , xi,2 , · · · , xi,D ]T , where xi,d is the value of feature d (d = 1, 2, · · · , D). Each user is associated with a label, which can be a complaining user (label 1) or a user in a normal state (label 0). Following the tradition, we also denote label 0 (users in normal state) as negative and label 1 (complaining user) as positive. There are K types of resources, and the resources allocated to user i are denoted by ri = [ri,1 , ri,2 , · · · , ri,K ]T . We assume that there is a linear relation between the allocated resource and the change of feature values1 . Given ri amount of resources, user i has its feature values updated to g(xi , ri ) = xi + Qi ri ,

(1)

where Qi is a D by K matrix, denoting the effects of the K resources on D features. In the following, we first discuss how to build the classifier and then how to use it to allocate network resources to users more effectively. B. Learning The first step is to construct classifiers for cellular user experience based on the dataset discussed in Sec. III. To handle highly imbalanced data, we randomly undersample the negatives with rate 1/50, and use Receiver operating Characteristic (RoC) curves as the performance metric [27]. Given a classifier, by changing the decision threshold, multiple 1 Linear relation is considered in our first step study of the data-driven resource allocation problem. This simple relation holds for the features we consider here such as throughput. More complex and general functions will be studied in the future work.

1

True positive rate

particular, a BS can allocate two types of resources to the users: bandwidth and local proactive reconnection. Allocating more bandwidth increases the throughput of the recipient, and thus increases the value of Feature 11, and local proactive reconnection helps the user connect to the BS when a connection failure happens, which improves the value of Feature 9 (by reducing its value). We note that in general the prediction model demonstrates correlation instead of causation. In this case, domain knowledge plays an important role in deciding the causation, which is similar to the traditional utility-based resource allocation, where a causal relationship between the resource allocation and overall utility is assumed. The benefit of the data-driven approach here is that the prediction model captures a quantitative relationship between the network metrics and the user experience. By leveraging this quantitative relationship, the prediction model allows us to improve the QoE of users more explicitly, compared to using a traditional abstract utility function.

4

0.8 0.6 Neural Net Logistic Regression k-nearest Neighbors SVM Decision Tree Random Forest

0.4 0.2 0 0

0.2

0.4

0.6

0.8

1

False positive rate Fig. 2. The RoC curves of the tested machine learning methods.

pairs of false positive rate and true positive rate can be obtained, and these pairs define a RoC curve. Specifically, based on true label and prediction label, any user can be classified into exactly one of the four categories: true positive, false positive, false negative, and true negative. For example, true positive means both true label and prediction label are positive. False positive means prediction label is positive, but the true label is negative. False positive rate is the proportion of “false positives” in “all the users whose true labels are negative” (including “false positives” and “true negatives”). True positive rate is the proportion of “true positives” in “all users whose true label are positive” (including “true positives” and “false negatives”). To choose a proper classifier, we test a variety of widely used classification algorithms, including logistic regression, neural network, support vector machine, decision tree, random forest and k-nearest neighbors. 50% randomly sampled data is used to train the classifier, and the rest is used as test data. AUC (Area Under the Curve) [28], a commonly used metric for positive and negative classification, is utilized to judge prediction performance. It has an advantage of being insensitive to unbalanced datasets, compared with other evaluation methods, such as the simpler misclassification error. As shown in Fig. 2, logistic regression achieves the best classification performance, i.e., the highest AUC scores. Therefore, we use logistic regression (with its trained weights) to guide resource allocation in the next step. Specifically, the logistic regression learning step is modeled as the following optimization problem ( ) ∑N [ max yi log 1+exp(−w10 −wT xi ) i=1 w,w0 ( )] +(1 − yi ) log 1+exp(w10 +wT xi ) + |w|, (2) where N is the number of users with labels, yi ∈ {0, 1} (i = 1, 2, · · · , N ) are the labels, and wT xi is the inner product between w and xi . The classifier needs to learn parameters w = [w1 , w2 , · · · , wD ]T and w0 , where w contains the weights associated with features, and w0 is the intercept (corresponding to Feature 0 in Table I). To alleviate overfitting, L1 regularization (the term |w|) is applied [29]. We obtain the prediction model based on the training data, and the weights for the trained logistic regression are shown



in Table I. C. Resource Optimization For a user with feature x, based on the classifier learned from labeled data, its probability to be positive is 1 . Therefore, given ri amount of resources, 1+exp(−wT x−w0 ) the probability for user i to be positive is pi =

1 . 1+exp(−wT (xi +Qi ri )−w0 )

(3)

For the ease of notation, define Ci = −wT xi − w0 , which is the initial intercept for user i, Ai = −QTi w, and η(x) = 1/ (1 + exp(x)) .

(4)

′

Let η (x) denote the gradient of η(x). Note that although we assume a common w for all users, unlike [8], we consider much more general scenarios where users are under different network conditions and may have different Qi ’s. Due to this heterogeneity, the multi-resource allocation problem in our paper cannot be reduced to a single-resource allocation problem, and the algorithms in [8] do not apply here. The target of resource allocation is to reduce the expected number of positives. This objective is motivated by the need to improve KPIs (Key Performance Indicators) (in this case, the number of complaints). By utilizing the classifier learned from labeled data as the objective function, given R = [R1 , R2 , · · · , RK ]T amount of resources, the resource allocation for M users without labels can be formulated as the following optimization problem: (P-0)

min

r1 ,r2 ,··· ,rM

s.t.

M ∑ i=1 M ∑

η(Ci + Ai T ri );

(5)

ri ≤ R;

(6)

i=1

0 ≤ ri ≤ Bi ,

i = 1, 2, · · · , M ; (7)

where Ai = −QTi w = [Ai,1 , Ai,2 , · · · , Ai,K ]T , and Ai,k is the aggregated resource efficiency of resource k on user i. If Ai,k ≤ 0, i.e., allocating resource k to user i impairs its performance, we will set Bi,k = 0. Therefore, without loss of generality, we assume Ai,k > 0. Constraint (7) is the userside resource upper bound, i.e., the resource allocated to user i is bounded by Bi = [Bi,1 , Bi,1 , · · · , Bi,K ]T , which can be caused by two reasons: the user-side system configuration, and marginal benefit (when more than Bi resource is allocated, it is no longer effective in improving user experience). This optimization problem is non-convex, which means it cannot be solved efficiently by gradient-based numerical methods. In [10], the authors consider a single dimensional case, where optimal solution can be obtained. However, in our case, the resources are a vector, which makes the problem much trickier and the algorithm in [10] cannot be directly applied. We consider the following approach to obtain a near optimal solution. First, we study the dual problem of the proposed optimization problem. The dual problem gives us a set of dual Lagrangian multipliers, which can be interpreted as the prices of the resources. The challenge is that the solutions

5

obtained by the dual problem may not be feasible in the primal problem because of the non-convexity. We prove that when users’ features are independently distributed, the feasible solution can be obtained readily. The problem formulation and the designed algorithms are not limited to user complaint reduction problem mentioned in Sec. III. It applies to cases where logistic regression is used as the binary classifier, and resource allocation has linear effects on the change of feature values. D. Closed-Loop Feedback and Classifier Optimization In this section, we discuss the closed-loop feedback and possible ways of classifier optimization. First, this component serves the crucial role of validating, partially validating, or invalidating the causal relationship assumed in the resource allocation component. In other words, based on domain knowledge and the learned prediction model, we assume that changing certain feature values improves the user experience in the resource allocation component. The closed-loop feedback allows us to validate or invalidate this assumption by sampling user labels after resource allocation. This is similar to randomized tests typically used in evaluating causal relationship. Furthermore, we note that the goal of the prediction model here is slightly different from that of the traditional one. Traditionally, the goal of the classifier is to maximize accuracy in predicting user labels. In contrast, the goal of the classifier here is to best guide resource allocation with respect to the ground truth. In particular, it needs to better quantify the relationship in the targeted region: where users are distributed as the result of resource allocation. Specifically, denote the ground truth by a function G(x), representing the positive probability for a given feature vector x. The set of classifiers that can be expressed by a certain machine learning method (e.g. logistic regression, neural networks) is denoted by F. Then, the task is to find the optimal f ∗ within F such that ∑ (P-1) f ∗ =arg minf ∈F G(g(xi , r∗i (f ))), (8) i=1,2,...,M

[r∗1 (f ), r∗2 (f )...r∗M (f )]

is an optimal solution of (P-0). where If we know the ground truth G(x), we can find the optimal f ∗ by optimizing (P-1). However, G(·) is unknown and thus we need to approximate it based on the sampled data. We note that this is a fairly complex problem that involves the fundamental tradeoff of exploration v.s. exploitation, where we need to balance between optimizing the user experience based on the learned classifier (exploitation) and improving the classifier by sampling more data (exploration). However, the feature space is typically huge and exploring all this space to obtain an accurate classifier will result in a large cost as noted in [30]. In this paper, we will only study a heuristic policy for this exploration and exploitation tradeoff, as described in Section V-D. V. N EAR O PTIMAL R ESOURCE A LLOCATION In this section, we propose a DualHet algorithm to obtain the near optimal solution for (P-0). The key idea is dual



decomposition, where we use the Lagrangian multipliers to coordinate the resource allocation among users. The key challenge is that dual decomposition does not necessarily provide optimal primal solution due to the nonconvexity of (P-0). In this section, we not only design the algorithm of finding an optimal dual and using it to obtain feasible primal solutions, but also prove the near-optimality of the proposed DualHet algorithm under mild technical conditions. A. DualHet Algorithm First, the DualHet algorithm solves the dual problem (D-0), defined next, of the primal problem (P-0). Based on the dual solution, the key step is to obtain a feasible primal solution and demonstrate its near optimality. Define ηi (r) = η(Ci + Ai T r). The Lagrangian of (P-0) is ∑M ∑M L(r1 , r2 , · · · , rM , λ) = i=1 ηi (ri )+λT ( i=1 ri − R), (9) where λ ∈ RK×1 is the Lagrangian Multiplier (LM). The dual problem of (P-0) is ∑M (D-0) max D(λ) = i=1 ui (λ) − λT R; (10) λ

λ ≥ 0,

s.t.

(11)

for some positive constant β. After achieving the optimal LM, resources are allocated in a sequence where users with single optimal solution are given higher priority, as shown in Algorithm 1 (Lines 8 to 11). Algorithm 1: DualHet Algorithm Input : Complaining likelihood function η1 (), η2 (), · · · , ηM (), λ(0), available resource R, and convergence threshold ∆. Output: r1 p , r2 p , · · · , rM p 1 while ||λ(t + 1) − λ(t)|| ≥ ∆ do 2 ri ∗ = arg minri ηi (ri ) + λ(t)ri s.t. 0 ≤ ri ≤ Bi ; ∑M 3 rneed = i=1 ri ∗ ; [ ]+ 4 λ(t + 1) = λ(t) − a(t)(R − rneed ) ; 5 6 7 8 9 10 11 12

where

13

(D-i) ui (λ) = min

ηi (ri ) + λT ri ;

(12)

s.t.

0 ≤ ri ≤ Bi .

(13)

ri

14

Typically, λ is interpreted as the prices of the K types of resources. Based on it, each user i minimizes its own complaining likelihood plus the cost of resource consumption in a distributed manner ((D-i)). Intuitively, when the price of resource k increases, a user tends to reduce the consumption of resource k, which may result in higher consumption of other types of resources. We discuss how to obtain the optimal solution for problem (D-i), denoted by ri ∗ (λ), in Sec. V-B. Note that (D-i) may have more than one optimal solution. Denote one of the optimal LM-s by λ∗ , i.e., λ∗ = arg maxλ

min

r1 ,r2 ,··· ,rM

L(r1 , r2 , · · · , rM , λ).

(14)

Next, we present the DualHet algorithm, which first finds the optimal LM λ∗ , and then allocate resources based on the λ∗ to users, while guaranteeing that the resource constraints are satisfied. Specifically, as shown in Algorithm 1, we use the subgradient method to solve the dual problem. Because ∑M R − i=1 r∗i (λ(t)) is one of the subgradients of the dual problem, the following equation converges to the optimal LM λ∗ : ]+ [ ∑M ∗ λ(t + 1) = λ(t) − a(t)(R − i=1 ri (λ(t))) , (15) where the step size a(t) needs to satisfy the following conditions [10], ∑∞ a(t) → 0, as t → ∞ and t=1 a(t) = ∞. (16) For instance, a(t) = β/t,

(17)

6

15

end λ∗ = λ(t + 1); ri ∗ = arg minri ηi (ri ) + λ∗ ri s.t. 0 ≤ ri ≤ Bi ; for user i who has a single optimal solution ri ∗ do ri p = ri ∗ ; Update available resource: R = R − ri p ; end for user i who has multiple optimal solutions ri ∗ do ri p = arg minri ηi (ri ) + λ∗ ri s.t. 0 ≤ ri ≤ min(Bi , R); Update available resource: R = R − ri p ; end

In Algorithm 1, rneed denotes the resources consumed by users, based on current LM (i.e., λ(t) and λ(t)), without considering the global available resources. Furthermore, rneed is compared with R and subgradient method (Lines 3 to 4 in Algorithm 1) is applied to update the prices. Note that the optimal value of (D-0) is a lower bound of the primal problem, and this lower bound is used in our performance evaluation. B. Resource Allocation at Individual Users In this subsection, we consider how to solve the nonconvex optimization problem (D-i) (Lines 2, 7 and 13 in Algorithm 1). Given the prices of resources as λ, each user needs to decide the amount of resource to consume. It is a challenging problem, even though it is distributed and the number of variables is reduced to K, because the objective function is still non-convex. The analysis in this subsection also contributes to the near optimality analysis in Sec. V-C. The KKT conditions are necessary conditions that optimal solutions have to satisfy. Utilizing the KKT conditions, we find two properties of the optimal solutions. These properties limit the number of candidate solutions to at most K + 1, and the optimum is selected by comparing the K + 1 candidates. The Lagrangian of (D-i) is as follows: ( ) LS (ri , Bi , λ, τ, v) = η Ci + Ai T ri + λT ri −τ T (Bi − ri ) − vT ri .

(18)



Any optimal solution of (D-i) satisfies the following KKT conditions: Ai,k η ′ (Ci + Ai T ri ) + λk + τk − vk = 0; vk ri,k = 0; τk (Bi,k − ri,k ) = 0;

(19) (20) (21)

vk ≥ 0; τk ≥ 0;

(22) (23)

for

7

resource allocation at each user is achieved. Note that there could exist multiple optimums among the K + 1 candidates. A = Case 2: There may be the case that for some k, λi,k+1 k+1 ( ) ∑k+1 Ai,k λk ′ ′ ′ ≥ − Ai,k , any combination k′ =1 Ai,k Bi,k λk . If η Ci + ( ∑k−1 of ri,k and ri,k+1 that)satisfies η ′ Ci + k′ =1 Ai,k′ Bi,k′ + k Ai,k ri,k +Ai,k+1 ri,k+1 = − Aλi,k is a candidate solution; otherwise, resource upper bounds Bi,k and Bi,k+1 are consumed, and likely next resource k + 2 also needs to be allocated.

k = 1, 2, · · · , K.

Sort the K types of resources for user i in a non-increasing A , i.e., the resources are labeled such that order of λi,k k Ai,1 Ai,2 Ai,K ≥ ≥ ··· ≥ . λ1 λ2 λK

(24)

Case 1: Consider the strictly decreasing case of Eq. (24), A A i.e., λi,k+1 ̸= λi,k for any k. By analyzing the KKT condik+1 k tions, the following property is derived. Property 1: In the optimal solution of (D-i), for user i, the K types of resources are allocated sequentially, i.e., ri,k > 0 only occurs when ri,k′ = Bi,k′ for all k ′ < k. Moreover, if ( ) ∑k−1 k . (25) η ′ Ci + k′ =1 Ai,k′ ri,k′ < − Aλi,k and Bi,k > 0, we have ri,k > 0. Proof: For user i, assume that the optimal resource T −λk k+1 ′ solution ri satisfies −λ Ai,k+1 < η (Ci + Ai ri ) ≤ Ai,k . Then ′ for resource k = k + 1, k + 2, · · · , K, ri,k′ has to be 0, since vk′ needs to be positive to satisfy Eqs. (19, 20, 22, 23). Meanwhile, for resource k ′ = 1, 2, · · · , k−1, ri,k′ has to equal to Bi,k′ , since τk′ needs(to be positive to satisfy) Eqs. (19, 21, ∑k−1 k 22, 23). Moreover if η ′ Ci + k′ =1 Ai,k′ ri,k′ < −λ Ai,k and Bi,k > 0, ri,k has to be positive, because otherwise τk = 0 and Eq. (19) is not satisfied. Property 1 implies there is a unique sequence of resource allocation, i.e., resource k + 1 will be allocated only if all resources k ′ ≤ k have reached their upper bounds. The following property provides the amount of each type of resource that should be allocated. Property 2: If resource k is allocated, i.e. ri,k > 0, ri,k satisfies the following condition: { } ∑k−1 −1 η ′ (−λk , Ai,k )−Ci − k′ =1 Ai,k′ Bi,k′ ri,k = min , Bi,k , Ai,k (26) ( ) √ −1 ai where η ′ (−λ, ai ) = ln −1 + 2λ − a2i − 4ai λ . Proof: Based on the KKT conditions, ri,k has to satisfy ∑k−1 Ai,k η ′ (Ci + k′ =1 Ai,k′ Bi,k′ + Ai,k ri,k ) + λk = 0, (27) when 0 < ri,k < Bi,k . The equation ai η ′ (r) = −λ has two unique solutions. However, the smaller one is a local maximal point with its second order derivative to be a2i η ′′ (r) < 0. −1 Therefore we choose η ′ (−λ, ai ). In summary, combining Property 1 and 2, we have at most K + 1 candidate solutions [ri 0 , ri 1 , · · · , ri K ], where ri k means the solution in which resource 1 to k are allocated. By selecting the best among the K + 1 candidates, the optimal

C. Near Optimality Analysis In this section, we discuss the performance of the DualHet algorithm under certain technical conditions. Let Vopt denote the optimal value of (P-0). To show the near-optimality of DualHet, we first show that the gap between DualHet and the optimal solution is bounded by the number of users with more than one optimal solution, given the optimal LM λ∗ (Proposition 1). Then we discuss the possible number of multi-solution users in practice, which is shown to be small in the simulations. Proposition 1: If there are Q users with multiple solutions given λ∗ , the solution [r1 p , r2 p , · · · , rM p ] generated by the DualHet algorithm is feasible and satisfies ∑M p (28) i=1 ηi (ri ) ≤ Vopt + Q. Proof: (Sketch) We prove this lemma by investigating the subgradient of the dual function D(λ), denoted by ∂D(λ). Without loss of generality, we assume the first Q users have multiple solutions. Due to the optimality ∗ ∗ of ), and we can show that ∑Mλ , we∗ have 0 ∈ ∂D(λ p p p r ≤ R. Thus [r , r 1 2 , · · · , rM ] is feasible bei=Q+1 i cause DualHet first allocates the single-solution users with resource ri p = r∗i and then use the remaining resource to other users.[ ∑ Moreover, by ]bounding ∑Qthe complementary M ∗ ∗ term as (λ∗ )T r − R ≥ − i i=1 ηi (ri ), we have i=1 ∑ ∑M M ∗ p i=Q+1 ηi (ri ) = i=Q+1 ηi (ri ) ≤ Vopt and the conclusion p then follows because ηi (ri ) ≤ 1 for all 1 ≤ i ≤ Q. Please refer to Appendix A for more details. Discussions: The Number of Multi-solution Users Next, we discuss the bound of Q. Assume user i has more than one optimal solution. In the following, we investigate the properties of λ∗ when user i has multiple solutions. Since one user’s optimal solutions are a subset of its candidate solutions (Sec. V-B), user i must have at least two candidate solutions achieving the same objective value in (D-i). There are two cases with a tie among candidate solutions. Case 1: For some k, we have Ai,k+1 λ∗ k+1

=

Ai,k . λ∗ k

(29)

In this case, user i could have an infinite number of candidate solutions sharing the same objective value. Case 2: Candidate solutions j1 and j2 have the same objective value, i.e., ηi (ri j1 ) + λ∗ T ri j1 = ηi (ri j2 ) + λ∗ T ri j2 ,

(30)



8

where j1 < j2 . According to Properties 1 and 2,

Algorithm 2: ϵ-Perturbed DualHet Input : Time horizon T , exploration probability ϵt ’s; T ri j1 = [Bi,1 , Bi,2 , Bi,j1 , 0, 0, · · · ] ; (31) Initial complaining likelihood function ηi ()’s, λ(0), available resource R, and and convergence threshold ∆; 1 for t = 1, 2, . . . , T do ri j2 = [Bi,1 , Bi,2 , Bi,j2 −1 , 2 Obtain resource allocation ri p by running DualHet { ′−1 ∗ } ∑j2 (Algorithm 1) with ηi ()’s, λ(0), R, and ∆; η (−λj ,Ai,j2 )−Ci − k=1 Ai,k Bi,k 2 min ,Bi,j2 , 0,0,· · · ]T .(32) Ai,j2 3 Draw a random number p ∼ U([0, 1]); 4 if p ≤ ϵt then 5 for i = 1, 2, . . . , M do Taking these two solutions into Eq. (30), we have ∑i−1 6 if i′ =1 ri′ p ≥ R then 7 ri p = 0; ∑j2 −1 ′ −1 ∗ ∑j2 −1 η (−λ ,A )−C − A B i,j2 i i,k i,k j2 k=1 ∗ ∗ 8 else k=j1 +1 λk Bi,k + λj2 Ai,j2 ) ( ) ( 9 ri p ∼ U([0, 2ri p ]); ∑j1 ′ −1 ∗ +η η (−λj2 , Ai,j2 ) = η Ci + k=1 Ai,k Bi,k (33) 10 ri p = min(ri p , Bi ); 11 end or 12 end Observe the label of the users and add the ( ) ( ) 13 j2 j1 j2 ∑ ∑ ∑ samples to the training set; λ∗k Bi,k = η Ci + Ai,k Bi,k −η Ci + Ai,k Bi,k . 14 Retrain the model based on the new training set k=j1 +1 k=1 k=1 and update ηi ()’s; (34) 15 end Note that user i could satisfy more than one condition in the 16 end form of (29), (33) or (34). Denote the set of users with multiple optimal solutions by S. Since each user i in S may have more than one condition in the form of (29), (33) or (34), the dimension of freedom of λ∗ is reduced by at least 1 from user i. In addition, when the users are heterogeneous, it is likely that the conditions (29), (33) or (34) for a user cannot be expressed by the conditions generated by the other users. Because λ∗ is a K-dimensional variable, K users with multiple optimal solutions reduce the freedom of λ∗ to at least 0. In other words, if |S| > K, the optimal LM λ∗ does not exist. Therefore, for heterogeneous users, we will likely have Q = |S| ≤ K. According to Proposition 1, we know that the performance gap between DualHet and the optimum is at most K, the number of the types of resources, independent of the number of users. Therefore, if the algorithm is deployed on a large number of users, the loss can be relatively small. We note that due to the complexity of conditions (29), (33), and (34), we are unable to obtain specific expressions of these conditions at this stage. However, as we can see in the linear case, noise can prevent singularity [31]. The only difference in our problem is that we have nonlinear terms in (33). Since the following function of λ∗j2 has only finite solutions:

D. ϵ-Perturbed DualHet for Classifier Optimization ¯ In this section, we propose an ϵ-perturbed version of DualHet to improve the classifier by randomized exploration. As shown in Algorithm 2, we implement DualHet (exploit) with probability 1 − ϵt , and explore and update the predictive model with probability ϵt (Lines 4 to 15). Specifically, when deciding to explore new samples, we randomly perturb the resource allocation result obtained by DualHet while satisfying the resource constraint. With this perturbation, we are able to explore the ground truth near the decision boundary. In this paper, we let the perturbed resource follow uniform distribution. Other distributions, e.g., truncated normal distribution, could also be applied to generate this perturbation. ϵt is used to control the exploitation and exploration tradeoff. Typically, if the environment does not change quickly, we will need fewer explorations as time increases and set ϵt to decrease as t, e.g., ϵt ∝ 1/t. Rigorous design and analysis for exploitation and exploration tradeoff is left as part of our future work. VI. P ERFORMANCE E VALUATION

In this section, we conduct experiments on two differ∗ ( ) (−λ , A ) − θ ent datasets to evaluate the performance of the proposed i,j 1 −1 2 j2 λ∗j2 + η η ′ (−λ∗j2 , Ai,j2 ) = θ2 , (35) algorithms. The first dataset is a synthetic dataset. It allows Ai,j2 performance evaluation in a scenario where the ground truth where θ1 and θ2 are two real numbers, we expect that if is known. The second dataset is the real-world mobile user Ai,k , Bi,k , Ci are independently distributed real numbers for complaint dataset, introduced in Sec. III. It allows us to study different users, they are unlikely to have the same solution and the performance of the proposed algorithms in a realistic thus the number of users with multiple optimal solutions will problem setting. be bounded by K. Our simulation results in Sec. VI validate Our proposed algorithms are compared with an optimized the near optimality of our algorithm. More specific expression baseline algorithm. In the baseline algorithm, each type of of these conditions are left for our future work. resource is evenly allocated to the predicted positive users. η′

−1



9

20

D2

10

0

-10

-20

-30 -40

Positives before moving Negatives before moving Decision boundary of LR -30

-20

-10

0

10

20

30

Expected Number of Positives (Based on Learned LR)

30

100 80 60 40 20 0 0 10

D1

(a) Before resource allocation.

Original No. of Positives DualHet Algorithm Baseline Lower Bound

10

1

10

2

10

3

10

4

Resource Amount

(a) Performance based on the learned logistic regression model.

30

D2

10

0

-10

-20

-30 -40

Positives after moving (LR) Negatives after moving (LR) Decision boundary of LR -30

-20

-10

0

10

20

30

D1

(b) After resource allocation. Fig. 3. An illustration of resource allocation w/ synthetic data.

Expected Number of Positives (Based on Ground Truth)

20

100 80 60 40 20 0 0 10

Original No. of Positives DualHet Algorithm Baseline

10

1

10

2

10

3

10

4

Resource Amount

(b) Performance based on the ground truth.

Denote the set of predicted positive users by Sp . When a user has resource upper bounds, for each type of resource received by the user, it is the minimum between the allocated resource and the upper bound, i.e., ri = min( |SRp | , Bi ) for any i ∈ Sp . Because whether a user is predicted to be positive or not depends on the setting of the cut-off point. After ranking the overall M users based on their predicted complaining likelihoods, there are at most M + 1 possible predicted results by choosing different cut-off points. Therefore, the optimized baseline tries all the M + 1 possible predicted results, and the one with the optimal resource allocation performance is chosen. We use the optimal solution of the dual problem in Sec. V-A as the lower bound, which may or may not be achievable. A. Gaussian Distributed Data in 2D space In this experiment, a synthetic dataset generated from a known ground truth distribution is considered. This experiment, conducted in a low dimensional space, also illustrates the intuitions of the algorithms. In this dataset, positive and negative points are assumed to be distributed in a 2D space with their means to be (-10, -10) and (10, 10) respectively, and their covariance matrix to be [8, 0; 0, 8]. A balanced dataset is considered which has equal numbers of positives and negatives. In this case, the probability for a point at (u1 , u2 ) to be positive is p(u1 , u2 ) = dp (u1 , u2 )/(dp (u1 , u2 ) + dn (u1 , u2 )), where dp (u1 , u2 ) and dn (u1 , u2 ) are the Gaussian density functions of positives and negatives, respectively. In Fig. 3(a), the line

Fig. 4. Performance evaluation w/ synthetic data.

indicates the logistic regression classifier trained from 2000 data points with labels. The 200 points, to whom resources will be allocated, are also shown in this figure. Two types of resources are allocated to the points, resource k affects only feature k and the linear coefficient is denoted by qi,k , for k = 1 and 2. We assume for each point i allocated with resources, it has resource upper bounds as ri,1 ≤ (20 − xi,1 )/qi,1 and ri,2 ≤ (20 − xi,2 )/qi,2 , i.e., after resource allocation, the points cannot exceed an area bounded by u1 ≤ 20 and u2 ≤ 20. We assume qi,1 and qi,2 are both chosen independently and uniformly from [0,1]. Using DualHet Algorithm, after resource allocation, the locations of points are shown in Fig. 3(b). The original positives are still marked with “+”, but a subset of them have been moved across the decision boundary. These moved points probably have their new labels changed to negative. From this figure, we can see that the points are moved to a “virtual” diagonal line in parallel with the decision boundary of the trained classifier. However, due to the fact that different points have different qi,1 and qi,2 , the destination (“virtual” diagonal) line is not straight. The figure also illustrates the effect of user-side resource upper bounds (u1 ≤ 20 and u2 ≤ 20) on the resource allocation. Fig. 4(a) shows the performance of the DualHet algorithm, the optimized baseline, and the lower bound. Since we have 2 types of resources, the theoretical gap is 2. However, as shown



in Fig. 4(a), the solution found by the DualHet algorithm has very close performance compared with the lower bound, which means the performance is very close to the optimal performance. Fig. 4(b) plots the performance based on ground truth. This figure shows, compared with the algorithms we designed, the baseline usually takes 2-4 times more resource to achieve the same performance. With 1000 units of resource 1 and 1000 units of resource 2, our algorithm reduces about 52% positives, while the baseline only reduces 16% positives. When the resource is limited, e.g., less than 10, even though both our algorithm and the baseline reduce only a fraction of expected complaints, the relative gain is more than 5 times. For example, when 4.0 units of resource is available, our algorithm reduces 0.257 expected complaints, while the baseline reduces 0.050 expected complaints. Fig. 4 also shows the near-optimal solution we found has similar performance based on both the learned logistic model and ground truth, since the logistic regression classifier we learned is very close the ground truth. B. Cellular Customer Complaint Data In the problem stated in Sec. III, Features 6, 9, 11, and 12 are related to resource allocation within a cell. We consider two types of resources that are allocated by a BS. The first resource is bandwidth, and with more bandwidth allocated to a user, the value of Feature 11 (the ratio of downlink throughput v.s. downlink volume) will increase. Although the value of Feature 12 (downlink volume) cannot be affected by resource allocation, it affects Feature 11, and different users have heterogeneous effects. The other resource used by a BS is active reconnection. In cellular networks, a BS periodically checks the connections between users and itself. When there is a radio connection failure, the BS can setup a proactive reconnection quickly. The number of proactive reconnections a BS can perform each hour depends on the computing capacity of the BS’s server. Allocating this reconnection resource decreases the value of Feature 9 (radio failure percentage). Since Feature 9 is a ratio, Feature 6 (attempted connections) needs to be considered as well, even though it is “uncontrollable”. Note that the number of proactive reconnections allocated to a user is bounded by the existing number of radio-network failures of the user. Meanwhile, we assume bandwidth can be allocated to a user without user side constraints. Note that these two resources actually can be allocated in a time scale smaller than hour. However, because the user complaint is highly related to the aggregated user experience in the past hour, our resource allocation approach serves as a longer period policy. Among the 570445 users, 50% users are selected randomly for training the classifier. The remaining 50%, or 285222 users are used for testing. They are grouped into 570 cells and in each cell there are about 500 users. Since on average a user has throughput of 4.79 KB/s and 11.42 radio fails per hour, the total bandwidth is 2.395 MB/s, and the total radio failures are 5710 in a cell. The bandwidth and active reconnections are both allocated by the BS in a cell to the users in its coverage. We treat the allocated reconnections as a continuous variable for simplicity, instead of dealing with a NP-hard problem

10

TABLE II C OMPLAINTS REDUCED BY D UAL H ET ALGORITHM VS . BASELINE .

Bandwidth Reconnection 10 100 1000 10000

10 KB/s

100 KB/s

1 MB/s

10 MB/s

22.03 (13.73) 48.35 (24.24)

57.25 (43.56) 82.89 (50.01)

154.45 (126.61) 174.19 (132.46)

366.93 (306.07) 374.84 (308.32)

92.46 (49.94) 110.90 (90.29)

128.67 (61.97) 164.73 (103.43)

218.97 (142.09) 274.00 (184.21)

405.01 (321.13) 471.24 (399.65)

TABLE III U PPER BOUNDS OF THE OPTIMUM AND LOWER BOUNDS OF PERFORMANCE RATIOS (D UAL H ET ALGORITHM ).

Bandwidth Reconnection 10 100 1000 10000

10 KB/s

100 KB/s

22.44 (0.9816) 48.58 (0.9953) 95.24 (0.9708) 113.34 (0.9784)

57.81 (0.9903) 82.91 (0.9998) 129.23 (0.9957) 166.23 (0.9910)

1 MB/s 155.33 174.64 219.01 274.74

(0.9943) (0.9974) (0.9998) (0.9973)

10 MB/s 373.74 380.66 407.87 471.90

(0.9818) (0.9847) (0.9930) (0.9986)

otherwise. In reality, when the allocated reconnection is a decimal, e.g., 3.6, one solution is to round it down to the nearest integer. Some other heuristics, e.g. randomization, can also be designed to solve it. With different amount of available resources allocated by each BS, we run experiments to evaluate the number of complaints reduced. The expected number of complaints is 616.50 without resource allocation. Table II shows the reduced complaints of the DualHet algorithm compared with the optimized baseline. The reduced numbers of complaints of the DualHet algorithm are presented outside the parentheses, while the baseline results are inside the parentheses. When there are 100 KB/s additional bandwidth (4.1% of existing throughput) per cell and 1000 reconnections (18% of radio failures) per cell, the DualHet algorithm reduces 128.67 user complaints (20.87% total complaints), which is more than 2x of the performance of the baseline. The DualHet algorithm achieves greater improvement compared with the baseline when resource is scarce. This is intuitive: when resource is scarce, judicious allocation is more important. On the other hand, when resource is abundant, each user is likely to receive sufficient resource and thus the performance of our algorithm and the baseline is relatively close. Table III shows the upper bounds of the complaints that can be reduced. This result is derived from the dual problem. The upper bounds are outside the parentheses, while the ratios of the achieved complaint reductions v.s. the upper bounds are inside the parentheses. Within cell c, the DualHet algorithm is suboptimal by at most 2 maxi∈Ic ηi (0), where Ic denotes all users in cell c. Therefore, considering all 570 cells, the solution found by the DualHet algorithm is suboptimal by at ∑570 most c=1 2 maxi∈Ic ηi (0) = 44.89. However, as shown by Table III, in reality much better performance is achieved. In our experiment, less than 3% performance loss is achieved by the DualHet algorithm compared with the optimal. Figs. 5 and 6 illustrate the impact of one dimensional resource on the expected number of complaints, with the other



Expected Number of Complaints

600 550 500 450 400 Original No. of Positives DualHet Algorithm (1MB/s bandwidth) Baseline (1MB/s bandwidth) DualHet Algorithm (10KB/s bandwidth) Baseline (10KB/s bandwidth)

350 300 250 0 10

10

1

10

2

10

3

10

4

Reconnections

Fig. 5. Number of complaints v.s. reconnections.

Expected Number of Complaints

600

500

400

Original No. of Positives DualHet Algorithm (1000 reconnections) Baseline (1000 reconnections) DualHet Algorithm (10 reconnections) Baseline (10 reconnections)

300

200

10

0

10

1

10

2

10

3

10

4

Available Bandwidth (KB/s)

Fig. 6. Number of complaints v.s. bandwidth.

dimension fixed. In Fig. 5, additional bandwidth is chosen to be 10 KB/s or 1 MB/s. This figure shows allocating reconnections at most can reduce about 20% complaints, which is due to the user side resource upper bounds on reconnections. In Fig. 6, the number of reconnections is 10 or 1000. The simulation results show high resource usage efficiency in improving user experience in the beginning (when the amount of available resources are small) and diminishing returns in the later stage when the resources are abundant. For example, when there are 10 reconnections available, 10 KB/s additional bandwidth can reduce on average 22.03 complaints, i.e., 0.45 KB/s per complaint; at the same time, 1 MB/s additional bandwidth can reduce 154.45 complaints, i.e., 6.47 KB/s per complaint. The results show that there are low hanging fruits in terms of improving the overall/aggregated user experience. Therefore, it is possible for an operator to improve user experience with relatively low resource consumption; an operator can decide a sweet-spot for its operation. C. Impact of Closed-Loop Feedback In this section, we discuss the impact of closed-loop feedback by running simulations for ϵ-Perturbed DualHet. We note that running experiments in practical cellular networks to collect actual QoE is costly. Thus, we run simulations based on the 2D Gaussian mixture data, as described in Section VI-A. We let M− and M+ be the number of samples generated

11

from the distribution with mean (10, 10) and (−10, −10), respectively. If we consider the logistic ground truth similar to Section VI-A, the predicted model will converge to the ground truth quickly because logistic regression is indeed the ground truth model throughout the entire region. In such cases, the close-loop serves only to validate the causal relation, and its impact on model adjustment is negligible. In reality, we usually do not have a perfect model that captures the ground truth for the entire region. Therefore, to illustrate this impact, we consider the following case where the samples are positive if x1 ≤ 0 and x2 ≤ 0, i.e., { 1, if x1 ≤ 0 and x2 ≤ 0 G(x) = 0, otherwise In these simulations, we could only allocate the second dimension of resource, where R = [0, R2 ] with R2 = 5000. For ϵ-Perturbed DualHet, we set ϵt = ϵ0 /t for t = 1, 2, . . . , T , where ϵ0 = 0.4 is the exploration probability at t = 1. Figs. 7(a) to 7(c) show the evolution of the decision boundary for the predictive model, which is trained based on the samples obtained in the exploration iterations. As we can see from the figures, although the logistic regression model does not fit the ground truth, the decision boundary gradually converges to the horizontal line. Since we can only allocate the second dimension of the resource, this new model is more suitable for resource allocation (although it may generate less accurate classification results), and thus result in better QoE performance. This is demonstrated in Fig. 8. We can see that, with random exploration, ϵ-Perturbed DualHet can reduce a larger number of user complaints as soon as it find a more suitable predictive model. Note that we study the above simple example for the logistic regression model. Similar idea applies to more complicated cases, while more machine learning models such as neural networks are needed. Interested users are referred to our previous work [9]. It is also worth noting that, due to its hidden and complex impact, rigorous design and analysis for the exploration-exploitation tradeoff is still an open problem. VII. C ONCLUSION AND F UTURE W ORK We envision that user-experience-oriented system design and its resource allocation are becoming increasingly important in the near future. This work studies a data-driven resource allocation problem in cellular networks where the objective is to minimize the number of user complaints based on trained logistic regression classifiers. We consider a general setting, where the same amount of allocated resources can have heterogeneous effects on different users’ features. We design a DualHet algorithm to handle the cases with multiple resources and heterogeneous users. Simulation using a real dataset from cellular networks shows our algorithms are very close to the optimal performance, and it can reduce up to 2x complaints compared with the optimized baseline. There are several limitations of our work. First, we assume the resources affect the values of features independently. While it is a reasonable approximation for a set of application scenarios, we hope to generalize the model to incorporate more


This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2017.2680918, IEEE Journal on Selected Areas in Communications 12

20

20

10

10

10

0

-10

D2

20

D2

D2

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. 8, AUGUST 2015

0

-10

-20 -20

-10

0

10

-20 -20

20

0

-10

-10

0

D1

10

-20 -20

20

-10

D1

(a) t = 0.

0

10

20

D1

(b) t = 2.

(c) t = 37.

Fig. 7. Evolution of the decision boundary. The ground truth is G(x) = 1 if x1 ≤ 0 and x2 ≤ 0, and G(x) = 0, otherwise. The predictive model is trained based on the samples in the exploration iterations.

Step 1) Subgradient of the Dual Function We first investigate the subgradient of the dual function D(λ), denoted as ∂D(λ). To obtain ∂D(λ), we rewrite the dual function using convex conjugate. Specifically, we introduce the following indicator function to capture the resource constraint: { 0, if 0 ≤ r ≤ R φ(r) = ∞, otherwise.

Number of reduced positives

200

190

180

170

160 DualHet

The convex conjugate of φ(r) is

-Perturbed DualHet

150 0

20

40

60

80

φ∗ (λ) = sup [λT r − φ(r)].

Time t

r∈RK

(a) M− = 200, M+ = 200.

Then the dual function can be rewritten as

Number of reduced positives

400

D(λ) = −φ∗ (λ) +

M ∑ [ηi (ri ) + λT ri ].

(36)

i=1

350

For λ ≥ 0, we can easily verify that the subgradient of φ∗ (λ) is { ˜1, R ˜2, . . . , R ˜K ) : R ˜ k = Rk if λk > 0, ∂φ∗ (λ) = (R } ˜ k ∈ [0, Rk ] if λk = 0 . and R (37)

300

250 DualHet -Perturbed DualHet

200 0

20

40

60

80

Time t

(b) M− = 500, M+ = 500. Fig. 8. Evolution of reduced positives.

Combining Eqs. (36) and (37) and according to Lemma 3 in [26], we have the following lemma: Lemma 1: The subgradient of the dual function D(λ) is ∂D(λ) = conv

M {∑

} ˜ : ri ∈ Ri (λ), R ˜ ∈ ∂φ∗ (λ) , (38) ri − R

i=1

sophisticated scenarios. Second, since our current dataset is collected in the wild, we cannot evaluate the performance of the algorithms based on the feedback from users after resource allocation. Our future work will consider the online learning and resource allocation problem, where the feedback from users will potentially further tune the prediction model. A PPENDIX A P ROOF OF P ROPOSITION 1 We prove Proposition 1 by investigating the subgradient of the dual function D(λ) at λ∗ . Using the fact that the zero vector 0 is a subgradient of D(λ) at λ∗ , we show the feasibility and the near-optimality for the solution obtained by DualHet, respectively.

where R{ i (λ) } is the optimal individual decision set under λ and conv S is the convex hull of set S. Step 2) Feasibility of the DualHet Solution We now construct a near-optimal feasible solution by leveraging the property of ∂D(λ∗ ). Due to the optimality of λ∗ , we know that the zero vector 0 ∈ ∂D(λ∗ ), i.e., 0 ∈ ∂D(λ∗ ) = conv

M {∑

} ˜ : r∗ ∈ R i , R ˜ ∈ ∂φ∗ (λ∗ ) . r∗i − R i

i=1

Combining with the convexity of ∂φ∗ (λ∗ ), we know that there ˜ ∗ ∈ ∂φ∗ (λ∗ ) such that exists a vector R ˜ ∗ ∈ conv R

M {∑

} r∗i : r∗i ∈ R∗i ,

(39)

i=1



where R∗i is the optimal individual decision set under λ∗ . When there are only Q users having multiple solution under λ∗ , without loss of generality, we assume the first Q users have multiple solution under λ∗ , i.e., |R∗i | > 1 for 1 ≤ i ≤ Q and |R∗i | = 1 for i > Q. Then, (39) indicates that there exists an allocation decision ˜r = [˜r1 , ˜r2 , . . . , ˜rM ] such that { ∈ conv(R∗i ), 1 ≤ i ≤ Q, r˜i = ri∗ , i > Q,

(40)

and M ∑

˜ ∗, ˜ri = R

(41)

i=1

implying that [˜r1 , ˜r2 , .∑ . . , ˜rM ] is a feasible ∑M solution of the M ∗ primal problem, and ri ≤ R. In i=Q+1 ri = i=Q+1 ˜ Algorithm 1, DualHet first allocates the resource to the singlesolution users with rpi = r∗i and then allocate the remaining resource to other users. Thus, the solution [rp1 , rp2 , . . . , rpM ] is feasible. Step 3) Near-Optimality of the DualHet Solution Note that R∗i is the optimal individual decision set and thus ηi (r∗i ) + λ∗ T r∗i is a constant value for all r∗i ∈ R∗i . Let (˜ri , ζi ) ∈ conv{(r∗i , ηi (r∗i )) : r∗i ∈ R∗i }. Then, ζi + (λ∗ )T ˜ri = ηi (r∗i ) + λ∗ T r∗i , i = 1, 2, . . . , M.

(42)

˜ ∗ ∈ ∂φ∗ (λ∗ ) implies that (λ∗ )T R = On the other hand, R ∗ T ˜∗ (λ ) R . Thus, ∗ T

(λ )

N (∑

r∗i

) −R

∗ T

N (∑

˜∗ r∗i − R

)

=

(λ )

=

N M ∑ (∑ ) ˜ri (λ∗ )T r∗i −

i

i

i

i=1

Q ∑ = [ζi − ηi (r∗i )]

≥

i=1 Q ∑

−

ηi (r∗i )

(43)

i=1

Because M ∑

M ∑ ηi (ri ∗ ) + λ∗ T ( ri ∗ − R)

i=1

i=1

= L(r1 ∗ , r2 ∗ , · · · , rM ∗ , λ∗ ) ≤ Vopt , we have M ∑

ηi (ri ∗ ) ≤ Vopt .

i=Q+1

The conclusion then follows by using the fact that under DualHet, ri = r∗i for i > Q and ηi (ri ) ≤ 1 for 1 ≤ i ≤ Q.

13

R EFERENCES [1] A. Balachandran, V. Aggarwal, E. Halepovic, J. Pang, S. Seshan, S. Venkataraman, and H. Yan, “Modeling web quality-of-experience on cellular networks,” in Proceedings of the 20th MobiCom. ACM, 2014, pp. 213–224. [2] M. Z. Shafiq, J. Erman, L. Ji, A. X. Liu, J. Pang, and J. Wang, “Understanding the impact of network dynamics on mobile video user engagement,” in 2014 ACM SIGMETRICS. ACM, 2014, pp. 367–379. [3] A. J. Chan, A. Pande, E. Baik, and P. Mohapatra, “Temporal quality assessment for mobile videos,” in Proceedings of the 18th Annual International Conference on Mobile Computing and Networking. ACM, 2012, pp. 221–232. [4] A. Balachandran, V. Sekar, A. Akella, S. Seshan, I. Stoica, and H. Zhang, “Developing a predictive model of quality of experience for internet video,” in ACM SIGCOMM Computer Communication Review, vol. 43, no. 4. ACM, 2013, pp. 339–350. [5] S. S. Krishnan and R. K. Sitaraman, “Video stream quality impacts viewer behavior: inferring causality using quasi-experimental designs,” Networking, IEEE/ACM Transactions on, vol. 21, no. 6, pp. 2001–2014, 2013. [6] G. Song and Y. Li, “Utility-based resource allocation and scheduling in ofdm-based wireless broadband networks,” Communications Magazine, IEEE, vol. 43, no. 12, pp. 127–134, 2005. [7] M. Fazel and M. Chiang, “Network utility maximization with nonconcave utilities using sum-of-squares method,” in Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC’05. 44th IEEE Conference on. IEEE, 2005, pp. 1867–1874. [8] Y. Bao, X. Liu, and A. Pande, “Data-guided approach for learning and improving user experience in computer networks,” ACML 2015. [9] Y. Bao, H. Wu, and X. Liu, “From prediction to action: A closed-loop approach for data-guided network resource allocation,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016, pp. 1425–1434. [10] J.-W. Lee, R. R. Mazumdar, and N. B. Shroff, “Non-convex optimization and rate control for multi-class services in the internet,” Networking, IEEE/ACM Transactions on, vol. 13, no. 4, pp. 827–840, 2005. [11] T. Spetebroot, S. Afra, N. Aguilera, D. Saucez, and C. Barakat, “From network-level measurements to expected quality of experience: the skype use case,” in Measurements & Networking (M&N), 2015 IEEE International Workshop on. IEEE, 2015, pp. 1–6. [12] E. Baik, A. Pande, C. Stover, and P. Mohapatra, “Video acuity assessment in mobile devices,” in The 32nd IEEE International Conference on Computer Communications, 2015. [13] C. Yu, Y. Xu, B. Liu, and Y. Liu, “Scan ¸ you see me now?Tˇ a measurement study of mobile video calls,” in IEEE INFOCOM 2014IEEE Conference on Computer Communications. IEEE, 2014, pp. 1456–1464. [14] Z. M. Mao, “Diagnosing mobile apps’ quality of experience: Challenges and promising directions,” IEEE Internet Computing, vol. 20, no. 1, pp. 66–69, Jan 2016. [15] A. Samba, Y. Busnel, A. Blanc, P. Dooze, and G. Simon, “Throughput Prediction in Cellular Networks: Experiments and Preliminary Results,” in 1ères Rencontres Francophones sur la Conception de Protocoles, l’Évaluation de Performance et l’Expérimentation des Réseaux de Communication (CoRes 2016), May 2016. [Online]. Available: https://hal.archives-ouvertes.fr/hal-01311158 [16] Y. Guo, F. Qian, Q. A. Chen, Z. M. Mao, and S. Sen, “Understanding on-device bufferbloat for cellular upload,” in Proceedings of the 2016 ACM on Internet Measurement Conference, ser. IMC ’16. New York, NY, USA: ACM, 2016, pp. 303–317. [Online]. Available: http://doi.acm.org/10.1145/2987443.2987490 [17] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. Cambridge Univ Press, 2017 (in progress). [18] E. Horvitz and T. Mitchell, “From data to knowledge to action: a global enabler for the 21st century. computing community consortium, v. 11. sep.” 2010. [19] M. Bayati, M. Braverman, M. Gillam, K. M. Mack, G. Ruiz, M. S. Smith, and E. Horvitz, “Data-driven decisions for reducing readmissions for heart failure: General methodology and case study,” PloS one, vol. 9, no. 10, p. e109264, 2014. [20] G. Lee, N. Tolia, P. Ranganathan, and R. H. Katz, “Topology-aware resource allocation for data-intensive workloads,” in Proceedings of the first ACM Asia-Pacific Workshop on Systems. ACM, 2010, pp. 1–6. [21] J. L. Berral, Í. Goiri, R. Nou, F. Julià, J. Guitart, R. Gavaldà, and J. Torres, “Towards energy-aware scheduling in data centers using machine learning,” in Proceedings of the 1st International Conference



[22] [23]

[24] [25] [26] [27] [28] [29] [30] [31]

on energy-Efficient Computing and Networking. ACM, 2010, pp. 215– 224. S. Deb and P. Monogioudis, “Learning-based uplink interference management in 4g lte cellular systems,” IEEE/ACM Transactions on Networking (TON), vol. 23, no. 2, pp. 398–411, 2015. M. Chiang, S. Zhang, and P. Hande, “Distributed rate allocation for inelastic flows: Optimization frameworks, optimality conditions, and optimal algorithms,” in INFOCOM 2005, vol. 4. IEEE, 2005, pp. 2679–2690. M. Udell and S. Boyd, “Maximizing a sum of sigmoids,” 2013. ——, “Bounding duality gap for problems with separable objective,” arXiv preprint arXiv:1410.4158, 2014. M. Wang, “Vanishing price of anarchy in large coordinative nonconvex optimization.” N. V. Chawla, N. Japkowicz, and A. Kotcz, “Editorial: special issue on learning from imbalanced data sets,” ACM Sigkdd Explorations Newsletter, vol. 6, no. 1, pp. 1–6, 2004. “Receiver operating characteristic,” https://en.wikipedia.org/wiki/ Receiver_operating_characteristic. A. Y. Ng, “Feature selection, l1 vs. l2 regularization, and rotational invariance,” in Proceedings of the Twenty-First International Conference on Machine Learning. ACM, 2004. A. Slivkins, “Contextual bandits with similarity information,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 2533–2568, 2014. A. Neumaier, “Solving ill-conditioned and singular linear systems: A tutorial on regularization,” SIAM review, vol. 40, no. 3, pp. 636–666, 1998.

14

Xin Liu (M’09) received the Ph.D. degree in electrical engineering from Purdue University in 2002. She is currently a professor in the Computer Science Department at the University of California, Davis. Before joining UC Davis, she was a postdoctoral research associate in the Coordinated Science Laboratory at UIUC. From March 2012 to July 2014, she was on leave from UC Davis and with Microsoft Research Asia. Her research interests are in the area of wireless communication networks, with a current focus on data-driven approach in networking. She received the Best Paper of year award of the Computer Networks Journal in 2003 for her work on opportunistic scheduling. She received NSF CAREER award in 2005 for her research on Smart-Radio-TechnologyEnabled Opportunistic Spectrum Utilization. She received the Outstanding Engineering Junior Faculty Award from the College of Engineering, University of California, Davis in 2005. She became a Chancellor’s Fellow in 2011.

Yanan Bao (S’12) received the B.S. and M.S. degrees from Tsinghua University, Beijing, China in 2010 and 2013, respectively, and received the Ph.D. degree from University of California, Davis, USA, in 2016. He is currently working on Image Search at Google, Inc. His research interests include Machine Learning, Data Mining, and Green Communications.

Huasen Wu (S’12–M’14) received the B.S. and Ph.D. degrees from Beihang University, Beijing, China, in 2007 and 2014, respectively. He is currently a postdoctoral researcher at Department of Computer Science, University of California, Davis. From December 2010 to January 2012, he was a visiting student at UC Davis, and from October 2012 to January 2014, he worked as a research intern at Wireless and Networking Group, Microsoft Research Asia. His research interests are in stochastic learning and optimization for wireless networks, crowdsourcing, and recommendation systems.


From Prediction to Action: Improving User Experience ... - IEEE Xplore

From Prediction to Action: Improving User Experience ... - IEEE Xplore

Suggest Documents

Conflict Prediction - IEEE Xplore

TXT-2-LRN: Improving Students' Learning Experience in ... - IEEE Xplore

Improving the Quality of Prediction Intervals Through ... - IEEE Xplore

A Comparative Study of User Experience in Online ... - IEEE Xplore

A User Experience-Based Cloud Service Redeployment ... - IEEE Xplore

Investigating User Experience of Online Communities - IEEE Xplore

Crowdsourcing towards User Experience Evaluation: An ... - IEEE Xplore

Effect of UPnP advertisements on user experience and ... - IEEE Xplore

Improving Video Summarization based on User ... - IEEE Xplore

Improving User Experience for Passenger ... - Semantic Scholar

User traffic profiling - IEEE Xplore

Improving Web User Experience with Document ...

From the Gaming Experience to the Wider User Experience

From the Gaming Experience to the Wider User Experience - York

SRF Accelerator Technology Transfer Experience from ... - IEEE Xplore

Prediction of Transformer Winding Displacement from ... - IEEE Xplore

Prediction of EMI from Two-Channel Differential ... - IEEE Xplore

SEU Prediction From SET Modeling Using Multi-Node ... - IEEE Xplore

Prediction of Active Array Impedance from Simulator ... - IEEE Xplore

Bayesian Target Prediction From Partial Finger Tracks - IEEE Xplore

Prediction of eucalypt foliage nitrogen content from ... - IEEE Xplore

Portraying User Life Status from Microblogging Posts - IEEE Xplore

Mood Classfication from Musical Audio Using User ... - IEEE Xplore

Shifting User Experience to Editorial Experience