Optimizing Caching Policy at Base Stations by Exploiting User ...

1 downloads 0 Views 480KB Size Report
IT] 27 Oct 2017. 1. Optimizing Caching Policy at Base ... October 30, 2017. DRAFT ...... hetnets,” IEEE Trans. Commun., vol. 65, no. 6, pp. 2699–2714, Jun. 2017.
1

Optimizing Caching Policy at Base Stations by Exploiting User Preference and Spatial Locality

arXiv:1710.09983v1 [cs.IT] 27 Oct 2017

Dong Liu and Chenyang Yang

Abstract Most prior works of proactive caching at wireless edge optimize caching policies under the following assumptions: the preference of each user is identical to content popularity, all users request contents with the same active level and at uniformly-distributed locations. In this paper, we investigate what happens without these assumptions. To this end, we establish a framework to optimize caching policy at base stations exploiting user preference, active level, and spatial locality. We obtain optimal caching policy to minimize the weighted sum of the file download time averaged over all file requests and user locations in the network (reflecting network performance) and the maximal weighted download time averaged over possible file requests and locations of each user (reflecting user fairness). To investigate how user preference similarity and active level skewness affect the optimal caching policy, we then provide a method to synthesize user preference for given content popularity and user active level. The analysis and simulation results show that exploiting user preference can improve both network performance and user fairness remarkably compared with priori works. The gain of exploiting user preference increases with user preference heterogeneity, user spatial locality, and skewness of user active level.

Index Terms Caching policy, user preference, content popularity, spatial locality, active level.

I. I NTRODUCTION Motivated by the 80/20 rule in terms of content popularity, local caching at the wireless edge has attracted considerable attention recently [1, 2]. By caching popular contents at base stations (BSs), network throughput, energy efficiency and user experience can be improved dramatically [3–5]. Facing the limited storage size at wireless edge while with huge number of available The authors are with School of Electronics and Information Engineering, Beihang University, Beijing, China (e-mail: {dliu, cyyang}@buaa.edu.cn).

October 30, 2017

DRAFT

2

contents, optimizing proactive caching policy by exploiting the skewed statistics of user demands is critical in reaping the benefit of local caching [2]. Most of priori works in caching policy optimization are based on content popularity, which can be predicted by various methods [6–8]. Diverse objectives and application scenarios have been considered. The original motivation of local caching is to reduce latency. In [9], caching policy was optimized to minimize the average download delay assuming that the location where each user sends request is known a priori when optimizing caching policy. Considering the uncertainty of user location, a probabilistic caching policy for BS maximizing the cache-hit probability in homogeneous network was proposed in [10] and then extended into heterogeneous network (HetNets) in [11]. Caching policies were optimized together with multicast and cooperative transmission to maximize the successful transmission probability in [12] and [13], and were optimized to minimize the energy consumption and average bit error rate of cache-enabled wireless networks respectively in [14] and [15], respectively. Coded caching policies were optimized in [16] to maximize the average fractional offloaded traffic and minimize the average ergodic rate for small cell networks. In [17], joint user association and content placement was optimized to maximize the supported traffic load in HetNets considering a hierarchical caching structure, where different kinds of BSs can share the cached content via backhaul links. While significant research efforts have been devoted to wireless caching, the following facts that have been observed from real dataset analysis are largely overlooked in the literature: 1) Content popularity can reflect average interests of multiple users, but cannot reflect the interest of each individual user. In fact, due to the differences in culture, occupation and age of the users, global content popularity observed at a large aggregation point (say a content server) is not the same as local content popularity observed in a small region (say a campus [18] or a cell [19]), not to mention the preference of each user. 2) The active level of users is skewed. As reported in [20], 90% of the daily network traffic is generated by less than 10% of all users. The user preference is also skewed, i.e., a user is usually interested in a small fraction of all contents. 3) The location of mobile users is neither known in advance as assumed in [2, 9, 13, 15], nor completely randomly distributed throughout the network as assumed in [5, 10–12, 14, 16, 17]. The analysis from real data shows that users always periodically reappear at the same location with high probability [21]. The measured data in [20] indicate that 60% of the users is active only in one cell each day and over 95% of the users travel across less than 10 BSs in a day, which suggests strong spatial locality of users. This suggests that the probability that in

October 30, 2017

DRAFT

3

which cell a user is located when sending file request can be learned from the request history. As a consequence, most of prior works do not differentiate user preference and content popularity, implicitly assuming that the preferences are identical among users in a region [9–17] or in a social group [22] and equal to the content popularity. Beside, almost all existing works assume that all users request contents with the same active level and at uniformly-distributed locations. This may degrade the caching gain, since the assumptions are in reality not true. By assuming user preferences as Zipf distributions with different ranks and user locations remaining unchanged during the period of content placement and content delivery, caching policy was optimized to minimize the average download delay in [23]. Yet the user preference model was not validated, and all the users were assumed to have identical active level. In [24], caching policies was optimized with learned user preference for device-to-device communications to maximize offloading ratio, but the impact of user active level and spatial locality was not analyzed. User preference prediction is a key task for recommendation system, which is widely applied by content providers. User preference can be learned by collaborative filtering (CF) based on users’ rating statistics, view or purchase history, exploiting the correlation of contents or preferences among users [25]. In wireless caching, CF methods such as matrix factorization (MF) [26] and probabilistic latent semantic analysis (pLSA) [27] have been adopted to learn user preference in [28] and [24], respectively, based on which caching policies were optimized to increase the offloading ratio. With learned user preferences, a nature way is to aggregate them into local content popularity of a cell and then employ popularity-based caching policies, e.g., [19, 28]. Alternatively, caching policies can be optimized more sophisticatedly by directly exploiting individual user preference as in [24]. Intuitively, the caching interests of users may conflict when user preference is heterogeneous (e.g., a user may prefer a BS to cache one file while the other user may prefer the BS to cache another file). Furthermore, considering that the coverage area of BSs is overlapped in dense wireless networks, a user can download file from adjacent BSs. Therefore, the caching policy of each BS should not only consider the preferences of users within its cell but also consider the preferences of users in its neighboring cells. In this paper, we go a further step to optimize caching policy by removing these assumptions. To investigate when and how the assumptions impact local caching, we establish a framework to optimize caching policies for BSs by exploiting other 80/20 rules in terms of the heterogeneous preference, active level and spatial locality of users. Since local caching can reduce end-toend delay, we consider the file download time as the metric. Taking the uncertainty of user

October 30, 2017

DRAFT

4

demands and locations into account, we derive the file download time for each user that is averaged over all possible requests and locations of the user, namely user average download time. When the caching interests of multiple users conflict due to different user preferences and locations, caching policy will affect user fairness because the file download time of each user not only depends on channel condition but also depends on whether the requested file is cached. This suggests that we can optimize caching policy to improve fairness among users. Different from ensuring user fairness by radio resource allocation (such as user scheduling and power allocation), caching policy guarantees user fairness on the application level, which takes user preference into consideration. In particular, we establish an optimization framework aimed at improving both network performance and user fairness. From the perspective of network performance, we minimize the download time averaged over all possible requests and locations of all users in the network, namely network average download time. From the perspective of user fairness, we minimize the maximal weighted user average download time, where the weight is associated with the user active level. To achieve a tradeoff between the two perspectives, we minimize a weighted sum of the network average download time and the maximal weighted user average download time. To evaluate the entangled impacts of content popularity, user preference and active level on optimal caching policies, we propose a user preference synthesization method with given content popularity and user active level, and validate the method by two real datasets, Million Songs dataset [29] and Lastfm dataset [30]. The major contributions of this paper are summarized as follows: •

We propose a caching policy optimization framework, where the spatial locality, heterogeneous preference and active level of users are considered.



We analyze individual user behavior from two real datasets, and provide a method to synthesize user preference, which can control the user preference similarity, content popularity, and active level skewness separately. The synthetic method is validated by the datasets.



Simulation results show that exploiting individual user preference can improve both network performance and user fairness remarkably, whose gain increases with the user preference heterogeneity, user spatial locality, and the skewness of user active level.

The rest of the paper is organized as follows. In Section II, we present the system model and characterize the connection between content popularity, user preference and active level. Section III optimizes the caching policy exploiting individual user preference. Section IV analyzes the

October 30, 2017

DRAFT

5

statistics of user demands from two real dataset and proposes a user preference synthesis method. The simulation results are provided in Section V, and the conclusions are drawn in Section VI. II. S YSTEM M ODEL

AND

U SER D EMAND S TATISTICS

We consider a cache-enabled wireless network where each BS is equipped with Nt antennas and a cache, and is connected to the core network via backhaul. For notational and analysis simplicity, we consider a hexagonal region with Nb BSs with cell radius D as shown in Fig. 1. The content library consists of Nf files each with size F that all the users in the considered region may request. Each user is allowed to associate with one of the three nearest BSs (called neighboring BS set) to download the requested file in order to increase the cache-hit probability. For example, when a user is located in the shaded area of Fig. 1, it can associate with BS1 , BS2 or BS3 , where BS1 is called local BS of the user.1 To avoid strong inter-cell interference inside the neighboring BS sets, especially the interference generated from the local BS when user downloads file from other BSs [3], the BSs within the neighboring BSs set use different frequency resource. Denote Nu as the total number of users within the Nb cells. To reflect the spatial locality of users, we denote A = [auj ]N u×Nb as the user location probability matrix, where auj is the probability that the uth user is located in the jth cell when it sends the file request. Since the exact location of a user in a cell is hard to predict, we assume that the uth user is uniformly distributed in the jth cell if it is located in the cell. Note that we only assume that the user is uniformly located within the cell but not assume that it is uniformly located in the whole region because the probabilities of the user locating in different cells may differ [20]. A. Caching Policy and File Download Time To achieve better performance, we employ coded caching strategy [9, 16] where each file is encoded by rateless maximum distance separable coding, such that a file can be retrieved by a user when F bits of the requested file is received by the user. Denote cbf (0 ≤ cbf ≤ 1) as the

fraction of the f th file cached at the bth BS and blu (xu )2 as the lth nearest BS of the uth user

when the user receives file at the location of xu = (xu1 , xu2 ). When the total fraction of the P f th file cached at the three nearest BSs of the uth user is no less than 1, i.e., 3l=1 cblu f ≥ 1, 1

The framework can be extended to neighboring BS sets with any number of BSs. We choose three only for illustration.

2

In the following, we use blu instead of blu (xu ) for notational simplicity.

October 30, 2017

DRAFT

6

BS36

BS22

BS8

BS9

BS19

BS18

BS10 BS2

BS3

BS7

BS17

BS11 BS1

BS4

BS6 BS16

BS12

BS5 BS13

BS15 BS31

BS14

BS27

Fig. 1. Layout of the cache-enabled network. The considered region are surrounded by solid line. In this example, Nb = 7.

the uth user can retrieve the f th file from the local caches of its neighboring BS set. When P3 P3 l=1 cblu f part of files via backhaul link. l=1 cblu f < 1, the user needs to download the rest 1 −

We assume block-fading channel where the small-scale fading gain is constant in each time

slot, and is identically and independently distributed among time slots. When the file size F is large and the duration of each time slot ∆t is small so that the number of time slots N required to transmit a file is large,3 the download time per unit bit for the uth user located at xu when downloading a file from its lth nearest BS can be derived as τublu (xu ) = lim PN N →∞

PN

n=1

N∆t Rublu (xu , n)∆t

=

1 = ¯ Rublu (xu ) n=1 Rublu (xu , n) 1

lim 1 N →∞ N

(1)

PN

Rublu (xu , n)∆t is the number of bits downloaded during N time slots, Rub (xu , n) is ¯ ub (xu ) is the the downlink data rate from the bth BS to the uth user in the nth time slots, and R where

n=1

average rate from the bth BS to the uth user. To unify the expression, we denote the download time per unit bit for the uth user when downloading from the backhaul as τub4u (xu ). Since cache is intended for networks with stringent capacity backhaul [1], we assume that the download time is limited by the backhaul bandwidth when the user downloads file from the backhaul. Then, we have τb4u (xu ) = 3

1 , Cbh,u

where Cbh,u is the backhaul bandwidth for the uth user.

This is true in practice. For example, when F = 10 MB, ∆t = 1 ms, and the transmission rate is 8 Mbps, N = 104 .

October 30, 2017

DRAFT

7

Since caching policy is optimized in a much larger time scale (at least in hours) than radio resource allocation such as power and bandwidth allocation (usually in milliseconds), we do not jointly optimize cache and transmission resource allocation. To focus on how to optimize caching policy exploiting user preference, we assume that each BS serves Nt users in the same time-frequency resource by zero-forcing beamforming with equal power allocation. Then, the average achievable rate can be expressed as " ¯ ub (xu ) = Eh ,h ′ Wu log2 1 + P R ub ub

Pt h r −α Nt ub ub

b′ ∈Φb ,b′ 6=b

−α 2 Pt hub′ rub ′ + σ

!#

(2)

where Wu is the transmission bandwidth for the uth user, Pt is the transmit power of each BS, hub is the equivalent channel gain (including channel coefficient and beamforming) from the bth BS to the uth user, rub = ||xu − xb || is the distance between the uth user and the bth BS, α is the pathloss exponent, Φb denotes the BS set that share the same frequency with the bth BS, and σ 2 is the noise power. Pk−1 P When l=1 cblu f < 1 and kl=1 cblu f ≥ 1, the uth user needs to receive the f th file from

the local caches of the 1st, · · · , kth nearest BSs successively to retrieve the complete file. The download time for the user receiving the file from these k BSs is given by ! k−1 k−1 X X cblu f τubku (xu ) cblu f τublu (xu ) + F 1 − tfuk (xu ) = F l=1

l=1

= F τubku (xu ) − F

k−1 X l=1

cblu f (τubku (xu ) − τublu (xu ))

(3)

Then, for k = 1, · · · , 4, the file download time of the uth user located at xu and requesting the f th file can be expressed as a piecewise   tfu1 (xu ),      tf 2 (x ), u u f tu (xu ) = f3    tu (xu ),    f4 tu (xu ),

function cb1u f ≥ 1 P2 P1 l=1 cblu f ≥ 1 l=1 cblu f < 1 and P2 P3 l=1 cblu f ≥ 1 l=1 cblu f < 1 and P3 l=1 cblu f < 1

(4)

B. Content Popularity and User Preference

The demand statistics for all users and for each individual user are modeled as follows. Global Content Popularity is the probability distribution of the file requests within the Nb cells, which is the aggregated user demands observed at a higher aggregation node, say the service gateway that covers the region, and reflects the average interests of all the users in the

October 30, 2017

DRAFT

8

considered region. We denote p = [p1 , · · · , pNf ] as the global content popularity, where pf is the probability that the f th file in the library is requested by all users in the region. Local Content Popularity is the probability distribution of the file requests within a single cell, which reflects the aggregated user demands observed at a cell. We denote pf |j as the probability that the f th file is requested by all users located in the jth cell. User Preference is the conditional probability distribution of the requests from a user given that the user sends a file request, which reflects the demands of each individual user. Denote Q = [qT1 , · · · , qTNu ]T as the user preference matrix, where qu = [q1|u , · · · , qNf |u ] is the preference vector of the uth user and qf |u ∈ [0, 1] is the conditional probability that the uth user demands the f th file given that it requests a file. Either when the shape of probability distribution qu differs from that of qm , or when the rankings of the elements in qu and qm differ, the preferences of two users are not identical. To reflect the similarity between the preferences of two users, we consider cosine similarity frequently used in CF [25], which is defined as cos(qu , qm ) ,

qu qTm ||qu || · ||qm ||

(5)

To use one parameter to characterize the heterogeneity of user preferences in a region, we consider average similarity, which is the cosine similarity averaged over all the two-user pairs N Nu u −1 X X 2 cos(Q) , cos(qu , qm ) Nu (Nu − 1) u=1 m=u+1

(6)

Based on the law of total probability, the relation between global content popularity and user preference can be expressed as pf =

Nu X u=1

su qf |u ,

Nu X

quf

(7)

u=1

where su is the probability that the request is sent from the uth user, which reflects the active level of the user, and quf is the joint probability that the requested file is the f th file and the request is sent from the uth user. We denote s = [s1 , · · · , sNu ] as the user active level vector. With user active level s, user preference Q and user location probability A, the relation between local content popularity of the jth cell and user preference can be expressed as PNu PNu auj quf u=1 auj su qf |u = Pu=1 pf |j = PNf PN N u u u=1 auj su f =1 u=1 auj su qf |u

(8)

From (7) and (8), we can obtain the following observation.

October 30, 2017

DRAFT

9

Observation: 1) When user preference is identical (i.e., qf |1 = · · · = qf |Nu ), we have pf = pf |j = qf |u , i.e., local content popularity and global content popularity are identical to the user preference. 2) When each user is uniformly distributed in the considered region (i.e., au1 = · · · = auNb ), we have pf = pf |j , i.e., local content popularity is identical to global content popularity no matter if user preferences are identical or not. 3) When both user preference and user location probability are non-identical, global content popularity, local content popularity and user preference are different in general. In practice, Q and s can be learned by CF methods such as pLSA [24, 27] or MF [26, 28, 31] at a service gateway or even a content server, which are assumed perfect in the following analysis, since our focus is to find when exploiting user preference is beneficial. In Section V, we will evaluate the impact of imperfect user preference learning via simulation. III. C ACHING P OLICY O PTIMIZATION W ITH U SER P REFERENCE In practice, the exact location where each user sends the file request is unknown a priori when optimizing the caching policy. In this section, we first derive the download time of each user averaged over user location considering the user spatial locality. Then, we establish a framework to optimize caching policy aiming at improving the network performance as well as user fairness. Since the optimal policy is not with closed-form expression, we demonstrate its behavior analytically in special cases and numerically with toy examples. To derive the file download time averaged over user location, we divide each cell into 12 sectors as shown in Fig. 1. In this way, the lth nearest BS of the uth user, i.e., blu , does not depend on user location any more given that the user locates in the ith sector of the jth cell. Based on the law of total expectation, the average download time of the uth user requesting the f th file can be expressed as Nb X 12    X f  f  auj (9) Exu tfu (xu )|ij Exu tu (xu ) = Eij Exu tu (xu )|ij = 12 j=1 i=1 f  where Exu tu (xu )|ij is the average download time of the user conditioned on that it is located

at the ith sector of the jth cell.

auj 12

is the probability that the uth user locates at the ith sector

of the jth cell. Further considering (4), we can obtain

October 30, 2017

DRAFT

10

     Exu tfu1 (xu )|ij ,       f   Exu tfu2 (xu )|ij , Exu tu (xu )|ij =     Exu tfu3 (xu )|ij ,        Exu tfu4 (xu )|ij ,

cb1ij f ≥ 1 P2 P1 c l f < 1 and b l=1 cblij f ≥ 1 l=1 ij P2 P3 l=1 cblij f < 1 and l=1 cblij f ≥ 1 P3 l=1 cblij f < 1

(10)

where blij is the lth nearest BS when the user is located in the ith sector of the jth cell. From (3), we can obtain k−1 k−1 X X   fk cblij f (¯ τubkij − τ¯ublij ) = F τ¯uk − F cblij f (¯ τuk − τ¯ul ) Exu tu (xu )|ij = F τ¯ubkij − F l=1

(11)

l=1

where τ¯ublij , Exu [τublij (xu )|ij] is the download time per bit from the lth nearest BS averaged over user location given that the user is located in the ith sector of the jth cell. Due to the symmetry of the network topology, τ¯ublij does not depend on i and j but only depend on l. Therefore, we use τ¯ul instead of τ¯ublij in the following for notational simplicity, whose value can be computed from Proposition 1. Since the average download time per bit increases with the distance between user and BS, we have τ¯uk > τ¯u(k−1) . Further considering the expressions of (10) and (11), similar to the proof of [9, Lemma 6], we can rewrite (10) as     Exu tfu (xu )|ij = max {Exu tfuk (xu )|ij }

(12)

k=1,··· ,4

Proposition 1. The average download time per bit of the uth user when downloading file from its lth nearest BS can be obtained as √ Z D Z x√u2  Z ∞ Z ∞ 3 2 3 ··· log2 τ¯ul = Wu D 2 0 0 0 0 × f1 (hul )

Y

b′ ∈Φ

l

f2 (hub′ )dhul

,b′ 6=l

where f1 (x) = e−x and f2 (x) =

xNt −1 e−xNt −N Nt t (Nt −1)!

Y

b′ ∈Φ

l

1+ P

,b′ 6=l

Pt h r −α Nt ul ul

b′ ∈Φl ,b′ 6=l

dhub′

−1

−α 2 Pt hub′ rub ′ + σ

dxu1 dxu2

! (13)

for Rayleigh fadings.

Proof: See Appendix A. Since the interference term in (13) is a sum of Gamma distributed random variables with −α different values of rub ¯ul has no closed form expression and the computation requires a |Φl |+2′, τ

fold numerical integration that is of high complexity. In the following corollary, we introduce

October 30, 2017

DRAFT

11

an accurate approximation for the interference term as in [32] and obtain the approximated τ¯ul for high signal-to-noise ratio (SNR) region. Pt σ2

→ ∞, τ¯ul can be approximated as √ Z D Z x√u2  −1 3 1 1 kx 2 3 + ψ(θx ) − ψ(θy ) dxu1 dxu2 log2 τ¯ul ≈ Wu D 2 0 0 ky ln 2 ln 2

Corollary 1. When

(14)

where ψ(·) is the Digamma function, kx , ky , θx , and θy are given in Appendix B. Proof: See Appendix B. We can see from (14) that by employing Gamma approximation, the computation of τ¯ul only requires a double numerical integration, which is much easier to compute.4 Then, by substituting (13) (or (14)) into (11) and further considering (12) and (9), the download time of the uth user averaged over its possible file requests and locations can be obtained as Nf Nb X k−1 12 X o n X f  X auj qf |u cblij f (¯ τuk − τ¯ul ) max F τ¯uk − F t¯u = Ef Exu tu (xu ) = 12 k=1,··· ,4 j=1 i=1 f =1 l=1



(15)

A. Caching Policy Optimization The network average download time is the file download time averaged over the requests of all the users in the considered region, which can reflect average user experience. Intuitively, more requests can be served during a certain period with the decrease of network average download time, which implies high throughput of the network. Hence, it is a performance metric from the network perspective and is widely used in literature [9, 22, 23], which can be expressed as ) ( Nf Nb X k−1 Nu Nu X 12 X X X X F cblij f (¯ τuk − τ¯ul ) (16) auj quf max τ¯uk − su t¯u = T = k=1,··· ,4 12 u=1 j=1 i=1 f =1 u=1 l=1 To capture user fairness, we consider the weighted user average download time

max {wu t¯u }.

u=1,··· ,Nu

When the weight wu is set identical for all the users, it reflects the fairness among the users sending file requests. Considering that the users with more file requests will suffer more if they have longer download time, we can also set wu as an increasing function of the user active level su . As an illustration, we set wu = Nu su in the sequel. Then, the weighted average download time can be expressed as 4

When analyzing the network without such symmetric topology, we can still divide each cell into several sectors so that the

neighboring BS set is fixed in each sector and then computed τ¯ubl by Monte Carlo method. ij

October 30, 2017

DRAFT

12

wu t¯u = Nu su t¯u = Nu

Nf Nb X 12 X X

auj quf max

k=1,··· ,4

j=1 i=1 f =1

(

τ¯uk −

k−1 X l=1

cblij f (¯ τuk − τ¯ul )

)

(17)

To improve network performance and user fairness at the same time, we formulate the following general optimization framework minimizing the weighted sum of these two metrics as min (1 − η)T + η max {Nu su t¯u }

(18a)

u=1,··· ,Nu

{cbf }

Nf X

s.t.

f =1

cbf ≤ Nc , ∀b

(18b)

0 ≤ cbf ≤ 1, ∀f, b

(18c)

By changing the value of η from 0 to 1, we can obtain the caching policy from minimizing the network average download time to minimizing the maximal weighted user average download time. By introducing auxiliary variables µuf ij and ν, which are upper bounds of Pk−1 {¯ τul − l=1 cblij f (¯ τuk − τ¯ul )}k=1,··· ,4 and {Nu su t¯u }u=1,··· ,Nu , respectively, we can convert the problem equivalently into

min

{cbf },{µuf ij },ν

Nf Nb X Nu 12 X X F X auj quf (1 − η) 12 j=1 i=1 f =1 u=1

s.t. τ¯uk − Nu

k−1 X l=1

µuf ij + ην

(19a)

cblij f (¯ τuk − τ¯ul ) ≤ µuf ij , ∀ i, j, u, f, k,

(19b)

Nf Nb X 12 X X j=1 i=1 f =1

Nf X f =1

!

auj quf µuf ij ≤ ν, ∀u

cbf ≤ Nc , ∀b

0 ≤ cbf ≤ 1, ∀f, b

(19c)

(19d) (19e)

which is a linear programming problem and can be solved by interior point method [33]. To reveal the behavior of the optimal caching policy, in what follows we analyze the solutions of the problem optimizing the performance respectively from the network and the user perspective. By setting η = 0, we can obtain the problem minimizing the network average download time, which is referred to as Problem 1. By setting η = 1, we can obtain the problem minimizing the maximal weighted average download, which is referred to as Problem 2.

October 30, 2017

DRAFT

13

Since transmission and caching resource allocation operated in different time-scales, to analyze the impact of exploiting user preference on the optimal caching policy, we assume that the transmission resources are identical for each user (i.e., τ¯1l = · · · = τ¯Nu l , τ¯l ) in the following. Proposition 2. When each user is uniformly distributed throughout the network, exploiting user preference cannot improve network average download time. Proof: In this case, we have au1 = · · · = auNb =

1 , Nb

τ¯1l = · · · = τ¯Nu l , and µ1f ij = · · · =

uf µN , µfij . Then, the first term in (19a) can be rewritten as ij ! Nf Nf Nb X Nb X 12 X 12 X Nu X F X F X f quf µij = pf µfij 12Nb j=1 i=1 12N b j=1 i=1 u=1

f =1

(20)

f =1

where the relation in (7) is used. We can see that the network average download time only depend on the global content popularity pf . Proposition 2 suggests that when the transmission resources are identical for all users, the gain of exploiting user preference disappears without user spatial locality. In other word, it implies that exploiting user preference can improve network average download time even without user spatial locality when the transmission resources are not the same. Proposition 3. When each user can only associate with the local BS (say the bth BS), the optimal solution of Problem 1 is c∗bf b = · · · = c∗bf b = 1, c∗bf b 1

Nc

Nc +1

= · · · = c∗bf b = 0, where fib denotes Nf

the file with the ith largest value in the local popularity of the bth cell {p1|b , · · · , pNf |b }. Proof: See Appendix C. Proposition 3 suggests that when the transmission resources are identical for users and the cells are not overlapped, the optimal caching policy minimizing the network average download time is simply to let each BS cache the most popular files according to local content popularity as used in [19, 28]. Otherwise, the caching policies should be designed more sophisticatedly. Proposition 4. When the location probabilities and the preferences are identical for all users, the optimal solutions of Problem 1 and Problem 2 are identical. Proof. In this case, since τ¯1l = · · · = τ¯Nu l , a1j = · · · = aNu j for all j, and qf |1 = · · · = qf |Nu for all f , we can see from (15) that the average download time of each user is identical, i.e., t¯1 = · · · = t¯Nu , t¯. Then, Problem 1 and Problem 2 are both equivalent to minimizing t¯.

October 30, 2017

DRAFT

14

Proposition 4 suggests that when transmission resources, location distributions and preferences are identical for all users (i.e., the actual identity of each user is irrelevant), minimizing the network download time is equivalent to minimizing the maximal weighted user average download time. Otherwise, the optimal caching policies for Problems 1 and 2 can be quite different as we show later. In the sequel, we refer the optimal caching policies for Problem 1 and Problem 2 as Policy 1 and Policy 2, respectively. B. Numerical Examples To help understand the behavior of Policy 1 and Policy 2 and the impact of heterogeneous user preference in single and multi-cell, we consider two toy examples as shown in Fig. 2.

UE2

BS1

BS1

(a) Single cell with two users.

BS2 UE1

UE1

UE2

(b) Two cells with one user in each cell.

Fig. 2. Two simple but typical numerical examples for understanding the caching policies, Nu = 2. The total number of files is Nf = 3 and each BS can cache Nc = 1 file.

Suppose that each user can either associate with BS1 or BS2 to download files, and the average download time from the nearest BS, the second nearest BS and the backhaul are F [¯ τu1 , τ¯u2 , τ¯u4 ] = [1, 2, 3] for both user equipments (UEs). The global popularity is set as p = [0.46, 0.30, 0.24] (obtained from Zipf distribution with skewness parameter 0.6) and the active level of users is set as s = [0.6, 0.4]. To show the impact of different user preference similarity, we generate user preference matrix Q that satisfies (7) to achieve the average cosine similarity of user preference of 0.2, 0.6 and 1. By solving Problems 1 and 2, we can obtain Policy 1 (denoted as C∗ ) and the minimized network average download time T ∗ , Policy 2 (denoted as C† ) and the minimized maximal weighted user average download time max{Nu s1 t¯†1 , Nu s2 t¯†2 }, respectively.

1) Located in One Cell: In this case, the user location probability matrix is A = [1, 1]T and

the optimization results are shown in Table. I.

October 30, 2017

DRAFT

15

TABLE I N UMERICAL E XAMPLE , A = [1, 1]T Average

Q  0.75  0.02  0.58  0.28  0.46  0.46

0.25 0.38

0.37 0.19

0.30 0.30

Problem 1

0



Problem 2

s1 t¯∗1 + s2 t¯∗2 = T ∗

C∗

Similarity

max{Nu s1 t¯†1 , Nu s2 t¯†2 }

C†

0.2

h

1

0

0

i

0.90 + 1.18 = 2.08

h 0.79

0

i 0.21

max{2.17, 2.17} = 2.17

 0.05  0.53

0.6

h

1

0

0

i

1.11 + 0.97 = 2.08

h 1

0

i 0

max{2.22, 1.94} = 2.22

 0.24  0.24

1

h

1

0

0

i

1.25 + 0.83 = 2.08

h 1

0

i 0

max{2.50, 1.66} = 2.50

 0.6

We can see from Table. I that when user preference similarity is low, i.e., 0.2, the most preferred file of UE1 and UE2 are file 1 (q1|1 = 0.75 > 0.25 > 0) and file 3 (q3|2 = 0.6 > 0.38 > 0.02), respectively. Therefore, UE1 prefers BS1 to cache file 1 so that its file download time can be minimized while UE2 prefers BS1 to cache file 3. Since UE1 is more active than UE2 that results in higher content popularity of file 1, Policy 1 lets BS1 cache file 1 completely, which agrees with Proposition 3. This, however, sacrifices the average download time of UE2 and makes UE2 have the maximal weighted user average download time (i.e. 1.18 > 0.90). Therefore, Policy 2 lets BS1 cache both files in part, i.e., 0.79 fraction of file 1 and 0.21 fraction of file 3. When user preference similarity increases to 0.6, for the similar reason, Policy 1 is still to let BS1 cache file 1. The average download time of UE1 increases while that of UE2 decreases, resulting in the same network average download time as the case with similarity of 0.2. However, in this case, the weighted average download time of UE1 becomes the maximal weighted user average download time and hence Policy 2 is also to let BS1 cache file 1 completely. Since the skewness of UE1 ’s preference decreases (i.e., the shape of probability distribution [0.75, 0.25, 0] is more “peaky” than [0.58, 0.37, 0.05]), which means the file request of UE1 becomes less deterministic, the maximal weighted user average download time increases. When user preference becomes identical, both users prefer BS1 to cache file 1 and hence both Policies 1 and 2 let BS1 cache file 1, which agrees with Proposition 4. Since the skewness of UE1 ’s preference continues to decrease, the maximal weighted user average download time increases compared to the case with similarity 0.6.

October 30, 2017

DRAFT

16

2) Located in Different Cells: In this case, the user location probability matrix is A = [ 10 01 ] and the optimization results are shown in Table. II. TABLE II N UMERICAL E XAMPLE , A =[ 10 01 ] Average

Q  0.75  0.02  0.58  0.28  0.46  0.46

0.25

0



0.38

 0.6

0.37

0.05



0.19

0.53



0.30

0.24



0.30

Problem 1

0.24



0.2

 1  0

0.6

 1  0

1

 1  0

0 0

0 0

0 1

Problem 2

s1 t¯∗1 + s2 t¯∗2 = T ∗

C∗

Similarity

 0  1  0  1  0  0

0.90 + 0.71 = 1.61

 1  0

1.08 + 0.66 = 1.74

 1  0

1.07 + 0.77 = 1.84

max{Nu s1 t¯†1 , Nu s2 t¯†2 }

C† 0

0

0.57

0

 1  0

 0.43 0

0.88

0 1





 0.12  0  0

max{1.63, 1.63} = 1.63

max{1.81, 1.81} = 1.81

max{2.14, 1.55} = 2.13

We can see from Table. II that when user preference similarity is low, UE1 prefers its local BS (i.e., BS1 ) to cache its most preferable file (i.e., file 1) and its neighboring BS (i.e., BS2 ) to cache its second preferable file (i.e., file 2), while UE2 prefers BS2 to cache file 3 and BS1 to cache file 2 according to its own preference. As a result, Policy 1 lets each BS cache the most preferable file of its local user, i.e. BS1 caches file 1 and BS2 caches file 3. As the user with higher active level, UE1 has the maximal weighted user average download time. Hence, Policy 2 is more prone to let BSs cache the files preferred by UE1 , i.e., let BS2 cache 0.57 fraction of file 1 and 0.43 fraction of file 3. When user preference similarity increases to 0.6, for the similar reason, Policy 1 is to let each BS cache the most preferable file of its local user, i.e. BS1 caches file 1 and BS2 caches file 3. Again, due to higher active level, UE1 has the maximal weighted user average download time. Moreover, since the skewness of UE1 ’s preference decreases with user preference similarity for a given content popularity, Policy 2 is more prone to let the BSs cache the files preferred by UE1 than the case with similarity 0.2, i.e., let BS2 cache 0.88 fraction of file 1 and 0.12 fraction of file 3. We can also see that the decrease of the skewness of UE1 ’s preference increases both the average network download time and the maximal weighted user average download time. When user preference becomes identical, Policy 1 and Policy 2 become the same, which agrees with Proposition 4. Since the skewness of user preference continues to decrease, both the average network download time and the maximal weighted user average download time increase.

October 30, 2017

DRAFT

17

From the analysis of the typical scenarios, we can observe that both network performance and user fairness can be improved by exploiting less-similar user preference. This is because the content popularity in a region is formed by the preferences of the users. To achieve a given popularity, the skewness of user preference increases with the preference heterogeneity. The performance of user preference based caching policies increases with the skewness of user preference, analogous to the widely recognized result that the performance of content popularity based caching policies increases with the skewness of content popularity. IV. R EAL DATASETS A NALYSIS

AND

U SER P REFERENCE S YNTHESIS

The performance of cache-enabled network highly depends on the user behavior in requesting contents, both collectively and individually. In this section, we first analyze content popularity, user active level, and user preference based on two real datasets. To fairly compare the optimized caching policies with user preference and content popularity, we then propose a method to synthesize user preference with given content popularity and user active level. A. Real Datasets Analysis We use Million Songs Dataset (MSD) [29] and Lastfm-1K Dataset [30] to analyze the user request behavior for songs. The reason why we chose these two music datasets is that a song is often requested by a user many times so that the ground truth of user preference can be obtained from the frequency of each user’s requests for each song. MSD records user listening data in the form of gathered from undisclosed partners, which contains 1019318 users and 384546 songs. The Lasmfm-1K Dataset records user listening data in the form of tuples collected from Last.fm API, which represents the listening habits for 992 users and 1012 songs. To capture the main trends of user demands statistics, we choose the 100 most active users and the 300 most popular files requested by these users for analysis, which generate more than 75% of the traffic in the data. In Fig. 3(a) we show the number of requests for each file in descending order. We can see that the content popularity of both datasets can be well fitted as Zipf distribution. The popularity skewness parameters are δp = 0.21 for Lastfm and δp = 0.74 for MSD, respectively. In Fig. 3(b) we show the number of requests from each user in descending order. We can see that user active level can also be well fitted as Zipf distribution for both datasets, whose skewness parameters are δs = 0.31 for Lastfm and δs = 0.44 for MSD. October 30, 2017

DRAFT

18

105

106 Lastfm Zipf, δ = 0.31 s

MSD Zipf, δ = 0.44

Number of Requests

Number of Requests

s

104

103

105

104

Lastfm Zipf, δ p = 0.21 MSD Zipf, δ = 0.74 p

102 100

101

103 100

102

101

File Rank

102

User Rank

(a) Content popularity.

(b) User active level.

Fig. 3. Content popularity and user active level of MSD and Lastfm datasets. To show the absolute number of requests, we do not normalize the content popularity and user active level as probability distributions here.

104

0.9 0.8 0.7 0.6

CDF

103

Number of Requests

1

Lastfm UE10 Lastfm UE10 (Synthetic) Lastfm UE90 Lastfm UE90 (Synthetic) MSD UE10 MSD UE10 (Synthetic) MSD UE90 MSD UE90 (Synthetic)

102

0.5 0.4 0.3

101

Lastfm Synthetic, θ = 0.98 MSD Synthetic, θ = 0.21

0.2 0.1 10

0

100

101

102

103

104

File Rank

(a) User preference

0 0

0.2

0.4

0.6

0.8

1

User Preference Simialrity

(b) CDF of user preference similarity

Fig. 4. User preference and the CDF of user preference similarity. The dashed lines are plotted from the synthetic user preference in Section IV-B. Again, we do not normalize the user preference.

In Fig. 4(a), we show the number of requests for each file of the 10th (i.e., more active) and 90th (i.e., less active) users in the two datasets. To show the shape of user preference, we re-rank the files by the number of requests for each user. We can see that user preference is also skewed, but the skewness is quite different for each user.5 In Fig. 4(b), we show the cumulative 5

In fact, the file rank is also quite different for each user. For example, in MSD, the top 5 preferable files of UE10 and UE90

are files with popularity ranking [52, 36, 41, 20, 31] and files with popularity ranking [186, 50, 24, 97, 158], respectively.

October 30, 2017

DRAFT

19

distribution function (CDF) of the cosine similarity between every two-user pair. We can see that the cosine similarity distributions are quite different between Lastfm and MSD. For Lastfm, more than 90% of the user pair similarity is more than 0.8, while for MSD, about 80% of similarity is less than 0.2. Furthermore, we compute the average cosine similarity given by (6), which are 0.84 for Lastfm and 0.04 for MSD, respectively. The differences of user preference similarity between these two datasets may result from different file catalog size (MSD is much larger than Lastfm) and different music recommendation methods (e.g, personalized recommend v.s. popularity based recommend). Since the data of MSD is gathered from undisclosed partners, we are not able to further analyze the reasons behind this. Nevertheless, this suggests that the similarity of user preference varies significantly in real world. B. User Preference Synthesization As illustrated by previous two datesets, in different areas and for different catalogers of contents in real-world networks, content popularity, user active level, and user preference similarity are quite different, which jointly affect the caching polices. To analyze the impact of user preference similarity on caching policies and the corresponding performance, user preference matrix Q with difference levels of similarity is needed for evaluation. To differentiate the impact of each single factor, we need to control these factors separately, while the existing user preference synthesization methods such as [19, 24] fail to do so. In what follows, we synthesize user preference with different similarity for a given content popularity p and user active level s. From the definition of user preference, with given p and s, each element in matrix Q should satisfy the following constraints. Nu X u=1

Nf X f =1

su qf |u = pf , ∀f

(21a)

qf |u = 1, ∀u

(21b)

0 ≤ qf |u ≤ 1, ∀f, u

(21c)

where (21a) is the relation between user preference and content popularity, (21b) and (21c) are the probability constraints. One way to obtain synthetic user preference with different level of similarity is to directly solve the equations and inequations in (21a)–(21c) as well as an equation by setting cos(Q) in (6) equal to the given similarity. However, the solution is hard to obtain

October 30, 2017

DRAFT

20

since the expression of cos(Q) is complicated, and the solution is not unique. Furthermore, the solution does not tell the shape of user preference, which is also skewed like content popularity as shown in Fig. 4(a). Hence, we provide an alternative way. The basic idea of our user preference synthesization method is as follows. Obviously, when each user’s preference is identical to the content popularity, i.e., qu = p for u = 1, · · · , Nu , the average cosine similarity achieves the maximal value, i.e., cos(Q) = 1. To obtain heterogeneous (i.e., less similar) user preferences whose shapes have relation with the shape of popularity, we make each element of qu fluctuate around the content popularity. Then, by introducing a parameter θ to adjust the fluctuation range, the similarity (and also the shape) of user preference can be controlled. In Algorithm 1, we generate the preference of each user in a successive manner. We first randomly chose a user and determine its preference. Suppose that the firstly chosen user is the uth user. Considering constraints (21a) and (21c), the preference of the uth user for the f th p

file is upper bounded by q¯u|f = min{ suf , 1}. Then, we randomly choose a file from the file set F = {1, · · · , Nf } for the uth user. By introducing the parameter θ and selecting qf |u randomly from [θpf , θpf + (1 − θ)¯ qf |u ] (i.e., the fluctuation range), we can adjust qf |u from identical to content popularity (i.e., θ = 1) to fluctuating between [0, q¯u|f ] (i.e., θ = 0). To satisfy (21b), qf |u is further adjusted as min{qf |u , 1}. Then, we remove the f th file from F and randomly choose PNf qf |u = 1 a file in F again for the uth user and repeat the similar procedures as above until f =1 P Nf or F is empty. If f =1 qf |u = 1, we remove the uth user in the user set U, update the content

popularity for the rest of users, and generate the preference of the next user similarly. Otherwise, we randomly increase the value of qf |u that satisfies qf |u < qf |u so that (21b) can be satisfied, and then move on to the next user. The detailed procedures are given in Algorithm 1. By considering the users in a cell or a region, we can synthesize user preference for a given global popularity or local popularity with the algorithm. In Fig. 4(a), we plot the synthetic user preference of the 10th and 90th active user based on the popularity and active level of Lastfm and MSD datasets. The parameter θ are chosen as θ = 0.98 and θ = 0.21 for Lastfm and MSD, respectively, so that the average cosine similarity of user preference is the same for the dataset and the synthetic preference. We can see that the shapes of the synthetic user preference are similar to the datasets, which suggests that the synthetic user preference can reflect real user preference. In Fig. 4(b), we further plot the CDF of the similarity cos(qu , qm ) defined in (5) of the synthetic user preference to compare the distribution of user October 30, 2017

DRAFT

21

preference similarity between the synthetic data and real data. We can see that the synthetic user preference can fit both datasets well. Algorithm 1. Synthesize user preference for given p and s Input: p, s, θ

12:

F˜ ← {f | qf |u < q¯f |u }

Output: Q

13:

while l > 0 do

˜ ← p, ρ ← p, U ← {1, · · · , Nu } where 1: Q ← 0, p ˜ , [˜ p p1 , · · · , p˜Nf ], ρ , [ρ1 , · · · , ρNf ]

2: while U is not empty do

n

ρ su , 1

o

a

14:

Randomly chose a file f ′ in F˜

15:

q˜f ′ |u ← min{qf ′ |u + l, q¯f ′ |u }

16:

l ← l − (˜ qf ′ |u − qf ′ |u )

17:

qf ′ |u ← q˜f ′ |u if qf ′ |u = q¯f ′ |u then

3:

¯ u ← min q

4:

F ← {1, · · · , Nf }, l ← 1

18:

5:

while l > 0 and F is not empty do

19:

Remove f ′ from F˜

6:

Randomly chose a file f in F

20:

end if

7:

Set q˜f |u ∈ [θ p˜f , θp˜f +(1−θ)¯ qf |u ] randomly  θ  l P , l, q¯f |u qf |u ← min q˜f |u p˜ ′ ′

21:

end while

8:

f ∈F

9: 10: 11:

a

f

l ← l − qf |u , q˜f |u ← 0, remove f from F

end while if l > 0 then

22:

end if

23:

ρ ← ρ − su qu

24:

˜← p

ρ PN f

f =1

ρf

, remove u from U

25: end while

¯ u , [˜ Here, q q1|u , · · · , q˜Nf |u ] and min{x, z} , [min{x1 , z}, · · · , min{xn , z}].

V. S IMULATION R ESULTS In this section, we compare the performance of the proposed caching policies with priori works, and analyze the impact of various factors by simulation. Consider Nb = 7 cells with radius D = 250 m as shown in Fig. 1. Each BS is with four transmit antennas. The backhaul bandwidth and the downlink transmission bandwidth for each user are set as Cbh,u = 2 Mbps and Wu = 5 MHz, respectively. We consider Rayleigh fading channels and the pathloss is modeled as 35.5 + 37.6 log10 (rub ). To reduce simulation time, we set the total number of users in the considered region as Nu = 100, and set Nf = 100 files each with size of F = 30 MB. We assume that each BS can cache 10% of the total files, i.e., Nc = 10. The probability distribution for the users located in different cells is modeled as Zipf distribution as auju =

j −δa PNb −δ , a j=1 j

where ju is the jth probable cell the uth user may locate in

and the skewness parameter is δa = 1 based on the measured data in [20]. A larger value of δa October 30, 2017

DRAFT

22

for a user indicates that the user is located in few cells with high probability. To analyze the impacts of user preference and active level and fairly compare with existing caching policies, we use the synthetic user preference for simulation. The global content popularity is modeled as Zipf distribution pf =

f −δp PNf −δp , n=1 n

where the skewness parameter is δp = 0.6. The active level of

each user is also modeled as Zipf distribution as su =

u−δs PNu −δ , s u=1 u

where the skewness parameter

is δs = 0.4 according to the dataset analysis in the previous section. Unless otherwise specified, the above setting is used throughout the simulation. To obtain synthetic user preference with different level of average cosine similarity, we adjust the value of θ in Algorithm 1 from 0 to 1. The following caching policies are compared: 1) “Global Pop”: Each BS caches the Nc most popular files according to the global content popularity pf . 2) “Local Pop”: Each BS caches the Nc most popular files according to the local content popularity within its cell pf |j given by (8). This is the method used in [19, 28]. 3) “Femtocaching (Pop)”: This is the caching policy proposed in [9] minimizing the network average download time, which is based on global content popularity assuming that user location is fixed. To show the impact of uncertain user location, we obtain the caching policy based on one realization of user location and then fix the caching policy for the rest of realizations of user locations. 4) “Femtocaching (Pref)”: We modify the caching policy in [9] to exploit user preference by simply replacing the global content popularity pf by user preference qf |u . This method still assumes the location of each user is known a priori when optimizing the caching policy. 5) “Min average DL time (Pop)”: The optimal solution of Problem 1, where quf is replaced by pf . This is the optimal caching policy minimizing the network average download time without the knowledge of user preference and active level, i.e., regard user preference as identical to content popularity and regard the active level of different users as identical. 6) “Min average DL time (Pref)”: The optimal solution of Problem 1, i.e., Policy 1. 7) “Min max DL time (Pop)”: The optimal solution of Problem 2, where quf is replaced by pf . This is the optimal caching policy minimizing the maximal user weighted average download time without the knowledge of user preference and active level. 8) “Min max DL time (Pref)”: The optimal solution of Problem 2, i.e., Policy 2. In Fig. 5(a), we show the impact of user preference similarity on the network average download

October 30, 2017

DRAFT

23

350

Maxmial Weighted Average Download Time

Network Average Download Time

90

85

80

75 Gobal Pop Local Pop Femtocaching (Pop) Femtocaching (Pref) min average DL time (Pop) min average DL time (Pref)

70

65

300

250

200 Gobal Pop Local Pop Femtocaching (Pop) Femtocaching (Pref) min max DL time (Pop) min max DL time (Pref)

150

100 0

0.2

0.4

0.6

0.8

1

0

Average Cosine Similarity of User Preference

(a) Network averge download time

0.2

0.4

0.6

0.8

1

Average Cosine Similarity of User Preference

(b) Maximal weighted user average download time

Fig. 5. The impact of user preference similarity, δa = 1 and δs = 0.4.

time (in seconds). It is shown that “Local Pop” can reduce network average download time compared with “Global Pop” when user preferences are heterogeneous. “Femtocaching (Pop)” and “min average DL time (Pop)” almost perform the same without exploiting individual user preference. This is because when user preference and user active level are regarded as identical, the actual identity of the user is irrelevant (i.e., there is no difference among users statistically). When a user (say the uth user) changes its location, and another user locates in the previous location of the uth user, the caching solution will be same as the previous solution [9]. However, when taking into account the preference of individual user, the performance is quite different. The network average download time of “Femtocaching (Pref)” is even higher than that of “Femtocaching (Pop)” when user preference is less similar. This is because “femtocaching” method does not consider the uncertainty of user location, which has large impact when user preference is heterogeneous, in which case user identity is no longer irrelevant. As expected, “min average DL time (Pref)” achieves the lowest network average download time. Besides, the network average download time of “min average DL time (Pref)” increases with the preference similarity, which coincides with the results of numerical example in Section III-B-2). In Fig. 5(b), we show the impact of user preference similarity on the maximal weighted user average download time (in seconds). We can see that “min max weighted DL time (Pref)” can reduce 60% of the maximal weighted user average download time compared with “Global Pop”. However, with “min max DL time (Pop)”, the maximal weighted user average download time

October 30, 2017

DRAFT

24

does not decrease much compared with other caching policies. This suggests that the knowledge of user preference is important when improving user fairness by caching. Similar to Fig. 5(a), the maximal weighed download time of “Femtocaching (Pref)” is higher than “Femtocaching (Pop)”. The maximal weighted user average download time of “min max DL time (Pref)” increases with the preference similarity, and the explanations are similar to those in Section III-B.

86

Maxmial Weighted Average Download Time

320

Network Average Download Time

84 Gobal Pop Local Pop Femtocaching (Pop) Femtocaching (Pref) min average DL time (Pop) min average DL time (Pref)

82 80 78 76 74 72 70 68

300 280 260 240 Gobal Pop Local Pop Femtocaching (Pop) Femtocaching (Pref) min max DL time (Pop) min max DL time (Pref)

220 200 180 160 140 120

0

1

2

3

4

5

0

User Location Skewness Parameter, δa

(a) Network averge download time

1

2

3

4

5

User Location Skewness Parameter, δa

(b) Maximal weighted user average download time

Fig. 6. Impact of user location skewness parameter, δa . The average user preference similarity is set as cos(Q) = 0.1.

In Fig. 6, we show the impact of user location skewness on the two performance metrics. As shown in Fig. 6(a), the network average download time of “Local Pop” and other user preference based caching policies decreases with user location skewness parameter δa . Moreover, when δa = 0, i.e., each user is with equal probability to locate in the Nb cells, “min average DL time (Pref)” achieves the same performance as “min average DL time (Pop)”, which verifies Proposition 2. When δa = 5, i.e., each user is with

−5 P71 −5 j=1 j

= 0.96 probability located in a

cell, “min average DL time (Pref)” reduces 20% of the network download time compared with “Global Pop”. This suggests that the benefit of exploiting user preference highly relies on the spatial locality of a user. As shown in Fig. 6(b), the maximal weighted user average download time of “Local Pop” and other user preference based caching policies also decrease with δa . When δa = 5, “min average DL time (Pref)” can reduce 55% of the maximal weighted user average download time compared with “Global Pop”. On the contrary, the spatial locality has little impact on both metrics achieved by the global popularity based caching policies, because these policies regard user preference and user active level as identical.

October 30, 2017

DRAFT

25

1800

Maxmial Weighted Average Download Time

Network Average Download Time

85

80

75

70 Gobal Pop Local Pop Femtocaching (Pop) Femtocaching (Pref) min average DL time (Pop) min average DL time (Pref)

65

60

Gobal Pop Local Pop Femtocaching (Pop) Femtocaching (Pref) min max DL time (Pop) min max DL time (Pref)

1600 1400 1200 1000 800 600 400 200 0

0

0.2

0.4

0.6

0.8

1

0

User Active Level Skewness Parameter, δs

(a) Network averge download time

0.2

0.4

0.6

0.8

1

User Active Level Skewness Parameter, δs

(b) Maximal weighted user average download time

Fig. 7. Impact of user active level, δs . The average user preference similarity is set as cos(Q) = 0.1.

In Fig. 7, we show the impact of user active level skewness on the two metrics. We can see from Fig. 7(a) that the network average download time of all caching policies decreases with δs (though slightly for “Global Pop”). This is because when the skewness of user active level increases, the caching solutions are determined more by the preference of highly active users, and hence the average download time of these users decreases, which reduces the network average download time. As shown in Fig. 7(b), the maximal weighted user average download time of all caching policies increase with δs . This is because the weights for highly active users increase with δs . As expected, the maximal weighted user average download time of “min max DL time (Pref)” is the lowest, and the performance gain increases with δs . In Fig. 8, we show the tradeoff between network average download and maximal weighted user average download time by solving problem (18) with different values of η. It is shown that when η is set between 0 and 0.25, we can achieve lower network average download time and better user fairness than the baseline policies at the same time. In Fig. 9, we show the impact of imperfect user preference. We compare the following methods to learn user preference or content popularity: (1) “Frequency”: this method uses the frequencycount method in [34] to learn user preference, i.e., qˆuf =

nuf PNu PNf u=1

f =1 nuf

, where nuf is the number

of requests of the uth user requesting the f th file. (2) “MF” : this is a CF method proposed in [31] that learns user preference by matrix factorization. Since the learned user preference by this method is not in probability form as we applied for optimization, we obtain the probability that

October 30, 2017

DRAFT

320

450 Global Pop Local Pop Femtocaching (Pop) Opt. Solution of (18)

300 280

min max DL time (Frequency) min max DL time (Popularity) min max DL time (MF) min max DL time (Perfect) min average DL time (Frequency) min average DL time (Popularity) min average DL time (MF) min average DL time (Perfect)

400 350

Download Time

Maximal Weighted User Average Download Time

26

260 η = 0, min average DL time (Pref)

240 220

300 250 200

Maximal Weighted User Averge DL Time

150

200 η = 1, min max DL time (Pref)

180

100

Network Average DL Time

η = 0.25

160 74

76

78

80

82

84

50 102

86

103

104

Total Number of Requests

Network Average Download Time

Fig. 8. Tradeoff between network average download time and

Fig. 9.

Impact of imperfect user preference learning. The

maximal weighted user average download time, cos(Q) = 0.1.

average user preference similarity is set as cos(Q) = 0.3.

each user requests each file by normalizing the learned preference. (3) “Popularity”: this method learns global content popularity also with frequency-count method as pˆf =

PNu

u=1 nuf PNf PNu u=1 nuf f =1

[34]

and then optimizes caching policy based on the learned popularity. To show the learning rate of these methods, we use the total number of requests in the considered region as x-axis. When the total number of requests exceeds 300 (i.e., higher than 3 requests by each user on average), the caching policies based on learned user preference by “Frequency” method outperform the caching policies based on learned global content popularity. The caching policies based on learned user preference by “MF” method outperform both “Popularity” and “Frequency” methods, and can achieve performance close to caching policies with perfect user preference when the total number of requests exceeds 104 (i.e., more than 100 requests for each user on average).6 It is worthy to mention that learning individual preference of a large number of users can be more computational complex than learning content popularity (even all with the simple frequency-count method), and informing the predicted user preference to BSs may cause overhead. Fortunately, in contrast to content popularity that has been observed with spatial locality and hence should be learned at more-edged nodes, user preference can be learned at a service gateway or even a content server7 that has abundant computing resource. Moreover, it is no need to learn very frequently (say each day). Nevertheless, to harness the benefit of user preference 6

If each user sends 25 requests per day in average, then 4 days are required.

7

This is possible since there is a trend of collaboration of mobile operators and content providers. Moreover, content providers

have investigated user preference prediction for a long time since it is a key task for recommendation systems [25].

October 30, 2017

DRAFT

27

based caching policy, it is worthwhile to investigate how to reduce the complexity and overhead. VI. C ONCLUSION In this paper, we proposed a caching policy optimization framework taking into account the spatial locality, heterogeneous preference and active level of users to minimize the network average download time and the maximal weighted user average download time. We showed in which case the global content popularity, local content popularity and user preference are identical and in which case they differ. To facilitate the investigation of how user preference similarity and active level skewness affect the optimal caching policies, we provided a method to synthesize user preference with given content popularity and user active level and validated it with two real datasets. Simulation results showed that both network performance and user fairness can be improved remarkably by exploiting individual user preference compared with priori works exploiting content popularity, as expected. For a region with given content popularity, the gain of the proposed policy is large when user preferences are less similar or when user active levels are more skewed, and more importantly, when each user is with spatial locality. Since in real-world networks the user preferences are indeed heterogeneous and each user indeed sends request in limited locations as observed by many recent data analysis, optimizing caching policy with user preference is promising to reap the benefit of caching at wireless edge. A PPENDIX A P ROOF

OF

P ROPOSITION 1

From (2), we can obtain ¯ ub (xu ) =Wu R

Z

∞ 0

···

× f1 (hub ) where f1 (x) = e−x and f2 (x) =

Z



0

log2

Y

b′ ∈Φb ,b′ 6=b

xNt −1 e−xNt −N Nt t (Nt −1)!

1+ P

Pt h r −α Nt ub ub

b′ ∈Φb ,b′ 6=b

f2 (hub′ )dhub

Y

−α 2 Pt hub′ rub ′ + σ

dhub′

! (A.1)

b′ ∈Φb ,b′ 6=b

because hub and hub′ follow exponential distribution

with unit mean and Gamma distribution G(Nt , N1t ), respectively, and (xb , yb ) is the location of the bth BS.

October 30, 2017

DRAFT

28

Without loss of generality, we derive the average download time from the three nearest BSs when the uth user located the 1st sector of the 1st cell, i.e., the shadow area in Fig. 1. From (1), by taking the expectation over xu within the shadow area, we have √ Z D Z x√u2 3 1 2 3 dxu1 dxu2 τ¯ul = 2 ¯ Wu D 0 0 Rul (xu )

(A.2)

Further considering (A.1), Proposition 1 can be proved. A PPENDIX B P ROOF When

Pt σ2

OF

C OROLLARY 1

→ ∞, we can neglect the impact of σ 2 and (2) can be derived as

¯ ub (xu ) = Wu Eh [log2 X] − Wu Eh ′ [log2 Y ] ≈ Wu Eh [log2 X] ˆ − Wu Eh ′ [log2 Yˆ ] (B.1) R ub ub ub ub P P −α −α −α where X = hub rub + Nt b′ ∈Φb ,b′ 6=b hub′ rub = Nt b′ ∈Φb ,b′ 6=b hub′ rub ′, Y ′ , and we approxiˆ ∼ G(kx , θx ) and Yˆ ∼ G(ky , θy ), mate X and Y as Gamma distributed random variables X

ˆ we can obtain respectively, as in [32]. By matching the first two moments of X and X, P P P −α −α 2 −α 2 −2α −2α N r +N r r +N r t ( b′ ∈Φ ,b′ 6=b rub′ ) ( ub t b′ ∈Φb ,b′ 6=b ub′ ) t b′ ∈Φb ,b′ 6=b ub′ ub b P P P kx = r−2α , k = , θ = , and θy = y x r −α +N r −α +N r −2α r −2α t ub −2α b′ ∈Φb ,b′ 6=b rub′ P −α b′ ∈Φb ,b′ 6=b rub′

P

b′ ∈Φb ,b′ 6=b ub′

t

ub

b′ ∈Φb ,b′ 6=b ub′

b′ ∈Φb ,b′ 6=b ub′

ˆ = ψ(kx ) + ln(θx ), (B.1) can be derived as . Considering E[ln X] ¯ ub (xu ) ≈ W (log2 kx − log2 ky + R

1 ψ(θx ) ln 2



1 ψ(θx )) ln 2

(B.2)

Then, by substituting (B.2) into (A.2), Corollary 1 can be proved. A PPENDIX C P ROOF

OF

P ROPOSITION 3

When each user can only associate with the nearest BS, the average download time of the P b PNf ¯1 + (1 − cjf )¯ τ4 ), where uth user in (15) can be simplified into t¯u = F N f =1 auj qf |u (cjf τ j=1

τ¯1 and τ¯4 are respectively the average download time per unit bit from cache at the nearest BS P u ¯ and from the backhaul. Then, by substituting t¯u and (8) into T = N u=1 su tu , the optimization

problem of minimizing the network average download time can be expressed as   X Nf Nb  X Nu X pf |j (cjf τ¯1 + (1 − cjf )¯ τ4 ) min F auj su {cjf }

s.t.

j=1

Nf X f =1

u=1

cjf ≤ Nc , ∀j

0 ≤ cjf ≤ 1, ∀f, j October 30, 2017

(B.3a)

f =1

(B.3b) (B.3c)

DRAFT

29

which can be solved from the following problem for each cell   Nf X pf |j cjf τ¯1 + (1 − cjf )¯ τ4 min {cjf }

s.t.

(B.4a)

f =1

Nf X

cjf ≤ Nc

(B.4b)

0 ≤ cjf ≤ 1, ∀f

(B.4c)

f =1

PNf PNf We can rewrite the objective function (B.4a) as f =1 pf |j (cjf τ¯1 + (1 − cjf )¯ τ4 ) = f =1 pf |j τ¯4 − PNf τ4 − τ¯1 ), from which we can see that minimizing (B.4a) is equivalent to maximizing f =1 pf |j cjf (¯ PNf τ4 − τ¯1 ). Since τ¯4 > τ¯1 , the optimal caching policy is to cache the Nc complete f =1 pf |j cjf (¯ files with the highest value of pf |j for each cell, say cell j. Then, Proposition 3 is proved. R EFERENCES [1] N. Golrezaei, A. F. Molisch, A. G. Dimakis, and G. Caire, “Femtocaching and device-to-device collaboration: A new architecture for wireless video distribution,” IEEE Commun. Mag., vol. 51, no. 4, pp. 142–149, 2013. [2] E. Bastug, M. Bennis, and M. Debbah, “Living on the edge: The role of proactive caching in 5G wireless networks,” IEEE Commun. Mag., vol. 52, no. 8, pp. 82–89, Aug. 2014. [3] D. Liu and C. Yang, “Energy efficiency of downlink networks with caching at base stations,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 907–922, Apr. 2016. [4] D. Liu, B. Chen, C. Yang, and A. F. Molisch, “Caching at the wireless edge: design aspects, challenges, and future directions,” IEEE Communications Magazine, vol. 54, no. 9, pp. 22–28, Sept. 2016. [5] D. Liu and C. Yang, “Caching policy toward maximal success probability and area spectral efficiency of cache-enabled hetnets,” IEEE Trans. Commun., vol. 65, no. 6, pp. 2699–2714, Jun. 2017. [6] H. Pinto, J. M. Almeida, and M. A. Gonc¸alves, “Using early view patterns to predict the popularity of youtube videos,” in Proc. ACM WSDM, 2013. [7] G. G¨ursun, M. Crovella, and I. Matta, “Describing and forecasting video access patterns,” in Proc. IEEE INFOCOM, 2011. [8] F. Figueiredo, “On the prediction of popularity of trends and hits for user generated videos,” in Proc. ACM WSDM, 2013. [9] K. Shanmugam, N. Golrezaei, A. G. Dimakis, A. F. Molisch, and G. Caire, “Femtocaching: Wireless content delivery through distributed caching helpers,” IEEE Trans. Inf. Theory, vol. 59, no. 12, pp. 8402–8413, Dec 2013. [10] B. Blaszczyszyn and A. Giovanidis, “Optimal geographic caching in cellular networks,” in proc. IEEE ICC, 2015. [11] J. Wen, K. Huang, S. Yang, and V. O. K. Li, “Cache-enabled heterogeneous cellular networks: Optimal tier-level content placement,” IEEE Trans. Wireless Commun., vol. 16, no. 9, pp. 5939–5952, Sept. 2017. [12] Y. Cui, D. Jiang, and Y. Wu, “Analysis and optimization of caching and multicasting in large-scale cache-enabled wireless networks,” IEEE Trans. Wireless Commun., vol. 15, no. 7, pp. 5101–5112, July 2016. [13] Z. Chen, J. Lee, T. Q. S. Quek, and M. Kountouris, “Cooperative caching and transmission design in cluster-centric small cell networks,” IEEE Trans Wireless Commun., vol. 16, no. 5, pp. 3401–3415, May 2017. [14] F. Gabry, V. Bioglio, and I. Land, “On energy-efficient edge caching in heterogeneous networks,” IEEE J. Sel. Areas Commun., vol. 34, no. 12, pp. 3288–3298, Dec. 2016.

October 30, 2017

DRAFT

30

[15] J. Song, H. Song, and W. Choi, “Optimal content placement for wireless femto-caching network,” IEEE Trans. Wireless Commun., vol. 16, no. 7, pp. 4433–4444, July 2017. [16] X. Xu and M. Tao, “Modeling, analysis, and optimization of coded caching in small-cell networks,” IEEE Trans. Commun., vol. 65, no. 8, pp. 3415–3428, Aug. 2017. [17] X. Li, X. Wang, K. Li, Z. Han, and V. C. Leung, “Collaborative multi-tier caching in heterogeneous networks: Modeling, analysis, and design,” IEEE Trans. Wireless Commun., early access, 2017. [18] M. Zink, K. Suh, Y. Gu, and J. Kurose, “Characteristics of youtube network traffic at a campus network–measurements, models, and implications,” Computer networks, vol. 53, no. 4, pp. 501–514, 2009. [19] H. Ahlehagh and S. Dey, “Video-aware scheduling and caching in the radio access network,” IEEE/ACM Trans. Netw., vol. 22, no. 5, pp. 1444–1462, Oct. 2014. [20] U. Paul, A. P. Subramanian, M. M. Buddhikot, and S. R. Das, “Understanding traffic dynamics in cellular data networks,” in Proc. IEEE INFOCOM, 2011. [21] W. j. Hsu, T. Spyropoulos, K. Psounis, and A. Helmy, “Modeling time-variant user mobility in wireless mobile networks,” in Proc. IEEE INFOCOM, 2007. [22] Y. Guo, L. Duan, and R. Zhang, “Cooperative local caching under heterogeneous file preferences,” IEEE Trans. Commun., vol. 65, no. 1, pp. 444–457, Jan. 2017. [23] J. Liu, B. Bai, J. Zhang, and K. B. Letaief, “Cache placement in Fog-RANs: From centralized to distributed algorithms,” IEEE Trans. Wireless Commun., early access, 2017. [24] B. Chen and C. Yang, “Caching policy optimization for D2D communications by learning user preference,” in Proc. IEEE VTC Spring, 2017. R [25] M. D. Ekstrand, J. T. Riedl, J. A. Konstan et al., “Collaborative filtering recommender systems,” Foundations and Trends

in Human–Computer Interaction, vol. 4, no. 2, pp. 81–173, 2011. [26] A. Paterek, “Improving regularized singular value decomposition for collaborative filtering,” in Proc. KDD, 2007. [27] T. Hofmann, “Latent semantic models for collaborative filtering,” ACM Trans. Inf. Syst., vol. 22, no. 1, pp. 89–115, 2004. [28] E. Zeydan, E. Bastug, M. Bennis, M. A. Kader, I. A. Karatepe, A. S. Er, and M. Debbah, “Big data caching for networking: moving from cloud to edge,” IEEE Communications Magazine, vol. 54, no. 9, pp. 36–42, Sept. 2016. [29] T. Bertin-Mahieux, D. P. Ellis, B. Whitman, and P. Lamere, “The million song dataset,” in Proc. ISMIR, 2011. [Online]. Available: https://labrosa.ee.columbia.edu/millionsong/tasteprofile [30] O. Celma, Music Recommendation and Discovery in the Long Tail.

Springer, 2010. [Online]. Available:

http://www.dtic.upf.edu/∼ocelma/MusicRecommendationDataset/lastfm-1K.html [31] Y. Hu, Y. Koren, and C. Volinsky, “Collaborative filtering for implicit feedback datasets,” in Proc. IEEE ICDM, 2008. [32] D. Jaramillo-Ramrez, M. Kountouris, and E. Hardouin, “Coordinated multi-point transmission with imperfect CSI and other-cell interference,” IEEE Trans. Wireless Commun., vol. 14, no. 4, pp. 1882–1896, Apr. 2015. [33] S. Boyd and L. Vandenberghe, Convex optimization.

Cambridge university press, 2004.

[34] A. Tatar, M. D. de Amorim, S. Fdida, and P. Antoniadis, “A survey on predicting the popularity of web content,” Journal of Internet Services and Applications, vol. 5, no. 1, p. 8, 2014.

October 30, 2017

DRAFT

Suggest Documents