2944
IEEE SENSORS JOURNAL, VOL. 13, NO. 8, AUGUST 2013
Log-Logistic Modeling of Sensory Flow Delays in Networked Telerobots Ana Gago-Benítez, Juan-Antonio Fernández-Madrigal and Ana Cruz-Martín
Abstract— This paper deals with the modeling of the delays in the transmission of sensory data coming from a networked telerobot, which would allow us to predict future times of arrival and provide guarantees on the time requirements of these systems. Considering these delay sequences as an uni-dimensional temporal signal, they easily exhibit rich stochastic behavior— abrupt changes of regime and bursts—due to the heterogeneity of the hardware and software components in the data path. There exist approaches for modeling this kind of signals without explicit knowledge of the system components: state-space reconstruction, hidden Markov models, neural networks, etc., but they are mostly focused on the stochasticity of the network only, without taking into account other elements in the sensory flow that also have an important influence in the delays. Previously, we have proposed simpler statistical methods that do not require any component knowledge either and are suitable for more lightweight implementations (e.g., in mobile phone interfaces). In this sense, we report elsewhere a log-normal three-parametrical model that fits reasonably well these delays as long as change detection is completely solved. Now we propose a more flexible solution: the log-logistic distribution, which has been found to fit delays better than the log-normal. In addition, we present two algorithms to model an entire delay signal, including abrupt nonlinearities, based on the log-logistic assumption. Our results show quite good fittings of real datasets gathered from a number of combinations of sensors, networks, and application software, provided that some mild assumptions hold. Index Terms— Sensor systems, statistical analysis, telerobotics.
I. I NTRODUCTION
S
TOCHASTIC time delays appear in several problems, such as multimedia [1], networking [2], distributed control [3], robotics [4], etc. These delays usually come from multi component systems, whose dynamics are unknown, being not only the network an issue, but also the general-purpose operating systems, application software and interfaces. For example, networked telerobots [5] transmit information from sensors to remote controllers, exhibiting a stochastic behavior that includes non-linearities: regime changes—i.e., abrupt changes in the parameters of the underlying stochastic process—, bursts—short regimes—, non-stationarities—smooth changes
Manuscript received April 5, 2013; revised May 3, 2013; accepted May 9, 2013. Date of publication May 16, 2013; date of current version July 2, 2013. This work was supported in part by the Junta de Andalucía and the European Union through the research Project P08-TIC-04282. The associate editor coordinating the review of this paper and approving it for publication was Prof. Weileun Fang. The authors are with the System Engineering and Automation Department, University of Málaga, Málaga 29071, Spain (e-mail:
[email protected];
[email protected];
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSEN.2013.2263381
of parameters—etc. (see the examples in Fig. 2). In spite of this, in robotics it is often mandatory to maintain time delays below given thresholds to perform correctly, therefore we should predict and control them. The modeling of stochastic time delays without explicit information about the behavior of the components of the system may be based on several techniques [6]: state-space reconstruction, that recognizes the geometrical structure underlying the observed behavior in the data sequence; machine learning, represented by Bayesian or neural networks that can adaptively explore a large space of potential models; hidden Markov models, that predict the evolution of the entire probability density, etc. Often, all of these have a high theoretical and computational complexity. In a previous work [7], we studied the problem from a simpler statistical signal processing approach, proposing several simple methods based on parametrical distributions and concluding that one of the most promising models was the three-parametrical lognormal distribution. In general, we obtained suitable results with several long-tailed models (erlang, gamma, etc.), as long as the different regimes in the delay signal were appropriately detected [8]. In this paper we extend this approach of modeling delays under a different hypothesis, which has been found to fit the signal even better than the log-normal: the three-parametrical log-logistic distribution. We propose to estimate the three parameters of that distribution through the MLE method (maximum likelihood estimation, [15]), then we study the limitations of this estimation with small samples, present an adaptation of the Anderson-Darling (AD) bi-parametrical hypothesis test [14] for assessing the goodness of fit of our three-parametrical model to the samples, define an intuitive measure of the compatibility that a real scenario may have with the log-logistic hypothesis, and, finally, describe two intuitive and simple algorithms based on all of these for on-line modeling of scenarios that exhibit multiple regimes and bursts. Our tests indicate that most real scenarios are compatible with the log-logistic assumption, and the results obtained are better than the ones of our previous log-normal proposal. Therefore, the work presented in this paper is a relevant improvement in the on-line modeling of delays in robotic teleoperation applications, concerning both accurateness and simplicity of implementation. The rest of the paper is structured as follows: section II describes the general components that can be found in a typical networked telerobotic system (or, more in general, in any remote sensor application) and the particular software
1530-437X/$31.00 © 2013 IEEE
GAGO-BENÍTEZ et al.: LOG-LOGISTIC MODELING OF SENSORY FLOW DELAYS IN NETWORKED TELEROBOTS
2945
Fig. 1. Left plot shows a general scheme of a sensory loop in a remote sensor system controlled through the Internet (hw and sw); the right top picture corresponds to the Surveyor robot [10], the right bottom one to the SANCHO robot [9], the two platforms used in our experiments.
and hardware that we have combined for setting up our experiments; section III proposes an intuitive measure of the compatibility of real sequences of delays with the AD test for the log-logistic, introduces the three-parametrical log-logistic fitting method and study its limitations with small samples; section IV introduces a first intuitive modeling algorithm and then an improved version that takes into account past regimes of the signal, and presents the results obtained with both. Finally, some conclusions and future work are outlined. II. OVERVIEW OF A N ETWORKED T ELEROBOT S YSTEM We can consider diverse components that reflect the heterogeneity of hardware and software in a networked telerobot. All of them add delays to the total measured time of the information flow, and all have stochastic behaviors in that aspect. In particular, we use a web interface for requesting and displaying remote data in the client side, which adds stochastic effects to the system due to being an intrinsic non-real-time software. For our purposes, the most important time delay of a networked telerobot is in the loop for the transmission of sensory information, since robot actuators require much less data to process. An ideal regulation of the amount of sensory data would maintain that delay under control providing the operator with an optimal amount of information in a timely fashion (see Fig. 1 left). For performing such regulation it is important to predict future delays, which, in turn, would be benefited from an estimation of the underlying stochastic model that produces them. In this paper we will assume that any sensor is requested only after its last request has arrived at the client, as it is shown in the figure. We are not interested here in the information content of the sensory data. For our experiments, we have particularized the generic scheme of Fig. 1 with the following components, setting up different scenarios (listed in Table 1).
A. Remote Sensors In our experiments with the mobile SANCHO robot [9] we use i) a USB color camera sensor (webcam) with 48-bit color depth and 640 × 480 maximum resolution (also captures B&W pictures) and ii) a SICK LMS-200 laser scanner providing 360, 180 and 90 range data, connected through RS-422. In the Surveyor [10] microrobot we use a CMOS image sensor with 1.3 megapixels and request 320 × 240 images (5.6% of the full color resolution). The Surveyor robot sends compressed images (JPEG), which alters randomly the actual density1 of the transmitted data. In the experiments we have considered only a single fixed density to carry out our analysis consistently. B. Servers for the Sensory System The camera sensor of SANCHO is plugged-in to a laptop Intel Pentium IV @ 3.2Ghz with 1Gb RAM. The laser sensor is connected to a second on-board laptop (Intel Pentium M @ 2Ghz with 1Gb RAM). Both laptops are connected to the local network through Ethernet, with only twisted pair segments at 1Gbps (the robot is stopped), and run MS Windows XP and also a robotic software mini-architecture built on the CORBA middleware [11]. On the other hand, the Surveyor robot uses a proprietary firmware that allows a remote client to request specific commands for getting images and other actions. C. Web Servers For SANCHO, an Apache web server is located in the same sub-network as the robot. It runs GNU/Linux (Ubuntu), and serves a PHP [12] page that connects through TCP/IP sockets to the robot middleware. The Surveyor, in contrast, includes 1 Volume of data that the sensor transmits, which we measure in bytes.
2946
IEEE SENSORS JOURNAL, VOL. 13, NO. 8, AUGUST 2013
TABLE I S UMMARY OF R EAL S CENARIOS W HERE THE E XPERIMENTS H AVE B EEN C ARRIED O UT Scenario
Robot
Sensor
Resolution
Location
Client Computer
Server Computer
Network
#1 #2 #3 #4 #5 #6 #7 #8 #9
SANCHO SANCHO SANCHO SANCHO SANCHO SANCHO SANCHO SANCHO SURVEYOR
Laser Webcam Webcam Webcam Webcam Webcam Webcam Webcam Webcam
181 data points 20% full B&W 40% full B&W 30% full B&W 100% full color 10% full B&W 50% full B&W 100% full B&W 5.86% full color
Same building Same city Same city Same city Same city Same building Same building Same building Same building
A B B B B A A A C
a a a b b b b b c
1 2 2 3 3 4 4 4 5
More details on the elements of this table can be consulted in the main text (section II). The legend is: A: Pentium M @ 1.8Ghz, 1Gb RAM, LINUX, FireFox. B: Intel Core Duo T7200 @ 2Ghz, 2Gb RAM, LINUX, FireFox. C: Android smartphone Galaxy SII. a: Intel Pentium M @ 2Ghz, 1Gb RAM, LINUX. b: Intel Pentium IV @ 3.2Ghz,1Gb RAM, WINDOWS. c: Surveyor, Android. 1: WiFi 802.11b/g, local University provider in the same building as the robot. 2: WiFi 802.11b/g, ISP provider located in Madrid (at about 600 km from our lab). 3: RTC 56Kbps, ISP provider located in Madrid (at about 600 km from our lab). 4: Twisted-pair 1GBs Cable. 5: WiFi ad-hoc.
an onboard ad-hoc web server that processes remote requests directly. D. Web Clients The client web browser for SANCHO has been Firefox on Linux, where a piece of AJAX code (Asynchronous JavaScript And XML [12]) requests continuously data from the sensors and measures the time delays. Two different laptops have been tested for running this web client: an Intel Core Duo T7200 @ 2GHz with 2Gb RAM (running Mandriva Linux) and a Pentium M @ 1.8 Ghz with 1Gb RAM (running Ubuntu Linux). Both can use WiFi 802.11b/g connections to access the network: the former through an ISP provider located in Madrid (at about 600 km from our lab); the latter through a local University provider in the same building as the robot. In addition, we have included in some experiments a 56Kbps narrowband segment through a phone modem. For the Surveyor robot, a specific Android client application connects to the robot onboard web server, using the robot control protocol and a WiFi ad-hoc (single-hop) connection, and retrieving camera images. It is important to remark the diversity of configurations that can be explored with combinations of all these elements. Concerning the network, some of our setups include a number of hops, since they have been carried out from geographically distant places, while others have been performed just with one network segment in the system. Regarding the hardware, the one of the robots is obviously fixed, but both are very different and in particular have different computing power, involving both a microcontroller and a standard laptop. We have used common robotic sensors, such a laser scanner and cameras. Cameras are of great importance to justify the applicability of our approach to a wide diversity of scenarios, since they allow as to vary the amount of data transmitted with very fine granularity (the main influence of the sensor in our algorithms
comes from the amount of data that it can generates). Finally, the software used in all the experiments are for standard desktop machines except for the Surveyor firmware. We have established no restriction in working with any reasonable number of applications not related to our experiments while they were executed, including robotics modules in charge of other tasks in the robot, office applications, Internet navigation in the case of the client controller, and auxiliary software in the case of the web server. As detailed in the previous paragraphs, we have also employed different web browsers and operating systems. Since the approach presented in this paper is aimed at not using any knowledge about the plant that is producing the delays, its applicability cannot be demonstrated analytically for any existing combination of network, computer hardware and software, thus only statistical results can be assessed. Under that perspective, the components enumerated above are representative of practical applications that do not have specific constraints (i.e., when non-real-time, general purpose systems and transmissions are used); the experiments with the Surveyor robot widen this by including a very computationally limited platform and surely a more predictable system. Apart from overall statistical parameters such as the average transmission time of sensory data, which are strongly related to the configuration of each particular experiment, the gathered delay signal may vary over time due to different causes, not so strongly related to that configuration: transmissions from other sensors, network congestion, non-real-time operating systems in the loop, unpredictable behavior of the client and the robot software applications, etc. Nevertheless, it is worth noting that the delay produced when requesting a sensor does not depend on previous delays of the same sensor, provided that it is not requested again until the previous request ends. Therefore, our first working assumption will be that the dependences appearing in the sequence of delays will be strongly determined
GAGO-BENÍTEZ et al.: LOG-LOGISTIC MODELING OF SENSORY FLOW DELAYS IN NETWORKED TELEROBOTS
2947
Fig. 2. Real time delays gathered from the sensory loop of the scenarios of Table 1. They present bursts and regime changes that have been visually detected (gray and black vertical lines, respectively). We also include boxplots of the measurements. The bottom x-axes show indexes of samples, while the top x-axes show the absolute time when each sample was gathered. Scenario #9 has been expanded in the figure due to its length. Scenarios with the longest delays have been explored up to a sample size of around 500 delays, due to the longer total time of these experiments, while quickest scenarios have been explored to a greater size of 1000 delays (except for the Surveyor robot, that has extended up to 1600 delays).
only by changes in the underlying stochastic parameters of the system. Fig. 2 shows the total delay that we have measured in the sensory loop for each of the real scenarios listed in Table 1, versus both the number of delay values gathered and the time
when each delay was measured; it is easy to observe abrupt changes of regimes, outliers and bursts. Fig. 2-right shows the boxplots for each scenario; they confirm the wide range of time delays measured and the pronounced skewness of the marginal distributions. We have marked with thick vertical lines those
2948
IEEE SENSORS JOURNAL, VOL. 13, NO. 8, AUGUST 2013
Fig. 3. Autocorrelograms of original scenarios (thin line) and of the visually purged scenarios divided in regimes (marked lines). Observe that when some lags go above/below the confidence limits, they do not exceed the 5% of the total count of lags.
abrupt changes that can be visually detected (hence separating near-stationary parts of the signal). Our independence assumption allows us to apply the basic tools of the next sections. To confirm it, we have analyzed
autocorrelation coefficients of the signal: the delays would be actually dependent if the auto-correlation function (ACF) would go above the corresponding confidence limits; otherwise, our independence hypothesis cannot be rejected
GAGO-BENÍTEZ et al.: LOG-LOGISTIC MODELING OF SENSORY FLOW DELAYS IN NETWORKED TELEROBOTS
2949
[13]. Fig. 3 shows the ACF of the scenarios, where it is clear the strong dependence of most of them when no processing is done in the signal, and also the ACF of the scenarios when they are visually purged from bursts and their regimes are also separated. We can see how the ACFs stay very close to zero in the latter case, and remain below the confidence limits in most cases. This supports our assumption of considering iid (independent and identically distributed) sequences of delays when they are properly separated in regimes and when bursts are also ruled out. This assumption includes the one of stationarity, i.e., no smooth variations on the underlying distribution will be considered. Finally, systematic effects in the delays might also appear, leading to multimodal distributions. In this paper we assume that they are negligible in most real situations, as we have observed in our experiments. In summary, the iid and totally stochastic assumptions allows us to provide a simpler and lightweight solution to the problem. Our results indicate that they are very reasonable in most situations. III. L OG -L OGISTIC D ISTRIBUTION OF THE T IME D ELAYS In this section we justify the log-logistic distribution as a model of the delays and present an intuitive measure that shows the compatibility of this model with real scenarios. In subsection III-B we formalize its practical use. A. Justification For studying the compatibility of entire real scenarios with a three-parametrical log-logistic model, we introduce an intuitive measure, called sliding-window, based on the Anderson-Darling (AD) test that is formalized in section III-B. This compatibility measure is calculated as follows. We define a window of fixed length, let say w, that slides all over the scenario, shifting forward one value at a time. After each movement, and according to the formalization of these procedures in section III.B, we estimate a three-parameter log-logistic distribution that fits that sample, then apply the AD test to that model, and then mark all the delays in the window as non-rejected if the test passes. At the end, the percentage of non-rejected values of the entire scenario is the result of the sliding-window measure: a greater percentage indicates a potentially greater compatibility of that scenario with the three-parametrical log-logistic model for that given value of w, or, in other words, that the scenario seems to contain a relevant number of regimes of that size that would not reject the log-logistic assumption. The complete curve of the measure for all values of w is consequently an indication of our chances of modeling such scenario with the log-logistic. Notice that this compatibility measure can also be applied to other probability models. Also notice that it does not provide a modeling algorithm, just an indicative measure of suitability. Fig. 4-top shows the calculation of the sliding-window measure for the real scenarios of Fig. 2 and also for a simulated log-logistic scenario that serves as a reference (it achieves nearly a 100% value in the measure for all values of w). It can be observed how scenarios #2 and #3 look largely compatible
Fig. 4. (Top) Comparison of the sliding window measure for the scenarios of Fig. 2 and a theoretical log-logistic reference (marked green). (Bottom) The same comparison when using a log-normal assumption instead of the log-logistic. The thick black lines in both figures are the average of all the colored lines; note how these average line is generally higher in the loglogistic case, illustrating its greater capability for modeling scenarios given a fixed window size parameter.
with the log-logistic assumption for any window size up to 240 approx., i.e., they obtain a measure value greater than 50% in all w < 240; scenarios #1, #4, #5 and #7 achieve a reasonably compatibility for w < 100, which is also a reasonable regime size; finally, scenarios #6, #8 and #9 are not very compatible, showing non-rejected percentages that quickly decrease below 50% for all practical windows, which indicates that in this kind of complex system it is not always guaranteed to find a suitable parametrical model. Notice that all the graphs except the reference fall to zero eventually. This is because of their multi-regime and bursty features, which prevent us to model them entirely with only one distribution. On the average, 67% of our scenarios have a non-negligible compatibility (measure > 50%) with the log-logistic assumption using a reasonably value of w < 80; thus, it seems appropriate to employ our test and fitting methods to model real sequences of delays in many cases. Fig. 4-bottom comes from [8], and shows the same slidingwindow compatibility measure when a three-parametrical lognormal assumption is used instead of the log-logistic. We have included that figure in order to illustrate the better suitability of the three-parametrical log-logistic w.r.t. the log-normal. On average, the latter achieves good compatibility only for w < 60, instead of w < 80 of the log-logistic. B. Formalization We need to formalize two core procedures in order to use the log-logistic model in practice: how to find a suitable
2950
IEEE SENSORS JOURNAL, VOL. 13, NO. 8, AUGUST 2013
log-logistic distribution for a sample (i.e., fitting) and how to implement an AD hypothesis test of its goodness of fit with the sample. In the following we deal with both issues. 1) Estimating the Three-Parametrical Log-Logistic Distribution of a Sample: An uni-dimensional three-parametrical log-logistic distribution can be defined by the tuple (a, b, c). Its probability density and cumulative distribution functions are, respectively [15]: [(x − a)/b]−1/c 2 c(x − a) 1 + [(x − a)/b]−1/c FX (x; a, b, c) = 1/ 1 + ((x − a)/b)−1/c f X (x; a, b, c) =
(1)
where a, b, c ∈ R, x ≥ 0, a ≥ 0, and b, c > 0 (the more classical bi-parametrical log-logistic distribution has a = 0). Parameter a is called the offset or location, since it establishes a minimum bound for x, while parameters b (scale) and c (shape) set the overall aspect of the function. This distribution has some especial traits that make it more difficult to deal with than others (e.g., log-normal), namely that its first moments do not always exist. We will always force c < 1 for its expectation to be defined, which agrees with the experimental shapes observed in our scenarios. For fitting this distribution to a given sample (a sequence of time delays, in our case) we use the MLE approach described in [15], which involves the numerical solution to a set of three equations in three variables, and an additional numerical solution to an equation that provides the initial guess of parameter c. We have used a trust-region numerical optimizer [16] for all of these, although when the computational efficiency is an issue, a simpler Levenberg-Marquardt algorithm can be used instead at the expense of losing some precision in the results. It is interesting at this point to discuss this computational complexity w.r.t. the one of the log-normal. Using the log-logistic MLE is less efficient: in the log-normal case that can be rendered as a simple Gaussian fitting, which has a closed analytical expression. Nevertheless, the numerical optimizations of the log-logistic are always O(n), being n the size of the sample, since we bound the number of steps they take (the algorithm does not need many steps to reach the optimum; in standard PCs it takes only one-two milliseconds in computing the fitting, which is far enough for the kind of remote sensory systems described in section II). In summary, in the worst case, both log-normal and log-logistic are of the same complexity O(n), although the latter finds better models at the expense of an increased average cost. In the log-logistic MLE fitting, the most sensitive parameter is the offset a, since a slight variation may produce great changes in the estimation of the values of b and c. We have observed that the size of the sample is critical in obtaining a good estimate of that parameter. In order to analyze that thoroughly, we have calculated the mean error between the true offset and the estimated one with different sample sizes, being the samples produced in simulation by a wide variety of theoretical long-tailed distributions: log-normal, log-logistic, exponential, left-truncated gaussian, generalized pareto and weibull. Fig. 5 shows that error (cyan curves) for 1000 offset estimations per distribution and sample size. The red curve in
Fig. 5. Study of the error in the estimation of the parameter a (offset) of the three-parametrical log-logistic for a number of samples generated from diverse long-tailed distributions. In red, the mode of all the curves. From a sample size of 20 on, there is no significant error decrease.
the figure corresponds to the mode of the error (we use the mode instead of the mean because of the asymmetric shape of the plot). Based on the figure, we can deduce that from a sample size of 20 on we achieve the minimum error in the estimation of the offset; therefore we can establish 20 as the minimum number of delay values required for estimating the log-logistic model correctly. 2) Assessing the Goodness of Fitting: A hypothesis test is needed to assess whether a sample provides enough evidence against the log-logistic distribution once it has been fitted. We have chosen the Anderson-Darling (AD) test, a powerful one originally defined for bi-parametrical logistic distributions, which have the following density and cumulative functions: f X (x; μ, σ ) =
exp(−(x − μ)/s)
σ (1 + exp(−(x − μ)/s))2 FX (x; μ, σ ) = 1/ 1 + exp(−(x − μ)/s)
(2)
You can appreciate the similarity between this cumulative distribution and the one of the log-logistic in (1). In order to apply this AD test to our three-parametrical log-logistic sample {x i } we need to transform the sample and also the (a, b, c) parameters obtained by the MLE fitting. Firstly, it is easy to see that the shifted sample {x i − a} must be drawn from a bi-parametrical log-logistic. Also, it is well known in basic probability theory that the logarithm of values drawn from a bi-parametrical log-logistic can be considered drawn from a bi-parametrical logistic. Therefore the bi-parametrical logistic equivalent of our three-parametrical log-logistic sample is simply {yi }=log({x i }−a). We now need to calculate the two bi-parametrical logistic parameters μ and σ of the fitting that would correspond to the MLE fitting (a, b, c) of the original sample. Both μ and σ can be deduced from the fact that if the distribution function of variable X is a three-parametrical log-logistic (a, b, c), the distribution function of variable Z = log(X − a) must be a bi-parametrical logistic that equals the distribution of X for
GAGO-BENÍTEZ et al.: LOG-LOGISTIC MODELING OF SENSORY FLOW DELAYS IN NETWORKED TELEROBOTS
Algorithm Algorithm
1 Pseudocode of the Stateless Modeling
2951
Algorithm 2 Pseudocode of the State-Based Modeling Algorithm
every x ∈ X. Consequently, 1 x−a −1/c =
1
z−μ
− σ 1+ b 1+e
−1/c x −a − log(x−a)−μ σ =e b c−1 (log(x − a) − log(b)) = σ −1 (log(x − a) − μ)
μ = log(b), σ = c.
(3)
IV. M ODELING E NTIRE S EQUENCES OF D ELAYS In this section we describe both a simple intuitive algorithm that detects abrupt changes and produces different log-logistic models after each one, and an improved version that uses the delay history for constructing better models. Both use the fitting and the AD test procedures described in section III, and are aimed at being on-line solutions to the modeling problem for practical situations. A. Stateless Algorithm Our first algorithm simply uses the AD test as an advisor: it detects whether the current log-logistic model stops being the best one for explaining new incoming delay values (i.e., whether a new model must be set up). In other words, the AD test is used for detecting abrupt changes, separating regimes with different parameters of the distribution. The algorithm works as follows. Firstly it collects a sequence of delays of a minimum size. That size is given by a parameter called s, from 20 on. When such a sequence is collected, it estimates the log-logistic three parameters by a MLE and it uses the AD test for the first time: a rejection slides that sequence one
Fig. 6. Results of our stateless algorithm when modeling scenarios of Fig. 2 with different values of its parameter s.
delay value on, forgetting its oldest value and appending the new one, while a non-rejection sets the tested sample as a new regime of the signal that will be enlarged with the next delay values. This process will be repeated until one delay value is added that makes the test to reject the regime formed so far; when that occurs, the previously non-rejected regime is recorded as a definitive regime of the signal. For the specifics of the algorithm, see the pseudocode in Algorithm 1. Fig. 6 summarizes the behavior of this modeling algorithm on all the real scenarios of Fig. 2. Please do not confuse the horizontal axis of this figure with the one of Fig. 4: in Fig. 6
2952
IEEE SENSORS JOURNAL, VOL. 13, NO. 8, AUGUST 2013
Fig. 7. Regimes detected by our state-based (left-top) and stateless (left-bottom) algorithms and log-logistic models found by the former in real scenarios #5 (right).
Fig. 8. Regimes detected by our state-based (left-top) and stateless (left-bottom) algorithms and log-logistic models found by the former in real scenarios #7 (right).
the abscissa contains the values of the parameter s, not the window size parameter of Fig. 4 (the stateless algorithm actually detects regimes longer than s). The resulting percentages (vertical axis) are nevertheless calculated in the same way in both figures. We can observe in Fig. 6 that the same scenarios that were not compatible with our assumptions (#6, #8 and #9) exhibit here a high proportion of rejections (over 60%). The rest have a very high proportion of non-rejected delays (> 60%) when using s < 60, and therefore have a suitable representation with a sequence of log-logistic models, i.e., as a sequence of stationary regimes. B. State-Based Algorithm A simple extension of the algorithm described in section IV-A has also been devised to improve the modeling through the re-use of previous detected regimes (that we call here states of the system). Basically this second algorithm checks whether the latest delays not yet included into a new regime could be appended to the end of regimes previously defined, or in other words, we attempt to detect when the system visits states previously finished. The algorithm works in the same way as the stateless version, except that as long as the sequence of newest delays does not pass the AD test, we check whether their inclusion at the end of a previous regime does pass it (more concretely, we
only use the α latest delay values, being α a parameter of the algorithm). If these delays are shown to extend a number of previous regimes, the one with better p-value in the AD test is selected as the revisited state, since the p-value of the test indicates quantitatively the goodness of fit; then we try to enlarge that previous regime with more new delay values, and continue in that fashion until the test does not pass. In that case we record the extended regime as a new version of the old regime. The pseudocode of this version is shown in Algorithm 2. C. Experimental Results For limitations of space, we have selected two real scenarios (#5 and #7) and illustrated the detailed results of both algorithms by using s = 20 and α = 10 (see Figs. 7 and 8, left-top and left-bottom, respectively). The thin grey vertical lines determine the end of regimes (Figs. 7 and 8 left). Please note that in the state-based algorithm we have also identified uniquely the different states of the system. The results show clearly the advantages of such an approach with respect to the stateless algorithm: the number of actually different regimes is significantly reduced (7 vs. 16 in scenarios #5 and 6 vs. 12 in scenario #7). As explained in previous sections, the complexity of both algorithm is O(n), so the state-based algorithm gives us more information besides the detection of larger regimes. Note that n is the number of delays gathered
GAGO-BENÍTEZ et al.: LOG-LOGISTIC MODELING OF SENSORY FLOW DELAYS IN NETWORKED TELEROBOTS
so far, thus both algorithms need some kind of reset procedure in order to run indefinitely. Nevertheless, practical situations finish after a relatively small number of delays (recall that with 1000 delay values we can spend hours in some experiments), thus we do not consider this a relevant limitation. The log-logistic models found by our state-based algorithm are presented on Figs. 7 and 8 right, highlighting with thicker lines those models that explain more delay values in each scenario. Since the time delay used in each iteration of the sensory loop can be measured at any point in that loop, both algorithms can be executed either in the client interface (e.g., in a webbased client that displays sensory data), in the server to which the sensors are connected (typically a PC) or even in the very sensors, if they support real number computations. They can store internally the last regime being considered, thus there is no need to transmitting delay sequences along the sensory loop. The implementation of the algorithms in the special cases of platforms with no real numbers support or in programming languages not specifically designed for numerical optimization (such as Javascript in the case of a client webbased interface) are out of the scope of this paper, thus all our experiments have been carried out with the algorithms running in the server machine, the one with the sensors connected to, and, as discussed at the end of section III-A, have exhibited computational times that are negligible with respect to the processing times of other parts of the loop (displaying data, transmission, etc.).
2953
[4] P. X. Liu, M. Q-H. Meng, J. Gu, S. X. Yang, and C. Hu, “Control and data transmission for internet robotics,” in Proc. IEEE Intl. Conf. Robot. Autom., Sep. 2003, pp. 1659–1664. [5] B. Siciliano and O. Khatib, Handbook of Robotics, B. Siciliano and O. Khatib, Eds. Berlin, Germany: Springer-Verlag, 2008. [6] N. A. Gershenfeld and A. S. Weigend, “The future of time series,” in Time Series Prediction: Forecasting the Future and Understanding the Past, A. S. Weigend and N. A. Gershenfeld, Eds. Reading, MA, USA: Addison-Wesley, 1993, pp. 1–70. [7] A. Gago-Benítez, J. A. Fernández-Madrigal, C. Galindo, and A. Cruz-Martín, “Statistical characterization of the time-delay for webbased networked telerobots,” in Proc. 5th Int. Workshop Appl. Probab., Jul. 2010, pp. 1–6. [8] A. Gago-Benítez, J. A. Fernández-Madrigal, A. Cruz-Martín, “A computationally efficient algorithm for modeling multi-regime delays in the sensory flow of networked telerobots,” in Proc. Int. Conf. Control, Robot. Cybern., 2012, pp. 1–13. [9] J. Gonzalez, C. Galindo, J.-L. Blanco, J.-A. Fernández-Madrigal, V. Arévalo, and F.-A. Moreno, “SANCHO, a fair host robot. A description,” in Proc. IEEE Int. Conf. Mechatron., Apr. 2009, pp. 1–6. [10] Surveyor Corp. (2010, Apr.). The Surveyor SRV-1 Blackfin Robot. San Luis Obispo, CA, USA [Online]. Available: http://surveyor.com/ [11] J.A. Fernández-Madrigal, C. Galindo, J. Gonzalez, E. Cruz-Martín, and A. Cruz-Martín, “A software engineering approach for the development of heterogeneous robotic applications,” J. Robot. Comput.-Integr. Manuf., vol. 24, no. 1, pp. 150–166, Feb. 2008. [12] C. Darie, B. Brinzarea, F. Chereches-Tosa, and M. Bucicia, “AJAX and PHP: Building responsive web applications,” Birmingham, AL, USA: Packt, Mar. 2006. [13] G. Box and G. Jenkins, Time Series Analysis: Forecasting and Control. New York, NY, USA: Wiley, 1976. [14] R.B. D’Agostino and M.A. Stephens, Goodness-of-Fit Techniques. New York, NY, USA: Marcel Dekker, 1986. [15] M.J. Ahmad, C.D. Sinclair, A. Werritty, “Log-logistic flood frequency analysis,” J. Hydrol., vol. 98, nos. 3–4, pp. 205–224, Apr. 1988. [16] M.A. Branch, T.F. Coleman, Y. Li, “A subspace, interior, and conjugate gradient method for large-scale bound-constrained minimization problems,” SIAM J. Sci. Comput., vol. 21, no. 1, pp. 1–23, 1999.
V. C ONCLUSION In this paper we have presented a novel study of sequences of delays occurring in the transmission of remote sensory data under a minimalist three-parametrical log-logistic hypothesis. We have firstly proposed a measure that reflects the compatibility of many real scenarios with our hypothesis and then have introduced two algorithms suitable for on-line estimation that are capable of dealing with multiple regimes, bursts, etc. Our results indicate that our approach is appropriate for online and accurate estimation of many real situations, being a suitable approximation to more complex solutions when the computational cost is limited. In the future we plan to improve those results by including other efficient change detectors (such as CUSUM: CUmulative SUM) and also detrending fluctuation analysis for covering smooth parameter change situations. The final goal is to use all these results to perform prediction and control and to guarantee in all possible situations that the user of a networked telerobot can control it as efficiently and safely as possible.
Ana Gago-Benítez was born in Ubrique, Spain, in 1981. She received the B.S. and M.S. degrees in electrical engineering from the University of Málaga, Málaga, Spain, in 2008. She is currently pursuing the Ph.D. degree at the Department of System Engineering and Automation, University of Málaga. Her current research interests include statistical modeling of sensory flow and remote control of mobile robots.
R EFERENCES
Ana Cruz-Martín was born in La Línea, Spain, in 1972. She received the M.S. degree in computer science from ETSI Informática, University of Málaga, Málaga, Spain, in 1997, and the Ph.D. degree in computer science from the University of Málaga in 2004. Currently, she is with the Systems Engineering and Automation Department, Málaga University, Málaga. Her current research interests include multirobot systems and educational robotics.
[1] X. Wang and H. Schulzrinne, “Comparison of adaptive internet multimedia applications,” IEICE Trans. Commun., vol. 82, no. 6, pp. 806–818, 1999. [2] W. Kim, K. Ji, and A. Ambike, “Networked real-time control strategy dealing with stochastic time delays and packet losses,” J. Dynamic Syst., Meas., Control, vol. 128, no. 3, pp. 81–685, 2006. [3] O. C. Imer, S. Yüksel, and T. Basar, “Optimal control of LTI systems over unreliable communication links,” Automatica, vol. 42, no. 9, pp. 1429–1439, Sep. 2006.
Juan-Antonio Fernandez-Madrigal was born in Córdoba, Spain, in 1970. He received the Ph.D. degree in computer science from the University of Málaga, Málaga, Spain, in 2000, where he has been a tenured Associate Professor since 2003. He has published three books internationally and nearly 80 scientific papers in journals and conferences. His current research interests include cognitive robotics, probabilistic robotics, educational robotics, and robotic software engineering.