379 research outputs found

    Delay versus Stickiness Violation Trade-offs for Load Balancing in Large-Scale Data Centers

    Full text link
    Most load balancing techniques implemented in current data centers tend to rely on a mapping from packets to server IP addresses through a hash value calculated from the flow five-tuple. The hash calculation allows extremely fast packet forwarding and provides flow `stickiness', meaning that all packets belonging to the same flow get dispatched to the same server. Unfortunately, such static hashing may not yield an optimal degree of load balancing, e.g., due to variations in server processing speeds or traffic patterns. On the other hand, dynamic schemes, such as the Join-the-Shortest-Queue (JSQ) scheme, provide a natural way to mitigate load imbalances, but at the expense of stickiness violation. In the present paper we examine the fundamental trade-off between stickiness violation and packet-level latency performance in large-scale data centers. We establish that stringent flow stickiness carries a significant performance penalty in terms of packet-level delay. Moreover, relaxing the stickiness requirement by a minuscule amount is highly effective in clipping the tail of the latency distribution. We further propose a bin-based load balancing scheme that achieves a good balance among scalability, stickiness violation and packet-level delay performance. Extensive simulation experiments corroborate the analytical results and validate the effectiveness of the bin-based load balancing scheme

    Lingering Issues in Distributed Scheduling

    Get PDF
    Recent advances have resulted in queue-based algorithms for medium access control which operate in a distributed fashion, and yet achieve the optimal throughput performance of centralized scheduling algorithms. However, fundamental performance bounds reveal that the "cautious" activation rules involved in establishing throughput optimality tend to produce extremely large delays, typically growing exponentially in 1/(1-r), with r the load of the system, in contrast to the usual linear growth. Motivated by that issue, we explore to what extent more "aggressive" schemes can improve the delay performance. Our main finding is that aggressive activation rules induce a lingering effect, where individual nodes retain possession of a shared resource for excessive lengths of time even while a majority of other nodes idle. Using central limit theorem type arguments, we prove that the idleness induced by the lingering effect may cause the delays to grow with 1/(1-r) at a quadratic rate. To the best of our knowledge, these are the first mathematical results illuminating the lingering effect and quantifying the performance impact. In addition extensive simulation experiments are conducted to illustrate and validate the various analytical results

    Queue-Based Random-Access Algorithms: Fluid Limits and Stability Issues

    Get PDF
    We use fluid limits to explore the (in)stability properties of wireless networks with queue-based random-access algorithms. Queue-based random-access schemes are simple and inherently distributed in nature, yet provide the capability to match the optimal throughput performance of centralized scheduling mechanisms in a wide range of scenarios. Unfortunately, the type of activation rules for which throughput optimality has been established, may result in excessive queue lengths and delays. The use of more aggressive/persistent access schemes can improve the delay performance, but does not offer any universal maximum-stability guarantees. In order to gain qualitative insight and investigate the (in)stability properties of more aggressive/persistent activation rules, we examine fluid limits where the dynamics are scaled in space and time. In some situations, the fluid limits have smooth deterministic features and maximum stability is maintained, while in other scenarios they exhibit random oscillatory characteristics, giving rise to major technical challenges. In the latter regime, more aggressive access schemes continue to provide maximum stability in some networks, but may cause instability in others. Simulation experiments are conducted to illustrate and validate the analytical results

    Exact asymptotics for fluid queues fed by multiple heavy-tailed on-off flows

    Get PDF
    We consider a fluid queue fed by multiple On-Off flows with heavy-tailed (regularly varying) On periods. Under fairly mild assumptions, we prove that the workload distribution is asymptotically equivalent to that in a reduced system. The reduced system consists of a ``dominant'' subset of the flows, with the original service rate subtracted by the mean rate of the other flows. We describe how a dominant set may be determined from a simple knapsack formulation. The dominant set consists of a ``minimally critical'' set of On-Off flows with regularly varying On periods. In case the dominant set contains just a single On-Off flow, the exact asymptotics for the reduced system follow from known results. For the case of several On-Off flows, we exploit a powerful intuitive argument to obtain the exact asymptotics. Combined with the reduced-load equivalence, the results for the reduced system provide a characterization of the tail of the workload distribution for a wide range of traffic scenarios

    Delay Performance and Mixing Times in Random-Access Networks

    Get PDF
    We explore the achievable delay performance in wireless random-access networks. While relatively simple and inherently distributed in nature, suitably designed queue-based random-access schemes provide the striking capability to match the optimal throughput performance of centralized scheduling mechanisms in a wide range of scenarios. The specific type of activation rules for which throughput optimality has been established, may however yield excessive queues and delays. Motivated by that issue, we examine whether the poor delay performance is inherent to the basic operation of these schemes, or caused by the specific kind of activation rules. We derive delay lower bounds for queue-based activation rules, which offer fundamental insight in the cause of the excessive delays. For fixed activation rates we obtain lower bounds indicating that delays and mixing times can grow dramatically with the load in certain topologies as well

    GPS queues with heterogeneous traffic classes

    Get PDF
    We consider a queue fed by a mixture of light-tailed and heavy-tailed traffic. The two traffic classes are served in accordance with the generalized processor sharing (GPS) discipline. GPS-based scheduling algorithms, such as weighted fair queueing (WFQ), have emerged as an important mechanism for achieving service differentiation in integrated networks. We derive the asymptotic workload behavior of the light-tailed class for the situation where its GPS weight is larger than its traffic intensity. The GPS mechanism ensures that the workload is bounded above by that in an isolated system with the light-tailed class served in isolation at a constant rate equal to its GPS weight. We show that the workload distribution is in fact asymptotically equivalent to that in the isolated system, multiplied with a certain pre-factor, which accounts for the interaction with the heavy-tailed class. Specifically, the pre-factor represents the probability that the heavy-tailed class is backlogged long enough for the light-tailed class to reach overflow. The results provide crucial qualitative insight in the typical overflow scenario

    Load Balancing in Large-Scale Systems with Multiple Dispatchers

    Full text link
    Load balancing algorithms play a crucial role in delivering robust application performance in data centers and cloud networks. Recently, strong interest has emerged in Join-the-Idle-Queue (JIQ) algorithms, which rely on tokens issued by idle servers in dispatching tasks and outperform power-of-dd policies. Specifically, JIQ strategies involve minimal information exchange, and yet achieve zero blocking and wait in the many-server limit. The latter property prevails in a multiple-dispatcher scenario when the loads are strictly equal among dispatchers. For various reasons it is not uncommon however for skewed load patterns to occur. We leverage product-form representations and fluid limits to establish that the blocking and wait then no longer vanish, even for arbitrarily low overall load. Remarkably, it is the least-loaded dispatcher that throttles tokens and leaves idle servers stranded, thus acting as bottleneck. Motivated by the above issues, we introduce two enhancements of the ordinary JIQ scheme where tokens are either distributed non-uniformly or occasionally exchanged among the various dispatchers. We prove that these extensions can achieve zero blocking and wait in the many-server limit, for any subcritical overall load and arbitrarily skewed load profiles. Extensive simulation experiments demonstrate that the asymptotic results are highly accurate, even for moderately sized systems

    Hyper-Scalable JSQ with Sparse Feedback

    Full text link
    Load balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load balancing mechanisms, have emerged as major concerns. Motivated by these issues, we introduce and analyze a novel class of load balancing schemes where the various servers provide occasional queue updates to guide the load assignment. We show that the proposed schemes strongly outperform JSQ(dd) strategies with comparable communication overhead per job, and can achieve a vanishing waiting time in the many-server limit with just one message per job, just like the popular JIQ scheme. The proposed schemes are particularly geared however towards the sparse feedback regime with less than one message per job, where they outperform corresponding sparsified JIQ versions. We investigate fluid limits for synchronous updates as well as asynchronous exponential update intervals. The fixed point of the fluid limit is identified in the latter case, and used to derive the queue length distribution. We also demonstrate that in the ultra-low feedback regime the mean stationary waiting time tends to a constant in the synchronous case, but grows without bound in the asynchronous case

    Achievable Performance in Product-Form Networks

    Full text link
    We characterize the achievable range of performance measures in product-form networks where one or more system parameters can be freely set by a network operator. Given a product-form network and a set of configurable parameters, we identify which performance measures can be controlled and which target values can be attained. We also discuss an online optimization algorithm, which allows a network operator to set the system parameters so as to achieve target performance metrics. In some cases, the algorithm can be implemented in a distributed fashion, of which we give several examples. Finally, we give conditions that guarantee convergence of the algorithm, under the assumption that the target performance metrics are within the achievable range.Comment: 50th Annual Allerton Conference on Communication, Control and Computing - 201
    corecore