Search CORE

52 research outputs found

Mechanism Design for Demand Response Programs

Author: Kalathil Dileep
Muthirayan Deepan
Poolla Kameshwar
Varaiya Pravin
Publication venue
Publication date: 29/04/2019
Field of study

Demand Response (DR) programs serve to reduce the consumption of electricity at times when the supply is scarce and expensive. The utility informs the aggregator of an anticipated DR event. The aggregator calls on a subset of its pool of recruited agents to reduce their electricity use. Agents are paid for reducing their energy consumption from contractually established baselines. Baselines are counter-factual consumption estimates of the energy an agent would have consumed if they were not participating in the DR program. Baselines are used to determine payments to agents. This creates an incentive for agents to inflate their baselines. We propose a novel self-reported baseline mechanism (SRBM) where each agent reports its baseline and marginal utility. These reports are strategic and need not be truthful. Based on the reported information, the aggregator selects or calls on agents to meet the load reduction target. Called agents are paid for observed reductions from their self-reported baselines. Agents who are not called face penalties for consumption shortfalls below their baselines. The mechanism is specified by the probability with which agents are called, reward prices for called agents, and penalty prices for agents who are not called. Under SRBM, we show that truthful reporting of baseline consumption and marginal utility is a dominant strategy. Thus, SRBM eliminates the incentive for agents to inflate baselines. SRBM is assured to meet the load reduction target. SRBM is also nearly efficient since it selects agents with the smallest marginal utilities, and each called agent contributes maximally to the load reduction target. Finally, we show that SRBM is almost optimal in the metric of average cost of DR provision faced by the aggregator

arXiv.org e-Print Archive

eScholarship - University of California

Online Algorithms for Dynamic Matching Markets in Power Distribution Systems

Author: Khargonekar Pramod P.
Muthirayan Deepan
Parvania Masood
Publication venue
Publication date: 16/07/2020
Field of study

This paper proposes online algorithms for dynamic matching markets in power distribution systems, which at any real-time operation instance decides about matching -- or delaying the supply of -- flexible loads with available renewable generation with the objective of maximizing the social welfare of the exchange in the system. More specifically, two online matching algorithms are proposed for the following generation-load scenarios: (i) when the mean of renewable generation is greater than the mean of the flexible load, and (ii) when the condition (i) is reversed. With the intuition that the performance of such algorithms degrades with increasing randomness of the supply and demand, two properties are proposed for assessing the performance of the algorithms. First property is convergence to optimality (CO) as the underlying randomness of renewable generation and customer loads goes to zero. The second property is deviation from optimality, is measured as a function of the standard deviation of the underlying randomness of renewable generation and customer loads. The algorithm proposed for the first scenario is shown to satisfy CO and a deviation from optimal that varies linearly with the variation in the standard deviation. But the same algorithm is shown to not satisfy CO for the second scenario. We then show that the algorithm proposed for the second scenario satisfies CO and a deviation from optimal that varies linearly with the variation in standard deviation plus an offset

arXiv.org e-Print Archive

eScholarship - University of California

A Minimal Incentive-based Demand Response Program With Self Reported Baseline Mechanism

Author: Baeyens Enrique
Chakraborty Pratyush
Khargonekar Pramod P.
Muthirayan Deepan
Poolla Kameshwar
Publication venue
Publication date: 08/04/2019
Field of study

In this paper, we propose a novel incentive based Demand Response (DR) program with a self reported baseline mechanism. The System Operator (SO) managing the DR program recruits consumers or aggregators of DR resources. The recruited consumers are required to only report their baseline, which is the minimal information necessary for any DR program. During a DR event, a set of consumers, from this pool of recruited consumers, are randomly selected. The consumers are selected such that the required load reduction is delivered. The selected consumers, who reduce their load, are rewarded for their services and other recruited consumers, who deviate from their reported baseline, are penalized. The randomization in selection and penalty ensure that the baseline inflation is controlled. We also justify that the selection probability can be simultaneously used to control SO's cost. This allows the SO to design the mechanism such that its cost is almost optimal when there are no recruitment costs or at least significantly reduced otherwise. Finally, we also show that the proposed method of self-reported baseline outperforms other baseline estimation methods commonly used in practice

arXiv.org e-Print Archive

eScholarship - University of California

Improved Attention Models for Memory Augmented Neural Network Adaptive Controllers

Author: lewis
mishra
muthirayan
muthirayan
vaswani
zhang
Publication venue
Publication date: 19/03/2020
Field of study

We introduced a {\it working memory} augmented adaptive controller in our recent work. The controller uses attention to read from and write to the working memory. Attention allows the controller to read specific information that is relevant and update its working memory with information based on its relevance. The retrieved information is used to modify the final control input computed by the controller. We showed that this modification speeds up learning. In the above work, we used a soft-attention mechanism for the adaptive controller. Controllers that use soft attention or hard attention mechanisms are limited either because they can forget the information or fail to shift attention when the information they are reading becomes less relevant. We propose an attention mechanism that comprises of (i) a hard attention mechanism and additionally (ii) an attention reallocation mechanism. The attention reallocation enables the controller to reallocate attention to a different location when the relevance of the location it is reading from diminishes. The reallocation also ensures that the information stored in the memory before the shift in attention is retained which can be lost in both soft and hard attention mechanisms. We illustrate through detailed simulations of various scenarios for two link robot and three link robot arm systems we illustrate the effectiveness of the proposed attention mechanism

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Online Learning Robust Control of Nonlinear Dynamical Systems

Author: Khargonekar Pramod P.
Muthirayan Deepan
Publication venue
Publication date: 08/06/2021
Field of study

In this work we address the problem of the online robust control of nonlinear dynamical systems perturbed by disturbance. We study the problem of attenuation of the total cost over a duration

T

in response to the disturbances. We consider the setting where the cost function (at a particular time) is a general continuous function and adversarial, the disturbance is adversarial and bounded at any point of time. Our goal is to design a controller that can learn and adapt to achieve a certain level of attenuation. We analyse two cases (i) when the system is known and (ii) when the system is unknown. We measure the performance of the controller by the deviation of the controller's cost for a sequence of cost functions with respect to an attenuation

\gamma

R^p_t

. We propose an online controller and present guarantees for the metric

R^p_t

when the maximum possible attenuation is given by

\overline{\gamma}

, which is a system constant. We show that when the controller has preview of the cost functions and the disturbances for a short duration of time and the system is known

R^p_T(\gamma) = O(1)

when

\gamma \geq \gamma_c

, where

\gamma_c = \mathcal{O}(\overline{\gamma})

. We then show that when the system is unknown the proposed controller with a preview of the cost functions and the disturbances for a short horizon achieves

R^p_T(\gamma) = \mathcal{O}(N) + \mathcal{O}(1) + \mathcal{O}((T-N)g(N))

, when

\gamma \geq \gamma_c

, where

g(N)

is the accuracy of a given nonlinear estimator and

N

is the duration of the initial estimation period. We also characterize the lower bound on the required prediction horizon for these guarantees to hold in terms of the system constants

arXiv.org e-Print Archive

Online Learning for Incentive-Based Demand Response

Author: Khargonekar Pramod P.
Muthirayan Deepan
Publication venue
Publication date: 27/03/2023
Field of study

In this paper, we consider the problem of learning online to manage Demand Response (DR) resources. A typical DR mechanism requires the DR manager to assign a baseline to the participating consumer, where the baseline is an estimate of the counterfactual consumption of the consumer had it not been called to provide the DR service. A challenge in estimating baseline is the incentive the consumer has to inflate the baseline estimate. We consider the problem of learning online to estimate the baseline and to optimize the operating costs over a period of time under such incentives. We propose an online learning scheme that employs least-squares for estimation with a perturbation to the reward price (for the DR services or load curtailment) that is designed to balance the exploration and exploitation trade-off that arises with online learning. We show that, our proposed scheme is able to achieve a very low regret of

\mathcal{O}\left((\log{T})^2\right)

with respect to the optimal operating cost over

T

days of the DR program with full knowledge of the baseline, and is individually rational for the consumers to participate. Our scheme is significantly better than the averaging type approach, which only fetches

\mathcal{O}(T^{1/3})

regret

arXiv.org e-Print Archive

Meta-Learning Guarantees for Online Receding Horizon Learning Control

Author: Khargonekar Pramod P.
Muthirayan Deepan
Publication venue
Publication date: 03/06/2021
Field of study

In this paper we provide provable regret guarantees for an online meta-learning receding horizon control algorithm in an iterative control setting. We consider the setting where, in each iteration the system to be controlled is a linear deterministic system that is different and unknown, the cost for the controller in an iteration is a general additive cost function and there are affine control input constraints. By analysing conditions under which sub-linear regret is achievable, we prove that the online receding horizon controller achieves a regret for the controller cost and constraint violation that are

\tilde{O}(T^{3/4})

with respect to the best policy that satisfies the control input control constraints, when the preview of the cost functions is limited to an interval and the interval size is doubled from one to the next. We then show that the average of the regret for the controller cost and constraint violation with respect to the same policy vary as

\tilde{O}((1+1/\sqrt{N})T^{3/4})

with the number of iterations

N

, under the same setting.Comment: arXiv admin note: substantial text overlap with arXiv:2008.13265, arXiv:2010.0726

arXiv.org e-Print Archive