Abstract
We consider the general problem of online convex optimization with time-varying budget constraints in the presence of predictions for the next cost and constraint functions, that arises in a plethora of network resource management problems. A novel saddle-point algorithm is designed by combining a Follow-The-Regularized-Leader iteration with prediction-adaptive dynamic steps. The algorithm achieves <inline-formula> <tex-math notation="LaTeX">$\c O(T^{(3-\beta)/4})$</tex-math> </inline-formula> regret and <inline-formula> <tex-math notation="LaTeX">$\c O(T^{(1+\beta)/2})$</tex-math> </inline-formula> constraint violation bounds that are tunable via parameter <inline-formula> <tex-math notation="LaTeX">$\beta\!\in\![1/2,1)$</tex-math> </inline-formula> and have constant factors that shrink with the predictions quality, achieving eventually <inline-formula> <tex-math notation="LaTeX">$\c O(1)$</tex-math> </inline-formula> regret for perfect predictions. Our work extends the seminal FTRL framework for this new OCO setting and outperforms the respective state-of-the-art greedy-based solutions which naturally cannot benefit from predictions, without imposing conditions on the (unknown) quality of predictions, the cost functions or the geometry of constraints, beyond convexity.
Original language | English |
---|---|
Pages (from-to) | 1-15 |
Number of pages | 15 |
Journal | IEEE/ACM Transactions on Networking |
DOIs | |
Publication status | E-pub ahead of print - 2023 |
Keywords
- Network control
- network management
- online convex optimization (OCO)
- online learning
- resource allocation