On the Convergence of Stochastic Aggregated Gradient Method

On the Convergence of Stochastic Aggregated Gradient Method

The minimization problem of the sum of a large set of convex functions arises in various applications. Methods such as incremental gradient, stochastic gradient, and aggregated gradient are popular choices for solving those problems as they do not require a full gradient evaluation at every iteration. In this paper, we analyze a generalization of the stochastic aggregated gradient method via an alternative technique based on the convergence of iterative linear systems. The technique provides a short proof for the $O(\kappa^{-1})$ linear convergence rate in the quadratic case. We observe that the technique is rather restrictive for the general case, and can provide weaker results.

___

  • Bertsekas, D.P., Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey, In: Optimization for Machine Learning, MIT Press, 2012.
  • Bottou, L., Curtis, F.E., Nocedal, J., Optimization methods for large-scale machine learning, SIAM Review, 60(2018), 223–311.
  • Gurbuzbalaban, M., Ozdaglar, A., Parrilo, P.A., On the convergence rate of incremental aggregated gradient algorithms, SIAM Journal on Optimization, 27(2017), 1035–1048.
  • Hartfiel, D.J., Nonhomogeneous Matrix Products, World Scientific, 2002.
  • Schmidt, M., Le Roux, N., Bach, F., Minimizing finite sums with the stochastic average gradient, Mathematical Programming, 162(2017), 83–112.
  • Silvester, J.R., Determinants of block matrices, The Mathematical Gazette, 84(2000), 460–467.
  • Strang, G., Introduction to Applied Mathematics, Wellesley-Cambridge Press, 1986.
  • Van Loan, C.F., Golub, G., Matrix Computations, The Johns Hopkins University Press, 1996.