ADMM example

ADMM example code

Author

Don Don

Published

May 30, 2021

scaled dual ADMM algorithm

  • Given x, z and , to some initial value.

  • Repeat:

    • x:=argmaxx(f(x)+ρ2Ax+Bzc+μ22)
    • x:=argmaxx(g(z)+ρ2Ax+Bzc+μ22)
    • μ:=μ+(Ax+Bzc)
    • Stopping criterion : quit r2<ϵ and s2<ϵ


Stopping criterion

We can define the primal and dual residuals in ADMM at step k+1.
* Primal residuals : rk+1=Axk+1+Bzk+1c
* Dual residuals : sk+1=ρATB(zk+1zk)

Therefore stopping criterion satisfies that r2 and s2 are smaller than any ϵ



Lasso example

minimizeβi=1n(yiβ0xitβ)2+λj=1p|βj|

minimizeβf(β)+f(z)subject toIβIZ=0


r=IβIZLρ(β,z,v)=f(β)+g(z)+vtr+ρ2||r||22=f(β)+g(z)+ρ2||r+1ρv||22ρ2||v||22=f(β)+g(z)+ρ2||r+μ||22constantv,μ=1ρv

βk+1:=argminβ(f(β)+ρ2||IβIZk+μk||22)=argminβ(yXβ)t(yXβ)+ρ2||IβIZk+μk||22)2Xty+2XtXβ+ρβρZk+ρμk=0(2XtX+ρI)β=2Xty+ρ(Zkμk)βk+1=(2XtX+ρI)1(2Xty+ρ(Zkμk))


βk+1:=argminβ(f(β)+ρ2||IβIZk+μk||22)=argminβ(yXβ)t(yXβ)+ρ2||IβIZk+μk||22)2Xty+2XtXβ+ρβρZk+ρμk=0(2XtX+ρI)β=2Xty+ρ(Zkμk)βk+1=(2XtX+ρI)1(2Xty+ρ(Zkμk))


The prox operatior for g(z)=λ||z||1

proxλ,g(z)=argminv(λ||z||1+12||zv||22)=argminv(||v||1+12λ||zv||22)argminvi(12(vizi)2+λ|vi|)


Zk+1:=argminZ(g(Z)+ρ2||Iβk+1IZ+μk||22)=argminZ(g(Z)+ρ2||βk+1+μkZ)||22=argminZ(g(Z)+121ρ||βk+1+μkZ)||22prox1ρ,g(βk+1+μk)


  • Given β, z, μ, ρ to some initial value

  • Repeat:

    • βk+1=(2XtX+ρI)1(2Xty+ρ(Zkμk))
    • Zk+1=prox1ρ,g(βk+1+μk)
    • μk+1:=μk+(βk+1Zk+1)
    • Stopping criterion : quit r2<ϵ and s2<ϵ


  • prime resdual : rk+1=βk+1zk+1
  • dual resdual : $s^{k+1} = -(z^{k+1} - z^k) $


R code

nll <- function(X, Y, beta) {
                A <- Y - X %*% beta
                loglike <- crossprod(A)
                return(loglike)
}

# Proximal operator
prox.l1 <- function(u, lambda) {
                uhat <- abs(u) - lambda
                prox <- sign(u) * pmax(rep(0, length(u)), uhat)
                return(prox)
}

l2norm <- function(x) sqrt(sum(x^2))


ADMM <- function(X,Y,rho=5,lambda=.1,iter=100, eps = 0.0001){
                
                n <- nrow(X)
                p <- ncol(X)
                
                beta <- matrix(0, nrow=iter, ncol=p) 
                beta[1,] <- rep(0, p)
                
                obj <- rep(0, iter)
                obj[1] <- nll(X, Y, beta[1,]) + lambda * sum(abs(beta[1,]))
                
                z <- matrix(0, nrow=iter, ncol=p)
                v <- rep(0, p)     
                
                invmat <- solve(2*crossprod(X) + diag(rho, p))
                
                s <- 0    
                r <- 0    
                t <- 0
                
                for (t in 2:iter){
                                
                                beta[t,] <- invmat %*% (2*crossprod(X, Y) + rho * (z[t-1,]-v))
                                z[t,] <- prox.l1(beta[t,] + v, lambda/rho)
                                v <- v + beta[t,] - z[t,]
                                obj[t] <- nll(X, Y, beta[t,]) + lambda * sum(abs(beta[t,]))
                                
                                r <- beta[t,] - z[t,]
                                s <- -rho * (z[t,] - z[t-1,])
                                
                                r.norm <- l2norm(r)
                                s.norm <- l2norm(s)
                                
                                if (r.norm < eps & s.norm < eps) {
                                                break
                                }
                }
                beta <- beta[-c(t+1:iter),]
                obj <- obj[-c(t+1:iter)]
                result <- list("beta.hat" = beta[nrow(beta),], "beta"=beta, "objective"=obj, "iter"=t)
                return(result)
}

x <- cbind(1, matrix(rnorm(1000*4), ncol = 4))
beta <- c(1.4, -2, -3, 4, 5)

eps <- rnorm(1000*1) 
y <- x%*%beta + eps 

ADMM(X = x, Y = y)
$beta.hat
[1]  1.379323 -1.990279 -2.998234  4.018089  5.044086

$beta
         [,1]      [,2]      [,3]     [,4]     [,5]
[1,] 0.000000  0.000000  0.000000 0.000000 0.000000
[2,] 1.376414 -1.985059 -2.991330 4.008202 5.031837
[3,] 1.379269 -1.990212 -2.998172 4.018014 5.044009
[4,] 1.379323 -1.990278 -2.998234 4.018089 5.044086
[5,] 1.379323 -1.990279 -2.998234 4.018089 5.044086

$objective
[1] 58940.123  1022.952  1022.613  1022.613  1022.613

$iter
[1] 5

Citation

BibTeX citation:
@online{don2021,
  author = {Don, Don and Don, Don},
  title = {ADMM Example},
  date = {2021-05-30},
  url = {https://dondonkim.netlify.app/posts/2021-06-12-admm/admm.html},
  langid = {en}
}
For attribution, please cite this work as:
Don, Don, and Don Don. 2021. “ADMM Example.” May 30, 2021. https://dondonkim.netlify.app/posts/2021-06-12-admm/admm.html.