AM221: Advanced Optimization

Lecture 3, Jan 29th, 2018

Instructor: Rasmus Kyng

using JuMP
using Clp
using DataFrames
using CSV
using Plots
using Gurobi
plotlyjs()

init_notebook(true)

WARNING: Method definition (::Type{PlotlyJS.SyncPlot{TD} where TD<:PlotlyJS.AbstractPlotlyDisplay})(PlotlyJS.Plot{TT} where TT<:PlotlyJS.AbstractTrace) in module PlotlyJS at /Users/rjkyng/.julia/v0.6/PlotlyJS/src/display.jl:136 overwritten at /Users/rjkyng/.julia/v0.6/PlotlyJS/src/displays/ijulia.jl:187.

Plots.PlotlyJSBackend()

Stock Portfolio¶

n = 4
c = [18; 16; 58; 200]
w = [10; 10; 48; 68.686]
b = 290

prog = Model(solver=ClpSolver())
@variable(prog,x[1:4] >= 0)
@constraint(prog,c'*x <= b)
@constraint(prog,x .<= ones(n))
obj = @objective(prog,:Max,w'*x)
solve(prog)

:Optimal

10/18.0, 16/18.0, 48/58.0, 68.68/200

(0.5555555555555556, 0.8888888888888888, 0.8275862068965517, 0.34340000000000004)

getvalue(obj)

135.99914

getvalue(x)

4-element Array{Float64,1}:
 1.0 
 1.0 
 1.0 
 0.99

Regression: Least Max Error¶

Read in the Galton height dataset as we did in Lecture 1.

ghdf = CSV.read("galton-stata11.tab",rows_for_type_detect=1000,delim='\t');
x,y = [ghdf[:mother] ghdf[:father]], ghdf[:height];
xc = [ones(length(y)) x ];

Let $\mathbf{x}(1)$ be the height of the mother, $\mathbf{x}(2)$ the height of the father, and $\mathbf{y}$ the height of the child.

We want to fit a linear model $\mathbf{y} = a_1 \cdot \mathbf{x}(1) + a_2 \cdot \mathbf{x}(2) + b $.

First, let's again find the parameters that minimize the residual sum of squares error.

galton = Model(solver=GurobiSolver(Presolve=0))
@variable(galton,ab[1:3])
obj = @objective(galton,:Min,sum((y-xc*ab).^2) )
solve(galton)

Academic license - for non-commercial use only
Optimize a model with 0 rows, 3 columns and 0 nonzeros
Model has 6 quadratic objective terms
Coefficient statistics:
  Matrix range     [0e+00, 0e+00]
  Objective range  [1e+05, 8e+06]
  QObjective range [2e+03, 2e+07]
  Bounds range     [0e+00, 0e+00]
  RHS range        [0e+00, 0e+00]
Ordering time: 0.00s

Barrier statistics:
 Free vars  : 5
 AA' NZ     : 1.000e+00
 Factor NZ  : 3.000e+00
 Factor Ops : 5.000e+00 (less than 1 second per iteration)
 Threads    : 1

                  Objective                Residual
Iter       Primal          Dual         Primal    Dual     Compl     Time
   0   0.00000000e+00  0.00000000e+00  0.00e+00 9.59e+05  0.00e+00     0s
   1  -3.88549443e+02 -9.42676285e-03  1.73e-08 9.59e+05  0.00e+00     0s
   2  -1.30224087e+03 -1.05904732e-01  3.74e-08 9.59e+05  0.00e+00     0s
   3  -2.42624440e+03 -3.67670532e-01  8.12e-08 9.59e+05  0.00e+00     0s
   4  -5.10860686e+03 -1.63056848e+00  1.79e-07 9.59e+05  0.00e+00     0s
   5  -1.65970825e+04 -1.72358964e+01  3.43e-07 9.57e+05  0.00e+00     0s
   6  -2.68411816e+04 -4.51361363e+01  7.83e-07 9.56e+05  0.00e+00     0s
   7  -1.51345570e+05 -1.45795652e+03  1.50e-06 9.41e+05  0.00e+00     0s
   8  -2.12348181e+05 -2.89282817e+03  3.46e-06 9.33e+05  0.00e+00     0s
   9  -3.11596277e+05 -6.31052834e+03  7.60e-06 9.21e+05  0.00e+00     0s
  10  -5.55548671e+05 -2.07365659e+04  1.67e-05 8.90e+05  0.00e+00     0s
  11  -1.05687903e+06 -8.08191176e+04  3.55e-05 8.23e+05  0.00e+00     0s
  12  -2.14136133e+06 -4.04792351e+05  7.00e-05 6.54e+05  0.00e+00     0s
  13  -3.37521989e+06 -1.45952498e+06  1.25e-04 3.80e+05  0.00e+00     0s
  14  -4.00371038e+06 -4.00362759e+06  8.30e-05 3.80e-01  0.00e+00     0s
  15  -4.00363076e+06 -4.00363076e+06  1.89e-10 3.80e-07  0.00e+00     0s

Barrier solved model in 15 iterations and 0.01 seconds
Optimal objective -4.00363076e+06

:Optimal

# calculate best parameters (not covered in class!)
ab_rss = getvalue(ab)

3-element Array{Float64,1}:
 22.3097  
  0.283215
  0.379897

For reference, let's compare the Gurobi solution to the solution we got in Lecture 1 by solving a linear equation:

ab_lecture1 = (xc'*xc)^-1 * xc'*y

3-element Array{Float64,1}:
 22.3097  
  0.283215
  0.379897

Let's plot the data and the linear function we fit to it.

x1var = linspace(58, 71, 40)
x2var = linspace(62, 79, 40)
f(u,v) = ab_rss[2]*u + ab_rss[3]*v + ab_rss[1]
plot(x1var, x2var, f, st=:surface)
scatter!(x[:,1],x[:,2],y,lab="data", xlabel="mother's height",ylabel="father's height",zlabel="subject's height",ticks=true,leg=false)

Now, let's again find the parameters that minimize the max residual error.

galton = Model(solver=ClpSolver())
@variable(galton,ab[1:3])
@variable(galton,z)
@constraint(galton,y-xc*ab .<= z)
@constraint(galton,xc*ab-y.<= z)
obj = @objective(galton,:Min,z)
solve(galton)

:Optimal

getvalue(obj)

9.841463414634127

ab_max = getvalue(ab)

3-element Array{Float64,1}:
 -9.37805 
  0.341463
  0.804878

x1var = linspace(58, 71, 40)
x2var = linspace(62, 79, 40)
f(u,v) = ab_max[2]*u + ab_max[3]*v + ab_max[1]
plot(x1var, x2var, f, st=:surface)
scatter!(x[:,1],x[:,2],y,lab="data", xlabel="mother's height",ylabel="father's height",zlabel="subject's height",ticks=true,leg=false)

We can see the two different objectives (residual sum of squares vs max error) give very different parameters and models.

Online advertising budget allocation¶

Let's write down the numbers we use for the budget and constraints.

c_conv = [3e-3 ; 2e-3]  # conversion rates from impression to paying subscription
a_cpi = [1e-2; 1e-2]    # cost per impression
b_cpi = 80e3             # budget for impressions
a_ctr = [2e-2; 1e-2]    # click-through rates
b_ctr = 100e3            # budget for clickthroughs (number of free subscriptions)
b_trf = 4e6             # maximum traffic on site A

4.0e6

Try out a pair of values for $x_1$ and $x_2$, the impressions on sites A and B respectively.

We should confirm that the values satisfy the constraints, and check the objective value (# conversions) that we'll get with this choice.

First, we see what happens with we choose to have as many impressions on site A as we can until the traffic constraint is exhausted, and then use the rest of our budget on site B.

(x1I, x2I) = (4e6,2e6)           # pick impressions for site A & B
@show a_cpi'*[x1I; x2I] , b_cpi
@show a_cpi'*[x1I; x2I] <= b_cpi # check the cpi budget constraint
@show a_ctr'*[x1I; x2I] , b_ctr
@show a_ctr'*[x1I; x2I] <= b_ctr # check the clickthroughs (number of free subscriptions) budget constraint
@show x1I, b_trf
@show x1I <= b_trf               # check the traffic constraint
@show c_conv'* [x1I; x2I]        # See the value

(a_cpi' * [x1I; x2I], b_cpi) = (60000.0, 80000.0)
a_cpi' * [x1I; x2I] <= b_cpi = true
(a_ctr' * [x1I; x2I], b_ctr) = (100000.0, 100000.0)
a_ctr' * [x1I; x2I] <= b_ctr = true
(x1I, b_trf) = (4.0e6, 4.0e6)
x1I <= b_trf = true
c_conv' * [x1I; x2I] = 16000.0

16000.0

The choices we used $x_1$ and $x_2$ gave 16K conversions. Is that optimal?

Let's now try spending all our budget on site B.

(x1II, x2II) = (0,8e6)             # pick impressions for site A & B
@show a_cpi'*[x1II; x2II] , b_cpi
@show a_cpi'*[x1II; x2II] <= b_cpi # check the cpi budget constraint
@show a_ctr'*[x1II; x2II] , b_ctr
@show a_ctr'*[x1II; x2II] <= b_ctr # check the clickthroughs (number of free subscriptions) budget constraint
@show x1II, b_trf
@show x1II <= b_trf                # check the traffic constraint
@show c_conv'* [x1II; x2II]        # See the value

(a_cpi' * [x1II; x2II], b_cpi) = (80000.0, 80000.0)
a_cpi' * [x1II; x2II] <= b_cpi = true
(a_ctr' * [x1II; x2II], b_ctr) = (80000.0, 100000.0)
a_ctr' * [x1II; x2II] <= b_ctr = true
(x1II, b_trf) = (0, 4.0e6)
x1II <= b_trf = true
c_conv' * [x1II; x2II] = 16000.0

16000.0

This strategy also gave 16K conversions.

Can we do better? Below we try with another pair of values for the site impression counts.

(x1III, x2III) = (2e6,6e6)           # pick impressions for site A & B
@show a_cpi'*[x1III; x2III] , b_cpi
@show a_cpi'*[x1III; x2III] <= b_cpi # check the cpi budget constraint
@show a_ctr'*[x1III; x2III] , b_ctr
@show a_ctr'*[x1III; x2III] <= b_ctr # check the clickthroughs (number of free subscriptions) budget constraint
@show x1III, b_trf
@show x1III <= b_trf                 # check the traffic constraint
@show c_conv'* [x1III; x2III]        # See the value

(a_cpi' * [x1III; x2III], b_cpi) = (80000.0, 80000.0)
a_cpi' * [x1III; x2III] <= b_cpi = true
(a_ctr' * [x1III; x2III], b_ctr) = (100000.0, 100000.0)
a_ctr' * [x1III; x2III] <= b_ctr = true
(x1III, b_trf) = (2.0e6, 4.0e6)
x1III <= b_trf = true
c_conv' * [x1III; x2III] = 18000.0

18000.0

This strategy gave 18K conversions! So that's better.

Is it optimal?

Let's try running an LP solver and see what it gives.

prog = Model(solver=ClpSolver())
@variable(prog,x1 >= 0) # impressions bought on site A
@variable(prog,x2 >= 0) # impressions bought on site B
@constraint(prog,a_cpi'*[x1; x2] <= b_cpi)
@constraint(prog,a_ctr'*[x1; x2] <= b_ctr)
@constraint(prog,x1 <= b_trf)
obj = @objective(prog,:Max,c_conv'*[x1; x2])

solve(prog)

:Optimal

getvalue(obj)

18000.0

(x1opt, x2opt) = getvalue([x1; x2])

2-element Array{Float64,1}:
 2.0e6
 6.0e6

We see that, according to the LP solver, choosing 2M impressions on site A and 6M impressions on site B is optimal, and gives 18K conversions.

But, how can we know that we have in fact obtained an optimal solution?

In the next lecture, we'll start to explore ways of showing that we have an optimal solution, also known as certificates of optimality.