AM221: Advanced Optimization
Lecture 3, Jan 29th, 2018
Instructor: Rasmus Kyng
using JuMP
using Clp
using DataFrames
using CSV
using Plots
using Gurobi
plotlyjs()
n = 4
c = [18; 16; 58; 200]
w = [10; 10; 48; 68.686]
b = 290
prog = Model(solver=ClpSolver())
@variable(prog,x[1:4] >= 0)
@constraint(prog,c'*x <= b)
@constraint(prog,x .<= ones(n))
obj = @objective(prog,:Max,w'*x)
solve(prog)
10/18.0, 16/18.0, 48/58.0, 68.68/200
getvalue(obj)
getvalue(x)
Read in the Galton height dataset as we did in Lecture 1.
ghdf = CSV.read("galton-stata11.tab",rows_for_type_detect=1000,delim='\t');
x,y = [ghdf[:mother] ghdf[:father]], ghdf[:height];
xc = [ones(length(y)) x ];
Let $\mathbf{x}(1)$ be the height of the mother, $\mathbf{x}(2)$ the height of the father, and $\mathbf{y}$ the height of the child.
We want to fit a linear model $\mathbf{y} = a_1 \cdot \mathbf{x}(1) + a_2 \cdot \mathbf{x}(2) + b $.
First, let's again find the parameters that minimize the residual sum of squares error.
galton = Model(solver=GurobiSolver(Presolve=0))
@variable(galton,ab[1:3])
obj = @objective(galton,:Min,sum((y-xc*ab).^2) )
solve(galton)
# calculate best parameters (not covered in class!)
ab_rss = getvalue(ab)
For reference, let's compare the Gurobi solution to the solution we got in Lecture 1 by solving a linear equation:
ab_lecture1 = (xc'*xc)^-1 * xc'*y
Let's plot the data and the linear function we fit to it.
x1var = linspace(58, 71, 40)
x2var = linspace(62, 79, 40)
f(u,v) = ab_rss[2]*u + ab_rss[3]*v + ab_rss[1]
plot(x1var, x2var, f, st=:surface)
scatter!(x[:,1],x[:,2],y,lab="data", xlabel="mother's height",ylabel="father's height",zlabel="subject's height",ticks=true,leg=false)
Now, let's again find the parameters that minimize the max residual error.
galton = Model(solver=ClpSolver())
@variable(galton,ab[1:3])
@variable(galton,z)
@constraint(galton,y-xc*ab .<= z)
@constraint(galton,xc*ab-y.<= z)
obj = @objective(galton,:Min,z)
solve(galton)
getvalue(obj)
ab_max = getvalue(ab)
x1var = linspace(58, 71, 40)
x2var = linspace(62, 79, 40)
f(u,v) = ab_max[2]*u + ab_max[3]*v + ab_max[1]
plot(x1var, x2var, f, st=:surface)
scatter!(x[:,1],x[:,2],y,lab="data", xlabel="mother's height",ylabel="father's height",zlabel="subject's height",ticks=true,leg=false)
We can see the two different objectives (residual sum of squares vs max error) give very different parameters and models.
Let's write down the numbers we use for the budget and constraints.
c_conv = [3e-3 ; 2e-3] # conversion rates from impression to paying subscription
a_cpi = [1e-2; 1e-2] # cost per impression
b_cpi = 80e3 # budget for impressions
a_ctr = [2e-2; 1e-2] # click-through rates
b_ctr = 100e3 # budget for clickthroughs (number of free subscriptions)
b_trf = 4e6 # maximum traffic on site A
Try out a pair of values for $x_1$ and $x_2$, the impressions on sites A and B respectively.
We should confirm that the values satisfy the constraints, and check the objective value (# conversions) that we'll get with this choice.
First, we see what happens with we choose to have as many impressions on site A as we can until the traffic constraint is exhausted, and then use the rest of our budget on site B.
(x1I, x2I) = (4e6,2e6) # pick impressions for site A & B
@show a_cpi'*[x1I; x2I] , b_cpi
@show a_cpi'*[x1I; x2I] <= b_cpi # check the cpi budget constraint
@show a_ctr'*[x1I; x2I] , b_ctr
@show a_ctr'*[x1I; x2I] <= b_ctr # check the clickthroughs (number of free subscriptions) budget constraint
@show x1I, b_trf
@show x1I <= b_trf # check the traffic constraint
@show c_conv'* [x1I; x2I] # See the value
The choices we used $x_1$ and $x_2$ gave 16K conversions. Is that optimal?
Let's now try spending all our budget on site B.
(x1II, x2II) = (0,8e6) # pick impressions for site A & B
@show a_cpi'*[x1II; x2II] , b_cpi
@show a_cpi'*[x1II; x2II] <= b_cpi # check the cpi budget constraint
@show a_ctr'*[x1II; x2II] , b_ctr
@show a_ctr'*[x1II; x2II] <= b_ctr # check the clickthroughs (number of free subscriptions) budget constraint
@show x1II, b_trf
@show x1II <= b_trf # check the traffic constraint
@show c_conv'* [x1II; x2II] # See the value
This strategy also gave 16K conversions.
Can we do better? Below we try with another pair of values for the site impression counts.
(x1III, x2III) = (2e6,6e6) # pick impressions for site A & B
@show a_cpi'*[x1III; x2III] , b_cpi
@show a_cpi'*[x1III; x2III] <= b_cpi # check the cpi budget constraint
@show a_ctr'*[x1III; x2III] , b_ctr
@show a_ctr'*[x1III; x2III] <= b_ctr # check the clickthroughs (number of free subscriptions) budget constraint
@show x1III, b_trf
@show x1III <= b_trf # check the traffic constraint
@show c_conv'* [x1III; x2III] # See the value
This strategy gave 18K conversions! So that's better.
Is it optimal?
Let's try running an LP solver and see what it gives.
prog = Model(solver=ClpSolver())
@variable(prog,x1 >= 0) # impressions bought on site A
@variable(prog,x2 >= 0) # impressions bought on site B
@constraint(prog,a_cpi'*[x1; x2] <= b_cpi)
@constraint(prog,a_ctr'*[x1; x2] <= b_ctr)
@constraint(prog,x1 <= b_trf)
obj = @objective(prog,:Max,c_conv'*[x1; x2])
solve(prog)
getvalue(obj)
(x1opt, x2opt) = getvalue([x1; x2])
We see that, according to the LP solver, choosing 2M impressions on site A and 6M impressions on site B is optimal, and gives 18K conversions.
But, how can we know that we have in fact obtained an optimal solution?
In the next lecture, we'll start to explore ways of showing that we have an optimal solution, also known as certificates of optimality.