python 2.7 - scipy.minimize 'SLSQP' appears to return sub optimal weights values -
im trying run minimization function ensemble of logloss values, when using scipy.minimize function appears return sub optimal value.
the data comes in pandas table:
click, prob1, prob2, prob3
0, 0.0023, 0.0024, 0.012
1, 0.89, 0.672, 0.78
0, 0.43, 0.023, 0.032
from scipy.optimize import minimize math import log import numpy np import pandas pd def logloss(p, y): p = max(min(p, 1 - 10e-15), 10e-15) return -log(p) if y == 1 else -log(1 - p) def ensemble_weights(weights, probs, y_true): loss = 0 final_pred = [] prob_length = len(probs) in range(prob_length): w_sum = 0 index, weight in enumerate(weights): w_sum += probs[i][index] * weight final_pred.append(w_sum) index, pred in enumerate(final_pred): loss += logloss(pred, y_true[index]) print loss / prob_length, 'weights :=', weights return loss / prob_length ## w0 initial guess minimum of function 'fun' ## initial guess weights equal w0 = [1/probs.shape[1]] * probs.shape[1] # ## sets bounds on weights, between 0 , 1 bnds = [(0,1)] * probs.shape[1] ## sets constraints on weights, must sum 1 ## or, in other words, 1 - sum(w) = 0 cons = ({'type':'eq','fun':lambda w: 1 - np.sum(w)}) weights = minimize( ensemble_weights, w0, (probs,y_true), method='slsqp', bounds=bnds, constraints=cons ) ## sanity check, make sure weights in fact sum 1 print("weights sum %0.4f:" % weights['fun']) print weights['x']
to debug i've used print statement in function returns following.
0.0101326509533 weights := [ 1. 0. 0.]
0.0101326509533 weights := [ 1. 0. 0.]
0.0101326509702 weights := [ 1.00000001 0. 0. ]
0.0101292476389 weights := [ 1.00000000e+00 1.49011612e-08 0.00000000e+00]
0.0101326509678 weights := [ 1.00000000e+00 0.00000000e+00 1.49011612e-08]
0.0102904525781 weights := [ -4.44628778e-10 1.00000000e+00 -4.38298620e-10]
0.00938612854966 weights := [ 5.00000345e-01 4.99999655e-01 -2.19149158e-10]
0.00961930211064 weights := [ 7.49998538e-01 2.50001462e-01 -1.09575296e-10]
0.00979499597866 weights := [ 8.74998145e-01 1.25001855e-01 -5.47881403e-11]
0.00990978430231 weights := [ 9.37498333e-01 6.25016666e-02 -2.73943942e-11]
0.00998305685424 weights := [ 9.68748679e-01 3.12513212e-02 -1.36974109e-11]
0.0100300175342 weights := [ 9.84374012e-01 1.56259881e-02 -6.84884901e-12]
0.0100605546439 weights := [ 9.92186781e-01 7.81321874e-03 -3.42452299e-12]
0.0100807513117 weights := [ 9.96093233e-01 3.90676721e-03 -1.71233067e-12]
0.0100942930446 weights := [ 9.98046503e-01 1.95349723e-03 -8.56215139e-13]
0.0101034594634 weights := [ 9.99023167e-01 9.76832595e-04 -4.28144378e-13]
0.0101034594634 weights := [ 9.99023167e-01 9.76832595e-04 -4.28144378e-13]
0.0101034594804 weights := [ 9.99023182e-01 9.76832595e-04 -4.28144378e-13]
0.0101034593149 weights := [ 9.99023167e-01 9.76847497e-04 -4.28144378e-13]
0.010103459478 weights := [ 9.99023167e-01 9.76832595e-04 1.49007330e-08]
weights sum 0.0101:
[ 9.99023167e-01 9.76832595e-04 -4.28144378e-13]
my expectation optimal weights returned should be: 0.00938612854966 weights := [ 5.00000345e-01 4.99999655e-01 -2.19149158e-10]
can see glaring issue?
fyi -> code hack of kaggle otto script https://www.kaggle.com/hsperr/otto-group-product-classification-challenge/finding-ensamble-weights
solved
options = {'ftol':1e-9}
as part of minimize function
Comments
Post a Comment