python - Caffe: Extremely high loss while learning simple linear functions -
i'm trying train neural net learn function y = x1 + x2 + x3
. objective play around caffe in order learn , understand better. data required synthetically generated in python , written memory lmdb database file.
code data generation:
import numpy np import lmdb import caffe ntrain = 100 ntest = 20 k = 3 h = 1 w = 1 xtrain = np.random.randint(0,1000, size = (ntrain,k,h,w)) xtest = np.random.randint(0,1000, size = (ntest,k,h,w)) ytrain = xtrain[:,0,0,0] + xtrain[:,1,0,0] + xtrain[:,2,0,0] ytest = xtest[:,0,0,0] + xtest[:,1,0,0] + xtest[:,2,0,0] env = lmdb.open('expt/expt_train') in range(ntrain): datum = caffe.proto.caffe_pb2.datum() datum.channels = xtrain.shape[1] datum.height = xtrain.shape[2] datum.width = xtrain.shape[3] datum.data = xtrain[i].tobytes() datum.label = int(ytrain[i]) str_id = '{:08}'.format(i) env.begin(write=true) txn: txn.put(str_id.encode('ascii'), datum.serializetostring()) env = lmdb.open('expt/expt_test') in range(ntest): datum = caffe.proto.caffe_pb2.datum() datum.channels = xtest.shape[1] datum.height = xtest.shape[2] datum.width = xtest.shape[3] datum.data = xtest[i].tobytes() datum.label = int(ytest[i]) str_id = '{:08}'.format(i) env.begin(write=true) txn: txn.put(str_id.encode('ascii'), datum.serializetostring())
solver.prototext file:
net: "expt/expt.prototxt" display: 1 max_iter: 200 test_iter: 20 test_interval: 100 base_lr: 0.000001 momentum: 0.9 # weight_decay: 0.0005 lr_policy: "inv" # gamma: 0.5 # stepsize: 10 # power: 0.75 snapshot_prefix: "expt/expt" snapshot_diff: true solver_mode: cpu solver_type: sgd debug_info: true
caffe model:
name: "expt" layer { name: "expt_data_train" type: "data" top: "data" top: "label" include { phase: train } data_param { source: "expt/expt_train" backend: lmdb batch_size: 1 } } layer { name: "expt_data_validate" type: "data" top: "data" top: "label" include { phase: test } data_param { source: "expt/expt_test" backend: lmdb batch_size: 1 } } layer { name: "ip" type: "innerproduct" bottom: "data" top: "ip" inner_product_param { num_output: 1 weight_filler { type: 'constant' } bias_filler { type: 'constant' } } } layer { name: "loss" type: "euclideanloss" bottom: "ip" bottom: "label" top: "loss" }
the loss on test data i'm getting 233,655
. shocking loss 3 orders of magnitude greater numbers in training , test data sets. also, function learned simple linear function. can't seem figure out wrong in code. suggestions/inputs appreciated.
the loss generated lot in case because caffe accepts data (i.e. datum.data
) in uint8
format , labels (datum.label
) in int32
format. however, labels, numpy.int64
format seems working. think datum.data
accepted in uint8
format because caffe developed computer vision tasks inputs images, have rgb values in [0,255] range. uint8
can capture using least amount of memory. made following changes data generation code:
xtrain = np.uint8(np.random.randint(0,256, size = (ntrain,k,h,w))) xtest = np.uint8(np.random.randint(0,256, size = (ntest,k,h,w))) ytrain = int(xtrain[:,0,0,0]) + int(xtrain[:,1,0,0]) + int(xtrain[:,2,0,0]) ytest = int(xtest[:,0,0,0]) + int(xtest[:,1,0,0]) + int(xtest[:,2,0,0])
after playing around net parameters (learning rate, number of iterations etc.) i'm getting error of order of 10^(-6) think pretty good!
Comments
Post a Comment