python - Caffe: Extremely high loss while learning simple linear functions -


i'm trying train neural net learn function y = x1 + x2 + x3. objective play around caffe in order learn , understand better. data required synthetically generated in python , written memory lmdb database file.

code data generation:

import numpy np import lmdb import caffe  ntrain = 100 ntest = 20 k = 3 h = 1 w = 1  xtrain = np.random.randint(0,1000, size = (ntrain,k,h,w)) xtest = np.random.randint(0,1000, size = (ntest,k,h,w))  ytrain = xtrain[:,0,0,0] + xtrain[:,1,0,0] + xtrain[:,2,0,0] ytest = xtest[:,0,0,0] + xtest[:,1,0,0] + xtest[:,2,0,0]  env = lmdb.open('expt/expt_train')  in range(ntrain):     datum = caffe.proto.caffe_pb2.datum()     datum.channels = xtrain.shape[1]     datum.height = xtrain.shape[2]     datum.width = xtrain.shape[3]     datum.data = xtrain[i].tobytes()     datum.label = int(ytrain[i])     str_id = '{:08}'.format(i)      env.begin(write=true) txn:         txn.put(str_id.encode('ascii'), datum.serializetostring())   env = lmdb.open('expt/expt_test')  in range(ntest):     datum = caffe.proto.caffe_pb2.datum()     datum.channels = xtest.shape[1]     datum.height = xtest.shape[2]     datum.width = xtest.shape[3]     datum.data = xtest[i].tobytes()     datum.label = int(ytest[i])     str_id = '{:08}'.format(i)      env.begin(write=true) txn:         txn.put(str_id.encode('ascii'), datum.serializetostring()) 

solver.prototext file:

net: "expt/expt.prototxt"  display: 1 max_iter: 200 test_iter: 20 test_interval: 100  base_lr: 0.000001 momentum: 0.9 # weight_decay: 0.0005  lr_policy: "inv" # gamma: 0.5 # stepsize: 10 # power: 0.75  snapshot_prefix: "expt/expt" snapshot_diff: true  solver_mode: cpu solver_type: sgd  debug_info: true 

caffe model:

name: "expt"   layer {     name: "expt_data_train"     type: "data"     top: "data"     top: "label"          include {         phase: train     }      data_param {         source: "expt/expt_train"         backend: lmdb         batch_size: 1     } }   layer {     name: "expt_data_validate"     type: "data"     top: "data"     top: "label"          include {         phase: test     }      data_param {         source: "expt/expt_test"         backend: lmdb         batch_size: 1     } }   layer {     name: "ip"     type: "innerproduct"     bottom: "data"     top: "ip"      inner_product_param {         num_output: 1          weight_filler {             type: 'constant'         }          bias_filler {             type: 'constant'         }     } }   layer {     name: "loss"     type: "euclideanloss"     bottom: "ip"     bottom: "label"     top: "loss" } 

the loss on test data i'm getting 233,655. shocking loss 3 orders of magnitude greater numbers in training , test data sets. also, function learned simple linear function. can't seem figure out wrong in code. suggestions/inputs appreciated.

the loss generated lot in case because caffe accepts data (i.e. datum.data) in uint8 format , labels (datum.label) in int32 format. however, labels, numpy.int64 format seems working. think datum.data accepted in uint8 format because caffe developed computer vision tasks inputs images, have rgb values in [0,255] range. uint8 can capture using least amount of memory. made following changes data generation code:

xtrain = np.uint8(np.random.randint(0,256, size = (ntrain,k,h,w))) xtest = np.uint8(np.random.randint(0,256, size = (ntest,k,h,w)))  ytrain = int(xtrain[:,0,0,0]) + int(xtrain[:,1,0,0]) + int(xtrain[:,2,0,0]) ytest = int(xtest[:,0,0,0]) + int(xtest[:,1,0,0]) + int(xtest[:,2,0,0]) 

after playing around net parameters (learning rate, number of iterations etc.) i'm getting error of order of 10^(-6) think pretty good!


Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

android - CollapsingToolbarLayout: position the ExpandedText programmatically -

Listeners to visualise results of load test in JMeter -