Job fails on loading com.databricks.spark.csv in SparkR shell -


when open sparkr shell below able run jobs successfully

>bin/sparkr >rdf = data.frame(name =c("a", "b"), age =c(1,2)) >df = createdataframe(sqlcontext, rdf) >df dataframe[name:string, age:double] 

wherease when include package spark-csv while loading sparkr shell, job fails

>bin/sparkr --packages com.databricks:spark-csv_2.10:1.0.3 >rdf = data.frame(name =c("a", "b"), age =c(1,2)) >df = createdataframe(sqlcontext, rdf)   > rdf = data.frame(name =c("a", "b"), age =c(1,2)) > df = createdataframe(sqlcontext, rdf) 15/06/25 17:59:50 info sparkcontext: starting job: collectpartitions @ nativeme thodaccessorimpl.java:-2 15/06/25 17:59:50 info dagscheduler: got job 0 (collectpartitions @ nativemetho daccessorimpl.java:-2) 1 output partitions (allowlocal=true) 15/06/25 17:59:50 info dagscheduler: final stage: resultstage 0(collectpartition s @ nativemethodaccessorimpl.java:-2) 15/06/25 17:59:50 info dagscheduler: parents of final stage: list() 15/06/25 17:59:50 info dagscheduler: missing parents: list() 15/06/25 17:59:50 info dagscheduler: submitting resultstage 0 (parallelcollectio nrdd[0] @ parallelize @ rrdd.scala:453), has no missing parents 15/06/25 17:59:50 warn sizeestimator: failed check whether usecompressedoops set; assuming yes 15/06/25 17:59:50 info memorystore: ensurefreespace(1280) called curmem=0, maxmem=280248975 15/06/25 17:59:50 info memorystore: block broadcast_0 stored values in memory  (estimated size 1280.0 b, free 267.3 mb) 15/06/25 17:59:50 info memorystore: ensurefreespace(854) called curmem=1280 , maxmem=280248975 15/06/25 17:59:50 info memorystore: block broadcast_0_piece0 stored bytes in memory (estimated size 854.0 b, free 267.3 mb) 15/06/25 17:59:50 info blockmanagerinfo: added broadcast_0_piece0 in memory on l ocalhost:55886 (size: 854.0 b, free: 267.3 mb) 15/06/25 17:59:50 info sparkcontext: created broadcast 0 broadcast @ dagsc heduler.scala:874 15/06/25 17:59:50 info dagscheduler: submitting 1 missing tasks resultstage  0 (parallelcollectionrdd[0] @ parallelize @ rrdd.scala:453) 15/06/25 17:59:50 info taskschedulerimpl: adding task set 0.0 1 tasks 15/06/25 17:59:50 info tasksetmanager: starting task 0.0 in stage 0.0 (tid 0, lo calhost, process_local, 1632 bytes) 15/06/25 17:59:50 info executor: running task 0.0 in stage 0.0 (tid 0) 15/06/25 17:59:50 info executor: fetching http://172.16.104.224:55867/jars/org.a pache.commons_commons-csv-1.1.jar timestamp 1435235242519 15/06/25 17:59:50 info utils: fetching http://172.16.104.224:55867/jars/org.apac he.commons_commons-csv-1.1.jar c:\users\edwinn\appdata\local\temp\spark-39ef1 9de-03f7-4b45-b91b-0828912c1789\userfiles-d9b0cd7f-d060-4acc-bd26-46ce34d975b3\f etchfiletemp3674233359629683967.tmp 15/06/25 17:59:50 error executor: exception in task 0.0 in stage 0.0 (tid 0) java.lang.nullpointerexception         @ java.lang.processbuilder.start(processbuilder.java:1012)         @ org.apache.hadoop.util.shell.runcommand(shell.java:482)         @ org.apache.hadoop.util.shell.run(shell.java:455)         @ org.apache.hadoop.util.shell$shellcommandexecutor.execute(shell.java: 702)         @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:873)         @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:853)         @ org.apache.spark.util.utils$.fetchfile(utils.scala:465)         @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:398)         @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:390)         @ scala.collection.traversablelike$withfilter$$anonfun$foreach$1.apply( traversablelike.scala:772)         @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98)         @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98)         @ scala.collection.mutable.hashtable$class.foreachentry(hashtable.scala :226)         @ scala.collection.mutable.hashmap.foreachentry(hashmap.scala:39)         @ scala.collection.mutable.hashmap.foreach(hashmap.scala:98)         @ scala.collection.traversablelike$withfilter.foreach(traversablelike.s cala:771)         @ org.apache.spark.executor.executor.org$apache$spark$executor$executor $$updatedependencies(executor.scala:390)     @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:193)     @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor. java:1142)         @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor .java:617)         @ java.lang.thread.run(thread.java:745) 15/06/25 17:59:50 **warn tasksetmanager: lost task 0.0 in stage 0.0 (tid 0, localh ost): java.lang.nullpointerexception**         @ java.lang.processbuilder.start(processbuilder.java:1012)         @ org.apache.hadoop.util.shell.runcommand(shell.java:482)         @ org.apache.hadoop.util.shell.run(shell.java:455)         @ org.apache.hadoop.util.shell$shellcommandexecutor.execute(shell.java: 702)         @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:873)         @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:853)         @ org.apache.spark.util.utils$.fetchfile(utils.scala:465)         @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:398)         @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:390)         @ scala.collection.traversablelike$withfilter$$anonfun$foreach$1.apply( traversablelike.scala:772)         @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98)         @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98)         @ scala.collection.mutable.hashtable$class.foreachentry(hashtable.scala :226)         @ scala.collection.mutable.hashmap.foreachentry(hashmap.scala:39)         @ scala.collection.mutable.hashmap.foreach(hashmap.scala:98)         @ scala.collection.traversablelike$withfilter.foreach(traversablelike.s cala:771)         @ org.apache.spark.executor.executor.org$apache$spark$executor$executor $$updatedependencies(executor.scala:390)         @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:193         @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor. java:1142)         @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor .java:617)         @ java.lang.thread.run(thread.java:745) 15/06/25 17:59:50 **** 15/06/25 17:59:50 info taskschedulerimpl: removed taskset 0.0, tasks have completed, pool 15/06/25 17:59:50 info taskschedulerimpl: cancelling stage 0 15/06/25 17:59:50 info dagscheduler: resultstage 0 (collectpartitions @ nativem ethodaccessorimpl.java:-2) failed in 0.156 s 15/06/25 17:59:50 info dagscheduler: job 0 failed: collectpartitions @ nativeme thodaccessorimpl.java:-2, took 0.301876 s 15/06/25 17:59:50 **error rbackendhandler: collectpartitions on 3 failed java.lang.reflect.invocationtargetexception**         @ sun.reflect.nativemethodaccessorimpl.invoke0(native method)         @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl. java:62)         @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodacces sorimpl.java:43)         @ java.lang.reflect.method.invoke(method.java:483)         @ org.apache.spark.api.r.rbackendhandler.handlemethodcall(rbackendhandl er.scala:127)         @ org.apache.spark.api.r.rbackendhandler.channelread0(rbackendhandler.s cala:74)         @ org.apache.spark.api.r.rbackendhandler.channelread0(rbackendhandler.s cala:36)         @ io.netty.channel.simplechannelinboundhandler.channelread(simplechanne linboundhandler.java:105)         @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abst ractchannelhandlercontext.java:333)         @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstra ctchannelhandlercontext.java:319)         @ io.netty.handler.codec.messagetomessagedecoder.channelread(messagetom essagedecoder.java:103)         @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abst ractchannelhandlercontext.java:333)         @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstra ctchannelhandlercontext.java:319)         @ io.netty.handler.codec.bytetomessagedecoder.channelread(bytetomessage decoder.java:163)         @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abst ractchannelhandlercontext.java:333)         @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstra ctchannelhandlercontext.java:319)         @ io.netty.channel.defaultchannelpipeline.firechannelread(defaultchanne lpipeline.java:787)         @ io.netty.channel.nio.abstractniobytechannel$niobyteunsafe.read(abstra ctniobytechannel.java:130)         @ io.netty.channel.nio.nioeventloop.processselectedkey(nioeventloop.jav a:511)         @ io.netty.channel.nio.nioeventloop.processselectedkeysoptimized(nioeve ntloop.java:468)         @ io.netty.channel.nio.nioeventloop.processselectedkeys(nioeventloop.ja va:382)         @ io.netty.channel.nio.nioeventloop.run(nioeventloop.java:354)         @ io.netty.util.concurrent.singlethreadeventexecutor$2.run(singlethread eventexecutor.java:116)         @ io.netty.util.concurrent.defaultthreadfactory$defaultrunnabledecorato r.run(defaultthreadfactory.java:137)         @ java.lang.thread.run(thread.java:745) caused by: org.apache.spark.sparkexception: job aborted due stage failure: ta sk 0 in stage 0.0 failed 1 times, recent failure: lost task 0.0 in stage 0. 0 (tid 0, localhost): java.lang.nullpointerexception         @ java.lang.processbuilder.start(processbuilder.java:1012)         @ org.apache.hadoop.util.shell.runcommand(shell.java:482)         @ org.apache.hadoop.util.shell.run(shell.java:455)         @ org.apache.hadoop.util.shell$shellcommandexecutor.execute(shell.java: 702)         @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:873)         @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:853)         @ org.apache.spark.util.utils$.fetchfile(utils.scala:465)         @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:398)         @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:390)         @ scala.collection.traversablelike$withfilter$$anonfun$foreach$1.apply( traversablelike.scala:772)         @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98)         @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98)         @ scala.collection.mutable.hashtable$class.foreachentry(hashtable.scala :226)         @ scala.collection.mutable.hashmap.foreachentry(hashmap.scala:39)         @ scala.collection.mutable.hashmap.foreach(hashmap.scala:98)         @ scala.collection.traversablelike$withfilter.foreach(traversablelike.s cala:771)         @ org.apache.spark.executor.executor.org$apache$spark$executor$executor $$updatedependencies(executor.scala:390)         @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:193)         @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor. java:1142)         @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor .java:617)         @ java.lang.thread.run(thread.java:745) driver stacktrace:         @ org.apache.spark.scheduler.dagscheduler.org$apache$spark$scheduler$da gscheduler$$failjobandindependentstages(dagscheduler.scala:1266)         @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(d agscheduler.scala:1257)         @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(d agscheduler.scala:1256)         @ scala.collection.mutable.resizablearray$class.foreach(resizablearray. scala:59)         @ scala.collection.mutable.arraybuffer.foreach(arraybuffer.scala:47)         @ org.apache.spark.scheduler.dagscheduler.abortstage(dagscheduler.scala :1256)         @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$ 1.apply(dagscheduler.scala:730)         @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$ 1.apply(dagscheduler.scala:730)         @ scala.option.foreach(option.scala:236)         @ org.apache.spark.scheduler.dagscheduler.handletasksetfailed(dagschedu ler.scala:730)         @ org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dag scheduler.scala:1450)         @ org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dag scheduler.scala:1411)         @ org.apache.spark.util.eventloop$$anon$1.run(eventloop.scala:48) **error: returnstatus == 0 not true** > 

i above error. suggestions? thanks.job details

i haven't used cluster. i've set

>bin/sparkr --master local --packages com.databricks:spark-csv_2.10:1.0.3 

my os version windows 8 enterprise, spark 1.4.1, scala 2.10.1, spark-csv 2.11:1.0.3/2.10:1.0.3


Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

javascript - Complex json ng-repeat -

jquery - Cloning of rows and columns from the old table into the new with colSpan and rowSpan -