Job fails on loading com.databricks.spark.csv in SparkR shell -
when open sparkr shell below able run jobs successfully
>bin/sparkr >rdf = data.frame(name =c("a", "b"), age =c(1,2)) >df = createdataframe(sqlcontext, rdf) >df dataframe[name:string, age:double]
wherease when include package spark-csv while loading sparkr shell, job fails
>bin/sparkr --packages com.databricks:spark-csv_2.10:1.0.3 >rdf = data.frame(name =c("a", "b"), age =c(1,2)) >df = createdataframe(sqlcontext, rdf) > rdf = data.frame(name =c("a", "b"), age =c(1,2)) > df = createdataframe(sqlcontext, rdf) 15/06/25 17:59:50 info sparkcontext: starting job: collectpartitions @ nativeme thodaccessorimpl.java:-2 15/06/25 17:59:50 info dagscheduler: got job 0 (collectpartitions @ nativemetho daccessorimpl.java:-2) 1 output partitions (allowlocal=true) 15/06/25 17:59:50 info dagscheduler: final stage: resultstage 0(collectpartition s @ nativemethodaccessorimpl.java:-2) 15/06/25 17:59:50 info dagscheduler: parents of final stage: list() 15/06/25 17:59:50 info dagscheduler: missing parents: list() 15/06/25 17:59:50 info dagscheduler: submitting resultstage 0 (parallelcollectio nrdd[0] @ parallelize @ rrdd.scala:453), has no missing parents 15/06/25 17:59:50 warn sizeestimator: failed check whether usecompressedoops set; assuming yes 15/06/25 17:59:50 info memorystore: ensurefreespace(1280) called curmem=0, maxmem=280248975 15/06/25 17:59:50 info memorystore: block broadcast_0 stored values in memory (estimated size 1280.0 b, free 267.3 mb) 15/06/25 17:59:50 info memorystore: ensurefreespace(854) called curmem=1280 , maxmem=280248975 15/06/25 17:59:50 info memorystore: block broadcast_0_piece0 stored bytes in memory (estimated size 854.0 b, free 267.3 mb) 15/06/25 17:59:50 info blockmanagerinfo: added broadcast_0_piece0 in memory on l ocalhost:55886 (size: 854.0 b, free: 267.3 mb) 15/06/25 17:59:50 info sparkcontext: created broadcast 0 broadcast @ dagsc heduler.scala:874 15/06/25 17:59:50 info dagscheduler: submitting 1 missing tasks resultstage 0 (parallelcollectionrdd[0] @ parallelize @ rrdd.scala:453) 15/06/25 17:59:50 info taskschedulerimpl: adding task set 0.0 1 tasks 15/06/25 17:59:50 info tasksetmanager: starting task 0.0 in stage 0.0 (tid 0, lo calhost, process_local, 1632 bytes) 15/06/25 17:59:50 info executor: running task 0.0 in stage 0.0 (tid 0) 15/06/25 17:59:50 info executor: fetching http://172.16.104.224:55867/jars/org.a pache.commons_commons-csv-1.1.jar timestamp 1435235242519 15/06/25 17:59:50 info utils: fetching http://172.16.104.224:55867/jars/org.apac he.commons_commons-csv-1.1.jar c:\users\edwinn\appdata\local\temp\spark-39ef1 9de-03f7-4b45-b91b-0828912c1789\userfiles-d9b0cd7f-d060-4acc-bd26-46ce34d975b3\f etchfiletemp3674233359629683967.tmp 15/06/25 17:59:50 error executor: exception in task 0.0 in stage 0.0 (tid 0) java.lang.nullpointerexception @ java.lang.processbuilder.start(processbuilder.java:1012) @ org.apache.hadoop.util.shell.runcommand(shell.java:482) @ org.apache.hadoop.util.shell.run(shell.java:455) @ org.apache.hadoop.util.shell$shellcommandexecutor.execute(shell.java: 702) @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:873) @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:853) @ org.apache.spark.util.utils$.fetchfile(utils.scala:465) @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:398) @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:390) @ scala.collection.traversablelike$withfilter$$anonfun$foreach$1.apply( traversablelike.scala:772) @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98) @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98) @ scala.collection.mutable.hashtable$class.foreachentry(hashtable.scala :226) @ scala.collection.mutable.hashmap.foreachentry(hashmap.scala:39) @ scala.collection.mutable.hashmap.foreach(hashmap.scala:98) @ scala.collection.traversablelike$withfilter.foreach(traversablelike.s cala:771) @ org.apache.spark.executor.executor.org$apache$spark$executor$executor $$updatedependencies(executor.scala:390) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:193) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor. java:1142) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor .java:617) @ java.lang.thread.run(thread.java:745) 15/06/25 17:59:50 **warn tasksetmanager: lost task 0.0 in stage 0.0 (tid 0, localh ost): java.lang.nullpointerexception** @ java.lang.processbuilder.start(processbuilder.java:1012) @ org.apache.hadoop.util.shell.runcommand(shell.java:482) @ org.apache.hadoop.util.shell.run(shell.java:455) @ org.apache.hadoop.util.shell$shellcommandexecutor.execute(shell.java: 702) @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:873) @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:853) @ org.apache.spark.util.utils$.fetchfile(utils.scala:465) @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:398) @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:390) @ scala.collection.traversablelike$withfilter$$anonfun$foreach$1.apply( traversablelike.scala:772) @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98) @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98) @ scala.collection.mutable.hashtable$class.foreachentry(hashtable.scala :226) @ scala.collection.mutable.hashmap.foreachentry(hashmap.scala:39) @ scala.collection.mutable.hashmap.foreach(hashmap.scala:98) @ scala.collection.traversablelike$withfilter.foreach(traversablelike.s cala:771) @ org.apache.spark.executor.executor.org$apache$spark$executor$executor $$updatedependencies(executor.scala:390) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:193 @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor. java:1142) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor .java:617) @ java.lang.thread.run(thread.java:745) 15/06/25 17:59:50 **** 15/06/25 17:59:50 info taskschedulerimpl: removed taskset 0.0, tasks have completed, pool 15/06/25 17:59:50 info taskschedulerimpl: cancelling stage 0 15/06/25 17:59:50 info dagscheduler: resultstage 0 (collectpartitions @ nativem ethodaccessorimpl.java:-2) failed in 0.156 s 15/06/25 17:59:50 info dagscheduler: job 0 failed: collectpartitions @ nativeme thodaccessorimpl.java:-2, took 0.301876 s 15/06/25 17:59:50 **error rbackendhandler: collectpartitions on 3 failed java.lang.reflect.invocationtargetexception** @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl. java:62) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodacces sorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:483) @ org.apache.spark.api.r.rbackendhandler.handlemethodcall(rbackendhandl er.scala:127) @ org.apache.spark.api.r.rbackendhandler.channelread0(rbackendhandler.s cala:74) @ org.apache.spark.api.r.rbackendhandler.channelread0(rbackendhandler.s cala:36) @ io.netty.channel.simplechannelinboundhandler.channelread(simplechanne linboundhandler.java:105) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abst ractchannelhandlercontext.java:333) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstra ctchannelhandlercontext.java:319) @ io.netty.handler.codec.messagetomessagedecoder.channelread(messagetom essagedecoder.java:103) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abst ractchannelhandlercontext.java:333) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstra ctchannelhandlercontext.java:319) @ io.netty.handler.codec.bytetomessagedecoder.channelread(bytetomessage decoder.java:163) @ io.netty.channel.abstractchannelhandlercontext.invokechannelread(abst ractchannelhandlercontext.java:333) @ io.netty.channel.abstractchannelhandlercontext.firechannelread(abstra ctchannelhandlercontext.java:319) @ io.netty.channel.defaultchannelpipeline.firechannelread(defaultchanne lpipeline.java:787) @ io.netty.channel.nio.abstractniobytechannel$niobyteunsafe.read(abstra ctniobytechannel.java:130) @ io.netty.channel.nio.nioeventloop.processselectedkey(nioeventloop.jav a:511) @ io.netty.channel.nio.nioeventloop.processselectedkeysoptimized(nioeve ntloop.java:468) @ io.netty.channel.nio.nioeventloop.processselectedkeys(nioeventloop.ja va:382) @ io.netty.channel.nio.nioeventloop.run(nioeventloop.java:354) @ io.netty.util.concurrent.singlethreadeventexecutor$2.run(singlethread eventexecutor.java:116) @ io.netty.util.concurrent.defaultthreadfactory$defaultrunnabledecorato r.run(defaultthreadfactory.java:137) @ java.lang.thread.run(thread.java:745) caused by: org.apache.spark.sparkexception: job aborted due stage failure: ta sk 0 in stage 0.0 failed 1 times, recent failure: lost task 0.0 in stage 0. 0 (tid 0, localhost): java.lang.nullpointerexception @ java.lang.processbuilder.start(processbuilder.java:1012) @ org.apache.hadoop.util.shell.runcommand(shell.java:482) @ org.apache.hadoop.util.shell.run(shell.java:455) @ org.apache.hadoop.util.shell$shellcommandexecutor.execute(shell.java: 702) @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:873) @ org.apache.hadoop.fs.fileutil.chmod(fileutil.java:853) @ org.apache.spark.util.utils$.fetchfile(utils.scala:465) @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:398) @ org.apache.spark.executor.executor$$anonfun$org$apache$spark$executor $executor$$updatedependencies$5.apply(executor.scala:390) @ scala.collection.traversablelike$withfilter$$anonfun$foreach$1.apply( traversablelike.scala:772) @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98) @ scala.collection.mutable.hashmap$$anonfun$foreach$1.apply(hashmap.sca la:98) @ scala.collection.mutable.hashtable$class.foreachentry(hashtable.scala :226) @ scala.collection.mutable.hashmap.foreachentry(hashmap.scala:39) @ scala.collection.mutable.hashmap.foreach(hashmap.scala:98) @ scala.collection.traversablelike$withfilter.foreach(traversablelike.s cala:771) @ org.apache.spark.executor.executor.org$apache$spark$executor$executor $$updatedependencies(executor.scala:390) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:193) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor. java:1142) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor .java:617) @ java.lang.thread.run(thread.java:745) driver stacktrace: @ org.apache.spark.scheduler.dagscheduler.org$apache$spark$scheduler$da gscheduler$$failjobandindependentstages(dagscheduler.scala:1266) @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(d agscheduler.scala:1257) @ org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(d agscheduler.scala:1256) @ scala.collection.mutable.resizablearray$class.foreach(resizablearray. scala:59) @ scala.collection.mutable.arraybuffer.foreach(arraybuffer.scala:47) @ org.apache.spark.scheduler.dagscheduler.abortstage(dagscheduler.scala :1256) @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$ 1.apply(dagscheduler.scala:730) @ org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$ 1.apply(dagscheduler.scala:730) @ scala.option.foreach(option.scala:236) @ org.apache.spark.scheduler.dagscheduler.handletasksetfailed(dagschedu ler.scala:730) @ org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dag scheduler.scala:1450) @ org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dag scheduler.scala:1411) @ org.apache.spark.util.eventloop$$anon$1.run(eventloop.scala:48) **error: returnstatus == 0 not true** >
i above error. suggestions? thanks.
i haven't used cluster. i've set
>bin/sparkr --master local --packages com.databricks:spark-csv_2.10:1.0.3
my os version windows 8 enterprise, spark 1.4.1, scala 2.10.1, spark-csv 2.11:1.0.3/2.10:1.0.3
Comments
Post a Comment