Posted by
geda on
Nov 30, 2016; 2:23pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/carbondata-quick-start-error-can-t-select-from-table-tp3440.html
HI:
i clone from git ,branch master , compile mvn hadoop 2.6.3 ,spark 1.6.1
follow the quick start , then run spark-shell
$SPARK_HOME/bin/spark-shell --verbose --master local[4] --jars
/usr/local/spark/lib/carbondata_2.10-0.3.0-incubating-SNAPSHOT-shade-hadoop2.6.3.jar,/usr/local/spark/lib/mysql-connector-java-5.1.38-bin.jar
more detail can see :
http://pastebin.com/Myp6aubsthen :paste
import java.io._
import org.apache.hadoop.hive.conf.HiveConf
import org.apache.spark.sql.CarbonContext
val storePath = "hdfs://test.namenode02.bi.com:8020/usr/carbondata/store"
val cc = new CarbonContext(sc, storePath)
cc.setConf(HiveConf.ConfVars.HIVECHECKFILEFORMAT.varname, "false")
cc.setConf("carbon.kettle.home","/usr/local/spark/carbondata/carbonplugins")
cc.sql("create table if not exists test_table (id string, name string, city
string, age Int) STORED BY 'carbondata'")
cc.sql(s"load data inpath 'hdfs://test.namenode02.bi.com:8020/tmp/sample.csv'
into table test_table")
cc.sql("select * from test_table").show
1:
can't load ,but can create table
Table MetaData Unlocked Successfully after data load
> java.lang.RuntimeException: Table is locked for updation. Please try
> after some time
like this
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/load-data-fail-td100.html#a164i chmod 777, then can run it
but when i run this ,cc.sql("select * from test_table").show
NFO 30-11 18:24:01,072 - Parse Completed
INFO 30-11 18:24:01,196 - main Starting to optimize plan
INFO 30-11 18:24:01,347 - main ************************Total Number Rows
In BTREE: 1
INFO 30-11 18:24:01,361 - main ************************Total Number Rows
In BTREE: 1
INFO 30-11 18:24:01,369 - main ************************Total Number Rows
In BTREE: 1
INFO 30-11 18:24:01,376 - main ************************Total Number Rows
In BTREE: 1
INFO 30-11 18:24:01,385 - main ************************Total Number Rows
In BTREE: 1
INFO 30-11 18:24:01,386 - main Total Time taken to ensure the required
executors: 0
INFO 30-11 18:24:01,386 - main Time elapsed to allocate the required
executors: 0
INFO 30-11 18:24:01,391 -
Identified no.of.blocks: 5,
no.of.tasks: 4,
no.of.nodes: 1,
parallelism: 4
INFO 30-11 18:24:01,396 - Starting job: show at <console>:37
INFO 30-11 18:24:01,396 - Got job 3 (show at <console>:37) with 1 output
partitions
INFO 30-11 18:24:01,396 - Final stage: ResultStage 4 (show at
<console>:37)
INFO 30-11 18:24:01,396 - Parents of final stage: List()
INFO 30-11 18:24:01,397 - Missing parents: List()
INFO 30-11 18:24:01,397 - Submitting ResultStage 4 (MapPartitionsRDD[20]
at show at <console>:37), which has no missing parents
INFO 30-11 18:24:01,401 - Block broadcast_6 stored as values in memory
(estimated size 13.3 KB, free 285.6 KB)
INFO 30-11 18:24:01,403 - Block broadcast_6_piece0 stored as bytes in
memory (estimated size 6.7 KB, free 292.2 KB)
INFO 30-11 18:24:01,403 - Added broadcast_6_piece0 in memory on
localhost:15792 (size: 6.7 KB, free: 511.1 MB)
INFO 30-11 18:24:01,404 - Created broadcast 6 from broadcast at
DAGScheduler.scala:1006
INFO 30-11 18:24:01,404 - Submitting 1 missing tasks from ResultStage 4
(MapPartitionsRDD[20] at show at <console>:37)
INFO 30-11 18:24:01,404 - Adding task set 4.0 with 1 tasks
INFO 30-11 18:24:01,405 - Starting task 0.0 in stage 4.0 (TID 6,
localhost, partition 0,PROCESS_LOCAL, 2709 bytes)
INFO 30-11 18:24:01,406 - Running task 0.0 in stage 4.0 (TID 6)
INFO 30-11 18:24:01,436 - [Executor task launch
worker-1][partitionID:table;queryID:10219962900397098] Query will be
executed on table: test_table
ERROR 30-11 18:24:01,444 - Exception in task 0.0 in stage 4.0 (TID 6)
java.lang.InterruptedException:
at
org.apache.carbondata.hadoop.CarbonRecordReader.initialize(CarbonRecordReader.java:83)
at
org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:171)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)