Hi,community: It inserts records from a source table into a target CarbonData table(kc22_ca). The source table can be a Hive table(‘kc22_p1’). kc22_p1 records : 102200946 51.5 G Stage: spark-shell --master yarn-client --driver-memory 20G --executor-cores 1 --num-executors 12 --executor-memory 5G val cc = new CarbonContext(sc, "hdfs://cluster1/opt/CarbonStore") cc.sql("create table if not exists kc22_ca (akb020 String,akc190 String,aae072 String,akc220 String,ake005 String,bka135 String,bkc301 String,ake001 String,ake002 String,ake006 String,akc221 String,ake010 String,aka065 String,ake003 String,aka063 String,akc225 double,akc226 double,aae019 double,akc228 double,ake051 double,aka068 double,akc268 double,bkc228 double,bka635 double,aka069 double,bka107 double,bka108 double,bkc127 String,aka064 String,aae100 String,bkc126 String,bkc125 String,bka231 String,bae073 double,bka636 double,bka637 double,bka104 double,bka609 String,aka070 String,aka067 String,aka074 String,bkc378 String,bkc379 String,bkc380 String,bkc381 String,aae011 String,aae036 String,bkc319 double,bkf050 String,akc273 String,aka071 double,aka072 String,aka107 String,bka076 String,akf002 String,bkc241 double,bkc242 String,bkc243 String,bka205 String,bkb401 String,bka650 double,bka651 String,aka130 String,aka120 String,bae075 double,aae017 String,aae032 String,bkc060 double,bkc061 double,bkc062 double,bkc063 double,bkc064 double,bkc065 double,bkc066 String,bkc067 String,bkc068 String,bkc069 String,baz001 double,baz002 double,bze011 String,bze036 String,aaa027 String,aab034 String,aac001 double,bkb070 String,bkb071 String,bkc077 String,bkc078 String,bkc079 String,bkc081 double,bka610 String,bka971 double,bka972 double,bka973 String,bka974 String) STORED BY 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUD'='akb020, aae072, bka135, akc220, ake005, bkc301','DICTIONARY_EXCLUDE'='akc190,ake001,ake002,ake006,akc221,ake010,aka065,ake003,aka063,bkc127,aka064,aae100,bkc126,bkc125,bka231,bka609,aka070,aka067,aka074,bkc378,bkc379,bkc380,bkc381,aae011,aae036,bkf050,akc273,aka072,aka107,bka076,akf002,bkc242,bkc243,bka205,bkb401,bka651,aka130,aka120,aae017,aae032,bkc066,bkc067,bkc068,bkc069,bze011,bze036,aaa027,aab034,bkb070,bkb071,bkc077,bkc078,bkc079,bka610,bka973,bka974')") note: When using only DICTIONARY_INCLUDE and the two are used together, the amount of shuffle is not the same. Reference annex Log: 17/09/19 09:29:51 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 1039) in 8523 ms on node2 (3/7) 17/09/19 09:30:13 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 1036) in 30754 ms on node2 (4/7) 17/09/19 09:30:18 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 1037) in 35309 ms on node1 (5/7) 17/09/19 09:33:49 WARN HeartbeatReceiver: Removing executor 5 with no recent heartbeats: 135938 ms exceeds timeout 120000 ms 17/09/19 09:33:49 ERROR YarnScheduler: Lost executor 5 on node1: Executor heartbeat timed out after 135938 ms 17/09/19 09:33:49 WARN TaskSetManager: Lost task 6.0 in stage 1.0 (TID 1041, node1): ExecutorLostFailure (executor 5 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 135938 ms 17/09/19 09:33:49 INFO TaskSetManager: Starting task 6.1 in stage 1.0 (TID 1042, node3, partition 6,PROCESS_LOCAL, 1894 bytes) 17/09/19 09:33:49 INFO DAGScheduler: Executor lost: 5 (epoch 1) 17/09/19 09:33:49 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 5 17/09/19 09:33:49 INFO BlockManagerMasterEndpoint: Trying to remove executor 5 from BlockManagerMaster. 17/09/19 09:33:49 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(5, node1, 58006) 17/09/19 09:33:49 INFO BlockManagerMaster: Removed 5 successfully in removeExecutor 17/09/19 09:33:49 INFO ShuffleMapStage: ShuffleMapStage 0 is now unavailable on executor 5 (917/1035, false) 17/09/19 09:33:49 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on node3:57113 (size: 3.7 KB, free: 4.1 GB) 17/09/19 09:33:49 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to node3:33757 17/09/19 09:33:49 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 5830 bytes 17/09/19 09:33:49 WARN TaskSetManager: Lost task 6.1 in stage 1.0 (TID 1042, node3): FetchFailed(null, shuffleId=0, mapId=-1, reduceId=6, message= org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0 at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$2.apply(MapOutputTracker.scala:542) at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$2.apply(MapOutputTracker.scala:538) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) at org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:538) at org.apache.spark.MapOutputTracker.getMapSizesByExecutorId(MapOutputTracker.scala:155) at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:47) at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:98) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.carbondata.spark.rdd.CarbonGlobalDictionaryGenerateRDD$$anon$1.<init>(CarbonGlobalDictionaryRDD.scala:372) at org.apache.carbondata.spark.rdd.CarbonGlobalDictionaryGenerateRDD.compute(CarbonGlobalDictionaryRDD.scala:345) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ) 17/09/19 09:33:49 INFO DAGScheduler: Marking ResultStage 1 (collect at GlobalDictionaryUtil.scala:746) as failed due to a fetch failure from ShuffleMapStage 0 (RDD at CarbonGlobalDictionaryRDD.scala:271) 17/09/19 09:33:49 INFO DAGScheduler: ResultStage 1 (collect at GlobalDictionaryUtil.scala:746) failed in 247.083 s 17/09/19 09:33:49 INFO DAGScheduler: Resubmitting ShuffleMapStage 0 (RDD at CarbonGlobalDictionaryRDD.scala:271) and ResultStage 1 (collect at GlobalDictionaryUtil.scala:746) due to fetch failure 17/09/19 09:33:50 INFO DAGScheduler: Resubmitting failed stages 17/09/19 09:33:50 INFO DAGScheduler: Submitting ShuffleMapStage 0 (CarbonBlockDistinctValuesCombineRDD[11] at RDD at CarbonGlobalDictionaryRDD.scala:271), which has no missing parents 17/09/19 09:33:50 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 15.0 KB, free 1291.9 KB) 17/09/19 09:33:50 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 7.1 KB, free 1299.0 KB) 17/09/19 09:33:50 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 10.9.22.15:58333 (size: 7.1 KB, free: 14.2 GB) 17/09/19 09:33:50 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1006 17/09/19 09:33:50 INFO DAGScheduler: Submitting 118 missing tasks from ShuffleMapStage 0 (CarbonBlockDistinctValuesCombineRDD[11] at RDD at CarbonGlobalDictionaryRDD.scala:271) ------------------------------------------------------------------------------------------------------- 刘峰 技术发展部(TDD) 东软集团股份有限公司 沈阳浑南新区新秀街2号东软软件园A2-105A Postcode:110179 Mobile:13889865456 --------------------------------------------------------------------------------------------------- |
Hello,
I don't get much from the logs but the error seems related to memory issue from Spark. From your old emails I get that you are using 3 node cluster. Is that all 3 node has nodemanager and datanodes? So better give only less number of executors and provide more memory to it like below. While data loading it is recommended to use one executor per nodemanager. spark-shell --master yarn-client --driver-memory 10G --executor-cores 4 --num-executors 3 --executor-memory 25G And also if any configuration gives any error please provide the executor log. Thank you, Ravindra. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Sorry,
A total of 4 nodes . of which 3 as datanode and snn on one of the datanodes. Version: Carbondata 1.1.0 Spark 1.6.0 Hadoop :2.7.2 Thank you for your help , I'm trying again ========================= Liu feng -----邮件原件----- 发件人: ravipesala [mailto:[hidden email]] 发送时间: 2017年9月19日 11:23 收件人: [hidden email] 主题: Re: insert carbondata table failed Hello, I don't get much from the logs but the error seems related to memory issue from Spark. From your old emails I get that you are using 3 node cluster. Is that all 3 node has nodemanager and datanodes? So better give only less number of executors and provide more memory to it like below. While data loading it is recommended to use one executor per nodemanager. spark-shell --master yarn-client --driver-memory 10G --executor-cores 4 --num-executors 3 --executor-memory 25G And also if any configuration gives any error please provide the executor log. Thank you, Ravindra. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ --------------------------------------------------------------------------------------------------- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. --------------------------------------------------------------------------------------------------- |
In reply to this post by 刘feng
Hi Feng,
You can also refer the below links wherein the spark users have tried to resolve this issue by making changes in the configuration. This might help you. https://stackoverflow.com/questions/28901123/why-do-spark-jobs-fail-with-org-apache-spark-shuffle-metadatafetchfailedexceptio https://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Thank you ,
I have tried to resolve this issue by making changes in the spark configuration and use two fields as DICTIONARY_INCLUDE. test data(30G) load 8 times, each time about 1.5 minutes to complete Is currently testing another larger data, hope to be successful, thank you very much for the help! ========================= Liu feng -----邮件原件----- 发件人: manishgupta88 [mailto:[hidden email]] 发送时间: 2017年9月19日 13:27 收件人: [hidden email] 主题: Re: insert carbondata table failed Hi Feng, You can also refer the below links wherein the spark users have tried to resolve this issue by making changes in the configuration. This might help you. https://stackoverflow.com/questions/28901123/why-do-spark-jobs-fail-with-org -apache-spark-shuffle-metadatafetchfailedexceptio https://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-o rg-apache-spark-shuffle-metadatafetchfailedexcept Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ --------------------------------------------------------------------------------------------------- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. --------------------------------------------------------------------------------------------------- |
Free forum by Nabble | Edit this page |