hi guys:
i am use carbondata1.3 and spark2.2.1 on standalone, i start the CarbonThriftServer like this: /bin/spark-submit --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer $SPARK_HOME/carbonlib/carbondata_2.11-1.3.0-shade-hadoop2.7.2.jar hdfs://nameservice1/hive/carbon/store i get this log:Downloading hdfs://nameservice1/hive/carbon/store to /tmp/tmp6465512979544197326/hive/carbon/store. this will download all carbonstore to tmp dir,If my carbonstore is very large, this will take a lot of boot time, and my temporary directory will be full; each time you start it will create a new temporary directory. Was it designed in this way, or was my configuration wrong? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hi dylan
I have verified your scenario in my setup and it is working fine without downloading store to local /tmp/location . Below command is used to started Thriftserver & Carbon Store is NOT getting copied to /tmp location . bin/spark-submit --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer /opt/sparkrelease/spark-2.2.1-bin-hadoop2.7/carbonlib/carbondata_2.11-1.3.0-shade-hadoop2.7.2.jar hdfs://master:9000/carbonstore Can you please provide below detail to analyze issue further. 1. spark-default.conf under <SPARK-HOME>/conf 2. driver logs ( console log when starting thriftserver) Thanks Babu -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
hello babulal:
thanks for your reply. 1.my spark-default.conf is: spark.executor.extraJavaOptions -Dcarbon.properties.filepath=/home/spark-2.2.1-bin-hadoop2.7/conf/carbon.properties spark.driver.extraJavaOptions -Dcarbon.properties.filepath=/home/spark-2.2.1-bin-hadoop2.7/conf/carbon.properties 2.console log 18/03/13 19:12:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/03/13 19:12:51 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. Downloading hdfs://nameservice1/hive/carbon/store to /tmp/tmp3188425816613265318/hive/carbon/store. The download operation will continue for a long time until it downloads all the data to the tmp directory -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Hi dylan
As per your console log , this error comes when wrong command to start spark-submit while providing resources(jars/file) . i tried below command and got same error like you ( i has given --jars option and at last store location with space ). root@master /opt/sparkrelease/spark-2.2.1-bin-hadoop2.7 # bin/spark-submit --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer --jars $SPARK_HOME/carbonlib/carbondata_2.11-1.3.0-shade-hadoop2.7.2.jar hdfs://master:9000/carbonstore log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Downloading hdfs://master:9000/carbonstore to /tmp/tmp1358150251291982356/carbonstore. Exception in thread "main" java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:868) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934) Spark has below code to do this for resource localization ..\core\src\main\scala\org\apache\spark\deploy\SparkSubmit.scala #downloadFile <http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/file/t203/downloadFileMethod.png> private[deploy] def downloadFile(path: String, hadoopConf: HadoopConfiguration): String = { require(path != null, "path cannot be null.") val uri = Utils.resolveURI(path) uri.getScheme match { case "file" | "local" => path case _ => val fs = FileSystem.get(uri, hadoopConf) val tmpFile = new File(Files.createTempDirectory("tmp").toFile, uri.getPath) // scalastyle:off println printStream.println(s"Downloading ${uri.toString} to ${tmpFile.getAbsolutePath}.") // scalastyle:on println fs.copyToLocalFile(new Path(uri), new Path(tmpFile.getAbsolutePath)) Utils.resolveURI(tmpFile.getAbsolutePath).toString } } And this method is called only in below case if (deployMode == CLIENT) { val hadoopConf = conf.getOrElse(new HadoopConfiguration()) localPrimaryResource = Option(args.primaryResource).map(downloadFile(_, hadoopConf)).orNull localJars = Option(args.jars).map(downloadFileList(_, hadoopConf)).orNull localPyFiles = Option(args.pyFiles).map(downloadFileList(_, hadoopConf)).orNull localFiles = Option(args.files).map(downloadFileList(_, hadoopConf)).orNull } <http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/file/t203/callingMethods.jpg> So Please check your command to start CarbonThriftServer Or send me the exact command. Thanks Babu -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
hello babulal:
I know the problem,i use the wrong command to start spark-submit with --jars. Thank you very much for your answer and solved my problem. thanks! -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |