Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1455 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/837/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1455 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1470/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1455#discussion_r149289746 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala --- @@ -84,6 +84,44 @@ case class LoadTableCommand( val carbonProperty: CarbonProperties = CarbonProperties.getInstance() carbonProperty.addProperty("zookeeper.enable.lock", "false") + + val numCoresLoading = + try { + Integer.parseInt(CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.NUM_CORES_LOADING, + CarbonCommonConstants.NUM_CORES_MAX_VAL.toString())) + } catch { + case exc: NumberFormatException => + LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING + + " is wrong. ") + CarbonCommonConstants.NUM_CORES_MAX_VAL + } + + val newNumCoresLoading = + if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) { + // If running on yarn, + // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING, + // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores'; + // If user doesn't set the NUM_CORES_LOADING, it will use the value of + // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL, + // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL; + Math.min( + sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1), + numCoresLoading + ) + } else { + // If running on local mode, + // get the minimum value of NUM_CORES_DEFAULT_VAL and NUM_CORES_LOADING, + Math.min( + Integer.parseInt(CarbonCommonConstants.NUM_CORES_DEFAULT_VAL), + numCoresLoading + ) + } + + // update the property with new value + carbonProperty.addProperty(CarbonCommonConstants.NUM_CORES_LOADING, --- End diff -- Set `spark.task.cpus` also to numCoresLoading, so that spark and YARN will do correct scheduling --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1455#discussion_r149315844 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala --- @@ -84,6 +84,44 @@ case class LoadTableCommand( val carbonProperty: CarbonProperties = CarbonProperties.getInstance() carbonProperty.addProperty("zookeeper.enable.lock", "false") + + val numCoresLoading = + try { + Integer.parseInt(CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.NUM_CORES_LOADING, + CarbonCommonConstants.NUM_CORES_MAX_VAL.toString())) + } catch { + case exc: NumberFormatException => + LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING + + " is wrong. ") + CarbonCommonConstants.NUM_CORES_MAX_VAL + } + + val newNumCoresLoading = + if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) { + // If running on yarn, + // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING, + // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores'; + // If user doesn't set the NUM_CORES_LOADING, it will use the value of + // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL, + // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL; + Math.min( + sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1), + numCoresLoading + ) + } else { + // If running on local mode, + // get the minimum value of NUM_CORES_DEFAULT_VAL and NUM_CORES_LOADING, + Math.min( + Integer.parseInt(CarbonCommonConstants.NUM_CORES_DEFAULT_VAL), + numCoresLoading + ) + } + + // update the property with new value + carbonProperty.addProperty(CarbonCommonConstants.NUM_CORES_LOADING, --- End diff -- But it will affect the task scheduling for other jobs and reduce the task parallelism. --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1455#discussion_r149575324 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala --- @@ -84,6 +84,44 @@ case class LoadTableCommand( val carbonProperty: CarbonProperties = CarbonProperties.getInstance() carbonProperty.addProperty("zookeeper.enable.lock", "false") + + val numCoresLoading = + try { + Integer.parseInt(CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.NUM_CORES_LOADING, + CarbonCommonConstants.NUM_CORES_MAX_VAL.toString())) + } catch { + case exc: NumberFormatException => + LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING + + " is wrong. ") + CarbonCommonConstants.NUM_CORES_MAX_VAL + } + + val newNumCoresLoading = + if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) { + // If running on yarn, + // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING, + // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores'; + // If user doesn't set the NUM_CORES_LOADING, it will use the value of + // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL, + // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL; + Math.min( + sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1), + numCoresLoading + ) + } else { + // If running on local mode, + // get the minimum value of NUM_CORES_DEFAULT_VAL and NUM_CORES_LOADING, + Math.min( + Integer.parseInt(CarbonCommonConstants.NUM_CORES_DEFAULT_VAL), + numCoresLoading + ) + } + + // update the property with new value + carbonProperty.addProperty(CarbonCommonConstants.NUM_CORES_LOADING, --- End diff -- I find that the value of 'spark.task.cpus' can't support to be modified dynamically, it get the value from SparkConf once when init --- |
In reply to this post by qiuchenjian-2
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1455#discussion_r149577349 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala --- @@ -84,6 +84,44 @@ case class LoadTableCommand( val carbonProperty: CarbonProperties = CarbonProperties.getInstance() carbonProperty.addProperty("zookeeper.enable.lock", "false") + + val numCoresLoading = + try { + Integer.parseInt(CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.NUM_CORES_LOADING, + CarbonCommonConstants.NUM_CORES_MAX_VAL.toString())) + } catch { + case exc: NumberFormatException => + LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING + + " is wrong. ") + CarbonCommonConstants.NUM_CORES_MAX_VAL + } + + val newNumCoresLoading = + if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) { + // If running on yarn, + // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING, + // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores'; + // If user doesn't set the NUM_CORES_LOADING, it will use the value of + // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL, + // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL; + Math.min( + sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1), --- End diff -- Rather than taking a minimum value of either spark.executor.cores or NUM_CORES_LOADING better to give **precedence** to one of them. In case NUM_CORES_LOADING set in carbonproperties then better to take it from there. Otherwise it might be confusing for the end user. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1455 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/865/ --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1455#discussion_r149583968 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala --- @@ -84,6 +84,44 @@ case class LoadTableCommand( val carbonProperty: CarbonProperties = CarbonProperties.getInstance() carbonProperty.addProperty("zookeeper.enable.lock", "false") + + val numCoresLoading = + try { + Integer.parseInt(CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.NUM_CORES_LOADING, + CarbonCommonConstants.NUM_CORES_MAX_VAL.toString())) + } catch { + case exc: NumberFormatException => + LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING + + " is wrong. ") + CarbonCommonConstants.NUM_CORES_MAX_VAL + } + + val newNumCoresLoading = + if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) { + // If running on yarn, + // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING, + // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores'; + // If user doesn't set the NUM_CORES_LOADING, it will use the value of + // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL, + // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL; + Math.min( + sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1), --- End diff -- The purpose of taking the minimum value of spark.executor.cores and NUM_CORES_LOADING is to ensure that the value of NUM_CORES_LOADING does not exceed the 'spark.executor.cores'. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1455 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1486/ --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/1455 According to the discussion with @jackylk and @sounakr offline, to modify the logic of getting value of 'carbon.number.of.cores.while.loading', default value is the value of 'spark.executor.cores'. @jackylk @sounakr , please review it, thanks. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1455 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/889/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1455 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1504/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1455 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/901/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1455 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1516/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/1455 LGTM, thanks for working on this --- |
In reply to this post by qiuchenjian-2
|
In reply to this post by qiuchenjian-2
|
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/1455 thanks @jackylk @sounakr --- |
Free forum by Nabble | Edit this page |