Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #1455: [CARBONDATA-1624]Set the default value of 'ca...

Classic

List

Threaded

78 messages Options

1234

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1455

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/837/

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1455

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1470/

---

qiuchenjian-2

[GitHub] carbondata pull request #1455: [CARBONDATA-1624]Set the default value of 'ca...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1455#discussion_r149289746

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala ---
@@ -84,6 +84,44 @@ case class LoadTableCommand(

val carbonProperty: CarbonProperties = CarbonProperties.getInstance()
carbonProperty.addProperty("zookeeper.enable.lock", "false")
+
+ val numCoresLoading =
+ try {
+ Integer.parseInt(CarbonProperties.getInstance()
+ .getProperty(CarbonCommonConstants.NUM_CORES_LOADING,
+ CarbonCommonConstants.NUM_CORES_MAX_VAL.toString()))
+ } catch {
+ case exc: NumberFormatException =>
+ LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING
+ + " is wrong. ")
+ CarbonCommonConstants.NUM_CORES_MAX_VAL
+ }
+
+ val newNumCoresLoading =
+ if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) {
+ // If running on yarn,
+ // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING,
+ // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores';
+ // If user doesn't set the NUM_CORES_LOADING, it will use the value of
+ // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL,
+ // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL;
+ Math.min(
+ sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1),
+ numCoresLoading
+ )
+ } else {
+ // If running on local mode,
+ // get the minimum value of NUM_CORES_DEFAULT_VAL and NUM_CORES_LOADING,
+ Math.min(
+ Integer.parseInt(CarbonCommonConstants.NUM_CORES_DEFAULT_VAL),
+ numCoresLoading
+ )
+ }
+
+ // update the property with new value
+ carbonProperty.addProperty(CarbonCommonConstants.NUM_CORES_LOADING,
--- End diff --

Set `spark.task.cpus` also to numCoresLoading, so that spark and YARN will do correct scheduling

---

qiuchenjian-2

[GitHub] carbondata pull request #1455: [CARBONDATA-1624]Set the default value of 'ca...

In reply to this post by qiuchenjian-2

Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1455#discussion_r149315844

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala ---
@@ -84,6 +84,44 @@ case class LoadTableCommand(

val carbonProperty: CarbonProperties = CarbonProperties.getInstance()
carbonProperty.addProperty("zookeeper.enable.lock", "false")
+
+ val numCoresLoading =
+ try {
+ Integer.parseInt(CarbonProperties.getInstance()
+ .getProperty(CarbonCommonConstants.NUM_CORES_LOADING,
+ CarbonCommonConstants.NUM_CORES_MAX_VAL.toString()))
+ } catch {
+ case exc: NumberFormatException =>
+ LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING
+ + " is wrong. ")
+ CarbonCommonConstants.NUM_CORES_MAX_VAL
+ }
+
+ val newNumCoresLoading =
+ if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) {
+ // If running on yarn,
+ // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING,
+ // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores';
+ // If user doesn't set the NUM_CORES_LOADING, it will use the value of
+ // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL,
+ // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL;
+ Math.min(
+ sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1),
+ numCoresLoading
+ )
+ } else {
+ // If running on local mode,
+ // get the minimum value of NUM_CORES_DEFAULT_VAL and NUM_CORES_LOADING,
+ Math.min(
+ Integer.parseInt(CarbonCommonConstants.NUM_CORES_DEFAULT_VAL),
+ numCoresLoading
+ )
+ }
+
+ // update the property with new value
+ carbonProperty.addProperty(CarbonCommonConstants.NUM_CORES_LOADING,
--- End diff --

But it will affect the task scheduling for other jobs and reduce the task parallelism.

---

qiuchenjian-2

[GitHub] carbondata pull request #1455: [CARBONDATA-1624]Set the default value of 'ca...

In reply to this post by qiuchenjian-2

Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1455#discussion_r149575324

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala ---
@@ -84,6 +84,44 @@ case class LoadTableCommand(

val carbonProperty: CarbonProperties = CarbonProperties.getInstance()
carbonProperty.addProperty("zookeeper.enable.lock", "false")
+
+ val numCoresLoading =
+ try {
+ Integer.parseInt(CarbonProperties.getInstance()
+ .getProperty(CarbonCommonConstants.NUM_CORES_LOADING,
+ CarbonCommonConstants.NUM_CORES_MAX_VAL.toString()))
+ } catch {
+ case exc: NumberFormatException =>
+ LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING
+ + " is wrong. ")
+ CarbonCommonConstants.NUM_CORES_MAX_VAL
+ }
+
+ val newNumCoresLoading =
+ if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) {
+ // If running on yarn,
+ // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING,
+ // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores';
+ // If user doesn't set the NUM_CORES_LOADING, it will use the value of
+ // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL,
+ // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL;
+ Math.min(
+ sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1),
+ numCoresLoading
+ )
+ } else {
+ // If running on local mode,
+ // get the minimum value of NUM_CORES_DEFAULT_VAL and NUM_CORES_LOADING,
+ Math.min(
+ Integer.parseInt(CarbonCommonConstants.NUM_CORES_DEFAULT_VAL),
+ numCoresLoading
+ )
+ }
+
+ // update the property with new value
+ carbonProperty.addProperty(CarbonCommonConstants.NUM_CORES_LOADING,
--- End diff --

I find that the value of 'spark.task.cpus' can't support to be modified dynamically, it get the value from SparkConf once when init

---

qiuchenjian-2

[GitHub] carbondata pull request #1455: [CARBONDATA-1624]Set the default value of 'ca...

In reply to this post by qiuchenjian-2

Github user sounakr commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1455#discussion_r149577349

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala ---
@@ -84,6 +84,44 @@ case class LoadTableCommand(

val carbonProperty: CarbonProperties = CarbonProperties.getInstance()
carbonProperty.addProperty("zookeeper.enable.lock", "false")
+
+ val numCoresLoading =
+ try {
+ Integer.parseInt(CarbonProperties.getInstance()
+ .getProperty(CarbonCommonConstants.NUM_CORES_LOADING,
+ CarbonCommonConstants.NUM_CORES_MAX_VAL.toString()))
+ } catch {
+ case exc: NumberFormatException =>
+ LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING
+ + " is wrong. ")
+ CarbonCommonConstants.NUM_CORES_MAX_VAL
+ }
+
+ val newNumCoresLoading =
+ if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) {
+ // If running on yarn,
+ // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING,
+ // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores';
+ // If user doesn't set the NUM_CORES_LOADING, it will use the value of
+ // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL,
+ // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL;
+ Math.min(
+ sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1),
--- End diff --

Rather than taking a minimum value of either spark.executor.cores or NUM_CORES_LOADING better to give **precedence** to one of them. In case NUM_CORES_LOADING set in carbonproperties then better to take it from there. Otherwise it might be confusing for the end user.

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata pull request #1455: [CARBONDATA-1624]Set the default value of 'ca...

In reply to this post by qiuchenjian-2

Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1455#discussion_r149583968

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/LoadTableCommand.scala ---
@@ -84,6 +84,44 @@ case class LoadTableCommand(

val carbonProperty: CarbonProperties = CarbonProperties.getInstance()
carbonProperty.addProperty("zookeeper.enable.lock", "false")
+
+ val numCoresLoading =
+ try {
+ Integer.parseInt(CarbonProperties.getInstance()
+ .getProperty(CarbonCommonConstants.NUM_CORES_LOADING,
+ CarbonCommonConstants.NUM_CORES_MAX_VAL.toString()))
+ } catch {
+ case exc: NumberFormatException =>
+ LOGGER.error("Configured value for property " + CarbonCommonConstants.NUM_CORES_LOADING
+ + " is wrong. ")
+ CarbonCommonConstants.NUM_CORES_MAX_VAL
+ }
+
+ val newNumCoresLoading =
+ if (sparkSession.sparkContext.conf.contains("spark.executor.cores")) {
+ // If running on yarn,
+ // get the minimum value of 'spark.executor.cores' and NUM_CORES_LOADING,
+ // If user set the NUM_CORES_LOADING, it can't exceed the value of 'spark.executor.cores';
+ // If user doesn't set the NUM_CORES_LOADING, it will use the value of
+ // 'spark.executor.cores', but the value can't exceed the value of NUM_CORES_MAX_VAL,
+ // NUM_CORES_LOADING's default value is NUM_CORES_MAX_VAL;
+ Math.min(
+ sparkSession.sparkContext.conf.getInt("spark.executor.cores", 1),
--- End diff --

The purpose of taking the minimum value of spark.executor.cores and NUM_CORES_LOADING is to ensure that the value of NUM_CORES_LOADING does not exceed the 'spark.executor.cores'.

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1455

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1486/

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/1455

According to the discussion with @jackylk and @sounakr offline, to modify the logic of getting value of 'carbon.number.of.cores.while.loading', default value is the value of 'spark.executor.cores'.

@jackylk @sounakr , please review it, thanks.

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1455

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/889/

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1455

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1504/

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1455

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1516/

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1455

LGTM, thanks for working on this

---

qiuchenjian-2

[GitHub] carbondata pull request #1455: [CARBONDATA-1624]Set the default value of 'ca...

In reply to this post by qiuchenjian-2

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1455

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user sounakr commented on the issue:

https://github.com/apache/carbondata/pull/1455

LGTM

---

qiuchenjian-2

[GitHub] carbondata issue #1455: [CARBONDATA-1624]Set the default value of 'carbon.nu...

In reply to this post by qiuchenjian-2

Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/1455

thanks @jackylk @sounakr

---

1234