[GitHub] [carbondata] VenuReddy2103 opened a new pull request #3744: [CARBONDATA-3791]:Updated configuration-parameters.md and removed unused configuration

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 opened a new pull request #3744: [CARBONDATA-3791]:Updated configuration-parameters.md and removed unused configuration

GitBox

VenuReddy2103 opened a new pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744


    ### Why is this PR needed?
   Updated configuration-parameters.md and removed unused configuration
   
    ### What changes were proposed in this PR?
   Updated configuration-parameters.md and removed unused configuration
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox

CarbonDataQA1 commented on pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#issuecomment-623643880


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2935/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#issuecomment-623646523


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1217/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#discussion_r419880943



##########
File path: docs/configuration-parameters.md
##########
@@ -31,7 +31,7 @@ This section provides the details of all the configurations required for the Car
 
 | Property | Default Value | Description |
 |----------------------------|-------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| carbon.storelocation | spark.sql.warehouse.dir property value | Location where CarbonData will create the store, and write the data in its custom format. If not specified,the path defaults to spark.sql.warehouse.dir property. **NOTE:** Store location should be in HDFS or S3. |
+| carbon.storelocation | spark.sql.warehouse.dir property value | Location where CarbonData will create the store, and write the data in its custom format. If not specified,the path defaults to spark.sql.warehouse.dir property. **NOTE:** Store location should be in one of the carbon supported filesystems. Like HDFS or S3. |

Review comment:
       Please add a line that it is recommended not to use this property




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#discussion_r419885797



##########
File path: docs/configuration-parameters.md
##########
@@ -63,7 +63,7 @@ This section provides the details of all the configurations required for the Car
 | carbon.number.of.cores.while.loading | 2 | Number of cores to be used while loading data. This also determines the number of threads to be used to read the input files (csv) in parallel.**NOTE:** This configured value is used in every data loading step to parallelize the operations. Configuring a higher value can lead to increased early thread pre-emption by OS and there by reduce the overall performance. |
 | enable.unsafe.sort | true | CarbonData supports unsafe operations of Java to avoid GC overhead for certain operations. This configuration enables to use unsafe functions in CarbonData. **NOTE:** For operations like data loading, which generates more short lived Java objects, Java GC can be a bottle neck. Using unsafe can overcome the GC overhead and improve the overall performance. |
 | enable.offheap.sort | true | CarbonData supports storing data in off-heap memory for certain operations during data loading and query. This helps to avoid the Java GC and thereby improve the overall performance. This configuration enables using off-heap memory for sorting of data during data loading.**NOTE:**  ***enable.unsafe.sort*** configuration needs to be configured to true for using off-heap |
-| carbon.load.sort.scope | LOCAL_SORT | CarbonData can support various sorting options to match the balance between load and query performance. LOCAL_SORT:All the data given to an executor in the single load is fully sorted and written to carbondata files. Data loading performance is reduced a little as the entire data needs to be sorted in the executor. GLOBAL SORT:Entire data in the data load is fully sorted and written to carbondata files. Data loading performance would get reduced as the entire data needs to be sorted. But the query performance increases significantly due to very less false positives and concurrency is also improved. **NOTE 1:** This property will be taken into account only when SORT COLUMNS are specified explicitly while creating table, otherwise it is always NO SORT |
+| carbon.load.sort.scope | NO_SORT [If sort columns are not specified while creating table] and LOCAL_SORT [If sort columns are specified] | CarbonData can support various sorting options to match the balance between load and query performance. LOCAL_SORT:All the data given to an executor in the single load is fully sorted and written to carbondata files. Data loading performance is reduced a little as the entire data needs to be sorted in the executor. GLOBAL SORT:Entire data in the data load is fully sorted and written to carbondata files. Data loading performance would get reduced as the entire data needs to be sorted. But the query performance increases significantly due to very less false positives and concurrency is also improved. **NOTE 1:** This property will be taken into account only when SORT COLUMNS are specified explicitly while creating table, otherwise it is always NO SORT |

Review comment:
       ```suggestion
   | carbon.load.sort.scope | NO_SORT [If sort columns are not specified while creating table] and LOCAL_SORT [If sort columns are specified] | CarbonData can support various sorting options to match the balance between load and query performance. LOCAL_SORT: All the data given to an executor in the single load is fully sorted and written to carbondata files. Data loading performance is reduced a little as the entire data needs to be sorted in the executor. GLOBAL SORT: Entire data in the data load is fully sorted and written to carbondata files. Data loading performance would get reduced as the entire data needs to be sorted. But the query performance increases significantly due to very less false positives and concurrency is also improved. **NOTE 1:** This property will be taken into account only when SORT COLUMNS are specified explicitly while creating table, otherwise it is always NO SORT |
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#discussion_r420177321



##########
File path: docs/configuration-parameters.md
##########
@@ -31,7 +31,7 @@ This section provides the details of all the configurations required for the Car
 
 | Property | Default Value | Description |
 |----------------------------|-------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| carbon.storelocation | spark.sql.warehouse.dir property value | Location where CarbonData will create the store, and write the data in its custom format. If not specified,the path defaults to spark.sql.warehouse.dir property. **NOTE:** Store location should be in HDFS or S3. |
+| carbon.storelocation | spark.sql.warehouse.dir property value | Location where CarbonData will create the store, and write the data in its custom format. If not specified,the path defaults to spark.sql.warehouse.dir property. **NOTE:** Store location should be in one of the carbon supported filesystems. Like HDFS or S3. |

Review comment:
       Added




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#discussion_r420177467



##########
File path: docs/configuration-parameters.md
##########
@@ -63,7 +63,7 @@ This section provides the details of all the configurations required for the Car
 | carbon.number.of.cores.while.loading | 2 | Number of cores to be used while loading data. This also determines the number of threads to be used to read the input files (csv) in parallel.**NOTE:** This configured value is used in every data loading step to parallelize the operations. Configuring a higher value can lead to increased early thread pre-emption by OS and there by reduce the overall performance. |
 | enable.unsafe.sort | true | CarbonData supports unsafe operations of Java to avoid GC overhead for certain operations. This configuration enables to use unsafe functions in CarbonData. **NOTE:** For operations like data loading, which generates more short lived Java objects, Java GC can be a bottle neck. Using unsafe can overcome the GC overhead and improve the overall performance. |
 | enable.offheap.sort | true | CarbonData supports storing data in off-heap memory for certain operations during data loading and query. This helps to avoid the Java GC and thereby improve the overall performance. This configuration enables using off-heap memory for sorting of data during data loading.**NOTE:**  ***enable.unsafe.sort*** configuration needs to be configured to true for using off-heap |
-| carbon.load.sort.scope | LOCAL_SORT | CarbonData can support various sorting options to match the balance between load and query performance. LOCAL_SORT:All the data given to an executor in the single load is fully sorted and written to carbondata files. Data loading performance is reduced a little as the entire data needs to be sorted in the executor. GLOBAL SORT:Entire data in the data load is fully sorted and written to carbondata files. Data loading performance would get reduced as the entire data needs to be sorted. But the query performance increases significantly due to very less false positives and concurrency is also improved. **NOTE 1:** This property will be taken into account only when SORT COLUMNS are specified explicitly while creating table, otherwise it is always NO SORT |
+| carbon.load.sort.scope | NO_SORT [If sort columns are not specified while creating table] and LOCAL_SORT [If sort columns are specified] | CarbonData can support various sorting options to match the balance between load and query performance. LOCAL_SORT:All the data given to an executor in the single load is fully sorted and written to carbondata files. Data loading performance is reduced a little as the entire data needs to be sorted in the executor. GLOBAL SORT:Entire data in the data load is fully sorted and written to carbondata files. Data loading performance would get reduced as the entire data needs to be sorted. But the query performance increases significantly due to very less false positives and concurrency is also improved. **NOTE 1:** This property will be taken into account only when SORT COLUMNS are specified explicitly while creating table, otherwise it is always NO SORT |

Review comment:
       Modified




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#issuecomment-624188986


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2943/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#issuecomment-624190222


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1225/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

Indhumathi27 commented on pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#issuecomment-624494652


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on pull request #3744: [CARBONDATA-3791] Updated configuration-parameters.md and removed unused configuration

GitBox
In reply to this post by GitBox

kunal642 commented on pull request #3744:
URL: https://github.com/apache/carbondata/pull/3744#issuecomment-624595276


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]