nihal0107 opened a new pull request #3737: URL: https://github.com/apache/carbondata/pull/3737 ### Why is this PR needed? Correct spelling, query, default value, in performance-tuning, prestodb and prestosql documentation. ### What changes were proposed in this PR? Corrected spelling, query, default value, in performance-tuning, prestodb and prestosql documentation. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
ajantha-bhat commented on pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#issuecomment-623312095 @nihal0107 : I have pushed the presto changes my self in their PR, rebase and handle other comments ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat edited a comment on pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#issuecomment-623312095 @nihal0107 : I have pushed the presto changes my self in the PR, rebase and handle other comments ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#issuecomment-623326751 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
chetandb commented on a change in pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#discussion_r419339008 ########## File path: docs/performance-tuning.md ########## @@ -128,6 +128,9 @@ **NOTE:** + BloomFilter can be created to enhance performance for queries with precise equal/in conditions. You can find more information about it in BloomFilter index [document](./index/bloomfilter-index-guide.md). + + Lucine index can be created on string columns which has content of more length to enhance the query performance. You can find more information about it in Lucene index [document](./index/lucene-index-guide.md). Review comment: Change "Lucine" spelling to "Lucene" ########## File path: docs/performance-tuning.md ########## @@ -141,12 +144,12 @@ | Parameter | Default Value | Description/Tuning | |-----------|-------------|--------| |carbon.number.of.cores.while.loading|Default: 2. This value should be >= 2|Specifies the number of cores used for data processing during data loading in CarbonData. | -|carbon.sort.size|Default: 100000. The value should be >= 100.|Threshold to write local file in sort step when loading data| -|carbon.sort.file.write.buffer.size|Default: 16384.|CarbonData sorts and writes data to intermediate files to limit the memory usage. This configuration determines the buffer size to be used for reading and writing such files. | +|carbon.sort.size|Default: 100000. The value should be >= 1000.|Threshold to write local file in sort step when loading data| +|carbon.sort.file.write.buffer.size|Default: 16384. The value should be >= 10240 and <= 10485760.|CarbonData sorts and writes data to intermediate files to limit the memory usage. This configuration determines the buffer size to be used for reading and writing such files. | |carbon.merge.sort.reader.thread|Default: 3 |Specifies the number of cores used for temp file merging during data loading in CarbonData.| |carbon.merge.sort.prefetch|Default: true | You may want set this value to false if you have not enough memory| - For example, if there are 10 million records, and i have only 16 cores, 64GB memory, will be loaded to CarbonData table. + For example, if there are 10 million records, and I have only 16 cores, 64 GB memory, will be loaded to CarbonData table. Review comment: Remove comma in 64 GB memory,. Change to "For example, if there are 10 million records, and I have only 16 cores, 64 GB memory will be loaded to CarbonData table." ########## File path: docs/prestodb-guide.md ########## @@ -139,11 +139,14 @@ Then, `query.max-memory=<30GB * number of nodes>`. ##### Configuring Carbondata in Presto 1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. +2. As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector. Review comment: " As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector." can be changed to " As carbondata connector extends hive connector all the configurations(including S3) is same as hive connector." ########## File path: docs/prestosql-guide.md ########## @@ -139,11 +139,15 @@ Then, `query.max-memory=<30GB * number of nodes>`. ##### Configuring Carbondata in Presto 1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. +2. As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector. Review comment: "As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector." can be changed to "As carbondata connector extends hive connector all the configurations(including S3) is same as hive connector." ########## File path: docs/prestodb-guide.md ########## @@ -139,11 +139,14 @@ Then, `query.max-memory=<30GB * number of nodes>`. ##### Configuring Carbondata in Presto 1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. +2. As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector. +Just replace the connector name in hive configuration and make a copy of it to carbondata.properties Review comment: "make a copy of it to carbondata.properties" can be changed to "copy same to carbondata.properties" ########## File path: docs/prestosql-guide.md ########## @@ -139,11 +139,15 @@ Then, `query.max-memory=<30GB * number of nodes>`. ##### Configuring Carbondata in Presto 1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. +2. As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector. +Just replace the connector name in hive configuration and make a copy of it to carbondata.properties +`connector.name = carbondata` Review comment: "make a copy of it to carbondata.properties" can be changed to "copy same to carbondata.properties" ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
nihal0107 commented on a change in pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#discussion_r419375350 ########## File path: docs/prestosql-guide.md ########## @@ -139,11 +139,15 @@ Then, `query.max-memory=<30GB * number of nodes>`. ##### Configuring Carbondata in Presto 1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. +2. As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector. +Just replace the connector name in hive configuration and make a copy of it to carbondata.properties +`connector.name = carbondata` Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
nihal0107 commented on a change in pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#discussion_r419375435 ########## File path: docs/prestosql-guide.md ########## @@ -139,11 +139,15 @@ Then, `query.max-memory=<30GB * number of nodes>`. ##### Configuring Carbondata in Presto 1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. +2. As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector. Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
nihal0107 commented on a change in pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#discussion_r419375561 ########## File path: docs/performance-tuning.md ########## @@ -128,6 +128,9 @@ **NOTE:** + BloomFilter can be created to enhance performance for queries with precise equal/in conditions. You can find more information about it in BloomFilter index [document](./index/bloomfilter-index-guide.md). + + Lucine index can be created on string columns which has content of more length to enhance the query performance. You can find more information about it in Lucene index [document](./index/lucene-index-guide.md). Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
nihal0107 commented on a change in pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#discussion_r419375968 ########## File path: docs/performance-tuning.md ########## @@ -141,12 +144,12 @@ | Parameter | Default Value | Description/Tuning | |-----------|-------------|--------| |carbon.number.of.cores.while.loading|Default: 2. This value should be >= 2|Specifies the number of cores used for data processing during data loading in CarbonData. | -|carbon.sort.size|Default: 100000. The value should be >= 100.|Threshold to write local file in sort step when loading data| -|carbon.sort.file.write.buffer.size|Default: 16384.|CarbonData sorts and writes data to intermediate files to limit the memory usage. This configuration determines the buffer size to be used for reading and writing such files. | +|carbon.sort.size|Default: 100000. The value should be >= 1000.|Threshold to write local file in sort step when loading data| +|carbon.sort.file.write.buffer.size|Default: 16384. The value should be >= 10240 and <= 10485760.|CarbonData sorts and writes data to intermediate files to limit the memory usage. This configuration determines the buffer size to be used for reading and writing such files. | |carbon.merge.sort.reader.thread|Default: 3 |Specifies the number of cores used for temp file merging during data loading in CarbonData.| |carbon.merge.sort.prefetch|Default: true | You may want set this value to false if you have not enough memory| - For example, if there are 10 million records, and i have only 16 cores, 64GB memory, will be loaded to CarbonData table. + For example, if there are 10 million records, and I have only 16 cores, 64 GB memory, will be loaded to CarbonData table. Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
nihal0107 commented on a change in pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#discussion_r419376358 ########## File path: docs/prestodb-guide.md ########## @@ -139,11 +139,14 @@ Then, `query.max-memory=<30GB * number of nodes>`. ##### Configuring Carbondata in Presto 1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. +2. As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector. Review comment: done ########## File path: docs/prestodb-guide.md ########## @@ -139,11 +139,14 @@ Then, `query.max-memory=<30GB * number of nodes>`. ##### Configuring Carbondata in Presto 1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. +2. As carbondata connector extends, hive connector. All the configurations(including S3) is same as hive connector. +Just replace the connector name in hive configuration and make a copy of it to carbondata.properties Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#issuecomment-623478677 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2932/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
CarbonDataQA1 commented on pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#issuecomment-623480668 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1214/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
In reply to this post by GitBox
ajantha-bhat commented on pull request #3737: URL: https://github.com/apache/carbondata/pull/3737#issuecomment-624082714 LGTM ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] |
Free forum by Nabble | Edit this page |