GitHub user geetikagupta16 opened a pull request:
https://github.com/apache/carbondata/pull/2199 [CARBONDATA-2370] Added document for presto multinode setup for carbondata Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/geetikagupta16/incubator-carbondata CARBONDATA-2370 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2199.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2199 ---- commit 21160515ccecdf34b00d36182d2594a2e3467c28 Author: Geetika Gupta <geetika.gupta@...> Date: 2018-04-20T11:17:35Z Added document for presto multinode setup for carbondata ---- --- |
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183027087 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata --- End diff -- Give a space after # --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183027976 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata --- End diff -- Leave a space after # --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183028111 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto --- End diff -- We can change it to Heading 2 (##) and change the heading to "Installing Presto" --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183027350 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto --- End diff -- We can make it heading 2 (##) and change the heading to "Installing Presto" --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183028525 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: + + ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz + `` + * Extract presto tar file + ``tar zxvf presto-server-0.187.tar.gz`` + + * Download the presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ### Create configuration Files --- End diff -- Headin 2 (##) and make it title case --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183028214 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: --- End diff -- If this are Steps then can change from Bulleted points to Numbered point --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183028486 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: + + ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz + `` + * Extract presto tar file + ``tar zxvf presto-server-0.187.tar.gz`` + + * Download the presto CLI for the coordinator and name it presto. --- End diff -- All 'presto' instances can be changed to title case 'Presto' --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183030674 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: + + ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz + `` + * Extract presto tar file + ``tar zxvf presto-server-0.187.tar.gz`` + + * Download the presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ### Create configuration Files + + * Create etc folder in presto-server-0.187 directory. + * Create config.properties, jvm.config, log.properties, and node.properties files. + * Install uuid to generate a node.id + + ``` + sudo apt-get install uuid + + uuid + ``` + + +##### Contents of your node.properties file + + ``` + node.environment=production + node.id=<generated uuid> + node.data-dir=/home/ubuntu/data + ``` + +##### Contents of your jvm.config file + + ``` + -server + -Xmx16G + -XX:+UseG1GC + -XX:G1HeapRegionSize=32M + -XX:+UseGCOverheadLimit + -XX:+ExplicitGCInvokesConcurrent + -XX:+HeapDumpOnOutOfMemoryError + -XX:OnOutOfMemoryError=kill -9 %p + ``` + +##### Contents of your log.properties file + ``` + com.facebook.presto=INFO + ``` + + The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. + +### Coordinator Configurations + + ##### Contents of your config.properties +``` +coordinator=true +node-scheduler.include-coordinator=false +http-server.http.port=8080 +query.max-memory=50GB +query.max-memory-per-node=2GB +discovery-server.enabled=true +discovery.uri=<coordinator_ip>:8080 +``` +The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers. + +**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`. + +Also relation between below two configuration-properties should be like: +If, `query.max-memory-per-node=30GB` +Then, `query.max-memory=<30GB * number of nodes>` + +### Worker Configurations + +##### Contents of your config.properties + +``` +coordinator=false +http-server.http.port=8080 +query.max-memory=50GB +query.max-memory-per-node=2GB +discovery.uri=<coordinator_ip>:8080 +``` + +**Note**: `jvm.config`, `node.properties` file is same for all the nodes (worker + coordinator). All the nodes should have different `node.id` --- End diff -- `jvm.config` and `node.properties` files are same for all the nodes (worker + coordinator). All the nodes should have different `node.id`. --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183029300 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: + + ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz + `` + * Extract presto tar file + ``tar zxvf presto-server-0.187.tar.gz`` + + * Download the presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ### Create configuration Files + + * Create etc folder in presto-server-0.187 directory. --- End diff -- This is a procedure so change it to number point --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183030457 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: + + ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz + `` + * Extract presto tar file + ``tar zxvf presto-server-0.187.tar.gz`` + + * Download the presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ### Create configuration Files + + * Create etc folder in presto-server-0.187 directory. + * Create config.properties, jvm.config, log.properties, and node.properties files. + * Install uuid to generate a node.id + + ``` + sudo apt-get install uuid + + uuid + ``` + + +##### Contents of your node.properties file + + ``` + node.environment=production + node.id=<generated uuid> + node.data-dir=/home/ubuntu/data + ``` + +##### Contents of your jvm.config file + + ``` + -server + -Xmx16G + -XX:+UseG1GC + -XX:G1HeapRegionSize=32M + -XX:+UseGCOverheadLimit + -XX:+ExplicitGCInvokesConcurrent + -XX:+HeapDumpOnOutOfMemoryError + -XX:OnOutOfMemoryError=kill -9 %p + ``` + +##### Contents of your log.properties file + ``` + com.facebook.presto=INFO + ``` + + The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. + +### Coordinator Configurations + + ##### Contents of your config.properties +``` +coordinator=true +node-scheduler.include-coordinator=false +http-server.http.port=8080 +query.max-memory=50GB +query.max-memory-per-node=2GB +discovery-server.enabled=true +discovery.uri=<coordinator_ip>:8080 +``` +The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers. + +**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`. + +Also relation between below two configuration-properties should be like: +If, `query.max-memory-per-node=30GB` +Then, `query.max-memory=<30GB * number of nodes>` + +### Worker Configurations --- End diff -- Heading 2 --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183030962 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: + + ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz + `` + * Extract presto tar file + ``tar zxvf presto-server-0.187.tar.gz`` + + * Download the presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ### Create configuration Files + + * Create etc folder in presto-server-0.187 directory. + * Create config.properties, jvm.config, log.properties, and node.properties files. + * Install uuid to generate a node.id + + ``` + sudo apt-get install uuid + + uuid + ``` + + +##### Contents of your node.properties file + + ``` + node.environment=production + node.id=<generated uuid> + node.data-dir=/home/ubuntu/data + ``` + +##### Contents of your jvm.config file + + ``` + -server + -Xmx16G + -XX:+UseG1GC + -XX:G1HeapRegionSize=32M + -XX:+UseGCOverheadLimit + -XX:+ExplicitGCInvokesConcurrent + -XX:+HeapDumpOnOutOfMemoryError + -XX:OnOutOfMemoryError=kill -9 %p + ``` + +##### Contents of your log.properties file + ``` + com.facebook.presto=INFO + ``` + + The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. + +### Coordinator Configurations + + ##### Contents of your config.properties +``` +coordinator=true +node-scheduler.include-coordinator=false +http-server.http.port=8080 +query.max-memory=50GB +query.max-memory-per-node=2GB +discovery-server.enabled=true +discovery.uri=<coordinator_ip>:8080 +``` +The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers. + +**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`. + +Also relation between below two configuration-properties should be like: +If, `query.max-memory-per-node=30GB` +Then, `query.max-memory=<30GB * number of nodes>` + +### Worker Configurations + +##### Contents of your config.properties + +``` +coordinator=false +http-server.http.port=8080 +query.max-memory=50GB +query.max-memory-per-node=2GB +discovery.uri=<coordinator_ip>:8080 +``` + +**Note**: `jvm.config`, `node.properties` file is same for all the nodes (worker + coordinator). All the nodes should have different `node.id` + +### Catalog Configurations + +Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator. + +##### Configuring Carbondata in Presto +* Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. + +### Add Plugins + +* Create a directory named `carbondata` in plugin directory of presto --- End diff -- Procedure so change to numbered step --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183030442 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: + + ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz + `` + * Extract presto tar file + ``tar zxvf presto-server-0.187.tar.gz`` + + * Download the presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ### Create configuration Files + + * Create etc folder in presto-server-0.187 directory. + * Create config.properties, jvm.config, log.properties, and node.properties files. + * Install uuid to generate a node.id + + ``` + sudo apt-get install uuid + + uuid + ``` + + +##### Contents of your node.properties file + + ``` + node.environment=production + node.id=<generated uuid> + node.data-dir=/home/ubuntu/data + ``` + +##### Contents of your jvm.config file + + ``` + -server + -Xmx16G + -XX:+UseG1GC + -XX:G1HeapRegionSize=32M + -XX:+UseGCOverheadLimit + -XX:+ExplicitGCInvokesConcurrent + -XX:+HeapDumpOnOutOfMemoryError + -XX:OnOutOfMemoryError=kill -9 %p + ``` + +##### Contents of your log.properties file + ``` + com.facebook.presto=INFO + ``` + + The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. + +### Coordinator Configurations --- End diff -- Heading 2 --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183031033 --- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md --- @@ -0,0 +1,135 @@ +#Presto Multinode Cluster setup For Carbondata + +### Install Presto + + * Download the 0.187 version of presto using: + + ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz + `` + * Extract presto tar file + ``tar zxvf presto-server-0.187.tar.gz`` + + * Download the presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ### Create configuration Files + + * Create etc folder in presto-server-0.187 directory. + * Create config.properties, jvm.config, log.properties, and node.properties files. + * Install uuid to generate a node.id + + ``` + sudo apt-get install uuid + + uuid + ``` + + +##### Contents of your node.properties file + + ``` + node.environment=production + node.id=<generated uuid> + node.data-dir=/home/ubuntu/data + ``` + +##### Contents of your jvm.config file + + ``` + -server + -Xmx16G + -XX:+UseG1GC + -XX:G1HeapRegionSize=32M + -XX:+UseGCOverheadLimit + -XX:+ExplicitGCInvokesConcurrent + -XX:+HeapDumpOnOutOfMemoryError + -XX:OnOutOfMemoryError=kill -9 %p + ``` + +##### Contents of your log.properties file + ``` + com.facebook.presto=INFO + ``` + + The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. + +### Coordinator Configurations + + ##### Contents of your config.properties +``` +coordinator=true +node-scheduler.include-coordinator=false +http-server.http.port=8080 +query.max-memory=50GB +query.max-memory-per-node=2GB +discovery-server.enabled=true +discovery.uri=<coordinator_ip>:8080 +``` +The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers. + +**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`. + +Also relation between below two configuration-properties should be like: +If, `query.max-memory-per-node=30GB` +Then, `query.max-memory=<30GB * number of nodes>` + +### Worker Configurations + +##### Contents of your config.properties + +``` +coordinator=false +http-server.http.port=8080 +query.max-memory=50GB +query.max-memory-per-node=2GB +discovery.uri=<coordinator_ip>:8080 +``` + +**Note**: `jvm.config`, `node.properties` file is same for all the nodes (worker + coordinator). All the nodes should have different `node.id` + +### Catalog Configurations + +Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator. + +##### Configuring Carbondata in Presto +* Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. + +### Add Plugins + +* Create a directory named `carbondata` in plugin directory of presto +* Copy `carbondata` jars to `plugin/carbondata` directory on all nodes + +### Start Presto Server on all nodes + +``` +./presto-server-0.187/bin/launcher start +``` +To run it as a background process. + +``` +./presto-server-0.187/bin/launcher run +``` +To run it in foreground. + +### Start presto CLI +``` +./presto +``` +To connect to carbondata catalog use the following command: --- End diff -- To connect to carbondata catalog, use the following command: --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5238/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4059/ --- |
In reply to this post by qiuchenjian-2
Github user geetikagupta16 commented on the issue:
https://github.com/apache/carbondata/pull/2199 @sgururajshetty I have made the required changes. Please review --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4140/ --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183304497 --- Diff: integration/presto/Presto_Cluster_Setup_For_Carbondata.md --- @@ -0,0 +1,133 @@ +# Presto Multinode Cluster setup For Carbondata + +## Installing Presto + + 1. Download the 0.187 version of Presto using: + `wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz` + + 2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz` + + 3. Download the Presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ## Create Configuration Files + + 1. Create `etc` folder in presto-server-0.187 directory. + 2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files. + 3. Install uuid to generate a node.id + + ``` + sudo apt-get install uuid + + uuid + ``` + + +##### Contents of your node.properties file + + ``` + node.environment=production + node.id=<generated uuid> + node.data-dir=/home/ubuntu/data + ``` + +##### Contents of your jvm.config file + + ``` + -server + -Xmx16G + -XX:+UseG1GC + -XX:G1HeapRegionSize=32M + -XX:+UseGCOverheadLimit + -XX:+ExplicitGCInvokesConcurrent + -XX:+HeapDumpOnOutOfMemoryError + -XX:OnOutOfMemoryError=kill -9 %p + ``` + +##### Contents of your log.properties file + ``` + com.facebook.presto=INFO + ``` + + The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. + +## Coordinator Configurations + + ##### Contents of your config.properties + ``` + coordinator=true + node-scheduler.include-coordinator=false + http-server.http.port=8080 + query.max-memory=50GB + query.max-memory-per-node=2GB + discovery-server.enabled=true + discovery.uri=<coordinator_ip>:8080 + ``` +The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers. + +**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`. + +Also relation between below two configuration-properties should be like: +If, `query.max-memory-per-node=30GB` +Then, `query.max-memory=<30GB * number of nodes>` + +## Worker Configurations + +##### Contents of your config.properties + + ``` + coordinator=false + http-server.http.port=8080 + query.max-memory=50GB + query.max-memory-per-node=2GB + discovery.uri=<coordinator_ip>:8080 + ``` + +**Note**: `jvm.config` and `node.properties` files are same for all the nodes (worker + coordinator). All the nodes should have different `node.id` + +## Catalog Configurations + +1. Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator. + +##### Configuring Carbondata in Presto +1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. + +## Add Plugins + +1. Create a directory named `carbondata` in plugin directory of presto +2. Copy `carbondata` jars to `plugin/carbondata` directory on all nodes + +## Start Presto Server on all nodes + +``` +./presto-server-0.187/bin/launcher start +``` +To run it as a background process. + +``` +./presto-server-0.187/bin/launcher run +``` +To run it in foreground. + +## Start Presto CLI +``` +./presto +``` +To connect to carbondata catalog use the following command: + +``` +./presto --server <coordinator_ip>:8080 --catalog carbondata --schema <schema_name> +``` +Execute the following command to ensure the workers are connected --- End diff -- : end of sentence --- |
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183304329 --- Diff: integration/presto/Presto_Cluster_Setup_For_Carbondata.md --- @@ -0,0 +1,133 @@ +# Presto Multinode Cluster setup For Carbondata + +## Installing Presto + + 1. Download the 0.187 version of Presto using: + `wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz` + + 2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz` + + 3. Download the Presto CLI for the coordinator and name it presto. + + ``` + wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar + + mv presto-cli-0.187-executable.jar presto + + chmod +x presto + ``` + + ## Create Configuration Files + + 1. Create `etc` folder in presto-server-0.187 directory. + 2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files. + 3. Install uuid to generate a node.id + + ``` + sudo apt-get install uuid + + uuid + ``` + + +##### Contents of your node.properties file + + ``` + node.environment=production + node.id=<generated uuid> + node.data-dir=/home/ubuntu/data + ``` + +##### Contents of your jvm.config file + + ``` + -server + -Xmx16G + -XX:+UseG1GC + -XX:G1HeapRegionSize=32M + -XX:+UseGCOverheadLimit + -XX:+ExplicitGCInvokesConcurrent + -XX:+HeapDumpOnOutOfMemoryError + -XX:OnOutOfMemoryError=kill -9 %p + ``` + +##### Contents of your log.properties file + ``` + com.facebook.presto=INFO + ``` + + The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. + +## Coordinator Configurations + + ##### Contents of your config.properties + ``` + coordinator=true + node-scheduler.include-coordinator=false + http-server.http.port=8080 + query.max-memory=50GB + query.max-memory-per-node=2GB + discovery-server.enabled=true + discovery.uri=<coordinator_ip>:8080 + ``` +The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers. + +**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`. + +Also relation between below two configuration-properties should be like: +If, `query.max-memory-per-node=30GB` +Then, `query.max-memory=<30GB * number of nodes>` + +## Worker Configurations + +##### Contents of your config.properties + + ``` + coordinator=false + http-server.http.port=8080 + query.max-memory=50GB + query.max-memory-per-node=2GB + discovery.uri=<coordinator_ip>:8080 + ``` + +**Note**: `jvm.config` and `node.properties` files are same for all the nodes (worker + coordinator). All the nodes should have different `node.id` + +## Catalog Configurations + +1. Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator. + +##### Configuring Carbondata in Presto +1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes. + +## Add Plugins + +1. Create a directory named `carbondata` in plugin directory of presto --- End diff -- Period at the end of sentence for both the point. Check for all the sentence --- |
Free forum by Nabble | Edit this page |