[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
GitHub user sraghunandan opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/611

    [CARBONDATA-731] Enhance and correct quick start and installation guides

    corrected the steps in quick start and installation guides.Synchronized the variable names to make it consistent
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sraghunandan/incubator-carbondata correct_installation_start_guide

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #611
   
----
commit 0e59c3bd1548e7e82dd168fc82e862d2de032258
Author: sraghunandan <[hidden email]>
Date:   2017-02-27T02:31:10Z

    corrected the steps in quick start and installation guides.Synchronized the variable names to make it consistent

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #611: [CARBONDATA-731] Enhance and correct quick ...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/611
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/969/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #611: [CARBONDATA-731] Enhance and correct quick ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sgururajshetty commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/611
 
    LGTM. The files are also formatted and reviewed for better understanding.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103343819
 
    --- Diff: docs/installation-guide.md ---
    @@ -40,42 +40,46 @@ followed by :
     
     ### Procedure
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar`.
    --- End diff --
   
    Please use "https://github.com/apache/incubator-carbondata/blob/master/build/README.md" replace "https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103344126
 
    --- Diff: docs/installation-guide.md ---
    @@ -40,42 +40,46 @@ followed by :
     
     ### Procedure
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar`.
    --- End diff --
   
    Suggest using  "./assembly/target/scala-2.xx/carbondata_xxx.jar", because also need cover Spark 1.x scenarios


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #611: [CARBONDATA-731] Enhance and correct quick ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/611
 
    @Hexiaoqiao   can you review this PR ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Hexiaoqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103390387
 
    --- Diff: docs/installation-guide.md ---
    @@ -40,42 +40,46 @@ followed by :
     
     ### Procedure
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar`.
    --- End diff --
   
    +1, wiki link has been invalid.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Hexiaoqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103394434
 
    --- Diff: docs/installation-guide.md ---
    @@ -92,77 +96,87 @@ To get started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Opera
     
        The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to `<SPARK_HOME>/carbonlib` folder.
    --- End diff --
   
    1.please cite `https://github.com/apache/incubator-carbondata/blob/master/build/README.md` since wiki link has been invalid.
    2.suggest change `./assembly/target/scala-2.1x/carbondata_xxx.jar` to `./assembly/target/scala-2.xx/carbondata_xxx.jar` as @chenliang613 mentioned above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Hexiaoqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103394813
 
    --- Diff: docs/installation-guide.md ---
    @@ -92,77 +96,87 @@ To get started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Opera
     
        The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to `<SPARK_HOME>/carbonlib` folder.
    +
    +    **NOTE**: Create the carbonlib folder if it does not exists inside `<SPARK_HOME>` path.
     
    -      NOTE: Create the carbonlib folder if it does not exists inside ``"<SPARK_HOME>"`` path.
    +2. Copy the `./processing/carbonplugins` folder from CarbonData repository to `<SPARK_HOME>/carbonlib/` folder.
     
    -* Copy "carbonplugins" folder to ``"<SPARK_HOME>/carbonlib"`` folder from "./processing/" folder of CarbonData repository.
    -      carbonplugins will contain .kettle folder.
    +    **NOTE**: carbonplugins will contain .kettle folder.
     
    -* Copy the "carbon.properties.template" to ``"<SPARK_HOME>/conf/carbon.properties"`` folder from conf folder of CarbonData repository.
    -* Modify the parameters in "spark-default.conf" located in the ``"<SPARK_HOME>/conf``"
    +3. Copy the `./conf/carbon.properties.template` file from CarbonData repository to `<SPARK_HOME>/conf/` folder and rename the file to `carbon.properties`.
    +
    +4. Create `tar,gz` file of carbonlib folder and move it inside the carbonlib folder.
    --- End diff --
   
    why create `tar.gz`, is this a necessary step?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Hexiaoqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103396441
 
    --- Diff: docs/installation-guide.md ---
    @@ -92,77 +96,87 @@ To get started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Opera
     
        The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to `<SPARK_HOME>/carbonlib` folder.
    +
    +    **NOTE**: Create the carbonlib folder if it does not exists inside `<SPARK_HOME>` path.
     
    -      NOTE: Create the carbonlib folder if it does not exists inside ``"<SPARK_HOME>"`` path.
    +2. Copy the `./processing/carbonplugins` folder from CarbonData repository to `<SPARK_HOME>/carbonlib/` folder.
     
    -* Copy "carbonplugins" folder to ``"<SPARK_HOME>/carbonlib"`` folder from "./processing/" folder of CarbonData repository.
    -      carbonplugins will contain .kettle folder.
    +    **NOTE**: carbonplugins will contain .kettle folder.
     
    -* Copy the "carbon.properties.template" to ``"<SPARK_HOME>/conf/carbon.properties"`` folder from conf folder of CarbonData repository.
    -* Modify the parameters in "spark-default.conf" located in the ``"<SPARK_HOME>/conf``"
    +3. Copy the `./conf/carbon.properties.template` file from CarbonData repository to `<SPARK_HOME>/conf/` folder and rename the file to `carbon.properties`.
    +
    +4. Create `tar,gz` file of carbonlib folder and move it inside the carbonlib folder.
    +
    +```
    + cd <SPARK_HOME>
    + tar -zcvf carbondata.tar.gz carbonlib/
    + mv carbondata.tar.gz carbonlib/
    +```
    +
    +5. Configure the properties mentioned in the following table in `<SPARK_HOME>/conf/spark-defaults.conf` file.
     
     | Property | Description | Value |
     |---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
    -| spark.master | Set this value to run the Spark in yarn cluster mode. | Set "yarn-client" to run the Spark in yarn cluster mode. |
    -| spark.yarn.dist.files | Comma-separated list of files to be placed in the working directory of each executor. |``"<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| spark.yarn.dist.archives | Comma-separated list of archives to be extracted into the working directory of each executor. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbondata_xxx.jar`` |
    -| spark.executor.extraJavaOptions | A string of extra JVM options to pass to executors. For instance  NOTE: You can enter multiple values separated by space. |``-Dcarbon.properties.filepath="<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| spark.executor.extraClassPath | Extra classpath entries to prepend to the classpath of executors. NOTE: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonlib/carbondata_xxx.jar`` |
    -| spark.driver.extraClassPath | Extra classpath entries to prepend to the classpath of the driver. NOTE: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonlib/carbondata_xxx.jar`` |
    -| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |``-Dcarbon.properties.filepath="<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| carbon.kettle.home | Path that will be used by CarbonData internally to create graph for loading the data. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonplugins`` |
    +| spark.master | Set this value to run the Spark in yarn cluster mode. | Set yarn-client to run the Spark in yarn cluster mode. |
    +| spark.yarn.dist.files | Comma-separated list of files to be placed in the working directory of each executor. |"<SPARK_HOME>"/conf/carbon.properties |
    +| spark.yarn.dist.archives | Comma-separated list of archives to be extracted into the working directory of each executor. |"<SPARK_HOME>"/carbonlib/carbondata.tar.gz |
    +| spark.executor.extraJavaOptions | A string of extra JVM options to pass to executors. For instance  **NOTE**: You can enter multiple values separated by space. |-Dcarbon.properties.filepath=carbon.properties |
    +| spark.executor.extraClassPath | Extra classpath entries to prepend to the classpath of executors. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath |carbondata.tar.gz/carbonlib/* |
    +| spark.driver.extraClassPath | Extra classpath entries to prepend to the classpath of the driver. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. |"<SPARK_HOME>"/carbonlib/carbonlib/* |
    +| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |-Dcarbon.properties.filepath="<SPARK_HOME>"/conf/carbon.properties |
    +
     
    -* Add the following properties in ``<SPARK_HOME>/conf/ carbon.properties``:
    +6. Add the following properties in `<SPARK_HOME>/conf/carbon.properties`:
     
     | Property | Required | Description | Example | Default Value |
     |----------------------|----------|----------------------------------------------------------------------------------------|-------------------------------------|---------------|
     | carbon.storelocation | NO | Location where CarbonData will create the store and write the data in its own format. | hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose to set HDFS directory|
    -| carbon.kettle.home | YES | Path that will be used by CarbonData internally to create graph for loading the data. | $SPARK_HOME/carbonlib/carbonplugins |  |
    +| carbon.kettle.home | YES | Path that will be used by CarbonData internally to create graph for loading the data. | carbondata.tar.gz/carbonlib/carbonplugins |  |
     
     
    -* Verify the installation.
    +7. Verify the installation.
     
     ```
          ./bin/spark-shell --master yarn-client --driver-memory 1g
          --executor-cores 2 --executor-memory 2G
     ```
    -  NOTE: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
    +  **NOTE**: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
     
       Getting started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Operations on CarbonData](ddl-operation-on-carbondata.md)
     
     ## Query Execution Using CarbonData Thrift Server
     
    -### Starting CarbonData Thrift Server
    +### Starting CarbonData Thrift Server.
     
    -   a. cd ``<SPARK_HOME>``
    +   a. cd `<SPARK_HOME>`
     
        b. Run the following command to start the CarbonData thrift server.
     
     ```
     ./bin/spark-submit --conf spark.sql.hive.thriftServer.singleSession=true
     --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
    -$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
    +<SPARK_HOME>/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
     ```
     
     | Parameter | Description | Example |
     |---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
    -| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the ``"<SPARK_HOME>"/carbonlib/`` folder. | carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar |
    -| carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | ``hdfs//<host_name>:54310/user/hive/warehouse/carbon.store`` |
    +| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the "<SPARK_HOME>"/carbonlib/ folder. | carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar |
    +| carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | hdfs//<host_name>:54310/user/hive/warehouse/carbon.store |
    --- End diff --
   
    maybe it's better to use `hdfs//hostname:port/user/hive/warehouse/carbon.store` or `hdfs//hacluster/user/hive/warehouse/carbon.store` as hdfs path example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sraghunandan commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103962828
 
    --- Diff: docs/quick-start-guide.md ---
    @@ -111,11 +111,9 @@ import org.apache.spark.sql.CarbonContext
     * Create an instance of CarbonContext in the following manner :
     
     ```
    -val cc = new CarbonContext(sc)
    +val cc = new CarbonContext(sc, "<hdfs store path>")
    --- End diff --
   
    fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sraghunandan commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103962920
 
    --- Diff: docs/installation-guide.md ---
    @@ -40,42 +40,46 @@ followed by :
     
     ### Procedure
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar`.
    --- End diff --
   
    fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sraghunandan commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103962969
 
    --- Diff: docs/installation-guide.md ---
    @@ -40,42 +40,46 @@ followed by :
     
     ### Procedure
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar`.
    --- End diff --
   
    we support 2.10 and 2.11.hence 2.1x is fine


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sraghunandan commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103963030
 
    --- Diff: docs/installation-guide.md ---
    @@ -92,77 +96,87 @@ To get started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Opera
     
        The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to `<SPARK_HOME>/carbonlib` folder.
    --- End diff --
   
    2.xx is not required as only 2.10, 2.11 is supported. hence 2.1x is fine as of now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sraghunandan commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103963067
 
    --- Diff: docs/installation-guide.md ---
    @@ -92,77 +96,87 @@ To get started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Opera
     
        The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to `<SPARK_HOME>/carbonlib` folder.
    +
    +    **NOTE**: Create the carbonlib folder if it does not exists inside `<SPARK_HOME>` path.
     
    -      NOTE: Create the carbonlib folder if it does not exists inside ``"<SPARK_HOME>"`` path.
    +2. Copy the `./processing/carbonplugins` folder from CarbonData repository to `<SPARK_HOME>/carbonlib/` folder.
     
    -* Copy "carbonplugins" folder to ``"<SPARK_HOME>/carbonlib"`` folder from "./processing/" folder of CarbonData repository.
    -      carbonplugins will contain .kettle folder.
    +    **NOTE**: carbonplugins will contain .kettle folder.
     
    -* Copy the "carbon.properties.template" to ``"<SPARK_HOME>/conf/carbon.properties"`` folder from conf folder of CarbonData repository.
    -* Modify the parameters in "spark-default.conf" located in the ``"<SPARK_HOME>/conf``"
    +3. Copy the `./conf/carbon.properties.template` file from CarbonData repository to `<SPARK_HOME>/conf/` folder and rename the file to `carbon.properties`.
    +
    +4. Create `tar,gz` file of carbonlib folder and move it inside the carbonlib folder.
    --- End diff --
   
    yes.yarn deploys these tar.gz files to all nodes of the cluster automatically


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sraghunandan commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r103963111
 
    --- Diff: docs/installation-guide.md ---
    @@ -92,77 +96,87 @@ To get started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Opera
     
        The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to `<SPARK_HOME>/carbonlib` folder.
    +
    +    **NOTE**: Create the carbonlib folder if it does not exists inside `<SPARK_HOME>` path.
     
    -      NOTE: Create the carbonlib folder if it does not exists inside ``"<SPARK_HOME>"`` path.
    +2. Copy the `./processing/carbonplugins` folder from CarbonData repository to `<SPARK_HOME>/carbonlib/` folder.
     
    -* Copy "carbonplugins" folder to ``"<SPARK_HOME>/carbonlib"`` folder from "./processing/" folder of CarbonData repository.
    -      carbonplugins will contain .kettle folder.
    +    **NOTE**: carbonplugins will contain .kettle folder.
     
    -* Copy the "carbon.properties.template" to ``"<SPARK_HOME>/conf/carbon.properties"`` folder from conf folder of CarbonData repository.
    -* Modify the parameters in "spark-default.conf" located in the ``"<SPARK_HOME>/conf``"
    +3. Copy the `./conf/carbon.properties.template` file from CarbonData repository to `<SPARK_HOME>/conf/` folder and rename the file to `carbon.properties`.
    +
    +4. Create `tar,gz` file of carbonlib folder and move it inside the carbonlib folder.
    +
    +```
    + cd <SPARK_HOME>
    + tar -zcvf carbondata.tar.gz carbonlib/
    + mv carbondata.tar.gz carbonlib/
    +```
    +
    +5. Configure the properties mentioned in the following table in `<SPARK_HOME>/conf/spark-defaults.conf` file.
     
     | Property | Description | Value |
     |---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
    -| spark.master | Set this value to run the Spark in yarn cluster mode. | Set "yarn-client" to run the Spark in yarn cluster mode. |
    -| spark.yarn.dist.files | Comma-separated list of files to be placed in the working directory of each executor. |``"<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| spark.yarn.dist.archives | Comma-separated list of archives to be extracted into the working directory of each executor. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbondata_xxx.jar`` |
    -| spark.executor.extraJavaOptions | A string of extra JVM options to pass to executors. For instance  NOTE: You can enter multiple values separated by space. |``-Dcarbon.properties.filepath="<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| spark.executor.extraClassPath | Extra classpath entries to prepend to the classpath of executors. NOTE: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonlib/carbondata_xxx.jar`` |
    -| spark.driver.extraClassPath | Extra classpath entries to prepend to the classpath of the driver. NOTE: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonlib/carbondata_xxx.jar`` |
    -| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |``-Dcarbon.properties.filepath="<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| carbon.kettle.home | Path that will be used by CarbonData internally to create graph for loading the data. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonplugins`` |
    +| spark.master | Set this value to run the Spark in yarn cluster mode. | Set yarn-client to run the Spark in yarn cluster mode. |
    +| spark.yarn.dist.files | Comma-separated list of files to be placed in the working directory of each executor. |"<SPARK_HOME>"/conf/carbon.properties |
    +| spark.yarn.dist.archives | Comma-separated list of archives to be extracted into the working directory of each executor. |"<SPARK_HOME>"/carbonlib/carbondata.tar.gz |
    +| spark.executor.extraJavaOptions | A string of extra JVM options to pass to executors. For instance  **NOTE**: You can enter multiple values separated by space. |-Dcarbon.properties.filepath=carbon.properties |
    +| spark.executor.extraClassPath | Extra classpath entries to prepend to the classpath of executors. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath |carbondata.tar.gz/carbonlib/* |
    +| spark.driver.extraClassPath | Extra classpath entries to prepend to the classpath of the driver. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. |"<SPARK_HOME>"/carbonlib/carbonlib/* |
    +| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |-Dcarbon.properties.filepath="<SPARK_HOME>"/conf/carbon.properties |
    +
     
    -* Add the following properties in ``<SPARK_HOME>/conf/ carbon.properties``:
    +6. Add the following properties in `<SPARK_HOME>/conf/carbon.properties`:
     
     | Property | Required | Description | Example | Default Value |
     |----------------------|----------|----------------------------------------------------------------------------------------|-------------------------------------|---------------|
     | carbon.storelocation | NO | Location where CarbonData will create the store and write the data in its own format. | hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose to set HDFS directory|
    -| carbon.kettle.home | YES | Path that will be used by CarbonData internally to create graph for loading the data. | $SPARK_HOME/carbonlib/carbonplugins |  |
    +| carbon.kettle.home | YES | Path that will be used by CarbonData internally to create graph for loading the data. | carbondata.tar.gz/carbonlib/carbonplugins |  |
     
     
    -* Verify the installation.
    +7. Verify the installation.
     
     ```
          ./bin/spark-shell --master yarn-client --driver-memory 1g
          --executor-cores 2 --executor-memory 2G
     ```
    -  NOTE: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
    +  **NOTE**: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
     
       Getting started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Operations on CarbonData](ddl-operation-on-carbondata.md)
     
     ## Query Execution Using CarbonData Thrift Server
     
    -### Starting CarbonData Thrift Server
    +### Starting CarbonData Thrift Server.
     
    -   a. cd ``<SPARK_HOME>``
    +   a. cd `<SPARK_HOME>`
     
        b. Run the following command to start the CarbonData thrift server.
     
     ```
     ./bin/spark-submit --conf spark.sql.hive.thriftServer.singleSession=true
     --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
    -$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
    +<SPARK_HOME>/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
     ```
     
     | Parameter | Description | Example |
     |---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
    -| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the ``"<SPARK_HOME>"/carbonlib/`` folder. | carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar |
    -| carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | ``hdfs//<host_name>:54310/user/hive/warehouse/carbon.store`` |
    +| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the "<SPARK_HOME>"/carbonlib/ folder. | carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar |
    +| carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | hdfs//<host_name>:54310/user/hive/warehouse/carbon.store |
    --- End diff --
   
    fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #611: [CARBONDATA-731] Enhance and correct quick ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/611
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/996/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Hexiaoqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r104083241
 
    --- Diff: docs/installation-guide.md ---
    @@ -92,77 +96,87 @@ To get started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Opera
     
        The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://github.com/apache/incubator-carbondata/blob/master/build/README.md) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to `<SPARK_HOME>/carbonlib` folder.
    +
    +    **NOTE**: Create the carbonlib folder if it does not exists inside `<SPARK_HOME>` path.
     
    -      NOTE: Create the carbonlib folder if it does not exists inside ``"<SPARK_HOME>"`` path.
    +2. Copy the `./processing/carbonplugins` folder from CarbonData repository to `<SPARK_HOME>/carbonlib/` folder.
     
    -* Copy "carbonplugins" folder to ``"<SPARK_HOME>/carbonlib"`` folder from "./processing/" folder of CarbonData repository.
    -      carbonplugins will contain .kettle folder.
    +    **NOTE**: carbonplugins will contain .kettle folder.
     
    -* Copy the "carbon.properties.template" to ``"<SPARK_HOME>/conf/carbon.properties"`` folder from conf folder of CarbonData repository.
    -* Modify the parameters in "spark-default.conf" located in the ``"<SPARK_HOME>/conf``"
    +3. Copy the `./conf/carbon.properties.template` file from CarbonData repository to `<SPARK_HOME>/conf/` folder and rename the file to `carbon.properties`.
    +
    +4. Create `tar,gz` file of carbonlib folder and move it inside the carbonlib folder.
    +
    +```
    + cd <SPARK_HOME>
    + tar -zcvf carbondata.tar.gz carbonlib/
    + mv carbondata.tar.gz carbonlib/
    +```
    +
    +5. Configure the properties mentioned in the following table in `<SPARK_HOME>/conf/spark-defaults.conf` file.
     
     | Property | Description | Value |
     |---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
    -| spark.master | Set this value to run the Spark in yarn cluster mode. | Set "yarn-client" to run the Spark in yarn cluster mode. |
    -| spark.yarn.dist.files | Comma-separated list of files to be placed in the working directory of each executor. |``"<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| spark.yarn.dist.archives | Comma-separated list of archives to be extracted into the working directory of each executor. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbondata_xxx.jar`` |
    -| spark.executor.extraJavaOptions | A string of extra JVM options to pass to executors. For instance  NOTE: You can enter multiple values separated by space. |``-Dcarbon.properties.filepath="<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| spark.executor.extraClassPath | Extra classpath entries to prepend to the classpath of executors. NOTE: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonlib/carbondata_xxx.jar`` |
    -| spark.driver.extraClassPath | Extra classpath entries to prepend to the classpath of the driver. NOTE: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonlib/carbondata_xxx.jar`` |
    -| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |``-Dcarbon.properties.filepath="<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| carbon.kettle.home | Path that will be used by CarbonData internally to create graph for loading the data. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonplugins`` |
    +| spark.master | Set this value to run the Spark in yarn cluster mode. | Set yarn-client to run the Spark in yarn cluster mode. |
    +| spark.yarn.dist.files | Comma-separated list of files to be placed in the working directory of each executor. |"<SPARK_HOME>"/conf/carbon.properties |
    +| spark.yarn.dist.archives | Comma-separated list of archives to be extracted into the working directory of each executor. |"<SPARK_HOME>"/carbonlib/carbondata.tar.gz |
    +| spark.executor.extraJavaOptions | A string of extra JVM options to pass to executors. For instance  **NOTE**: You can enter multiple values separated by space. |-Dcarbon.properties.filepath=carbon.properties |
    +| spark.executor.extraClassPath | Extra classpath entries to prepend to the classpath of executors. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath |carbondata.tar.gz/carbonlib/* |
    +| spark.driver.extraClassPath | Extra classpath entries to prepend to the classpath of the driver. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. |"<SPARK_HOME>"/carbonlib/carbonlib/* |
    +| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |-Dcarbon.properties.filepath="<SPARK_HOME>"/conf/carbon.properties |
    +
     
    -* Add the following properties in ``<SPARK_HOME>/conf/ carbon.properties``:
    +6. Add the following properties in `<SPARK_HOME>/conf/carbon.properties`:
     
     | Property | Required | Description | Example | Default Value |
     |----------------------|----------|----------------------------------------------------------------------------------------|-------------------------------------|---------------|
     | carbon.storelocation | NO | Location where CarbonData will create the store and write the data in its own format. | hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose to set HDFS directory|
    -| carbon.kettle.home | YES | Path that will be used by CarbonData internally to create graph for loading the data. | $SPARK_HOME/carbonlib/carbonplugins |  |
    +| carbon.kettle.home | YES | Path that will be used by CarbonData internally to create graph for loading the data. | carbondata.tar.gz/carbonlib/carbonplugins |  |
     
     
    -* Verify the installation.
    +7. Verify the installation.
     
     ```
          ./bin/spark-shell --master yarn-client --driver-memory 1g
          --executor-cores 2 --executor-memory 2G
     ```
    -  NOTE: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
    +  **NOTE**: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
     
       Getting started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Operations on CarbonData](ddl-operation-on-carbondata.md)
     
     ## Query Execution Using CarbonData Thrift Server
     
    -### Starting CarbonData Thrift Server
    +### Starting CarbonData Thrift Server.
     
    -   a. cd ``<SPARK_HOME>``
    +   a. cd `<SPARK_HOME>`
     
        b. Run the following command to start the CarbonData thrift server.
     
     ```
     ./bin/spark-submit --conf spark.sql.hive.thriftServer.singleSession=true
     --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
    -$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
    +<SPARK_HOME>/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
     ```
     
     | Parameter | Description | Example |
     |---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
    -| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the ``"<SPARK_HOME>"/carbonlib/`` folder. | carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar |
    -| carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | ``hdfs//<host_name>:54310/user/hive/warehouse/carbon.store`` |
    +| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the "<SPARK_HOME>"/carbonlib/ folder. | carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar |
    +| carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | hdfs//<host_name>:port/user/hive/warehouse/carbon.store |
     
    -### Examples
    +**Examples**
       
    -   * Start with default memory and executors
    +   * Start with default memory and executors.
     
     ```
     ./bin/spark-submit --conf spark.sql.hive.thriftServer.singleSession=true
     --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
    -$SPARK_HOME/carbonlib
    +<SPARK_HOME>/carbonlib
     /carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar
    -hdfs://hacluster/user/hive/warehouse/carbon.store
    +hdfs://1.1.1.1:54310/user/hive/warehouse/carbon.store
    --- End diff --
   
    please use `hacluster` or `hostname:port` replace real ip/port sample.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Hexiaoqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r104083265
 
    --- Diff: docs/installation-guide.md ---
    @@ -171,10 +185,10 @@ hdfs://hacluster/user/hive/warehouse/carbon.store
     --executor-cores 32
     /srv/OSCON/BigData/HACluster/install/spark/sparkJdbc/lib
     /carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar
    -hdfs://hacluster/user/hive/warehouse/carbon.store
    +hdfs://1.1.1.1:54310/user/hive/warehouse/carbon.store
    --- End diff --
   
    please use hacluster or hostname:port replace real ip/port sample.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #611: [CARBONDATA-731] Enhance and correct...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Hexiaoqiao commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/611#discussion_r104083673
 
    --- Diff: docs/installation-guide.md ---
    @@ -92,77 +96,87 @@ To get started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Opera
     
        The following steps are only for Driver Nodes. (Driver nodes are the one which starts the spark context.)
     
    -* [Build the CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration) project and get the assembly jar from "./assembly/target/scala-2.10/carbondata_xxx.jar" and put in the ``"<SPARK_HOME>/carbonlib"`` folder.
    +1. [Build the CarbonData](https://github.com/apache/incubator-carbondata/blob/master/build/README.md) project and get the assembly jar from `./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to `<SPARK_HOME>/carbonlib` folder.
    +
    +    **NOTE**: Create the carbonlib folder if it does not exists inside `<SPARK_HOME>` path.
     
    -      NOTE: Create the carbonlib folder if it does not exists inside ``"<SPARK_HOME>"`` path.
    +2. Copy the `./processing/carbonplugins` folder from CarbonData repository to `<SPARK_HOME>/carbonlib/` folder.
     
    -* Copy "carbonplugins" folder to ``"<SPARK_HOME>/carbonlib"`` folder from "./processing/" folder of CarbonData repository.
    -      carbonplugins will contain .kettle folder.
    +    **NOTE**: carbonplugins will contain .kettle folder.
     
    -* Copy the "carbon.properties.template" to ``"<SPARK_HOME>/conf/carbon.properties"`` folder from conf folder of CarbonData repository.
    -* Modify the parameters in "spark-default.conf" located in the ``"<SPARK_HOME>/conf``"
    +3. Copy the `./conf/carbon.properties.template` file from CarbonData repository to `<SPARK_HOME>/conf/` folder and rename the file to `carbon.properties`.
    +
    +4. Create `tar,gz` file of carbonlib folder and move it inside the carbonlib folder.
    +
    +```
    + cd <SPARK_HOME>
    + tar -zcvf carbondata.tar.gz carbonlib/
    + mv carbondata.tar.gz carbonlib/
    +```
    +
    +5. Configure the properties mentioned in the following table in `<SPARK_HOME>/conf/spark-defaults.conf` file.
     
     | Property | Description | Value |
     |---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
    -| spark.master | Set this value to run the Spark in yarn cluster mode. | Set "yarn-client" to run the Spark in yarn cluster mode. |
    -| spark.yarn.dist.files | Comma-separated list of files to be placed in the working directory of each executor. |``"<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| spark.yarn.dist.archives | Comma-separated list of archives to be extracted into the working directory of each executor. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbondata_xxx.jar`` |
    -| spark.executor.extraJavaOptions | A string of extra JVM options to pass to executors. For instance  NOTE: You can enter multiple values separated by space. |``-Dcarbon.properties.filepath="<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| spark.executor.extraClassPath | Extra classpath entries to prepend to the classpath of executors. NOTE: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonlib/carbondata_xxx.jar`` |
    -| spark.driver.extraClassPath | Extra classpath entries to prepend to the classpath of the driver. NOTE: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonlib/carbondata_xxx.jar`` |
    -| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |``-Dcarbon.properties.filepath="<YOUR_SPARK_HOME_PATH>"/conf/carbon.properties`` |
    -| carbon.kettle.home | Path that will be used by CarbonData internally to create graph for loading the data. |``"<YOUR_SPARK_HOME_PATH>"/carbonlib/carbonplugins`` |
    +| spark.master | Set this value to run the Spark in yarn cluster mode. | Set yarn-client to run the Spark in yarn cluster mode. |
    +| spark.yarn.dist.files | Comma-separated list of files to be placed in the working directory of each executor. |"<SPARK_HOME>"/conf/carbon.properties |
    +| spark.yarn.dist.archives | Comma-separated list of archives to be extracted into the working directory of each executor. |"<SPARK_HOME>"/carbonlib/carbondata.tar.gz |
    +| spark.executor.extraJavaOptions | A string of extra JVM options to pass to executors. For instance  **NOTE**: You can enter multiple values separated by space. |-Dcarbon.properties.filepath=carbon.properties |
    +| spark.executor.extraClassPath | Extra classpath entries to prepend to the classpath of executors. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the values in below parameter spark.driver.extraClassPath |carbondata.tar.gz/carbonlib/* |
    +| spark.driver.extraClassPath | Extra classpath entries to prepend to the classpath of the driver. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. |"<SPARK_HOME>"/carbonlib/carbonlib/* |
    +| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |-Dcarbon.properties.filepath="<SPARK_HOME>"/conf/carbon.properties |
    +
     
    -* Add the following properties in ``<SPARK_HOME>/conf/ carbon.properties``:
    +6. Add the following properties in `<SPARK_HOME>/conf/carbon.properties`:
     
     | Property | Required | Description | Example | Default Value |
     |----------------------|----------|----------------------------------------------------------------------------------------|-------------------------------------|---------------|
     | carbon.storelocation | NO | Location where CarbonData will create the store and write the data in its own format. | hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose to set HDFS directory|
    -| carbon.kettle.home | YES | Path that will be used by CarbonData internally to create graph for loading the data. | $SPARK_HOME/carbonlib/carbonplugins |  |
    +| carbon.kettle.home | YES | Path that will be used by CarbonData internally to create graph for loading the data. | carbondata.tar.gz/carbonlib/carbonplugins |  |
     
     
    -* Verify the installation.
    +7. Verify the installation.
     
     ```
          ./bin/spark-shell --master yarn-client --driver-memory 1g
          --executor-cores 2 --executor-memory 2G
     ```
    -  NOTE: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
    +  **NOTE**: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
     
       Getting started with CarbonData : [Quick Start](quick-start-guide.md), [DDL Operations on CarbonData](ddl-operation-on-carbondata.md)
     
     ## Query Execution Using CarbonData Thrift Server
     
    -### Starting CarbonData Thrift Server
    +### Starting CarbonData Thrift Server.
     
    -   a. cd ``<SPARK_HOME>``
    +   a. cd `<SPARK_HOME>`
     
        b. Run the following command to start the CarbonData thrift server.
     
     ```
     ./bin/spark-submit --conf spark.sql.hive.thriftServer.singleSession=true
     --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
    -$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
    +<SPARK_HOME>/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
     ```
     
     | Parameter | Description | Example |
     |---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
    -| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the ``"<SPARK_HOME>"/carbonlib/`` folder. | carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar |
    -| carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | ``hdfs//<host_name>:54310/user/hive/warehouse/carbon.store`` |
    +| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the "<SPARK_HOME>"/carbonlib/ folder. | carbondata_2.10-0.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar |
    +| carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | hdfs//<host_name>:port/user/hive/warehouse/carbon.store |
    --- End diff --
   
    miss colon in HDFS Path? (`hdfs://hostnname:port/path`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
12