[GitHub] [carbondata] Indhumathi27 opened a new pull request #3885: [WIP] Support Presto with IndexSserver

classic Classic list List threaded Threaded
50 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 opened a new pull request #3885: [WIP] Support Presto with IndexSserver

GitBox

Indhumathi27 opened a new pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885


    ### Why is this PR needed?
   
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-669965675


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3637/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-669968186


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1898/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670095485


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1907/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670143028


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3646/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670362767


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3648/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670369483


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1909/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670443439


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3652/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670444134


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1913/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670617768


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3657/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670622887


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1918/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-671276686


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3674/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-671278053


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1935/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468499341



##########
File path: docs/prestodb-guide.md
##########
@@ -301,3 +303,21 @@ Presto carbon only supports reading the carbon table which is written by spark c
 During reading, it supports the non-distributed indexes like block index and bloom index.
 It doesn't support Materialized View as it needs query plan to be changed and presto does not allow it.
 Also, Presto carbon supports streaming segment read from streaming table created by spark.
+
+## Presto Setup with CarbonData Distributed IndexServer

Review comment:
       As `prestosql` is default profile, add this doc in `prestosql-guide.md` and in prestodb doc give a link to this section of prestosql as it is common to both the presto version




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468499772



##########
File path: docs/prestodb-guide.md
##########
@@ -301,3 +303,21 @@ Presto carbon only supports reading the carbon table which is written by spark c
 During reading, it supports the non-distributed indexes like block index and bloom index.
 It doesn't support Materialized View as it needs query plan to be changed and presto does not allow it.
 Also, Presto carbon supports streaming segment read from streaming table created by spark.
+
+## Presto Setup with CarbonData Distributed IndexServer
+
+### Dependency jars
+After copying all the jars from ../integration/presto/target/carbondata-presto-X.Y.Z-SNAPSHOT
+to `plugin/carbondata` directory on all nodes, ensure copying the following jars as well.
+1. Copy ../integration/spark/target/carbondata-spark_X.Y.Z-SNAPSHOT.jar
+2. Copy corresponding Spark dependency jars to the location.
+
+### Configure properties
+Configure IndexServer configurations in carbon.properties file. Refer
+[Configuring IndexServer](https://github.com/apache/carbondata/blob/master/docs/index-server.md#Configurations) for more info.
+Add  `-Dcarbon.properties.filepath=<path>/carbon.properties` in jvm.config file.
+
+### Presto with IndexServer
+Start distributed index server. Launch presto CLI and fire SELECT query and check if the corresponding job
+is triggered in the index server application.

Review comment:
       Also mention that can use spark to see the cache loaded by using show metacache command




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468503482



##########
File path: integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java
##########
@@ -127,7 +128,8 @@ public CarbonLocalInputSplit(@JsonProperty("segmentId") String segmentId,
       @JsonProperty("deleteDeltaFiles") String[] deleteDeltaFiles,
       @JsonProperty("blockletId") String blockletId,
       @JsonProperty("detailInfo") String detailInfo,
-      @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal) {
+      @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal,
+      boolean isDistributedPruningEnabled) {

Review comment:
       please also keep  @JsonProperty for this




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468518151



##########
File path: integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java
##########
@@ -127,7 +128,8 @@ public CarbonLocalInputSplit(@JsonProperty("segmentId") String segmentId,
       @JsonProperty("deleteDeltaFiles") String[] deleteDeltaFiles,
       @JsonProperty("blockletId") String blockletId,
       @JsonProperty("detailInfo") String detailInfo,
-      @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal) {
+      @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal,
+      boolean isDistributedPruningEnabled) {

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-671966844


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3698/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-671967929


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1959/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r469287811



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexJobs.scala
##########
@@ -55,14 +57,27 @@ class DistributedIndexJob extends AbstractIndexJob {
     val (response, time) = logTime {
       try {
         val spark = SparkSQLUtil.getSparkSession
-        indexFormat.setTaskGroupId(SparkSQLUtil.getTaskGroupId(spark))
-        indexFormat.setTaskGroupDesc(SparkSQLUtil.getTaskGroupDesc(spark))
+        // In case of presto with index server flow, sparksession will be null
+        if (null != spark) {

Review comment:
       is it possible to add a configuration to CarbonProperties that can tell us whether its a presto flow or not. Property should not be exposed to user. just for internal purpose.
   
   This null check looks dirty.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


123