[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
GitHub user bhavya411 opened a pull request:

    https://github.com/apache/carbondata/pull/2265

    Added Performance Optimization for Presto by using MultiBlockSplit

    This PR deals with the changes in performance of Presto Integration layer, the major changes are
    - Introducing CarbonMultiBlockSplit to reduce the Network Latency
    - Added Query Statistics support in Presto
    - Optimized DecimalStreamReader
    - Added log4j.properties to get the logs output on console
    - Optimized pom.xml to remove unnecessary stuff and duplicate dependencies
   
     - [X ] Any interfaces changed?
     
     - [ X] Any backward compatibility impacted?
     
     - [ X] Document update required?
   
     - [ Y] Testing done
         We have run the Queries on local environment and there is an improvement in the performance
           
     - [ N] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bhavya411/incubator-carbondata UnsafeCode

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2265.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2265
   
----
commit 9822c4dbfe46dc21f698d0b9d52b64e29dbfbe7a
Author: Bhavya <bhavya@...>
Date:   2018-04-16T06:24:17Z

    Added Performance Optimization for Presto by using MultiBlockSplit

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4458/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5618/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4711/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    please create JIRA for this PR


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Are you using the existing CarbonMultiBlockSplit, not adding a new one, right?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2265#discussion_r186277824
 
    --- Diff: integration/presto/pom.xml ---
    @@ -462,12 +462,6 @@
           <version>3.0.2</version>
    --- End diff --
   
    Please check whether can remove the below hadoop dependency in presto/pom.xml, or not ?
   
    <dependency>
          <groupId>com.facebook.presto.hadoop</groupId>
          <artifactId>hadoop-apache2</artifactId>
          <version>2.7.3-1</version>
          <exclusions>
            <exclusion>
              <groupId>org.antlr</groupId>
              <artifactId>antlr4-runtime</artifactId>
            </exclusion>
            <exclusion>
              <groupId>com.fasterxml.jackson.core</groupId>
              <artifactId>jackson-databind</artifactId>
            </exclusion>
          </exclusions>
        </dependency>


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2265#discussion_r186359964
 
    --- Diff: integration/presto/pom.xml ---
    @@ -462,12 +462,6 @@
           <version>3.0.2</version>
    --- End diff --
   
    We can not remove this class for now as compilation fails , is there a particular reason we should have not this dependency then I can look into supplementing it with some carbon jar


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    @jackylk  Yes we are using the CarbonMultiBlockSplit of Carbon only but we have to write a class in Presto to wrap it for serialization purpose. Internally it uses the CarbonMultiBlock split only.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2265#discussion_r187825638
 
    --- Diff: integration/presto/pom.xml ---
    @@ -462,12 +462,6 @@
           <version>3.0.2</version>
    --- End diff --
   
    Because in presto-module already have hadoop dependency as below, can we remove one ?
    ```
    groupId>org.apache.carbondata</groupId>
    <artifactId>carbondata-hadoop</artifactId>
    <version>${project.version}</version>
    ```


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    please solve the conflict


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6024/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5032/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4865/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    I have resolved the conflicts


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2265: Added Performance Optimization for Presto by ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2265#discussion_r189779766
 
    --- Diff: integration/presto/pom.xml ---
    @@ -462,12 +462,6 @@
           <version>3.0.2</version>
    --- End diff --
   
    @chenliang613  we can not remove this dependency as CarbonTableInputFormat is part of this , we have excluded a lot of dependencies which were not needed in this particular dependency.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6034/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5045/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4875/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2265: Added Performance Optimization for Presto by using M...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2265
 
    Is there JIRA ticket created for this PR? @bhavya411


---
123