[GitHub] incubator-carbondata pull request #650: [CARBONDATA-<Jira issue 728>](WIP) a...

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #650: [CARBONDATA-<Jira issue 728>](WIP) a...

qiuchenjian-2
GitHub user ffpeng90 opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/650

    [CARBONDATA-<Jira issue 728>](WIP) add intergation with presto

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
   
     - [ ] Make sure the PR title is formatted like:
       `[CARBONDATA-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt).
     - [ ] Testing done
     
            Please provide details on
            - Whether new unit test cases have been added or why no new tests are required?
            - What manual testing you have done?
            - Any additional information to help reviewers in testing this change.
             
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
                     
    ---


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ffpeng90/incubator-carbondata add_presto

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/650.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #650
   
----
commit 82168f48a07b1757160147551332bb456df2e65a
Author: ffpeng90 <[hidden email]>
Date:   2017-03-12T12:27:32Z

    add presto integration 0.0.1

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [CARBONDATA-<Jira issue 728>](WIP) add inte...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1100/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/650#discussion_r105558847
 
    --- Diff: integration/presto/pom.xml ---
    @@ -0,0 +1,167 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<project xmlns="http://maven.apache.org/POM/4.0.0"
    +         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    +         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +
    +    <parent>
    +        <groupId>com.facebook.presto</groupId>
    --- End diff --
   
    please change groupId to org.apache.carbondata.
    you can take integration/spark module as reference, and update accordingly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    Thanks for working on this. Can you describe what feature is added in term of:
    1. What SQL syntax is supported? DDL &DML?
    2. I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat?
    3. Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    And I think it is easier to review and can be merged sooner if you could break this PR down into smaller one. Just provide the very basic functionality in the first round of the integration. You can add more functionality in subsequent PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ffpeng90 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    Hi:
        1. This version only suppport DML,  
            All tables for test are created by spark-sql(DML part),
            and i submit queries to presto to get results.
            I only tested the "Select" Case , like where, group , sum , join.
   
   
        2.  I use APIs like createQueryPlan, resolveFilter from class "CarbonInputFormatUtil".
           To read carbon formatted table, i make the read process into several steps:
           a). load table metadata
           b). get splits from table (pushing down filtering to filter datablocks of one segment @CarbonTableReader.getInputSplits2 )
           c). parse records ( pushing down column projection and filtering into QueryModel  @CarbondataRecordSetProvider.getRecordSet )
   
   
        3. As i described  in partC "parse records", I use QueryModel to get  decoded records.
           For lazy decoding,  I will keep on exploring a better solution.  Maybe we can get inspiration from module presto-orc, presto-parquet.
         
     
         
       
   
   
   
   
   
   
    At 2017-03-15 09:11:19, "Jacky Li" <[hidden email]> wrote:
   
   
    Thanks for working on this. Can you describe what feature is added in term of:
   
    What SQL syntax is supported? DDL &DML?
    I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat?
    Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode?
   
    —
    You are receiving this because you authored the thread.
    Reply to this email directly, view it on GitHub, or mute the thread.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ffpeng90 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    I'm focusing on two things.
    1. let user can debug presto-carbondata in his IDE.
    2. use new presto API to support lazy decode.
    They will be ok soon.
   
   
   
   
   
   
    At 2017-03-15 10:52:01, "å½­" <[hidden email]> wrote:
   
    Hi:
        1. This version only suppport DML,  
            All tables for test are created by spark-sql(DML part),
            and i submit queries to presto to get results.
            I only tested the "Select" Case , like where, group , sum , join.
   
   
        2.  I use APIs like createQueryPlan, resolveFilter from class "CarbonInputFormatUtil".
           To read carbon formatted table, i make the read process into several steps:
           a). load table metadata
           b). get splits from table (pushing down filtering to filter datablocks of one segment @CarbonTableReader.getInputSplits2 )
           c). parse records ( pushing down column projection and filtering into QueryModel  @CarbondataRecordSetProvider.getRecordSet )
   
   
        3. As i described  in partC "parse records", I use QueryModel to get  decoded records.
           For lazy decoding,  I will keep on exploring a better solution.  Maybe we can get inspiration from module presto-orc, presto-parquet.
         
     
         
       
   
   
   
   
   
   
    At 2017-03-15 09:11:19, "Jacky Li" <[hidden email]> wrote:
   
   
    Thanks for working on this. Can you describe what feature is added in term of:
   
    What SQL syntax is supported? DDL &DML?
    I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat?
    Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode?
   
    —
    You are receiving this because you authored the thread.
    Reply to this email directly, view it on GitHub, or mute the thread.
   
   
   
   
   
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    @ffpeng90  please update the PR title also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1290/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ffpeng90 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    as your wish


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [CARBONDATA-728] add intergation with prest...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [CARBONDATA-728] add intergation with prest...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [CARBONDATA-728] add intergation with prest...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1296/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #650: [CARBONDATA-728] add intergation with prest...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
 
    @ffpeng90
    In presto/pom.xml, please change groupid from "com.facebook.presto" to "org.apache.carbondata"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #650: [CARBONDATA-728] add intergation wit...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-carbondata/pull/650


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---