GitHub user ffpeng90 opened a pull request:
https://github.com/apache/incubator-carbondata/pull/650 [CARBONDATA-<Jira issue 728>](WIP) add intergation with presto Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[CARBONDATA-<Jira issue #>] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - What manual testing you have done? - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/ffpeng90/incubator-carbondata add_presto Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/650.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #650 ---- commit 82168f48a07b1757160147551332bb456df2e65a Author: ffpeng90 <[hidden email]> Date: 2017-03-12T12:27:32Z add presto integration 0.0.1 ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1100/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/650#discussion_r105558847 --- Diff: integration/presto/pom.xml --- @@ -0,0 +1,167 @@ +<?xml version="1.0" encoding="UTF-8"?> +<project xmlns="http://maven.apache.org/POM/4.0.0" + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> + <modelVersion>4.0.0</modelVersion> + + <parent> + <groupId>com.facebook.presto</groupId> --- End diff -- please change groupId to org.apache.carbondata. you can take integration/spark module as reference, and update accordingly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 Thanks for working on this. Can you describe what feature is added in term of: 1. What SQL syntax is supported? DDL &DML? 2. I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat? 3. Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 And I think it is easier to review and can be merged sooner if you could break this PR down into smaller one. Just provide the very basic functionality in the first round of the integration. You can add more functionality in subsequent PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ffpeng90 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 Hi: 1. This version only suppport DML, All tables for test are created by spark-sql(DML part), and i submit queries to presto to get results. I only tested the "Select" Case , like where, group , sum , join. 2. I use APIs like createQueryPlan, resolveFilter from class "CarbonInputFormatUtil". To read carbon formatted table, i make the read process into several steps: a). load table metadata b). get splits from table (pushing down filtering to filter datablocks of one segment @CarbonTableReader.getInputSplits2 ) c). parse records ( pushing down column projection and filtering into QueryModel @CarbondataRecordSetProvider.getRecordSet ) 3. As i described in partC "parse records", I use QueryModel to get decoded records. For lazy decoding, I will keep on exploring a better solution. Maybe we can get inspiration from module presto-orc, presto-parquet. At 2017-03-15 09:11:19, "Jacky Li" <[hidden email]> wrote: Thanks for working on this. Can you describe what feature is added in term of: What SQL syntax is supported? DDL &DML? I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat? Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode? â You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ffpeng90 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 I'm focusing on two things. 1. let user can debug presto-carbondata in his IDE. 2. use new presto API to support lazy decode. They will be ok soon. At 2017-03-15 10:52:01, "å½" <[hidden email]> wrote: Hi: 1. This version only suppport DML, All tables for test are created by spark-sql(DML part), and i submit queries to presto to get results. I only tested the "Select" Case , like where, group , sum , join. 2. I use APIs like createQueryPlan, resolveFilter from class "CarbonInputFormatUtil". To read carbon formatted table, i make the read process into several steps: a). load table metadata b). get splits from table (pushing down filtering to filter datablocks of one segment @CarbonTableReader.getInputSplits2 ) c). parse records ( pushing down column projection and filtering into QueryModel @CarbondataRecordSetProvider.getRecordSet ) 3. As i described in partC "parse records", I use QueryModel to get decoded records. For lazy decoding, I will keep on exploring a better solution. Maybe we can get inspiration from module presto-orc, presto-parquet. At 2017-03-15 09:11:19, "Jacky Li" <[hidden email]> wrote: Thanks for working on this. Can you describe what feature is added in term of: What SQL syntax is supported? DDL &DML? I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat? Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode? â You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 @ffpeng90 please update the PR title also. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1290/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ffpeng90 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 as your wish --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1296/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/650 @ffpeng90 In presto/pom.xml, please change groupid from "com.facebook.presto" to "org.apache.carbondata" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:
https://github.com/apache/incubator-carbondata/pull/650 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |