[GitHub] carbondata pull request #3014: [WIP] Added load level SORT_SCOPE

classic Classic list List threaded Threaded
54 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3014: [WIP] Added load level SORT_SCOPE

qiuchenjian-2
GitHub user NamanRastogi opened a pull request:

    https://github.com/apache/carbondata/pull/3014

    [WIP] Added load level SORT_SCOPE

    ### Load level SORT_SCOPE
    ```sql
    LOAD DATA INPATH 'path/to/data.csv'
    INTO TABLE my_table
    OPTIONS (
       'sort_scope'='no_sort'
    )
    ```
   
    ### Priority of SORT_SCOPE
    1. Load Level (if provided)
    2. Table level (if provided)
    3. Default
   
   
    =====
     - [x] Any interfaces changed?   --->   No
     - [x] Any backward compatibility impacted?   ---> No
     - [x] Document update required?   --->   Yes
     - [ ] Testing done
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/NamanRastogi/carbondata load_sort_scope

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/3014.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3014
   
----
commit f2bb306acfb3af095b6306be4793acaf8b8380f2
Author: namanrastogi <naman.rastogi.52@...>
Date:   2018-12-21T07:33:30Z

    Added load level SORT_SCOPE

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [WIP] Added load level SORT_SCOPE

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1897/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [WIP] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2107/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [WIP] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10152/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [WIP] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1958/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3014#discussion_r244092283
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ---
    @@ -191,10 +191,17 @@ case class CarbonLoadDataCommand(
         optionsFinal
           .put("complex_delimiter_level_4",
             ComplexDelimitersEnum.COMPLEX_DELIMITERS_LEVEL_4.value())
    -    optionsFinal.put("sort_scope", tableProperties.asScala.getOrElse("sort_scope",
    -      carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE,
    -        carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE,
    -          CarbonCommonConstants.LOAD_SORT_SCOPE_DEFAULT))))
    +    optionsFinal.put(
    --- End diff --
   
    please format the code


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3014#discussion_r244092650
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ---
    @@ -191,10 +191,17 @@ case class CarbonLoadDataCommand(
         optionsFinal
           .put("complex_delimiter_level_4",
             ComplexDelimitersEnum.COMPLEX_DELIMITERS_LEVEL_4.value())
    -    optionsFinal.put("sort_scope", tableProperties.asScala.getOrElse("sort_scope",
    -      carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE,
    -        carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE,
    -          CarbonCommonConstants.LOAD_SORT_SCOPE_DEFAULT))))
    +    optionsFinal.put(
    +      "sort_scope",
    +      options.getOrElse(
    +        "sort_scope",
    +        tableProperties.asScala.getOrElse(
    +          "sort_scope",
    +          carbonProperty.getProperty(
    +            CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE,
    +            carbonProperty.getProperty(
    +              CarbonCommonConstants.LOAD_SORT_SCOPE,
    +              CarbonCommonConstants.LOAD_SORT_SCOPE_DEFAULT)))))
    --- End diff --
   
    please handle for SDK and file format also


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3014#discussion_r244092819
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableWithSortScope.scala ---
    @@ -32,25 +32,6 @@ class TestCreateTableWithSortScope extends QueryTest with BeforeAndAfterAll {
         sql("DROP TABLE IF EXISTS tableWithBatchSort")
         sql("DROP TABLE IF EXISTS tableWithNoSort")
         sql("DROP TABLE IF EXISTS tableWithUnsupportSortScope")
    -    sql("DROP TABLE IF EXISTS tableLoadWithSortScope")
    -  }
    -
    -  test("Do not support load data with specify sort scope") {
    -    sql(
    -    s"""
    -       | CREATE TABLE tableLoadWithSortScope(
    -       | intField INT,
    -       | stringField STRING
    -       | )
    -       | STORED BY 'carbondata'
    -       | TBLPROPERTIES('SORT_COLUMNS'='stringField')
    -       """.stripMargin)
    -
    -    val exception_loaddata_sortscope: Exception = intercept[Exception] {
    -      sql("LOAD DATA LOCAL INPATH '/path/to/data' INTO TABLE tableLoadWithSortScope " +
    -          "OPTIONS('SORT_SCOPE'='GLOBAL_SORT')")
    -    }
    -    assert(exception_loaddata_sortscope.getMessage.contains("Error: Invalid option(s): sort_scope"))
    --- End diff --
   
    please add a test case and you can give different sort scope in create tavble, load and property and  check the sort scope in describe formatted.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10210/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2232/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2054/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Failed  with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10306/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    @NamanRastogi please fix test failure


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2130/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10384/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2336/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user NamanRastogi commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3014#discussion_r244947621
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ---
    @@ -191,10 +191,17 @@ case class CarbonLoadDataCommand(
         optionsFinal
           .put("complex_delimiter_level_4",
             ComplexDelimitersEnum.COMPLEX_DELIMITERS_LEVEL_4.value())
    -    optionsFinal.put("sort_scope", tableProperties.asScala.getOrElse("sort_scope",
    -      carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE,
    -        carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE,
    -          CarbonCommonConstants.LOAD_SORT_SCOPE_DEFAULT))))
    +    optionsFinal.put(
    +      "sort_scope",
    +      options.getOrElse(
    +        "sort_scope",
    +        tableProperties.asScala.getOrElse(
    +          "sort_scope",
    +          carbonProperty.getProperty(
    +            CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE,
    +            carbonProperty.getProperty(
    +              CarbonCommonConstants.LOAD_SORT_SCOPE,
    +              CarbonCommonConstants.LOAD_SORT_SCOPE_DEFAULT)))))
    --- End diff --
   
    No need to handle for SDK.
    Done for PreAgg.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2353/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10399/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3014
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2147/



---
123