[GitHub] carbondata pull request #2642: [CARBONDATA-2532][Integration] Carbon to supp...

classic Classic list List threaded Threaded
85 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2642: [CARBONDATA-2532][Integration] Carbon to supp...

qiuchenjian-2
GitHub user sujith71955 opened a pull request:

    https://github.com/apache/carbondata/pull/2642

    [CARBONDATA-2532][Integration] Carbon to support spark 2.3.1 version

    In this PR inorder to hide the compatibility issues of columnar vector API's from the existing common classes, i introduced an interface of the proxy vector readers, this
    proxy vector readers will take care the compatibility issues with respect to spark different versions.
    Column vector and Columnar Batch interface compatibility issues has been addressed in this PR, The changes were related to below modifications done in spark interface.
   
    Highlights:
    a) This is a refactoring of ColumnVector hierarchy and related classes. By Sujith
    b) make ColumnVector read-only. By Sujith
    c) introduce WritableColumnVector with write interface. By Sujith
    d) remove ReadOnlyColumnVector. By Sujith
    e) Fixed spark-carbon integration API compatibility issues - By sandeep katta
    f) Corrected the testcases based on spark 2.3.0 behaviour change - By sandeep katta
    g) Excluded following dependency from pom.xml files net.jpountzlz4 as spark 2.3.0 changed
    it to org.lz4, so removed from the test class path of spark2,spark-common-test,spark2-examples

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sujith71955/incubator-carbondata mas_mig_spark2.3_carbon_latest

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2642.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2642
   
----
commit 7359151b612d3403e53c4759c853e1ab681fae7f
Author: sujith71955 <sujithchacko.2010@...>
Date:   2018-05-24T05:51:50Z

    [CARBONDATA-2532][Integration] Carbon to support spark 2.3 version, ColumnVector Interface
   
    Column vector and Columnar Batch interface compatibility issues has been
    addressed in this PR, The changes were related to below modifications
    done in spark interface
    a) This is a refactoring of ColumnVector hierarchy and related classes.
    b) make ColumnVector read-only
    c) introduce WritableColumnVector with write interface
    d) remove ReadOnlyColumnVector
   
    In this PR inorder to hide the compatibility issues of columnar vector
    API's from the existing common classes, i introduced an interface of the
    proxy vector readers, this
    proxy vector readers will take care the compatibility issues with
    respect to spark different versions.

commit 5934d975b53276b2490c6c178ae5b71f539dac60
Author: sandeep-katta <sandeep.katta2007@...>
Date:   2018-07-06T04:31:29Z

    [CARBONDATA-2532][Integration] Carbon to support spark 2.3 version, compatability issues
   
    All compatability issues when supporting 2.3 addressed
    Supported pom profile -P"spark-2.3"

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
Github user sujith71955 commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    @sandeep-katta @gvramana


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6284/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7933/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6285/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6656/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6297/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sandeep-katta commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    4 test cases are failing in SDV build which is not related this PR code changes.
    Same 4 test cases are failing other PR also refer !https://github.com/apache/carbondata/pull/2643


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7946/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6669/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2642: [CARBONDATA-2532][Integration] Carbon to supp...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2642#discussion_r211463185
 
    --- Diff: integration/spark2/src/main/java/org/apache/carbondata/spark/vectorreader/ColumnarVectorWrapper.java ---
    @@ -25,198 +25,204 @@
     import org.apache.carbondata.spark.util.CarbonScalaUtil;
     
     import org.apache.parquet.column.Encoding;
    -import org.apache.spark.sql.execution.vectorized.ColumnVector;
    +import org.apache.spark.sql.CarbonVectorProxy;
     import org.apache.spark.sql.types.Decimal;
     
     class ColumnarVectorWrapper implements CarbonColumnVector {
     
    -  private ColumnVector columnVector;
    +  private CarbonVectorProxy writableColumnVector;
    --- End diff --
   
    it is better to name this member a general name instead of a class name in spark2.3


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2642: [CARBONDATA-2532][Integration] Carbon to supp...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2642#discussion_r211841539
 
    --- Diff: integration/spark2/src/main/java/org/apache/carbondata/spark/vectorreader/ColumnarVectorWrapper.java ---
    @@ -25,198 +25,204 @@
     import org.apache.carbondata.spark.util.CarbonScalaUtil;
     
     import org.apache.parquet.column.Encoding;
    -import org.apache.spark.sql.execution.vectorized.ColumnVector;
    +import org.apache.spark.sql.CarbonVectorProxy;
     import org.apache.spark.sql.types.Decimal;
     
     class ColumnarVectorWrapper implements CarbonColumnVector {
     
    -  private ColumnVector columnVector;
    +  private CarbonVectorProxy writableColumnVector;
    --- End diff --
   
    agree


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2642: [CARBONDATA-2532][Integration] Carbon to supp...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2642#discussion_r211841980
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableUsingSparkCarbonFileFormat.scala ---
    @@ -111,10 +112,10 @@ class TestCreateTableUsingSparkCarbonFileFormat extends QueryTest with BeforeAnd
         sql("DROP TABLE IF EXISTS sdkOutputTable")
     
         //data source file format
    -    if (sqlContext.sparkContext.version.startsWith("2.1")) {
    +    if (SparkUtil.isSparkVersionEqualToX("2.1")) {
    --- End diff --
   
    rename to `isSparkVersionEqualTo`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user aaron-aa commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    Hi @sujith71955,  it's great to see you do the integration work for latest spark release, so what's the time schedule to merge this pull into master? because my companies' production spark version is 2.3.1, we're expecting your progress. Thanks very much!


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    hi @aaron-aa , I think it's better to use Spark 2.3.2, Spark 2.3.2 has fixed some big issues which were found in Spark 2.3.1 and will be released soon. what do you think?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user sujith71955 commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    @aaron-aa @zzcclp  Re-base work is pending which i will finish might be in a day or couple. Will check with committers regarding the merge plan of this feature.
    As @zzcclp  told it will be better to use spark 2.3.2 version because of some major defect fixes, but currently the release date for spark 2.3.2 is unclear. anyways once this feature will be merged it will take very less effort for rebasing with spark 2.3.2 version.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user aaron-aa commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    @sujith71955 @zzcclp
    Thanks a lot for you guys information,  which could help me reschedule the plan in advance! Hope spark 2.3.2 come out soon, and I will try to work on spark 2.2.1 currently.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    Now spark 2.3.2 is about to release, can this PR works with all spark 2.3 branch including 2.3.2?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6455/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2642: [CARBONDATA-2532][Integration] Carbon to support spa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2642
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6456/



---
12345