[GitHub] carbondata pull request #1538: [CARBONDATA-1779] GenericVectorizedReader

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1538: [CARBONDATA-1779] GenericVectorizedReader

qiuchenjian-2
GitHub user bhavya411 opened a pull request:

    https://github.com/apache/carbondata/pull/1538

    [CARBONDATA-1779]  GenericVectorizedReader

    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - No  interfaces changed?
     
     - No backward compatibility impacted?
     
     - No Document update required?
   
     - [ Yes] Testing done
            - All Unit test cases are passing, no new unit test cases were needed as this PR implements a Generic Vectorized Reader.
            - Manual Testing completed for the same .
           
   
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bhavya411/incubator-carbondata CARBONDATA-1779

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1538.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1538
   
----
commit ef28391c656cc2d20082e52dd4ab729b0992cfb3
Author: Bhavya <[hidden email]>
Date:   2017-11-14T10:05:44Z

    Added Generic vectorized Reader

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1538: [CARBONDATA-1779] GenericVectorizedReader

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1538
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1312/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1538: [CARBONDATA-1779] GenericVectorizedReader

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1538#discussion_r152168110
 
    --- Diff: integration/presto/pom.xml ---
    @@ -31,7 +31,7 @@
       <packaging>presto-plugin</packaging>
     
       <properties>
    -    <presto.version>0.186</presto.version>
    --- End diff --
   
    why changed presto version again ?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1538: [CARBONDATA-1779] GenericVectorizedReader

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/1538
 
    @bhavya411  please add the detail description for this pull request.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1538: [CARBONDATA-1779] GenericVectorizedReader

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1538#discussion_r152498613
 
    --- Diff: integration/presto/pom.xml ---
    @@ -31,7 +31,7 @@
       <packaging>presto-plugin</packaging>
     
       <properties>
    -    <presto.version>0.186</presto.version>
    --- End diff --
   
    There was an issue with multiple queries on Presto 0.186 so they have declare that release as unstable, they fixed that issue in 0.187 so have upgraded it to 0.187


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1538: [CARBONDATA-1779] GenericVectorizedReader

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user bhavya411 commented on the issue:

    https://github.com/apache/carbondata/pull/1538
 
    This PR removes the Spark Dependency from Presto Integration Module for using the CarbonVectorizedRecordreader, This PR consolidate  CarbonVectorizedRecordReader into one,to make it shared for all integration modules
   
    In the earlier version of Presto Integration we were using ColumnarBatch of Spark, which is not a good practice, here we provided our own implementation of the ColumnVector and the VectorBatch to eliminate the Spark all together. This generic ColumnVector can now be used for all the integration module wherever we want to have a VectorizedReader to speed up the processing.
   
    There are some core module classes changed to ensure that we are using Java data types instead of Spark datatypes, Decimal being one of them.
   
    This PR tries to limit the changes to Core module .
   
    Newly Added Classes
    1.CarbonColumnVectorImpl:This Class Implements the Interface CarbonColumnVector and provides the methods to store the data in a Vector and to retrieved the data from it as well
   
    2.CarbonVectorBatch: This Class Creates A VectorizedRowBatch which is a set of rows, organized with each column as a CarbonVector. It is the unit of query execution, organized to minimize the cost per row and achieve high cycles-per-instruction. The major fields are public by design to allow fast and convenient access by the vectorized query execution code.
   
    3.StructField:This class is used to pass the Schema Information to the Carbon Columnar Batch



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1538: [CARBONDATA-1779] GenericVectorizedReader

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1538#discussion_r152741031
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/SafeVariableLengthDimensionDataChunkStore.java ---
    @@ -141,24 +135,25 @@ public SafeVariableLengthDimensionDataChunkStore(boolean isInvertedIndex, int nu
           // for last record
           length = (short) (this.data.length - currentDataOffset);
         }
    -    DataType dt = vector.getType();
    -    if ((!(dt instanceof StringType) && length == 0) || ByteUtil.UnsafeComparer.INSTANCE
    +    org.apache.carbondata.core.metadata.datatype.DataType dt = vector.getType();
    --- End diff --
   
    why do this change ?   remove import, add  the full import at here ?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1538: [CARBONDATA-1779] GenericVectorizedReader

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1538


---