Hello All,
I am working on supporting Binary DataType. Please find below the scope and design approach for the same. **Scope:** 1. Create Table DDL support for Binary Data type column. 2. Support loading of data for binary data type column [DataLoad && Insert into DDL]. 3. Support querying binary data type columns. 4. Describe formatted support to display the binary data type column. **Proposed Solution:** 1. Implement a Binary converter BinaryFieldConverterImpl, which takes CarbonRow as input and converts data to Hex decoded ByteArray during RowConverterImpl step. 2. Create a Column Page for Binary Datatype similar to BYTE_ARRAY page. 3. Encoding type for Binary data type is DIRECT_COMPRESS and compress data using getLVFlattenedBytePage() and return encoded data. While decoding, check whether if column is of BINARY datatype, and decode it to a newBinaryPage(). 4. For Querying Binary data type columns, implement a BinaryVectorFiller, which fills byte array data to CarbonColumnVector. Please provide your inputs and comments. Any suggestion from community is most welcomed. Regards, Indhumathi -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
There is an existing PR 2665 that works on binary data type, is your work based on that PR and a new one?
Regards, Jacky > 在 2018年9月14日,下午2:30,Indhumathi <[hidden email]> 写道: > > Hello All, > > I am working on supporting Binary DataType. Please find below > the scope and design approach for the same. > > **Scope:** > 1. Create Table DDL support for Binary Data type column. > 2. Support loading of data for binary data type column [DataLoad && Insert > into DDL]. > 3. Support querying binary data type columns. > 4. Describe formatted support to display the binary data type column. > > **Proposed Solution:** > 1. Implement a Binary converter BinaryFieldConverterImpl, which takes > CarbonRow as > input and converts data to Hex decoded ByteArray during > RowConverterImpl step. > 2. Create a Column Page for Binary Datatype similar to BYTE_ARRAY page. > 3. Encoding type for Binary data type is DIRECT_COMPRESS and compress data > using getLVFlattenedBytePage() and return encoded data. > While decoding, check whether if column is of BINARY datatype, > and decode it to a newBinaryPage(). > 4. For Querying Binary data type columns, implement a BinaryVectorFiller, > which fills byte array data to CarbonColumnVector. > > Please provide your inputs and comments. Any suggestion from community is > most welcomed. > > Regards, > Indhumathi > > > > > -- > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
What is the difference between string and binary datatype in your processing?
Will you introduce special UDFs for the binary datatype? At last, since you will add a new datatype, the test cases for index datamap will also be considered to support. |
In reply to this post by Jacky Li
Hi Jacky Li,
Yes. I am extending PR-2670 and working on that, for binary data type. Regards, Indhumathi M -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |