[DISCUSSION] Support Binary DataType

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSSION] Support Binary DataType

Indhumathi
Hello All,

I am working on supporting Binary DataType. Please find below
the scope and design approach for the same.

**Scope:**
  1. Create Table DDL support for Binary Data type column.
  2. Support loading of data for binary data type column [DataLoad && Insert
into DDL].
  3. Support querying binary data type columns.
  4. Describe formatted support to display the binary data type column.

**Proposed Solution:**
 1. Implement a Binary converter BinaryFieldConverterImpl, which takes
CarbonRow as
     input and converts data to Hex decoded ByteArray during
RowConverterImpl step.
 2. Create a Column Page for Binary Datatype similar to BYTE_ARRAY page.
 3. Encoding type for Binary data type is DIRECT_COMPRESS and compress data
     using getLVFlattenedBytePage() and return encoded data.
     While decoding, check whether if column is of BINARY datatype,
     and decode it to a newBinaryPage().
 4. For Querying Binary data type columns, implement a BinaryVectorFiller,
     which fills byte array data to CarbonColumnVector.

 Please provide your inputs and comments. Any suggestion from community is
 most welcomed.

Regards,
Indhumathi




--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Support Binary DataType

Jacky Li
There is an existing PR 2665 that works on binary data type, is your work based on that PR and a new one?

Regards,
Jacky

> 在 2018年9月14日,下午2:30,Indhumathi <[hidden email]> 写道:
>
> Hello All,
>
> I am working on supporting Binary DataType. Please find below
> the scope and design approach for the same.
>
> **Scope:**
>  1. Create Table DDL support for Binary Data type column.
>  2. Support loading of data for binary data type column [DataLoad && Insert
> into DDL].
>  3. Support querying binary data type columns.
>  4. Describe formatted support to display the binary data type column.
>
> **Proposed Solution:**
> 1. Implement a Binary converter BinaryFieldConverterImpl, which takes
> CarbonRow as
>     input and converts data to Hex decoded ByteArray during
> RowConverterImpl step.
> 2. Create a Column Page for Binary Datatype similar to BYTE_ARRAY page.
> 3. Encoding type for Binary data type is DIRECT_COMPRESS and compress data
>     using getLVFlattenedBytePage() and return encoded data.
>     While decoding, check whether if column is of BINARY datatype,
>     and decode it to a newBinaryPage().
> 4. For Querying Binary data type columns, implement a BinaryVectorFiller,
>     which fills byte array data to CarbonColumnVector.
>
> Please provide your inputs and comments. Any suggestion from community is
> most welcomed.
>
> Regards,
> Indhumathi
>
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>



Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Support Binary DataType

xuchuanyin
What is the difference between string and binary datatype in your processing?
Will you introduce special UDFs for the binary datatype?

At last, since you will add a new datatype, the test cases for index datamap will also be considered to support.
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Support Binary DataType

Indhumathi
In reply to this post by Jacky Li
Hi Jacky Li,

Yes. I am extending PR-2670 and working on that, for binary data type.

Regards,
Indhumathi M



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/