XOR encoding for floating point

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

XOR encoding for floating point

geetikagupta
Hi Community,

I was looking into CARBONDATA-1128
<https://issues.apache.org/jira/browse/CARBONDATA-1128>. The document
attached with the Jira describes about compression of timestamp and decimal
values. The decimal values are compressed using XOR. So I would like to
contribute to one of its subtask i.e. CARBONDATA-1130
<https://issues.apache.org/jira/browse/CARBONDATA-1130>.

--
Regards,
Geetika Gupta
Reply | Threaded
Open this post in threaded view
|

Re: XOR encoding for floating point

Liang Chen
Administrator
Hi Geetika

Very happy to see that you are interested in contributing this feature.
Please have the design discussion before you start to code.

Regards
Liang

Geetika Gupta wrote
Hi Community,

I was looking into CARBONDATA-1128
<https://issues.apache.org/jira/browse/CARBONDATA-1128>. The document
attached with the Jira describes about compression of timestamp and decimal
values. The decimal values are compressed using XOR. So I would like to
contribute to one of its subtask i.e. CARBONDATA-1130
<https://issues.apache.org/jira/browse/CARBONDATA-1130>.

--
Regards,
Geetika Gupta
Reply | Threaded
Open this post in threaded view
|

Re: XOR encoding for floating point

Jacky Li
In reply to this post by geetikagupta
+1

Feel free to contribute :)
To implement this feature, I think you need to break this feature into following sub tasks.
1. You can extend ColumnPageCodec to implement XOR encoding.
2. Come up with the criteria of how to select this encoding and change behavior of DefaultEncodingStrategy
3. SQL syntax for this encoding.

The encoding override work is still going on. The SQL syntax part is missing, so the point 3 can be done later.


Regards,
Jacky

> 在 2017年7月5日,下午3:32,Geetika Gupta <[hidden email]> 写道:
>
> Hi Community,
>
> I was looking into CARBONDATA-1128
> <https://issues.apache.org/jira/browse/CARBONDATA-1128>. The document
> attached with the Jira describes about compression of timestamp and decimal
> values. The decimal values are compressed using XOR. So I would like to
> contribute to one of its subtask i.e. CARBONDATA-1130
> <https://issues.apache.org/jira/browse/CARBONDATA-1130>.
>
> --
> Regards,
> Geetika Gupta



Reply | Threaded
Open this post in threaded view
|

Re: XOR encoding for floating point

geetikagupta
Hi Jacky Li,

XOR Encoding mainly works on timeseries data as discussed in the paper. We
looked into the classes suggested by you and found out that we will be
having min and max values for our data, firstly we need to identify whether
the data is in time series or not only then XOR encoding can be successful.

So do we need to check for timeseries data prior to performing the encoding.

--
Regards,
Geetika Gupta



--
View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/XOR-encoding-for-floating-point-tp17347p17789.html
Sent from the Apache CarbonData Dev Mailing List archive mailing list archive at Nabble.com.