[Discussion] Support for Float and Byte data types

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion] Support for Float and Byte data types

kunalkapoor
Hi dev,
I am working on supporting Float and Byte datatypes.

*Background*
Currently float is supported by internally storing the data as double and
changing the data type to Double. This poses some problems while using
SparkCarbonFileFormat for reading the float type data.
Internally as the data type is changed from Float to Double therefore the
data is retrieved as a Double page instead of float.
If the user tried to create a table using file format by specifying the
datatype as float for any column then the query will fail. User is *restricted
to use double to retrieve the data.*

*Proposed Solution*
Add support for float data type and store the date as a FloatPage. Most of
the methods that are used for double can be reused for float.

*Similar approach can be used for Byte*.

Any suggestion from community is most welcomed.

 -Regards
Kunal Kapoor
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Support for Float and Byte data types

xuchuanyin
The actual storage datatype for that column is stored in ColumnPage level.
In previous implementation, columns with literal datatype 'float' and
'double' shared the same storage datatype 'double' and you want to
distinguish them by adding support for storage datatype 'float'.

Is my understanding right? If it is so, there will no compatible problems.
But we'd better to mention that for old store prior than 1.5.0, user can
only use double.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Support for Float and Byte data types

kunalkapoor
Hi xuchuanyin,
Yes your understanding is correct and i agree that documentation has to be
updated to mention that for old store double data type should be used.
For the first phase let us focus on writing/reading through SDK and
FileFormat.
What are your thoughts?

On Fri, Sep 14, 2018 at 7:05 AM xuchuanyin <[hidden email]> wrote:

> The actual storage datatype for that column is stored in ColumnPage level.
> In previous implementation, columns with literal datatype 'float' and
> 'double' shared the same storage datatype 'double' and you want to
> distinguish them by adding support for storage datatype 'float'.
>
> Is my understanding right? If it is so, there will no compatible problems.
> But we'd better to mention that for old store prior than 1.5.0, user can
> only use double.
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Support for Float and Byte data types

Jacky Li
I think your proposal will support CarbonSession also, but not only SDK and FileFormat, right?

Regards,
Jacky

> 在 2018年9月14日,下午12:34,Kunal Kapoor <[hidden email]> 写道:
>
> Hi xuchuanyin,
> Yes your understanding is correct and i agree that documentation has to be
> updated to mention that for old store double data type should be used.
> For the first phase let us focus on writing/reading through SDK and
> FileFormat.
> What are your thoughts?
>
> On Fri, Sep 14, 2018 at 7:05 AM xuchuanyin <[hidden email]> wrote:
>
>> The actual storage datatype for that column is stored in ColumnPage level.
>> In previous implementation, columns with literal datatype 'float' and
>> 'double' shared the same storage datatype 'double' and you want to
>> distinguish them by adding support for storage datatype 'float'.
>>
>> Is my understanding right? If it is so, there will no compatible problems.
>> But we'd better to mention that for old store prior than 1.5.0, user can
>> only use double.
>>
>>
>>
>> --
>> Sent from:
>> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>>
>



Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Support for Float and Byte data types

kunalkapoor
Yes, It will support both Session and SDK

On Fri, Sep 14, 2018 at 1:13 PM Jacky Li <[hidden email]> wrote:

> I think your proposal will support CarbonSession also, but not only SDK
> and FileFormat, right?
>
> Regards,
> Jacky
>
> > 在 2018年9月14日,下午12:34,Kunal Kapoor <[hidden email]> 写道:
> >
> > Hi xuchuanyin,
> > Yes your understanding is correct and i agree that documentation has to
> be
> > updated to mention that for old store double data type should be used.
> > For the first phase let us focus on writing/reading through SDK and
> > FileFormat.
> > What are your thoughts?
> >
> > On Fri, Sep 14, 2018 at 7:05 AM xuchuanyin <[hidden email]>
> wrote:
> >
> >> The actual storage datatype for that column is stored in ColumnPage
> level.
> >> In previous implementation, columns with literal datatype 'float' and
> >> 'double' shared the same storage datatype 'double' and you want to
> >> distinguish them by adding support for storage datatype 'float'.
> >>
> >> Is my understanding right? If it is so, there will no compatible
> problems.
> >> But we'd better to mention that for old store prior than 1.5.0, user can
> >> only use double.
> >>
> >>
> >>
> >> --
> >> Sent from:
> >>
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >>
> >
>
>
>
>