[jira] [Updated] (CARBONDATA-45) Support MAP type

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Updated] (CARBONDATA-45) Support MAP type

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravindra Pesala updated CARBONDATA-45:
    Fix Version/s:     (was: 1.4.0)

> Support MAP type
> ----------------
>                 Key: CARBONDATA-45
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-45
>             Project: CarbonData
>          Issue Type: New Feature
>          Components: core, sql
>            Reporter: cen yuhai
>            Assignee: Venkata Ramana G
>            Priority: Major
> {code:sql}
> >>CREATE TABLE table1 (
>                  deviceInformationId int,
>                  channelsId string,
>                  props map<key:int,value:string>)
>               STORED BY 'org.apache.carbondata.format'
> >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root')
> {code}
> format of data to be read from csv, with '$' as level 1 delimiter and map keys terminated by '#'
> {code:sql}
> >>load data local inpath '/tmp/data.csv' into table1 options ('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 'COMPLEX_DELIMITER_FOR_KEY'='#')
> 20,channel2,2#user2$100#usercommon
> 30,channel3,3#user3$100#usercommon
> 40,channel4,4#user3$100#usercommon
> >>select channelId, props[100] from table1 where deviceInformationId > 10;
> 20, usercommon
> 30, usercommon
> 40, usercommon
> >>select channelId, props from table1 where props[2] = 'user2';
> 20, {2,'user2', 100, 'usercommon'}
> {code}
> Following cases needs to  be handled:
> ||Sub feature||Pending activity||Remarks||
> |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, select * from maptable|
> |Maptype lookup in projection and filter|Develop|Projection and filters needs execution at spark|
> |NULL values, UDFs, Describe support|Develop||
> |Compaction support | Test + fix | As compaction works at byte level, no changes required. Needs to add test-cases|
> |Insert into table| Develop | Source table data containing Map data needs to convert from spark datatype to string , as carbon takes string as input row |
> |Support DDL for Map fields Dictionary include and Dictionary Exclude | Develop | Also needs to handle CarbonDictionaryDecoder  to handle the same. |
> |Support multilevel Map | Develop | currently DDL is validated to allow only 2 levels, remove this restriction|
> |Support Map value to be a measure | Develop | Currently array and struct supports only dimensions which needs change|
> |Support Alter table to add and remove Map column | Develop | implement DDL and requires default value handling |
> |Projections of Map loopup push down to carbon | Develop | this is an optimization, when more number of values are present in Map |
> |Filter map loolup push down to carbon | Develop | this is an optimization, when more number of values are present in Map |
> |Update Map values | Develop | update map value|
> h4. Design suggestion:
> Map can be represented internally stored as Array<Struct<key,Value>>, So that conversion of data is required to Map data type while giving to spark. Schema will have new column of map type similar to Array.

This message was sent by Atlassian JIRA