[ https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkata Ramana G updated CARBONDATA-45: --------------------------------------- Description: {code:sql} >>CREATE TABLE table1 ( deviceInformationId int, channelsId string, props map<key:int,value:string>) STORED BY 'org.apache.carbondata.format' >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root') {code} format of data to be read from csv, with '$' as level 1 delimiter and map keys terminated by '#' {code:sql} >>load data local inpath '/tmp/data.csv' into table1 options ('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 'COMPLEX_DELIMITER_FOR_KEY'='#') 20,channel2,2#user2$100#usercommon 30,channel3,3#user3$100#usercommon 40,channel4,4#user3$100#usercommon >>select channelId, props[100] from table1 where deviceInformationId > 10; 20, usercommon 30, usercommon 40, usercommon >>select channelId, props from table1 where props[2] == 'user2'; 20, {2,'user2', 100, 'usercommon'} {code} Following cases needs to be handled: ||Sub feature||Pending activity||Remarks|| |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, select * from maptable| |Maptype lookup in projection and filter|Develop|Projection and filters needs execution at spark| |NULL values, UDFs, Describe support|Develop|| |Compaction support | Test + fix | As compaction works at byte level, no changes required. Needs to add test-cases| |Insert into table| Develop | Source table data containing Map data needs to convert from spark datatype to string , as carbon takes string as input row | |Support DDL for Map fields Dictionary include and Dictionary Exclude | Develop | Also needs to handle CarbonDictionaryDecoder to handle the same. | |Support multilevel Map | Develop | currently DDL is validated to allow only 2 levels, remove this restriction| |Support Map value to be a measure | Develop | Currently array and struct supports only dimensions which needs change| |Support Alter table to add and remove Map column | Develop | implement DDL and requires default value handling | |Projections of Map loopup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | |Filter map loolup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | |Update Map values | Develop | update map value| h4. Design suggestion: Map can be represented internally stored as Array<Struct<key,Value>>, So that conversion of data is required to Map data type while giving to spark. Schema will have new column of map type similar to Array. was: {code:sql} >>CREATE TABLE table1 ( deviceInformationId int, channelsId string, props map<key:int,value:string>) STORED BY 'org.apache.carbondata.format' >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root') {code} format of data to be read from csv, with '$' as level 1 delimiter and map keys terminated by '#' {code:sql} >>load data local inpath '/tmp/data.csv' into table1 options ('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 'COMPLEX_DELIMITER_FOR_KEY'='#') 20,channel2,2#user2$100#usercommon 30,channel3,3#user3$100#usercommon 40,channel4,4#user3$100#usercommon >>select channelId, props[100] from table1 where deviceInformationId > 10; 20, usercommon 30, usercommon 40, usercommon >>select channelId, props from table1 where props[2] == 'user2'; 20, {2,'user2', 100, 'usercommon'} {code} Following cases needs to be handled: ||Sub feature||Pending activity||Remarks|| |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, select * from maptable| |Maptype lookup in projection and filter|Develop|Projection and filters needs execution at spark| |NULL values, UDFs, Describe support|Develop|| |Compaction support | Test + fix | As compaction works at byte level, no changes required. Needs to add test-cases| |Insert into table| Develop | Source table data containing Map data needs to convert from spark datatype to string , as carbon takes string as input row | |Support DDL for Map fields Dictionary include and Dictionary Exclude | Develop | Also needs to handle CarbonDictionaryDecoder to handle the same. | |Support multilevel Map | Develop | currently DDL is validated to allow only 2 levels, remove this restriction| |Support Map value to be a measure | Develop | Currently supports only dimensions | |Support Alter table to add and remove Map column | Develop | implement DDL and requires default value handling | |Projections of Map loopup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | |Filter map loolup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | |Update Map values | Develop | update map value| h4. Design suggestion: Map can be represented internally stored as Array<Struct<key,Value>>, So that conversion of data is required to Map data type while giving to spark. Schema will have new column of map type similar to Array. > Support MAP type > ---------------- > > Key: CARBONDATA-45 > URL: https://issues.apache.org/jira/browse/CARBONDATA-45 > Project: CarbonData > Issue Type: New Feature > Components: core, sql > Reporter: cen yuhai > Assignee: Venkata Ramana G > Fix For: 1.3.0 > > > {code:sql} > >>CREATE TABLE table1 ( > deviceInformationId int, > channelsId string, > props map<key:int,value:string>) > STORED BY 'org.apache.carbondata.format' > >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root') > {code} > format of data to be read from csv, with '$' as level 1 delimiter and map keys terminated by '#' > {code:sql} > >>load data local inpath '/tmp/data.csv' into table1 options ('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 'COMPLEX_DELIMITER_FOR_KEY'='#') > 20,channel2,2#user2$100#usercommon > 30,channel3,3#user3$100#usercommon > 40,channel4,4#user3$100#usercommon > >>select channelId, props[100] from table1 where deviceInformationId > 10; > 20, usercommon > 30, usercommon > 40, usercommon > >>select channelId, props from table1 where props[2] == 'user2'; > 20, {2,'user2', 100, 'usercommon'} > {code} > Following cases needs to be handled: > ||Sub feature||Pending activity||Remarks|| > |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, select * from maptable| > |Maptype lookup in projection and filter|Develop|Projection and filters needs execution at spark| > |NULL values, UDFs, Describe support|Develop|| > |Compaction support | Test + fix | As compaction works at byte level, no changes required. Needs to add test-cases| > |Insert into table| Develop | Source table data containing Map data needs to convert from spark datatype to string , as carbon takes string as input row | > |Support DDL for Map fields Dictionary include and Dictionary Exclude | Develop | Also needs to handle CarbonDictionaryDecoder to handle the same. | > |Support multilevel Map | Develop | currently DDL is validated to allow only 2 levels, remove this restriction| > |Support Map value to be a measure | Develop | Currently array and struct supports only dimensions which needs change| > |Support Alter table to add and remove Map column | Develop | implement DDL and requires default value handling | > |Projections of Map loopup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | > |Filter map loolup push down to carbon | Develop | this is an optimization, when more number of values are present in Map | > |Update Map values | Develop | update map value| > h4. Design suggestion: > Map can be represented internally stored as Array<Struct<key,Value>>, So that conversion of data is required to Map data type while giving to spark. Schema will have new column of map type similar to Array. -- This message was sent by Atlassian JIRA (v6.4.14#64029) |
Free forum by Nabble | Edit this page |