[jira] [Updated] (CARBONDATA-1850) Support Standard Partitioning in CarbonData

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (CARBONDATA-1850) Support Standard Partitioning in CarbonData

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravindra Pesala updated CARBONDATA-1850:
----------------------------------------
    Attachment: Standard Partitioning Support in CarbonData (1).docx

> Support Standard Partitioning in CarbonData
> -------------------------------------------
>
>                 Key: CARBONDATA-1850
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1850
>             Project: CarbonData
>          Issue Type: New Feature
>            Reporter: Ravindra Pesala
>         Attachments: Standard Partitioning Support in CarbonData (1).docx
>
>
> Currently, Carbon supports the partitions which is custom implemented by carbon. So if community users want to use the features which are available in spark and hive in carbondata then there is a compatibility problem arrives in the sense of usage. And also carbondata does not have a built-in dynamic partition.
> To use the partition feature of spark we should comply with the interfaces available in spark while loading and reading the data.
> Features supported by standard partitioning
> Creating table with partition
> {code}
> CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db_name.]table_name
>     [(col_name1 col_type1 [COMMENT col_comment1], ...)]
>     USING datasource
>     [OPTIONS (key1=val1, key2=val2, ...)]
>     [PARTITIONED BY (col_name1, col_name2, ...)]
>     [TBLPROPERTIES (key1=val1, key2=val2, ...)]
>     [AS select_statement]
> Or
> CREATE TABLE [IF NOT EXISTS] [db_name.]table_name
>   [(col_name data_type , ...)]
>   [COMMENT table_comment]
>   [PARTITIONED BY (col_name data_type , ...)]
>   [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
>   [STORED BY file_format]
>   [TBLPROPERTIES (property_name=property_value, ...)]
>   [AS select_statement];
> {code}
> Load Data Using Static Partition
> {code}
>  LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.txt'
>       INTO TABLE partitioned_user
>       PARTITION (country = 'US', state = 'CA')
>    
>  INSERT OVERWRITE TABLE partitioned_user
>       PARTITION (country = 'US', state = 'AL')
>       SELECT * FROM another_user au
>       WHERE au.country = 'US' AND au.state = 'AL';
> {code}
> Load data Using Dynamic Partition
> {code}
>   LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.txt'
>       INTO TABLE partitioned_user
>       PARTITION (country, state)
>    
>  INSERT OVERWRITE TABLE partitioned_user
>       PARTITION (country, state)
>       SELECT * FROM another_user;
> {code}
> Show Partitions
> {code}
> SHOW PARTITIONS [db_name.]table_name
> {code}
> Drop Partition
> {code}
> ALTER TABLE table_name DROP [IF EXISTS] (PARTITION part_spec, ...)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)