Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Updated] (CARBONDATA-1377) Implement hive partition

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Updated] (CARBONDATA-1377) Implement hive partition

[ https://issues.apache.org/jira/browse/CARBONDATA-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

cen yuhai updated CARBONDATA-1377:
----------------------------------
Attachment: the future of hive integration.png

> Implement hive partition
> ------------------------
>
> Key: CARBONDATA-1377
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1377
> Project: CarbonData
> Issue Type: Sub-task
> Components: hive-integration
> Reporter: cen yuhai
> Assignee: cen yuhai
> Attachments: the future of hive integration.png
>
>
> Current partition implement is like database, If I want to use carbon to replace parquet massively, we must make the usage of carbon the same with parquet/orc.
> Hive users should able to switch to CarbonData for all the new partitions being created. Hive support format to be specified at partition level.
> Example:
> {code:sql}
> create table rtestpartition (col1 string, col2 int) partitioned by (col3 int) stored as parquet;
> insert into rtestpartition partition(col3=10) select "pqt", 1;
> insert into rtestpartition partition(col3=20) select "pqt", 1;
> insert into rtestpartition partition(col3=10) select "pqt", 1;
> insert into rtestpartition partition(col3=20) select "pqt", 1;
> {code}
> {noformat}
> hive creates folder like
> /db1/table1/col3=10/0001_file.pqt
> /db1/table1/col3=10/0002_file.pqt
> /db1/table1/col3=20/0001_file.pqt
> /db1/table1/col3=20/0002_file.pqt
> {noformat}
> Hive users can now change new partitions to CarbonData, how ever old partitions still be with parquet and require migration scripts to move to CarbonData format.
> {code:sql}
> alter table rtestpartition set fileformat carbondata;
> insert into rtestpartition partition(col3=30) select "cdata", 1;
> insert into rtestpartition partition(col3=40) select "cdata", 1;
> {code}
> {noformat}
> hive creates folder like
> /db1/table1/col3=10/0001_file.pqt
> /db1/table1/col3=10/0002_file.pqt
> /db1/table1/col3=20/0001_file.pqt
> /db1/table1/col3=20/0002_file.pqt
> /db1/table1/col3=30/<carbondatafiles>
> /db1/table1/col3=40/<carbondatafiles>
> {noformat}

--
This message was sent by Atlassian JIRA
(v6.4.14#64029)