Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

Classic

List

96 messages Options

Options

12345

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

GitHub user lionelcao opened a pull request:

https://github.com/apache/carbondata/pull/1192

[CARBONDATA-940] alter table add/split partition for spark 2.1

add/split partition function

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lionelcao/carbondata carbon_910_06

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1192.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1192

----
commit afd4428ae1e66879de33b775c5051fe2745b2379
Author: lionelcao <[hidden email]>
Date: 2017-07-19T06:36:18Z

alter table add/split partition for spark 2.1

----

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1192

Can one of the admins verify this patch?

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1192

Can one of the admins verify this patch?

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3177/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/582/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3178/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/583/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3179/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/584/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1192

retest this please

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3181/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/586/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/587/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3182/

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user QiangCai commented on the issue:

https://github.com/apache/carbondata/pull/1192

please add describe to explain your design and modification.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

In reply to this post by qiuchenjian-2

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129513477

--- Diff: integration/spark-common-test/src/test/resources/partition_data.csv ---
@@ -0,0 +1,27 @@
+id,vin,logdate,phonenumber,country,area,salary
--- End diff --

can you try to reuse the current csv files or generate data.
Don't suggest adding so many csv file to repo.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

In reply to this post by qiuchenjian-2

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129514137

--- Diff: conf/carbon.properties.template ---
@@ -42,6 +42,9 @@ carbon.enableXXHash=true
#carbon.max.level.cache.size=-1
#enable prefetch of data during merge sort while reading data from sort temp files in data loading
#carbon.merge.sort.prefetch=true
+######## Alter Partition Configuration ########
+#Number of cores to be used while alter partition
+carbon.number.of.cores.while.altPartition=2
--- End diff --

1. Please check whether the parameter "carbon.number.of.cores.while.altPartition=2" is necessary , or not ?
2. If yes, suggest directly using : carbon.number.of.cores.while.alterPartition

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

In reply to this post by qiuchenjian-2

Github user lionelcao commented on the issue:

https://github.com/apache/carbondata/pull/1192

# Feature Description
This feature is to support ADD & SPLIT partition function on CarbonData.
# Scope
Support range partition and list partition table
# Syntax Example
Suppose one carbon table is list partitioned on COUNTRY column.
Current partition definition is ('China', 'US', 'UK', 'India', 'Canada, Japan, South Korea, North Korea')
### add a partition
ALTER TABLE t1 ADD PARTITION('Russia')
### split a partition
ALTER TABLE t1 SPLIT PARTITION(5) INTO ('Canada', 'Japan', '(South Korea, North Korea)')

# Modification
### parser
added new parser to support alter table add/split partition statement
### validate new RangeInfo and ListInfo
ensure new rangeInfo after adding/splitting is in correct order
ensure new added listInfo is not existed before
ensure the target split listInfo could be split
### read target partition data
add function to read data in one segment and one partition
### use ALTER_PARTITION as key of temp directions
add isAltPartitionFlow in getTempStoreLocationKey function
### repartition and write data
decode the partition column and repartition
write to new data blocks
### refresh cache
drop old cache
### multi threads operation in different segments
support make the changing of multiple segments in parallel.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

In reply to this post by qiuchenjian-2

Github user lionelcao commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129514953

--- Diff: integration/spark-common-test/src/test/resources/partition_data.csv ---
@@ -0,0 +1,27 @@
+id,vin,logdate,phonenumber,country,area,salary
--- End diff --

Hi @chenliang613 this csv data is already existed for partition example and test case. It's simple and clear to understand the partition concept. this PR just added two columns.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

In reply to this post by qiuchenjian-2

Github user lionelcao commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129515426

--- Diff: conf/carbon.properties.template ---
@@ -42,6 +42,9 @@ carbon.enableXXHash=true
#carbon.max.level.cache.size=-1
#enable prefetch of data during merge sort while reading data from sort temp files in data loading
#carbon.merge.sort.prefetch=true
+######## Alter Partition Configuration ########
+#Number of cores to be used while alter partition
+carbon.number.of.cores.while.altPartition=2
--- End diff --

Yes, it will be used when take action of multiple segments in parallel. this configuration will allow user to set the threads according to their hardware.
Sure, I will make the change.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

12345