Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #1812: [CARBONDATA-2033]support user specified segme...

Classic

List

Threaded

59 messages Options

123

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1812

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2786/

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1812

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2840/

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1812

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4085/

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user gvramana commented on the issue:

https://github.com/apache/carbondata/pull/1812

@Xaprice Currently Minor and Major compaction has fixed meaning, minor is based on frequency of segments and Major is based on size. So better to not to change the current meaning.
Also CARBON_INPUT_SEGMENTS will impact only read query but will not impact any other DDL/DML.

So you can add a new compaction type CUSTOM and pass the required segments in the same command, so that it will not create any confusion.
so command can be
ALTER TABLE tablename compact 'CUSTOM' '1, 2, 3, 4'
It is also required to mention in documentation that it will not respect other features like preserve_segments, size etc. Also invalid segments in list are ignored. Also CUSTOM compacted segments will not participate in minor compaction triggered later.

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1812

@Xaprice @chenliang613 @ravipesala @gvramana

I think the syntax of segment compaction should be similar with that of other management on segment.
Currently in carbondata, we delete segment using syntax:
```
DELETE FROM TABLE CarbonDatabase.CarbonTable WHERE SEGMENT.ID IN (0,5,8)
```
And
```
DELETE FROM TABLE CarbonDatabase.CarbonTable WHERE SEGMENT.STARTTIME BEFORE '2017-06-01 12:05:06'
```

So, we can imitate the above syntax and get the followings:
```
ALTER TABLE [db_name.]table_name COMPACT 'MINOR/MAJOR' WHERE SEGMENT.ID IN (0,5,8)
```
And
```
ALTER TABLE [db_name.]table_name COMPACT 'MINOR/MAJOR' WHERE SEGMENT.STARTTIME BEFORE '2017-06-01 12:05:06' AND SEGMENT.STARTTIME AFTER '2017-05-01 12:05:06'
```
We can support compact segment by specifying IDs and dates.

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1812

@gvramana
I think âmajorâ and âminorâ is enough to describe compaction, there is no need to add another on. And 'custom' is somewhat ambiguous.

As it is described in readme,
```
In Major compaction, multiple segments can be merged into one large segment. User will specify the compaction size until which segments can be merged.
```
The previous (default without condition) major compaction is size based, carbondata choose the segments by size. And for the newly major compaction (with condition), we specify the segments and let carbondata merge them into one large segment. They are no different. So we don't need an another compaction type.

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user manishgupta88 commented on the issue:

https://github.com/apache/carbondata/pull/1812

I agree with @gvramana
1. We should not use Major/Minor compaction type as they have a specific meaning and both are controlled by the system for taking decisions whether segment is valid to be compacted or not.
2. We should not use carbon.input.segments.default.seg_compact to set the segments to be compacted.
3. We should introduce a new compaction type in the DDL 'CUSTOM' as suggested above because it is something like force compaction for the given segments as it will not check for size and frequency of segments. We can work on using the below syntax for custom compaction.

**ALTER TABLE [db_name.]table_name COMPACT 'CUSTOM' WHERE SEGMENT.ID IN (0,5,8)**

Once a table is compacted using Custom compaction, then minor compaction does not hold good for the custom compacted segment. Custom compacted segment should only participate during major compaction if it satisfies the major compaction size property.

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1812

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4293/

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user bill1208 commented on the issue:

https://github.com/apache/carbondata/pull/1812

I agree with @gvramana

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1812

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4465/

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user Xaprice commented on the issue:

https://github.com/apache/carbondata/pull/1812

retest this please

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1812

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4500/

---

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1812: [CARBONDATA-2033]Support user specified segments in ...

In reply to this post by qiuchenjian-2

123