[jira] [Commented] (CARBONDATA-663) Major compaction is not working properly as per the configuration

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CARBONDATA-663) Major compaction is not working properly as per the configuration

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831280#comment-15831280 ]

ravikiran commented on CARBONDATA-663:
--------------------------------------

hi ,
      Please find below the working of the Major compaction .

1. major compaction is a size based compaction.  assume X is the size configured . for example X = 10 mb
2. Major compaction will be done as long as the segments are inside this 10 mb size. here the size considered is not the CSV input file size.  size will be calculated from the segments files i.e carbondata files , index files of a segment.

3. if a size of 1 segment is above 10 mb then that segment wont be considered for merging.

In the above description please check if the size is calculated properly.  




> Major compaction is not working properly as per the configuration
> -----------------------------------------------------------------
>
>                 Key: CARBONDATA-663
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-663
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-query
>    Affects Versions: 1.0.0-incubating
>         Environment: Spark - 2.1
>            Reporter: Anurag Srivastava
>         Attachments: logs, sample_str_more1.csv, show_segment.png, show_segments_after_compaction.png
>
>
> I have set property *carbon.major.compaction.size= 3* and load data which is the size of 5 MB and when I perform compaction it compacted, but initially it shouldn't be perform. Here is the queries :
> *create table :* create table test_major_compaction(id Int,name string)stored by 'carbondata';
> *Load Data :* Load two segments.
> LOAD DATA inpath 'hdfs://localhost:54310/sample_str_more1.csv' INTO table test_major_compaction options('DELIMITER'=',', 'FILEHEADER'='id, name','QUOTECHAR'='"');
> *Show segments :* show segments for table test_major_compaction;
> !https://issues.apache.org/jira/secure/attachment/12848287/show_segment.png!
> *Alter Table :* ALTER TABLE test_major_compaction COMPACT 'MAJOR';
> *Show segments :* Again see the segments :
> show segments for table test_major_compaction;
> !https://issues.apache.org/jira/secure/attachment/12848286/show_segments_after_compaction.png!
> I have attached all the data with the it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)