Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Updated] (CARBONDATA-844) Avoid to get useless splits

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Updated] (CARBONDATA-844) Avoid to get useless splits

[ https://issues.apache.org/jira/browse/CARBONDATA-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravindra Pesala updated CARBONDATA-844:
---------------------------------------
Fix Version/s: (was: 1.1.0)
1.1.1

> Avoid to get useless splits
> ---------------------------
>
> Key: CARBONDATA-844
> URL: https://issues.apache.org/jira/browse/CARBONDATA-844
> Project: CarbonData
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.1.0
> Reporter: Yadong Qi
> Assignee: Yadong Qi
> Fix For: 1.1.1
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> In current implements of CarbonInputFormat.getDataBlocksOfSegment,
> 1. Get all of the carbondata splits in segments directory.
> 2. Read the carbonindex and construct the B-tree.
> 3. Apply filter and get matching splits.
> I think we get some useless splits and the operator of getSplits is expensive. So we'd better to do the getSplits after filter:
> 1. List the segment directory, and filter the path of carbonindex.
> 2. Read the carbonindex and construct the B-tree.
> 3. Apply filter and get matching blocks.
> 4. Get carbondata splits from filtered blocks.

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)