Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Resolved] (CARBONDATA-3447) Index Server Performance Improvement

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Resolved] (CARBONDATA-3447) Index Server Performance Improvement

[ https://issues.apache.org/jira/browse/CARBONDATA-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kunal Kapoor resolved CARBONDATA-3447.
--------------------------------------
Resolution: Fixed
Fix Version/s: 1.6.0

> Index Server Performance Improvement
> ------------------------------------
>
> Key: CARBONDATA-3447
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3447
> Project: CarbonData
> Issue Type: Improvement
> Components: core
> Reporter: kumar vishal
> Assignee: kumar vishal
> Priority: Major
> Fix For: 1.6.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Problem:
> When number of splits are high, index server performance is slow as compared to old flow(Driver caching). This is because data is transferred over network is more and causing performance bottleneck.
> Solution:
> # If data transferred is less we can sent through network, but when it grows we can write to file and only send file name and in Main driver it will read the file and construct input split.
> # Use snappy to compress the data, so data transferred through network/written to file size will be less, so IO time wont impact performance
> # In main driver pruning is done in multiple thread, added same for index executor as now index executor will do the pruning
> # In case of block cache no need to send blockletdetailinfo object as size is more and same can be constructed in executor from file footer

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)