[
https://issues.apache.org/jira/browse/CARBONDATA-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
QiangCai updated CARBONDATA-2853:
---------------------------------
Description: Streaming index file in stream segment adds min/max meta index for each streaming file during streaming ingestion. So the filter query can use the min/max index to prune the streaming files to reduce the number of the spark tasks in the driver side. Streaming file adds min/max into the blocklet header, so the filter query can skip data during scanning file. (was: Streaming index file in stream segment adds min/max meta index for each streaming file during streaming ingestion. So the filter query can use the file-level min/max index to prune the streaming files to reduce the number of the spark tasks.)
Summary: Add min/max index for streaming segment (was: Add file-level min/max index for streaming segment)
> Add min/max index for streaming segment
> ---------------------------------------
>
> Key: CARBONDATA-2853
> URL:
https://issues.apache.org/jira/browse/CARBONDATA-2853> Project: CarbonData
> Issue Type: Sub-task
> Affects Versions: 1.5.0
> Reporter: QiangCai
> Assignee: QiangCai
> Priority: Major
> Fix For: 1.5.0
>
> Attachments: streaming_minmax_v2.pdf
>
> Time Spent: 8h 50m
> Remaining Estimate: 0h
>
> Streaming index file in stream segment adds min/max meta index for each streaming file during streaming ingestion. So the filter query can use the min/max index to prune the streaming files to reduce the number of the spark tasks in the driver side. Streaming file adds min/max into the blocklet header, so the filter query can skip data during scanning file.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)