[
https://issues.apache.org/jira/browse/CARBONDATA-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
kumar vishal resolved CARBONDATA-2895.
--------------------------------------
Resolution: Fixed
> [Batch-sort]Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.
> ---------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-2895
> URL:
https://issues.apache.org/jira/browse/CARBONDATA-2895> Project: CarbonData
> Issue Type: Bug
> Reporter: Ajantha Bhat
> Assignee: Ajantha Bhat
> Priority: Major
> Time Spent: 4h 40m
> Remaining Estimate: 0h
>
> probelm: Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.
> scenario:
> a) Configure batchsort but give batch size more than UnsafeMemoryManager.INSTANCE.getUsableMemory().
> b) Load data that is greater than batch size. Observe that unsafeMemoryManager save to disk happened as it cannot process one batch.
> c) so load happens in 2 batch.
> d) When query the results. There result data rows is more than expected data rows.
> root cause:
> For each batch, createSortDataRows() will be called.
> Files saved to disk during sorting of previous batch was considered for this batch.
> solution:
> Files saved to disk during sorting of previous batch ,should not be considered for this batch.
> Hence use batchID as rangeID field of sorttempfiles.
> So getFilesToMergeSort() will select files of only this batch.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)