[jira] [Resolved] (CARBONDATA-2895) [Batch-sort]Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (CARBONDATA-2895) [Batch-sort]Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kumar vishal resolved CARBONDATA-2895.
--------------------------------------
    Resolution: Fixed

> [Batch-sort]Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.
> ---------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2895
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2895
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Ajantha Bhat
>            Assignee: Ajantha Bhat
>            Priority: Major
>          Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> probelm: Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.
> scenario:
> a) Configure batchsort but give batch size more than UnsafeMemoryManager.INSTANCE.getUsableMemory().
> b) Load data that is greater than batch size. Observe that unsafeMemoryManager save to disk happened as it cannot process one batch.  
> c) so load happens in 2 batch.
> d) When query the results. There result data rows is more than expected data rows.
> root cause:
> For each batch, createSortDataRows() will be called.
> Files saved to disk during sorting of previous batch was considered for this batch.
> solution:
> Files saved to disk during sorting of previous batch ,should not be considered for this batch.
> Hence use batchID as rangeID field of sorttempfiles.
> So getFilesToMergeSort() will select files of only this batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)