GitHub user kunal642 opened a pull request:
https://github.com/apache/carbondata/pull/2548 [CARBONDATA-2778]Fixed bug when select after delete and cleanup is showing empty records
Problem: In case if delete operation when it is found that the data being deleted is leading to a state where one complete block data is getting deleted. In that case the status if that block is marked for delete and during the next delete operation run the block is deleted along with its carbonIndex file. The problem arises due to deletion of carbonIndex file because for multiple blocks there can be one carbonIndex file as one carbonIndex file represents one task.
Solution: Do not delete the carbondata and carbonIndex file. After compaction it will automatically take care of deleting the stale data and stale segments.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull
https://github.com/kunal642/carbondata iud_fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2548.patchTo close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2548
----
commit b5f9c3c857a93d7633580417f63e9c745cbcbbe6
Author: kunal642 <kunalkapoor642@...>
Date: 2018-07-24T10:42:54Z
Fixed bug when select after delete and cleanup is showing empty records
----
---