GitHub user ravipesala opened a pull request:
https://github.com/apache/carbondata/pull/2596 [CARBONDATA-2806] Fix clean carbondata files when task has multiple carbondata files
Problem:
When task has multiple blocks and blocklets then clean files is not cleaning properly.
Solution:
SegmentFile read contains duplicate block paths so cleaning is aborting in between.So remove the duplicate block paths.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull
https://github.com/ravipesala/incubator-carbondata flat-folder-delete-new
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2596.patchTo close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2596
----
commit 6edadc82d488fc0e251a4c63b62e056aac3968ee
Author: ravipesala <ravi.pesala@...>
Date: 2018-08-01T16:05:59Z
Fix clean carbondata files when task has multiple carbondata files
----
---