[ https://issues.apache.org/jira/browse/CARBONDATA-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xuchuanyin updated CARBONDATA-1839: ----------------------------------- Description: Carbondata provide an option to optimize data load process by compressing the intermediate sort temp files. The option is `carbon.is.sort.temp.file.compression.enabled` and its default value is `false`. In some disk tense scenario, user can turn on this feature by setting the option `true`, it will compress the file content before writing it to disk. How ever I have found bugs in the related code and the data load was failed after turning on this feature. Error messages are shown as below: ``` 17/11/29 18:04:12 ERROR SortDataRows: SortDataRowPool:test1 java.lang.ClassCastException: [B cannot be cast to [Ljava.lang.Integer; at org.apache.carbondata.core.util.NonDictionaryUtil.getDimension(NonDictionaryUtil.java:93) at org.apache.carbondata.processing.sort.sortdata.UnCompressedTempSortFileWriter.writeDataOutputStream(UnCompressedTempSortFileWriter.java:52) at org.apache.carbondata.processing.sort.sortdata.CompressedTempSortFileWriter.writeSortTempFile(CompressedTempSortFileWriter.java:65) at org.apache.carbondata.processing.sort.sortdata.SortTempFileChunkWriter.writeSortTempFile(SortTempFileChunkWriter.java:72) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeSortTempFile(SortDataRows.java:245) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataTofile(SortDataRows.java:232) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.access$300(SortDataRows.java:45) at org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter.run(SortDataRows.java:426) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ``` ``` 17/11/29 18:04:13 ERROR SortDataRows: SafeParallelSorterPool:test1 exception occurred while trying to acquire a semaphore lock: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] 17/11/29 18:04:13 ERROR ParallelReadMergeSorterImpl: SafeParallelSorterPool:test1 org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException: at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:173) at org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.RejectedExecutionException: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:169) ... 4 more ``` ``` at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException: at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:173) at org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:227) ... 3 more Caused by: java.util.concurrent.RejectedExecutionException: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:169) ... 4 more ``` was: Carbondata provide an option to optimize data load process by compressing the intermediate sort temp files. The option is `carbon.is.sort.temp.file.compression.enabled` and its default value is `false`. In some disk tense scenario, user can turn on this feature by setting the option `true`, it will compress the file content before write it to disk. How ever I have found bugs in the related code and the data load is failed after turn on this feature. Error messages are shown as below: ``` 17/11/29 18:04:12 ERROR SortDataRows: SortDataRowPool:test1 java.lang.ClassCastException: [B cannot be cast to [Ljava.lang.Integer; at org.apache.carbondata.core.util.NonDictionaryUtil.getDimension(NonDictionaryUtil.java:93) at org.apache.carbondata.processing.sort.sortdata.UnCompressedTempSortFileWriter.writeDataOutputStream(UnCompressedTempSortFileWriter.java:52) at org.apache.carbondata.processing.sort.sortdata.CompressedTempSortFileWriter.writeSortTempFile(CompressedTempSortFileWriter.java:65) at org.apache.carbondata.processing.sort.sortdata.SortTempFileChunkWriter.writeSortTempFile(SortTempFileChunkWriter.java:72) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeSortTempFile(SortDataRows.java:245) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataTofile(SortDataRows.java:232) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.access$300(SortDataRows.java:45) at org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter.run(SortDataRows.java:426) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ``` ``` 17/11/29 18:04:13 ERROR SortDataRows: SafeParallelSorterPool:test1 exception occurred while trying to acquire a semaphore lock: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] 17/11/29 18:04:13 ERROR ParallelReadMergeSorterImpl: SafeParallelSorterPool:test1 org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException: at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:173) at org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.RejectedExecutionException: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:169) ... 4 more ``` ``` at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException: at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:173) at org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:227) ... 3 more Caused by: java.util.concurrent.RejectedExecutionException: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:169) ... 4 more ``` > Data load failed when using compressed sort temp file > ----------------------------------------------------- > > Key: CARBONDATA-1839 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1839 > Project: CarbonData > Issue Type: Bug > Reporter: xuchuanyin > Assignee: xuchuanyin > > Carbondata provide an option to optimize data load process by compressing the intermediate sort temp files. > The option is `carbon.is.sort.temp.file.compression.enabled` and its default value is `false`. In some disk tense scenario, user can turn on this feature by setting the option `true`, it will compress the file content before writing it to disk. > How ever I have found bugs in the related code and the data load was failed after turning on this feature. > Error messages are shown as below: > ``` > 17/11/29 18:04:12 ERROR SortDataRows: SortDataRowPool:test1 > java.lang.ClassCastException: [B cannot be cast to [Ljava.lang.Integer; > at org.apache.carbondata.core.util.NonDictionaryUtil.getDimension(NonDictionaryUtil.java:93) > at org.apache.carbondata.processing.sort.sortdata.UnCompressedTempSortFileWriter.writeDataOutputStream(UnCompressedTempSortFileWriter.java:52) > at org.apache.carbondata.processing.sort.sortdata.CompressedTempSortFileWriter.writeSortTempFile(CompressedTempSortFileWriter.java:65) > at org.apache.carbondata.processing.sort.sortdata.SortTempFileChunkWriter.writeSortTempFile(SortTempFileChunkWriter.java:72) > at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeSortTempFile(SortDataRows.java:245) > at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataTofile(SortDataRows.java:232) > at org.apache.carbondata.processing.sort.sortdata.SortDataRows.access$300(SortDataRows.java:45) > at org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter.run(SortDataRows.java:426) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > ``` > ``` > 17/11/29 18:04:13 ERROR SortDataRows: SafeParallelSorterPool:test1 exception occurred while trying to acquire a semaphore lock: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] > 17/11/29 18:04:13 ERROR ParallelReadMergeSorterImpl: SafeParallelSorterPool:test1 > org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException: > at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:173) > at org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:227) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.concurrent.RejectedExecutionException: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] > at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) > at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:169) > ... 4 more > ``` > ``` > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException: > at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:173) > at org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:227) > ... 3 more > Caused by: java.util.concurrent.RejectedExecutionException: Task org.apache.carbondata.processing.sort.sortdata.SortDataRows$DataSorterAndWriter@3d413b40 rejected from java.util.concurrent.ThreadPoolExecutor@cb56011[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] > at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) > at org.apache.carbondata.processing.sort.sortdata.SortDataRows.addRowBatch(SortDataRows.java:169) > ... 4 more > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029) |
Free forum by Nabble | Edit this page |