[ https://issues.apache.org/jira/browse/CARBONDATA-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506741#comment-15506741 ] ASF GitHub Bot commented on CARBONDATA-260: ------------------------------------------- GitHub user manishgupta88 opened a pull request: https://github.com/apache/incubator-carbondata/pull/180 [CARBONDATA-260] Equal or lesser value of MAXCOLUMNS option than column count in CSV header results into array index of bound exception Problem: Equal or lesser value of MAXCOLUMNS option than column count in CSV header results into array index of bound exception Analysis: If column count in CSV header is more or equal to MAXCOLUMNS option value then array index out of bound exception is thrown by the Univocity CSV parser. This is because while parsing the row, parser adds each row to an array and increments the index and after incrementing it performs one more operation using the incremented index value which leads to array index pf bound exception. Code snipped as attached below for CSV parser. public void valueParsed() { this.parsedValues[column++] = appender.getAndReset(); this.appender = appenders[column]; } e.g. In the above code if column value is 7 then array index will be from 0-6 and when column value becomes 6 then in the second line ArrayIndexOutOfBoundException will be thrown as column value will become 7. Fix: Whenever Column count in CSV header is equal or more than MAXCOLUMNS option value or default value, increment it by 1. Impact: Data load flow You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/incubator-carbondata maxcolumns_array_indexOfBound Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/180.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #180 ---- commit 3f32424e55615c8e45470d5169b817f9f703dc3e Author: manishgupta88 <[hidden email]> Date: 2016-09-20T14:21:33Z Problem: Equal or lesser value of MAXCOLUMNS option than column count in CSV header results into array index of bound exception Analysis: If column count in CSV header is more or equal to MAXCOLUMNS option value then array index out of bound exception is thrown by the Univocity CSV parser. This is because while parsing the row, parser adds each row to an array and increments the index and after incrementing it performs one more operation using the incremented index value which leads to array index pf bound exception. Code snipped as attached below for CSV parser. public void valueParsed() { this.parsedValues[column++] = appender.getAndReset(); this.appender = appenders[column]; } Fix: Whenever Column count in CSV header is equal or more than MAXCOLUMNS option value or default value, increment it by 1. Impact: Data load flow ---- > Equal or lesser value of MAXCOLUMNS option than column count in CSV header results into array index of bound exception > ---------------------------------------------------------------------------------------------------------------------- > > Key: CARBONDATA-260 > URL: https://issues.apache.org/jira/browse/CARBONDATA-260 > Project: CarbonData > Issue Type: Bug > Reporter: Manish Gupta > Assignee: Manish Gupta > > If column count in CSV header is more or equal to MAXCOLUMNS option value then array index out of bound exception is thrown by the Univocity CSV parser. This is because while parsing the row, parser adds each row to an array and increments the index and after incrementing it performs one more operation using the incremented index value which leads to array index pf bound exception > java.lang.OutOfMemoryError: Java heap space > at com.univocity.parsers.common.ParserOutput.<init>(ParserOutput.java:86) > at com.univocity.parsers.common.AbstractParser.<init>(AbstractParser.java:66) > at com.univocity.parsers.csv.CsvParser.<init>(CsvParser.java:50) > at org.apache.carbondata.processing.csvreaderstep.UnivocityCsvParser.initialize(UnivocityCsvParser.java:114) > at org.apache.carbondata.processing.csvreaderstep.CsvInput.doProcessUnivocity(CsvInput.java:427) > at org.apache.carbondata.processing.csvreaderstep.CsvInput.access$100(CsvInput.java:60) > at org.apache.carbondata.processing.csvreaderstep.CsvInput$1.call(CsvInput.java:389) -- This message was sent by Atlassian JIRA (v6.3.4#6332) |
Free forum by Nabble | Edit this page |