ajantha-bhat opened a new pull request #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-606090142 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2593/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-606095130 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/885/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607130786 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2602/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607139937 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/894/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607249603 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/899/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607253777 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2607/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607391335 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/904/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607396547 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2612/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607583473 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607616961 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2616/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [WIP] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-607619932 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/907/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3687: [CARBONDATA-3761] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#discussion_r403476766 ########## File path: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ########## @@ -240,141 +238,137 @@ public boolean getIsColumnDictionary() { @Override public void writeByteArray(Object input, DataOutputStream dataOutputStream, - BadRecordLogHolder logHolder) throws IOException { - String parsedValue = - input == null ? null : DataTypeUtil.parseValue(input.toString(), carbonDimension); - String message = logHolder.getColumnMessageMap().get(carbonDimension.getColName()); - if (this.isDictionary) { - Integer surrogateKey; - if (null == parsedValue) { - surrogateKey = CarbonCommonConstants.MEMBER_DEFAULT_VAL_SURROGATE_KEY; - if (null == message) { - message = CarbonDataProcessorUtil - .prepareFailureReason(carbonDimension.getColName(), carbonDimension.getDataType()); - logHolder.getColumnMessageMap().put(carbonDimension.getColName(), message); - logHolder.setReason(message); - } - } else { - if (dictionaryGenerator instanceof DirectDictionary && input instanceof Long) { - surrogateKey = ((DirectDictionary) dictionaryGenerator).generateKey((long) input); - } else { - surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue); - } - if (surrogateKey == CarbonCommonConstants.INVALID_SURROGATE_KEY) { - surrogateKey = CarbonCommonConstants.MEMBER_DEFAULT_VAL_SURROGATE_KEY; - message = CarbonDataProcessorUtil - .prepareFailureReason(carbonDimension.getColName(), carbonDimension.getDataType()); - logHolder.getColumnMessageMap().put(carbonDimension.getColName(), message); - logHolder.setReason(message); - } - } - dataOutputStream.writeInt(surrogateKey); - } else { - // Transform into ByteArray for No Dictionary. - // TODO have to refactor and place all the cases present in NonDictionaryFieldConverterImpl - if (null == parsedValue && this.carbonDimension.getDataType() != DataTypes.STRING) { - updateNullValue(dataOutputStream, logHolder); - } else if (null == parsedValue || parsedValue.equals(nullFormat)) { + BadRecordLogHolder logHolder, Boolean isWithoutConverter) throws IOException { + String parsedValue = null; + if (null == input || + (this.carbonDimension.getDataType() == DataTypes.STRING && input.equals(nullFormat))) { + updateNullValue(dataOutputStream, logHolder); + return; + } + if (!isWithoutConverter) { + parsedValue = DataTypeUtil.parseValue(input.toString(), carbonDimension); + if (null == parsedValue || (this.carbonDimension.getDataType() == DataTypes.STRING + && parsedValue.equals(nullFormat))) { updateNullValue(dataOutputStream, logHolder); - } else { - String dateFormat = null; - if (this.carbonDimension.getDataType() == DataTypes.DATE) { - dateFormat = carbonDimension.getDateFormat(); - } else if (this.carbonDimension.getDataType() == DataTypes.TIMESTAMP) { - dateFormat = carbonDimension.getTimestampFormat(); - } - try { - if (!this.carbonDimension.getUseActualData()) { - byte[] value = null; - if (isDirectDictionary) { - int surrogateKey; - if (!(input instanceof Long) && !(input instanceof Integer)) { - SimpleDateFormat parser = new SimpleDateFormat(getDateFormat(carbonDimension)); - parser.parse(parsedValue); - } - // If the input is a long value then this means that logical type was provided by - // the user using AvroCarbonWriter. In this case directly generate surrogate key - // using dictionaryGenerator. - if (dictionaryGenerator instanceof DirectDictionary && input instanceof Long) { - surrogateKey = ((DirectDictionary) dictionaryGenerator).generateKey((long) input); - } else if (dictionaryGenerator instanceof DirectDictionary - && input instanceof Integer) { - // In case of file format, for complex type date or time type, input data comes as a - // Integer object, so just assign the surrogate key with the input object value - surrogateKey = (int) input; - } else { - surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue); - } - if (surrogateKey == CarbonCommonConstants.INVALID_SURROGATE_KEY) { - value = new byte[0]; + return; + } + } + // Transform into ByteArray for No Dictionary. + String dateFormat = null; + if (this.carbonDimension.getDataType() == DataTypes.DATE) { + dateFormat = carbonDimension.getDateFormat(); + } else if (this.carbonDimension.getDataType() == DataTypes.TIMESTAMP) { + dateFormat = carbonDimension.getTimestampFormat(); + } + try { + if (!this.carbonDimension.getUseActualData()) { + byte[] value; + if (isDirectDictionary) { + int surrogateKey; + // If the input is a long value then this means that logical type was provided by + // the user using AvroCarbonWriter. In this case directly generate surrogate key + // using dictionaryGenerator. + if (dictionaryGenerator instanceof DirectDictionary && input instanceof Long) { + surrogateKey = ((DirectDictionary) dictionaryGenerator).generateKey((long) input); + } else if (dictionaryGenerator instanceof DirectDictionary + && input instanceof Integer) { Review comment: move `&&` to previous line ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3687: [CARBONDATA-3761] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#discussion_r403476980 ########## File path: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ########## @@ -240,141 +238,137 @@ public boolean getIsColumnDictionary() { @Override public void writeByteArray(Object input, DataOutputStream dataOutputStream, Review comment: Can you break this function into smaller functions to make it more easy to understand ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3687: [CARBONDATA-3761] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#discussion_r403847723 ########## File path: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ########## @@ -240,141 +238,137 @@ public boolean getIsColumnDictionary() { @Override public void writeByteArray(Object input, DataOutputStream dataOutputStream, Review comment: This part of code is messed up. Need to check and refactor `useActualData` first (which needs some analysis). Without that these flows will be like this ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3687: [CARBONDATA-3761] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#discussion_r403847723 ########## File path: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ########## @@ -240,141 +238,137 @@ public boolean getIsColumnDictionary() { @Override public void writeByteArray(Object input, DataOutputStream dataOutputStream, Review comment: This part of code is messed up. Need to check and refactor `useActualData` first (which needs some analysis). Without that these flows will be like this. I will try to refactor to some extent now ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3687: [CARBONDATA-3761] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#discussion_r403855028 ########## File path: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ########## @@ -240,141 +238,137 @@ public boolean getIsColumnDictionary() { @Override public void writeByteArray(Object input, DataOutputStream dataOutputStream, - BadRecordLogHolder logHolder) throws IOException { - String parsedValue = - input == null ? null : DataTypeUtil.parseValue(input.toString(), carbonDimension); - String message = logHolder.getColumnMessageMap().get(carbonDimension.getColName()); - if (this.isDictionary) { - Integer surrogateKey; - if (null == parsedValue) { - surrogateKey = CarbonCommonConstants.MEMBER_DEFAULT_VAL_SURROGATE_KEY; - if (null == message) { - message = CarbonDataProcessorUtil - .prepareFailureReason(carbonDimension.getColName(), carbonDimension.getDataType()); - logHolder.getColumnMessageMap().put(carbonDimension.getColName(), message); - logHolder.setReason(message); - } - } else { - if (dictionaryGenerator instanceof DirectDictionary && input instanceof Long) { - surrogateKey = ((DirectDictionary) dictionaryGenerator).generateKey((long) input); - } else { - surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue); - } - if (surrogateKey == CarbonCommonConstants.INVALID_SURROGATE_KEY) { - surrogateKey = CarbonCommonConstants.MEMBER_DEFAULT_VAL_SURROGATE_KEY; - message = CarbonDataProcessorUtil - .prepareFailureReason(carbonDimension.getColName(), carbonDimension.getDataType()); - logHolder.getColumnMessageMap().put(carbonDimension.getColName(), message); - logHolder.setReason(message); - } - } - dataOutputStream.writeInt(surrogateKey); - } else { - // Transform into ByteArray for No Dictionary. - // TODO have to refactor and place all the cases present in NonDictionaryFieldConverterImpl - if (null == parsedValue && this.carbonDimension.getDataType() != DataTypes.STRING) { - updateNullValue(dataOutputStream, logHolder); - } else if (null == parsedValue || parsedValue.equals(nullFormat)) { + BadRecordLogHolder logHolder, Boolean isWithoutConverter) throws IOException { + String parsedValue = null; + if (null == input || + (this.carbonDimension.getDataType() == DataTypes.STRING && input.equals(nullFormat))) { + updateNullValue(dataOutputStream, logHolder); + return; + } + if (!isWithoutConverter) { + parsedValue = DataTypeUtil.parseValue(input.toString(), carbonDimension); + if (null == parsedValue || (this.carbonDimension.getDataType() == DataTypes.STRING + && parsedValue.equals(nullFormat))) { updateNullValue(dataOutputStream, logHolder); - } else { - String dateFormat = null; - if (this.carbonDimension.getDataType() == DataTypes.DATE) { - dateFormat = carbonDimension.getDateFormat(); - } else if (this.carbonDimension.getDataType() == DataTypes.TIMESTAMP) { - dateFormat = carbonDimension.getTimestampFormat(); - } - try { - if (!this.carbonDimension.getUseActualData()) { - byte[] value = null; - if (isDirectDictionary) { - int surrogateKey; - if (!(input instanceof Long) && !(input instanceof Integer)) { - SimpleDateFormat parser = new SimpleDateFormat(getDateFormat(carbonDimension)); - parser.parse(parsedValue); - } - // If the input is a long value then this means that logical type was provided by - // the user using AvroCarbonWriter. In this case directly generate surrogate key - // using dictionaryGenerator. - if (dictionaryGenerator instanceof DirectDictionary && input instanceof Long) { - surrogateKey = ((DirectDictionary) dictionaryGenerator).generateKey((long) input); - } else if (dictionaryGenerator instanceof DirectDictionary - && input instanceof Integer) { - // In case of file format, for complex type date or time type, input data comes as a - // Integer object, so just assign the surrogate key with the input object value - surrogateKey = (int) input; - } else { - surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue); - } - if (surrogateKey == CarbonCommonConstants.INVALID_SURROGATE_KEY) { - value = new byte[0]; + return; + } + } + // Transform into ByteArray for No Dictionary. + String dateFormat = null; + if (this.carbonDimension.getDataType() == DataTypes.DATE) { + dateFormat = carbonDimension.getDateFormat(); + } else if (this.carbonDimension.getDataType() == DataTypes.TIMESTAMP) { + dateFormat = carbonDimension.getTimestampFormat(); + } + try { + if (!this.carbonDimension.getUseActualData()) { + byte[] value; + if (isDirectDictionary) { + int surrogateKey; + // If the input is a long value then this means that logical type was provided by + // the user using AvroCarbonWriter. In this case directly generate surrogate key + // using dictionaryGenerator. + if (dictionaryGenerator instanceof DirectDictionary && input instanceof Long) { + surrogateKey = ((DirectDictionary) dictionaryGenerator).generateKey((long) input); + } else if (dictionaryGenerator instanceof DirectDictionary + && input instanceof Integer) { Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [CARBONDATA-3761] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-609664557 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2646/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3687: [CARBONDATA-3761] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-609671853 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/936/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on issue #3687: [CARBONDATA-3761] Remove redundant conversion for complex type insert
URL: https://github.com/apache/carbondata/pull/3687#issuecomment-610286383 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |