GitHub user anubhav100 opened a pull request:
https://github.com/apache/incubator-carbondata/pull/607 [CARBONDATA-712]resolve the bug for bad records not written in csv file when 'BAD_RECORDS_ACTION'='REDIRECT' https://issues.apache.org/jira/browse/CARBONDATA-712 The bad_records are also not loading in the BAD_RECORS_LOCATION when using 'BAD_RECORDS_ACTION'='REDIRECT' at load time both inprogress file and log file is empty Because nothing in the I/O path guarantees that your data has reached disk,When you write data to a stream, it is not written immediately, and it is buffered. So we should use flush() when need to be sure that all your data from buffer is writte so when we are using bufferedCSVWriter.write(logStrings.toString()),it must be followed by bufferedCSVWriter.flush(); You can merge this pull request into a Git repository by running: $ git pull https://github.com/anubhav100/incubator-carbondata CARBONDATA-712 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/607.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #607 ---- commit 02f1c6c4d7b7e03589a35ee09a8303795c1c0b41 Author: anubhav100 <[hidden email]> Date: 2017-02-21T06:45:14Z resolved the issue for bad records not written in csv file when bad_records_action=redirect ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/925/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102144509 --- Diff: processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/BadRecordsLogger.java --- @@ -179,6 +179,7 @@ private synchronized void writeBadRecordsToFile(StringBuilder logStrings) { } bufferedWriter.write(logStrings.toString()); + bufferedWriter.flush(); --- End diff -- It is not the correct way, please don't flush for every row. it slows down a lot. Please close the streams properly, it will solve the problem --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 @ravipesala changes done can you have a look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/926/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/927/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102242075 --- Diff: processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/BadRecordsLogger.java --- @@ -179,6 +179,7 @@ private synchronized void writeBadRecordsToFile(StringBuilder logStrings) { } bufferedWriter.write(logStrings.toString()); + bufferedWriter.flush(); --- End diff -- Changes are done can u review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102593162 --- Diff: processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/BadRecordsLogger.java --- @@ -176,15 +176,13 @@ private synchronized void writeBadRecordsToFile(StringBuilder logStrings) { bufferedWriter = new BufferedWriter(new OutputStreamWriter(outStream, Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))); - } bufferedWriter.write(logStrings.toString()); - bufferedWriter.flush(); bufferedWriter.newLine(); } catch (FileNotFoundException e) { LOGGER.error("Bad Log Files not found"); } catch (IOException e) { - LOGGER.error("Error While writing bad log File"); + LOGGER.error("Error While writ1ing bad log File"); --- End diff -- "writ1ing" is wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102628013 --- Diff: processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/BadRecordsLogger.java --- @@ -176,15 +176,13 @@ private synchronized void writeBadRecordsToFile(StringBuilder logStrings) { bufferedWriter = new BufferedWriter(new OutputStreamWriter(outStream, Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))); - } bufferedWriter.write(logStrings.toString()); - bufferedWriter.flush(); bufferedWriter.newLine(); } catch (FileNotFoundException e) { LOGGER.error("Bad Log Files not found"); } catch (IOException e) { - LOGGER.error("Error While writing bad log File"); + LOGGER.error("Error While writ1ing bad log File"); --- End diff -- @chenliang613 i corrected it you can review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/936/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102637522 --- Diff: processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/BadRecordsLogger.java --- @@ -176,7 +176,6 @@ private synchronized void writeBadRecordsToFile(StringBuilder logStrings) { bufferedWriter = new BufferedWriter(new OutputStreamWriter(outStream, Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))); - --- End diff -- don't do unnecessary changes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 @anubhav100 it is not correct way to close streams for every row, how the other rows will be added to bad records if it is closed for each row. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 @ravipesala can you have another look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/938/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/939/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:
https://github.com/apache/incubator-carbondata/pull/607 @ravipesala can you review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102887780 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataConverterProcessorStepImpl.java --- @@ -93,6 +93,7 @@ protected CarbonRowBatch processRowBatch(CarbonRowBatch rowBatch, RowConverter l while (rowBatch.hasNext()) { newBatch.addRow(localConverter.convert(rowBatch.next())); } + createBadRecordLogger().closeStreams(); --- End diff -- Here also we should not close badrecords after every batch process. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102887816 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataConverterProcessorStepImpl.java --- @@ -93,6 +93,7 @@ protected CarbonRowBatch processRowBatch(CarbonRowBatch rowBatch, RowConverter l while (rowBatch.hasNext()) { newBatch.addRow(localConverter.convert(rowBatch.next())); } + createBadRecordLogger().closeStreams(); --- End diff -- Please call closeStreams from close method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102887832 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/converter/impl/RowConverterImpl.java --- @@ -135,6 +135,7 @@ public CarbonRow convert(CarbonRow row) throws CarbonDataLoadingException { return null; } } + --- End diff -- remove unncessary line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/607#discussion_r102887862 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataConverterProcessorWithBucketingStepImpl.java --- @@ -120,6 +120,7 @@ protected CarbonRowBatch processRowBatch(CarbonRowBatch rowBatch, RowConverter l convertRow.bucketNumber = (short) partitioner.getPartition(next.getData()); newBatch.addRow(convertRow); } + createBadRecordLogger().closeStreams(); --- End diff -- Here also call from close method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at [hidden email] or file a JIRA ticket with INFRA. --- |
Free forum by Nabble | Edit this page |