GitHub user zzcclp opened a pull request:
https://github.com/apache/carbondata/pull/1516 [CARBONDATA-1729]Fix the compatibility issue with hadoop <= 2.6 and 2.7 1. Recover profile of 'hadoop-2.2.0' to pom.xml 2. Use reflection mechanism to implement 'truncate' method Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zzcclp/carbondata CARBONDATA-1729 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1516.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1516 ---- commit 66e349b277251ebfb46adc48a833569de32e1799 Author: Zhang Zhichao <[hidden email]> Date: 2017-11-17T02:29:12Z [CARBONDATA-1729]Fix the compatibility issue with hadoop <= 2.6 and 2.7 1. Recover profile of 'hadoop-2.2.0' to pom.xml 2. Use reflection mechanism to implement 'truncate' method ---- --- |
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/1516 @QiangCai @jackylk please review, thanks. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1516 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1202/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1516 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1206/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1213/ --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on the issue:
https://github.com/apache/carbondata/pull/1516 I have complied successfully with below commands: 1. mvn -Pspark-2.1 -Pbuild-with-format -Dspark.version=2.1.2 clean package; 2. mvn -Pspark-2.1 -Phadoop-2.2.0 -Pbuild-with-format -Dspark.version=2.1.2 -Dhadoop.version=2.6.0-cdh5.7.1; @QiangCai @jackylk @chenliang613 please review, thanks. --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1516#discussion_r151621845 --- Diff: pom.xml --- @@ -453,9 +453,9 @@ </build> </profile> <profile> - <id>hadoop-2.7.2</id> + <id>hadoop-2.2.0</id> --- End diff -- you can add a profile for hadoop-2.2.0, don't need to overwrite hadoop-2.7.2. by default, should use hadoop-2.7.2 --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1516#discussion_r151622204 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -462,39 +461,8 @@ public static DataOutputStream getDataOutputStreamUsingAppend(String path, FileT * @throws IOException */ public static void truncateFile(String path, FileType fileType, long newSize) throws IOException { - path = path.replace("\\", "/"); - FileChannel fileChannel = null; - switch (fileType) { - case LOCAL: - path = getUpdatedFilePath(path, fileType); - fileChannel = new FileOutputStream(path, true).getChannel(); - try { - fileChannel.truncate(newSize); - } finally { - if (fileChannel != null) { - fileChannel.close(); - } - } - return; - case HDFS: - case ALLUXIO: - case VIEWFS: - case S3: - Path pt = new Path(path); - FileSystem fs = pt.getFileSystem(configuration); - fs.truncate(pt, newSize); --- End diff -- I think it is better to use java reflection for line 485 only, no need to modify previous file --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1516#discussion_r151626078 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java --- @@ -154,52 +155,68 @@ public boolean delete() { * This method will delete the data in file data from a given offset */ @Override public boolean truncate(String fileName, long validDataEndOffset) { - DataOutputStream dataOutputStream = null; - DataInputStream dataInputStream = null; boolean fileTruncatedSuccessfully = false; - // if bytes to read less than 1024 then buffer size should be equal to the given offset - int bufferSize = validDataEndOffset > CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ? - CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR : - (int) validDataEndOffset; - // temporary file name - String tempWriteFilePath = fileName + CarbonCommonConstants.TEMPWRITEFILEEXTENSION; - FileFactory.FileType fileType = FileFactory.getFileType(fileName); try { - CarbonFile tempFile; - // delete temporary file if it already exists at a given path - if (FileFactory.isFileExist(tempWriteFilePath, fileType)) { - tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); - tempFile.delete(); - } - // create new temporary file - FileFactory.createNewFile(tempWriteFilePath, fileType); - tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); - byte[] buff = new byte[bufferSize]; - dataInputStream = FileFactory.getDataInputStream(fileName, fileType); - // read the data - int read = dataInputStream.read(buff, 0, buff.length); - dataOutputStream = FileFactory.getDataOutputStream(tempWriteFilePath, fileType); - dataOutputStream.write(buff, 0, read); - long remaining = validDataEndOffset - read; - // anytime we should not cross the offset to be read - while (remaining > 0) { - if (remaining > bufferSize) { - buff = new byte[bufferSize]; - } else { - buff = new byte[(int) remaining]; + // if hadoop version >= 2.7, it can call method 'truncate' to truncate file, + // this method was new in hadoop 2.7 + FileSystem fs = fileStatus.getPath().getFileSystem(FileFactory.getConfiguration()); + Method truncateMethod = fs.getClass().getDeclaredMethod("truncate", + new Class[]{Path.class, long.class}); + fileTruncatedSuccessfully = (boolean)truncateMethod.invoke(fs, + new Object[]{fileStatus.getPath(), validDataEndOffset}); + } catch (NoSuchMethodException e) { + LOGGER.error("there is no 'truncate' method in FileSystem, the version of hadoop is" + + " below 2.7, It needs to implement truncate file by other way."); + DataOutputStream dataOutputStream = null; + DataInputStream dataInputStream = null; + // if bytes to read less than 1024 then buffer size should be equal to the given offset + int bufferSize = validDataEndOffset > CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ? + CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR : + (int) validDataEndOffset; + // temporary file name + String tempWriteFilePath = fileName + CarbonCommonConstants.TEMPWRITEFILEEXTENSION; + FileFactory.FileType fileType = FileFactory.getFileType(fileName); + try { + CarbonFile tempFile; + // delete temporary file if it already exists at a given path + if (FileFactory.isFileExist(tempWriteFilePath, fileType)) { + tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); + tempFile.delete(); } - read = dataInputStream.read(buff, 0, buff.length); + // create new temporary file + FileFactory.createNewFile(tempWriteFilePath, fileType); + tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); + byte[] buff = new byte[bufferSize]; + dataInputStream = FileFactory.getDataInputStream(fileName, fileType); + // read the data + int read = dataInputStream.read(buff, 0, buff.length); + dataOutputStream = FileFactory.getDataOutputStream(tempWriteFilePath, fileType); dataOutputStream.write(buff, 0, read); - remaining = remaining - read; + long remaining = validDataEndOffset - read; + // anytime we should not cross the offset to be read + while (remaining > 0) { + if (remaining > bufferSize) { + buff = new byte[bufferSize]; + } else { + buff = new byte[(int) remaining]; + } + read = dataInputStream.read(buff, 0, buff.length); + dataOutputStream.write(buff, 0, read); + remaining = remaining - read; + } + CarbonUtil.closeStreams(dataInputStream, dataOutputStream); + // rename the temp file to original file + tempFile.renameForce(fileName); + fileTruncatedSuccessfully = true; + } catch (IOException ioe) { + LOGGER.error("IOException occurred while truncating the file " + ioe.getMessage()); + } finally { + CarbonUtil.closeStreams(dataOutputStream, dataInputStream); } - CarbonUtil.closeStreams(dataInputStream, dataOutputStream); - // rename the temp file to original file - tempFile.renameForce(fileName); - fileTruncatedSuccessfully = true; } catch (IOException e) { - LOGGER.error("Exception occurred while truncating the file " + e.getMessage()); - } finally { - CarbonUtil.closeStreams(dataOutputStream, dataInputStream); + LOGGER.error("IOException occurred while truncating the file " + e.getMessage()); + } catch (Exception e) { --- End diff -- why finally, need add the catch again : catch (Exception e) { LOGGER.error("Other exception occurred while truncating the file + e.getMessage()); } --- |
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1516#discussion_r151628700 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -462,39 +461,8 @@ public static DataOutputStream getDataOutputStreamUsingAppend(String path, FileT * @throws IOException */ public static void truncateFile(String path, FileType fileType, long newSize) throws IOException { - path = path.replace("\\", "/"); --- End diff -- why remove these code. --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1516#discussion_r151632100 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -462,39 +461,8 @@ public static DataOutputStream getDataOutputStreamUsingAppend(String path, FileT * @throws IOException */ public static void truncateFile(String path, FileType fileType, long newSize) throws IOException { - path = path.replace("\\", "/"); - FileChannel fileChannel = null; - switch (fileType) { - case LOCAL: - path = getUpdatedFilePath(path, fileType); - fileChannel = new FileOutputStream(path, true).getChannel(); - try { - fileChannel.truncate(newSize); - } finally { - if (fileChannel != null) { - fileChannel.close(); - } - } - return; - case HDFS: - case ALLUXIO: - case VIEWFS: - case S3: - Path pt = new Path(path); - FileSystem fs = pt.getFileSystem(configuration); - fs.truncate(pt, newSize); --- End diff -- According to discussion with @QiangCai offline, just use the interface 'CarbonFile.truncate' to truncate file uniformly. @QiangCai what do you think about this? --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1516#discussion_r151632600 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -462,39 +461,8 @@ public static DataOutputStream getDataOutputStreamUsingAppend(String path, FileT * @throws IOException */ public static void truncateFile(String path, FileType fileType, long newSize) throws IOException { - path = path.replace("\\", "/"); --- End diff -- want to use the interface 'CarbonFile.truncate' to truncate file uniformly. --- |
In reply to this post by qiuchenjian-2
Github user zzcclp commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1516#discussion_r151633344 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java --- @@ -154,52 +155,68 @@ public boolean delete() { * This method will delete the data in file data from a given offset */ @Override public boolean truncate(String fileName, long validDataEndOffset) { - DataOutputStream dataOutputStream = null; - DataInputStream dataInputStream = null; boolean fileTruncatedSuccessfully = false; - // if bytes to read less than 1024 then buffer size should be equal to the given offset - int bufferSize = validDataEndOffset > CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ? - CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR : - (int) validDataEndOffset; - // temporary file name - String tempWriteFilePath = fileName + CarbonCommonConstants.TEMPWRITEFILEEXTENSION; - FileFactory.FileType fileType = FileFactory.getFileType(fileName); try { - CarbonFile tempFile; - // delete temporary file if it already exists at a given path - if (FileFactory.isFileExist(tempWriteFilePath, fileType)) { - tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); - tempFile.delete(); - } - // create new temporary file - FileFactory.createNewFile(tempWriteFilePath, fileType); - tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); - byte[] buff = new byte[bufferSize]; - dataInputStream = FileFactory.getDataInputStream(fileName, fileType); - // read the data - int read = dataInputStream.read(buff, 0, buff.length); - dataOutputStream = FileFactory.getDataOutputStream(tempWriteFilePath, fileType); - dataOutputStream.write(buff, 0, read); - long remaining = validDataEndOffset - read; - // anytime we should not cross the offset to be read - while (remaining > 0) { - if (remaining > bufferSize) { - buff = new byte[bufferSize]; - } else { - buff = new byte[(int) remaining]; + // if hadoop version >= 2.7, it can call method 'truncate' to truncate file, + // this method was new in hadoop 2.7 + FileSystem fs = fileStatus.getPath().getFileSystem(FileFactory.getConfiguration()); + Method truncateMethod = fs.getClass().getDeclaredMethod("truncate", + new Class[]{Path.class, long.class}); + fileTruncatedSuccessfully = (boolean)truncateMethod.invoke(fs, + new Object[]{fileStatus.getPath(), validDataEndOffset}); + } catch (NoSuchMethodException e) { + LOGGER.error("there is no 'truncate' method in FileSystem, the version of hadoop is" + + " below 2.7, It needs to implement truncate file by other way."); + DataOutputStream dataOutputStream = null; + DataInputStream dataInputStream = null; + // if bytes to read less than 1024 then buffer size should be equal to the given offset + int bufferSize = validDataEndOffset > CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ? + CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR : + (int) validDataEndOffset; + // temporary file name + String tempWriteFilePath = fileName + CarbonCommonConstants.TEMPWRITEFILEEXTENSION; + FileFactory.FileType fileType = FileFactory.getFileType(fileName); + try { + CarbonFile tempFile; + // delete temporary file if it already exists at a given path + if (FileFactory.isFileExist(tempWriteFilePath, fileType)) { + tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); + tempFile.delete(); } - read = dataInputStream.read(buff, 0, buff.length); + // create new temporary file + FileFactory.createNewFile(tempWriteFilePath, fileType); + tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); + byte[] buff = new byte[bufferSize]; + dataInputStream = FileFactory.getDataInputStream(fileName, fileType); + // read the data + int read = dataInputStream.read(buff, 0, buff.length); + dataOutputStream = FileFactory.getDataOutputStream(tempWriteFilePath, fileType); dataOutputStream.write(buff, 0, read); - remaining = remaining - read; + long remaining = validDataEndOffset - read; + // anytime we should not cross the offset to be read + while (remaining > 0) { + if (remaining > bufferSize) { + buff = new byte[bufferSize]; + } else { + buff = new byte[(int) remaining]; + } + read = dataInputStream.read(buff, 0, buff.length); + dataOutputStream.write(buff, 0, read); + remaining = remaining - read; + } + CarbonUtil.closeStreams(dataInputStream, dataOutputStream); + // rename the temp file to original file + tempFile.renameForce(fileName); + fileTruncatedSuccessfully = true; + } catch (IOException ioe) { + LOGGER.error("IOException occurred while truncating the file " + ioe.getMessage()); + } finally { + CarbonUtil.closeStreams(dataOutputStream, dataInputStream); } - CarbonUtil.closeStreams(dataInputStream, dataOutputStream); - // rename the temp file to original file - tempFile.renameForce(fileName); - fileTruncatedSuccessfully = true; } catch (IOException e) { - LOGGER.error("Exception occurred while truncating the file " + e.getMessage()); - } finally { - CarbonUtil.closeStreams(dataOutputStream, dataInputStream); + LOGGER.error("IOException occurred while truncating the file " + e.getMessage()); + } catch (Exception e) { --- End diff -- Method.invoke may throw other exceptions when running, such as: IllegalAccessException, IllegalArgumentException, InvocationTargetException; So need to add this catch for above exceptions. The finally block is for the try block which is in catch NoSuchMethodException branch. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1296/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1298/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1300/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1516 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1765/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1516 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1303/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1305/ --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1516 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1767/ --- |
Free forum by Nabble | Edit this page |