CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-584197385 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/219/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-584234139 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1921/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#discussion_r377419646 ########## File path: streaming/src/main/java/org/apache/carbondata/streaming/CarbonStreamRecordWriter.java ########## @@ -188,7 +190,7 @@ private void initializeAtFirstRow() throws IOException { compressorName = carbonTable.getTableInfo().getFactTable().getTableProperties().get( CarbonCommonConstants.COMPRESSOR); if (null == compressorName) { - compressorName = CompressorFactory.getInstance().getCompressor().getName(); + compressorName = CompressorFactory.NativeSupportedCompressor.SNAPPY.getName(); Review comment: ok, please add a TODO ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#discussion_r377419707 ########## File path: streaming/src/main/java/org/apache/carbondata/streaming/CarbonStreamRecordWriter.java ########## @@ -140,7 +140,9 @@ private void initialize(TaskAttemptContext job) throws IOException { segmentDir = CarbonTablePath.getSegmentPath( carbonTable.getAbsoluteTableIdentifier().getTablePath(), segmentId); - fileName = CarbonTablePath.getCarbonDataFileName(0, taskNo + "", 0, 0, "0", segmentId); + fileName = CarbonTablePath.getCarbonDataFileName( Review comment: ok, please add a TODO ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-585133694 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/257/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-585156992 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1960/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-589554209 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/386/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-589588444 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2087/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
Indhumathi27 commented on a change in pull request #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#discussion_r382532147 ########## File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java ########## @@ -409,11 +433,10 @@ public static String getTimeStampFromFileName(String carbonDataFileName) { * @return */ public static Boolean compareCarbonFileTimeStamp(String fileName, Long timestamp) { - int lastIndexOfHyphen = fileName.lastIndexOf("-"); - int lastIndexOfDot = fileName.lastIndexOf("."); - if (lastIndexOfHyphen > 0 && lastIndexOfDot > 0) { - return fileName.substring(fileName.lastIndexOf("-") + 1, fileName.lastIndexOf(".")) - .equals(timestamp.toString()); + int startIndex = fileName.lastIndexOf("-"); Review comment: Suggest to use CarbonCommonConstants for "-". "." ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#discussion_r382908617 ########## File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java ########## @@ -409,11 +433,10 @@ public static String getTimeStampFromFileName(String carbonDataFileName) { * @return */ public static Boolean compareCarbonFileTimeStamp(String fileName, Long timestamp) { - int lastIndexOfHyphen = fileName.lastIndexOf("-"); - int lastIndexOfDot = fileName.lastIndexOf("."); - if (lastIndexOfHyphen > 0 && lastIndexOfDot > 0) { - return fileName.substring(fileName.lastIndexOf("-") + 1, fileName.lastIndexOf(".")) - .equals(timestamp.toString()); + int startIndex = fileName.lastIndexOf("-"); Review comment: This func is not used anywhere, I deleted it ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3606: [CARBONDATA-3681][CARBONDATA-3688] Add compressor name in data file name and change default compressor to ZSTD
URL: https://github.com/apache/carbondata/pull/3606#discussion_r382908617 ########## File path: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java ########## @@ -409,11 +433,10 @@ public static String getTimeStampFromFileName(String carbonDataFileName) { * @return */ public static Boolean compareCarbonFileTimeStamp(String fileName, Long timestamp) { - int lastIndexOfHyphen = fileName.lastIndexOf("-"); - int lastIndexOfDot = fileName.lastIndexOf("."); - if (lastIndexOfHyphen > 0 && lastIndexOfDot > 0) { - return fileName.substring(fileName.lastIndexOf("-") + 1, fileName.lastIndexOf(".")) - .equals(timestamp.toString()); + int startIndex = fileName.lastIndexOf("-"); Review comment: fixed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-589954095 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/399/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#issuecomment-589960864 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2100/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#discussion_r382961481 ########## File path: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/describeTable/TestDescribeTable.scala ########## @@ -68,15 +68,6 @@ class TestDescribeTable extends QueryTest with BeforeAndAfterAll { assert(descPar.exists(_.toString().contains("Partition Parameters:"))) } - test(testName = "Compressor Type update from carbon properties") { - sql("drop table if exists b") - sql(sqlText = "create table b(a int,b string) STORED AS carbondata") - CarbonProperties.getInstance().addProperty(CarbonCommonConstants.COMPRESSOR, "gzip") - val result = sql(sqlText = "desc formatted b").collect() - assert(result.filter(row => row.getString(0).contains("Data File Compressor")).head.getString Review comment: As now we are not changing the default compressor, we can revert this change ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#discussion_r382961494 ########## File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala ########## @@ -141,11 +141,6 @@ private[sql] case class CarbonDescribeFormattedCommand( Strings.formatSize( tblProps.getOrElse(CarbonCommonConstants.CARBON_LOAD_MIN_SIZE_INMB, CarbonCommonConstants.CARBON_LOAD_MIN_SIZE_INMB_DEFAULT).toFloat), ""), - ("Data File Compressor ", tblProps - .getOrElse(CarbonCommonConstants.COMPRESSOR, - CarbonProperties.getInstance() Review comment: As now we are not changing the default compressor, we can revert this change ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#discussion_r382961552 ########## File path: streaming/src/main/java/org/apache/carbondata/streaming/CarbonStreamRecordWriter.java ########## @@ -139,7 +139,9 @@ private void initialize(TaskAttemptContext job) throws IOException { segmentDir = CarbonTablePath.getSegmentPath( carbonTable.getAbsoluteTableIdentifier().getTablePath(), segmentId); - fileName = CarbonTablePath.getCarbonDataFileName(0, taskNo + "", 0, 0, "0", segmentId); + fileName = CarbonTablePath.getCarbonDataFileName( + 0, taskNo + "", 0, 0, "0", + segmentId, CompressorFactory.NativeSupportedCompressor.SNAPPY.getName()); Review comment: revert this or use CCC default compressor ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#discussion_r382961565 ########## File path: streaming/src/main/java/org/apache/carbondata/streaming/CarbonStreamRecordWriter.java ########## @@ -186,7 +188,7 @@ private void initializeAtFirstRow() throws IOException { compressorName = carbonTable.getTableInfo().getFactTable().getTableProperties().get( CarbonCommonConstants.COMPRESSOR); if (null == compressorName) { - compressorName = CompressorFactory.getInstance().getCompressor().getName(); + compressorName = CompressorFactory.NativeSupportedCompressor.SNAPPY.getName(); Review comment: revert this or use CCC default compressor ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#discussion_r383083186 ########## File path: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/describeTable/TestDescribeTable.scala ########## @@ -68,15 +68,6 @@ class TestDescribeTable extends QueryTest with BeforeAndAfterAll { assert(descPar.exists(_.toString().contains("Partition Parameters:"))) } - test(testName = "Compressor Type update from carbon properties") { - sql("drop table if exists b") - sql(sqlText = "create table b(a int,b string) STORED AS carbondata") - CarbonProperties.getInstance().addProperty(CarbonCommonConstants.COMPRESSOR, "gzip") - val result = sql(sqlText = "desc formatted b").collect() - assert(result.filter(row => row.getString(0).contains("Data File Compressor")).head.getString Review comment: No, this print is not correct. For example, when loading, user set to use ZSTD compressor, and then set to SNAPPY again after loading, then this command still prints SNAPPY which is misleading ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#discussion_r383083186 ########## File path: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/describeTable/TestDescribeTable.scala ########## @@ -68,15 +68,6 @@ class TestDescribeTable extends QueryTest with BeforeAndAfterAll { assert(descPar.exists(_.toString().contains("Partition Parameters:"))) } - test(testName = "Compressor Type update from carbon properties") { - sql("drop table if exists b") - sql(sqlText = "create table b(a int,b string) STORED AS carbondata") - CarbonProperties.getInstance().addProperty(CarbonCommonConstants.COMPRESSOR, "gzip") - val result = sql(sqlText = "desc formatted b").collect() - assert(result.filter(row => row.getString(0).contains("Data File Compressor")).head.getString Review comment: No, this print is not correct. For example, when loading, user set to use ZSTD compressor, and then set to SNAPPY again after loading, then this command still prints SNAPPY which is misleading. Compressor should be file level by checking the file name or the meta inside the file ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3606: [CARBONDATA-3688] Add compressor name in data file name
URL: https://github.com/apache/carbondata/pull/3606#discussion_r383083275 ########## File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala ########## @@ -141,11 +141,6 @@ private[sql] case class CarbonDescribeFormattedCommand( Strings.formatSize( tblProps.getOrElse(CarbonCommonConstants.CARBON_LOAD_MIN_SIZE_INMB, CarbonCommonConstants.CARBON_LOAD_MIN_SIZE_INMB_DEFAULT).toFloat), ""), - ("Data File Compressor ", tblProps - .getOrElse(CarbonCommonConstants.COMPRESSOR, - CarbonProperties.getInstance() Review comment: explained above ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |