[GitHub] [carbondata] niuge01 opened a new pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 opened a new pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
niuge01 opened a new pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564
 
 
    ### Why is this PR needed?
    Currently, only support set string with delimiter as struct field value, sometime it doesn't work very well on struct<binary> field.
   
    ### What changes were proposed in this PR?
    Support set base64 string as struct<binary> field value
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571539841
 
 
   please test this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Zhangshunyu commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
Zhangshunyu commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r363706267
 
 

 ##########
 File path: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala
 ##########
 @@ -63,16 +66,22 @@ object FieldConverter {
         case b: java.lang.Boolean => b.toString
         case s: java.lang.Short => s.toString
         case f: java.lang.Float => f.toString
-        case bs: Array[Byte] => new String(bs,
-          Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))
+        case bs: Array[Byte] =>
+          if (isInsertFlow) {
+            Base64.getEncoder.encodeToString(bs)
 
 Review comment:
   could you pls add some comment here about the reason in case 'isInsertFlow' use base64

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Zhangshunyu commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
Zhangshunyu commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r363709880
 
 

 ##########
 File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
 ##########
 @@ -70,7 +70,7 @@ object CarbonScalaUtil {
       level: Int = 0): String = {
     FieldConverter.objectToString(value, serializationNullFormat, complexDelimiters,
       timeStampFormat, dateFormat, isVarcharType = isVarcharType, isComplexType = isComplexType,
-      level)
+      level, true)
 
 Review comment:
   this method is an static method in Util, why the last input paramer 'insertflow' always use 'true' ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Zhangshunyu commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
Zhangshunyu commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r363709880
 
 

 ##########
 File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
 ##########
 @@ -70,7 +70,7 @@ object CarbonScalaUtil {
       level: Int = 0): String = {
     FieldConverter.objectToString(value, serializationNullFormat, complexDelimiters,
       timeStampFormat, dateFormat, isVarcharType = isVarcharType, isComplexType = isComplexType,
-      level)
+      level, true)
 
 Review comment:
   this method is a static method in Util, why the last input paramer 'insertflow' always use 'true' ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
niuge01 commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r363713449
 
 

 ##########
 File path: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
 ##########
 @@ -70,7 +70,7 @@ object CarbonScalaUtil {
       level: Int = 0): String = {
     FieldConverter.objectToString(value, serializationNullFormat, complexDelimiters,
       timeStampFormat, dateFormat, isVarcharType = isVarcharType, isComplexType = isComplexType,
-      level)
+      level, true)
 
 Review comment:
   Currently, the method is only invoked by insert flow, so  the insertflow parameter is fixed with 'true'.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
niuge01 commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r363713592
 
 

 ##########
 File path: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala
 ##########
 @@ -63,16 +66,22 @@ object FieldConverter {
         case b: java.lang.Boolean => b.toString
         case s: java.lang.Short => s.toString
         case f: java.lang.Float => f.toString
-        case bs: Array[Byte] => new String(bs,
-          Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))
+        case bs: Array[Byte] =>
+          if (isInsertFlow) {
+            Base64.getEncoder.encodeToString(bs)
 
 Review comment:
   OK

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
niuge01 commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r363716048
 
 

 ##########
 File path: streaming/src/main/scala/org/apache/carbondata/streaming/parser/FieldConverter.scala
 ##########
 @@ -63,16 +66,22 @@ object FieldConverter {
         case b: java.lang.Boolean => b.toString
         case s: java.lang.Short => s.toString
         case f: java.lang.Float => f.toString
-        case bs: Array[Byte] => new String(bs,
-          Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET))
+        case bs: Array[Byte] =>
+          if (isInsertFlow) {
+            Base64.getEncoder.encodeToString(bs)
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571557965
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571583737
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1505/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571667948
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571700026
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1512/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571718103
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571750466
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1515/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571763295
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571796245
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1516/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#issuecomment-571842368
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r364030859
 
 

 ##########
 File path: integration/flink/src/main/java/org/apache/carbon/flink/CarbonLocalWriter.java
 ##########
 @@ -51,19 +52,29 @@
   ) {
     super(factory, identifier, table);
     final Properties writerProperties = factory.getConfiguration().getWriterProperties();
+    final Properties carbonProperties = factory.getConfiguration().getCarbonProperties();
     final String commitThreshold =
         writerProperties.getProperty(CarbonLocalProperty.COMMIT_THRESHOLD);
     this.writerFactory = new WriterFactory(table, writePath) {
       @Override
       protected org.apache.carbondata.sdk.file.CarbonWriter newWriter(
           final Object[] row) {
         try {
-          return org.apache.carbondata.sdk.file.CarbonWriter.builder()
+          final CarbonWriterBuilder writerBuilder =
+              org.apache.carbondata.sdk.file.CarbonWriter.builder()
               .outputPath(super.getWritePath(row))
               .writtenBy("flink")
               .withSchemaFile(CarbonTablePath.getSchemaFilePath(table.getTablePath()))
-              .withCsvInput()
-              .build();
+              .withCsvInput();
+          for (String propertyName : carbonProperties.stringPropertyNames()) {
+            try {
+              writerBuilder.withLoadOption(propertyName,
+                  carbonProperties.getProperty(propertyName));
+            } catch (IllegalArgumentException ignore) {
+              // Ignore.
 
 Review comment:
   suggest to print a long to warm the user

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r364030859
 
 

 ##########
 File path: integration/flink/src/main/java/org/apache/carbon/flink/CarbonLocalWriter.java
 ##########
 @@ -51,19 +52,29 @@
   ) {
     super(factory, identifier, table);
     final Properties writerProperties = factory.getConfiguration().getWriterProperties();
+    final Properties carbonProperties = factory.getConfiguration().getCarbonProperties();
     final String commitThreshold =
         writerProperties.getProperty(CarbonLocalProperty.COMMIT_THRESHOLD);
     this.writerFactory = new WriterFactory(table, writePath) {
       @Override
       protected org.apache.carbondata.sdk.file.CarbonWriter newWriter(
           final Object[] row) {
         try {
-          return org.apache.carbondata.sdk.file.CarbonWriter.builder()
+          final CarbonWriterBuilder writerBuilder =
+              org.apache.carbondata.sdk.file.CarbonWriter.builder()
               .outputPath(super.getWritePath(row))
               .writtenBy("flink")
               .withSchemaFile(CarbonTablePath.getSchemaFilePath(table.getTablePath()))
-              .withCsvInput()
-              .build();
+              .withCsvInput();
+          for (String propertyName : carbonProperties.stringPropertyNames()) {
+            try {
+              writerBuilder.withLoadOption(propertyName,
+                  carbonProperties.getProperty(propertyName));
+            } catch (IllegalArgumentException ignore) {
+              // Ignore.
 
 Review comment:
   suggest to print a log to warm the user

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3564: [CARBONDATA-3655] Support set base64 string as struct<binary> field value
URL: https://github.com/apache/carbondata/pull/3564#discussion_r364030946
 
 

 ##########
 File path: integration/flink/src/main/java/org/apache/carbon/flink/CarbonS3Writer.java
 ##########
 @@ -54,20 +55,30 @@
   ) {
     super(factory, identifier, table);
     final Properties writerProperties = factory.getConfiguration().getWriterProperties();
+    final Properties carbonProperties = factory.getConfiguration().getCarbonProperties();
     final String commitThreshold =
         writerProperties.getProperty(CarbonS3Property.COMMIT_THRESHOLD);
     this.writerFactory = new WriterFactory(table, writePath) {
       @Override
       protected org.apache.carbondata.sdk.file.CarbonWriter newWriter(
           final Object[] row) {
         try {
-          return org.apache.carbondata.sdk.file.CarbonWriter.builder()
+          final CarbonWriterBuilder writerBuilder =
+              org.apache.carbondata.sdk.file.CarbonWriter.builder()
               .outputPath(super.getWritePath(row))
               .writtenBy("flink")
               .withSchemaFile(CarbonTablePath.getSchemaFilePath(table.getTablePath()))
               .withCsvInput()
-              .withHadoopConf(configuration)
-              .build();
+              .withHadoopConf(configuration);
+          for (String propertyName : carbonProperties.stringPropertyNames()) {
+            try {
+              writerBuilder.withLoadOption(propertyName,
+                  carbonProperties.getProperty(propertyName));
+            } catch (IllegalArgumentException ignore) {
+              // Ignore.
 
 Review comment:
   suggest to print a log to warm the user

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
12