[GitHub] [carbondata] niuge01 opened a new pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table

classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 opened a new pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
niuge01 opened a new pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532
 
 
   Support write flink streaming data to a partition carbon table with stage file format.
   
    - [ ] Any interfaces changed?
   Yes, add an property [COMMIT_THRESHOLD] of carbon writer.
   
    - [ ] Any backward compatibility impacted?
   No
   
    - [ ] Document update required?
   
    - [ ] Testing done
           Please provide details on
           - Whether new unit test cases have been added or why no new tests are required?
           - How it is tested? Please attach test report.
           - Is it a performance related change? Please attach the performance test report.
           - Any additional information to help reviewers in testing this change.
         
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   NA
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-568833034
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1293/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-568833126
 
 
   Build Failed with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1282/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-568833241
 
 
   Build Failed  with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1272/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#discussion_r361770939
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/statusmanager/StageInput.java
 ##########
 @@ -39,6 +39,8 @@
    */
   private Map<String, Long> files;
 
+  private List<PartitionLocation> locations;
 
 Review comment:
   please add comment

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#discussion_r361770968
 
 

 ##########
 File path: integration/flink/src/main/java/org/apache/carbon/flink/CarbonS3Writer.java
 ##########
 @@ -139,15 +152,16 @@ public void commit() throws IOException {
         );
       }
       dataPath = dataPath + this.table.getDatabaseName() + CarbonCommonConstants.FILE_SEPARATOR +
-          this.table.getTableName() + CarbonCommonConstants.FILE_SEPARATOR +
-          this.writePartition + CarbonCommonConstants.FILE_SEPARATOR;
-      Map<String, Long> fileList =
-          this.uploadSegmentDataFiles(this.writePath + "Fact/Part0/Segment_null/", dataPath);
+          this.table.getTableName() + CarbonCommonConstants.FILE_SEPARATOR;
+      StageInput stageInput = this.uploadSegmentDataFiles(this.writePath, dataPath);
+      if (stageInput == null) {
+        return;
+      }
       try {
         String stageInputPath = CarbonTablePath.getStageDir(
             table.getAbsoluteTableIdentifier().getTablePath()) +
-            CarbonCommonConstants.FILE_SEPARATOR + this.writePartition;
-        StageManager.writeStageInput(stageInputPath, new StageInput(dataPath, fileList));
+            CarbonCommonConstants.FILE_SEPARATOR + UUID.randomUUID();// TODO UUID
 
 Review comment:
   remove TODO

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#discussion_r361770999
 
 

 ##########
 File path: integration/flink/src/test/scala/org/apache/carbon/flink/TestSource.scala
 ##########
 @@ -1,25 +1,27 @@
 package org.apache.carbon.flink
 
+import java.util.Random
+
 import org.apache.flink.api.common.state.{ListState, ListStateDescriptor}
 import org.apache.flink.runtime.state.{FunctionInitializationContext, FunctionSnapshotContext}
 import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction
 import org.apache.flink.streaming.api.functions.source.SourceFunction
 
-abstract class TestSource(val dataCount: Int) extends SourceFunction[String] with CheckpointedFunction {
+abstract class TestSource(val dataCount: Int) extends SourceFunction[Array[AnyRef]] with CheckpointedFunction {
 
 Review comment:
   Please add more test case to verify the write output is correct

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#discussion_r361780868
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/statusmanager/StageInput.java
 ##########
 @@ -39,6 +39,8 @@
    */
   private Map<String, Long> files;
 
+  private List<PartitionLocation> locations;
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#discussion_r361780902
 
 

 ##########
 File path: integration/flink/src/test/scala/org/apache/carbon/flink/TestSource.scala
 ##########
 @@ -1,25 +1,27 @@
 package org.apache.carbon.flink
 
+import java.util.Random
+
 import org.apache.flink.api.common.state.{ListState, ListStateDescriptor}
 import org.apache.flink.runtime.state.{FunctionInitializationContext, FunctionSnapshotContext}
 import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction
 import org.apache.flink.streaming.api.functions.source.SourceFunction
 
-abstract class TestSource(val dataCount: Int) extends SourceFunction[String] with CheckpointedFunction {
+abstract class TestSource(val dataCount: Int) extends SourceFunction[Array[AnyRef]] with CheckpointedFunction {
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 commented on a change in pull request #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#discussion_r361780938
 
 

 ##########
 File path: integration/flink/src/main/java/org/apache/carbon/flink/CarbonS3Writer.java
 ##########
 @@ -139,15 +152,16 @@ public void commit() throws IOException {
         );
       }
       dataPath = dataPath + this.table.getDatabaseName() + CarbonCommonConstants.FILE_SEPARATOR +
-          this.table.getTableName() + CarbonCommonConstants.FILE_SEPARATOR +
-          this.writePartition + CarbonCommonConstants.FILE_SEPARATOR;
-      Map<String, Long> fileList =
-          this.uploadSegmentDataFiles(this.writePath + "Fact/Part0/Segment_null/", dataPath);
+          this.table.getTableName() + CarbonCommonConstants.FILE_SEPARATOR;
+      StageInput stageInput = this.uploadSegmentDataFiles(this.writePath, dataPath);
+      if (stageInput == null) {
+        return;
+      }
       try {
         String stageInputPath = CarbonTablePath.getStageDir(
             table.getAbsoluteTableIdentifier().getTablePath()) +
-            CarbonCommonConstants.FILE_SEPARATOR + this.writePartition;
-        StageManager.writeStageInput(stageInputPath, new StageInput(dataPath, fileList));
+            CarbonCommonConstants.FILE_SEPARATOR + UUID.randomUUID();// TODO UUID
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569393927
 
 
   please test this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569394437
 
 
   Build Failed  with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1309/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569396166
 
 
   please test this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569397235
 
 
   Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1311/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569398296
 
 
   please test this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569399269
 
 
   Build Failed  with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1314/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569400607
 
 
   please test this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569401658
 
 
   please test this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] niuge01 removed a comment on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
niuge01 removed a comment on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569400607
 
 
   please test this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3532: [CARBONDATA-3557] Write flink streaming data to partition table
URL: https://github.com/apache/carbondata/pull/3532#issuecomment-569402636
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1337/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
12