[GitHub] [carbondata] QiangCai opened a new pull request #3826: [CARBONDATA-3889] Cleanup code in carbondata-streaming module

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai opened a new pull request #3826: [CARBONDATA-3889] Cleanup code in carbondata-streaming module

GitBox

QiangCai opened a new pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826


    ### Why is this PR needed?
    need cleanup code in carbondata-streaming module
   
    ### What changes were proposed in this PR?
    Cleanup code in carbondata-streaming module
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3826: [CARBONDATA-3889] Cleanup code in carbondata-streaming module

GitBox

CarbonDataQA1 commented on pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#issuecomment-653794570


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1566/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3826: [CARBONDATA-3889] Cleanup code in carbondata-streaming module

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#issuecomment-653794729


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3303/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#discussion_r451469174



##########
File path: streaming/pom.xml
##########
@@ -95,7 +95,7 @@
         <groupId>org.apache.maven.plugins</groupId>
         <artifactId>maven-surefire-plugin</artifactId>
         <version>2.18</version>
-        <!-- Note config is repeated in scalatest config -->
+        <!-- Note config is repeated in scala test config -->

Review comment:
       Please add license header in this POM file

##########
File path: streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java
##########
@@ -155,38 +160,21 @@ public static String close(CarbonTable table, String segmentId)
             break;
           }
         }
-
-        int newSegmentId = SegmentStatusManager.createNewSegmentId(details);
-        LoadMetadataDetails newDetail = new LoadMetadataDetails();
-        newDetail.setLoadName(String.valueOf(newSegmentId));
-        newDetail.setFileFormat(FileFormat.ROW_V1);
-        newDetail.setLoadStartTime(System.currentTimeMillis());
-        newDetail.setSegmentStatus(SegmentStatus.STREAMING);
-
-        LoadMetadataDetails[] newDetails = new LoadMetadataDetails[details.length + 1];
-        int i = 0;
-        for (; i < details.length; i++) {
-          newDetails[i] = details[i];
-        }
-        newDetails[i] = newDetail;
-        SegmentStatusManager
-            .writeLoadDetailsIntoFile(CarbonTablePath.getTableStatusFilePath(
-                table.getTablePath()), newDetails);
-        return newDetail.getLoadName();
+        return createNewSegment(table, details);
       } else {
         LOGGER.error(
-            "Not able to acquire the lock for stream table status updation for table " + table
+            "Not able to acquire the lock for stream table status update for table " + table

Review comment:
       ```suggestion
               "Not able to acquire the status update lock for streaming table " + table
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kevinjmh commented on a change in pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

kevinjmh commented on a change in pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#discussion_r451471456



##########
File path: streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java
##########
@@ -400,12 +386,7 @@ public static void recoverFileIfRequired(
   public static CarbonFile[] listDataFiles(String segmentDir) {
     CarbonFile carbonDir = FileFactory.getCarbonFile(segmentDir);
     if (carbonDir.exists()) {
-      return carbonDir.listFiles(new CarbonFileFilter() {
-        @Override
-        public boolean accept(CarbonFile file) {
-          return CarbonTablePath.isCarbonDataFile(file.getName());
-        }
-      });
+      return carbonDir.listFiles(file -> CarbonTablePath.isCarbonDataFile(file.getName()));

Review comment:
       Are we target on jdk8 from now on? If yes, codes in `org.apache.carbondata.common` like `Maps`, `Strings` can be removed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#discussion_r456252916



##########
File path: streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java
##########
@@ -400,12 +386,7 @@ public static void recoverFileIfRequired(
   public static CarbonFile[] listDataFiles(String segmentDir) {
     CarbonFile carbonDir = FileFactory.getCarbonFile(segmentDir);
     if (carbonDir.exists()) {
-      return carbonDir.listFiles(new CarbonFileFilter() {
-        @Override
-        public boolean accept(CarbonFile file) {
-          return CarbonTablePath.isCarbonDataFile(file.getName());
-        }
-      });
+      return carbonDir.listFiles(file -> CarbonTablePath.isCarbonDataFile(file.getName()));

Review comment:
       For carbon 2.0, the answer is yes, in my opinion.
   
   we already use new featrues of jdk8(lambda, stream API)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#discussion_r456252916



##########
File path: streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java
##########
@@ -400,12 +386,7 @@ public static void recoverFileIfRequired(
   public static CarbonFile[] listDataFiles(String segmentDir) {
     CarbonFile carbonDir = FileFactory.getCarbonFile(segmentDir);
     if (carbonDir.exists()) {
-      return carbonDir.listFiles(new CarbonFileFilter() {
-        @Override
-        public boolean accept(CarbonFile file) {
-          return CarbonTablePath.isCarbonDataFile(file.getName());
-        }
-      });
+      return carbonDir.listFiles(file -> CarbonTablePath.isCarbonDataFile(file.getName()));

Review comment:
       For carbon 2.0, the answer is yes, in my opinion.
   
   we already used new features of jdk8(lambda, stream API)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#discussion_r456257306



##########
File path: streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java
##########
@@ -155,38 +160,21 @@ public static String close(CarbonTable table, String segmentId)
             break;
           }
         }
-
-        int newSegmentId = SegmentStatusManager.createNewSegmentId(details);
-        LoadMetadataDetails newDetail = new LoadMetadataDetails();
-        newDetail.setLoadName(String.valueOf(newSegmentId));
-        newDetail.setFileFormat(FileFormat.ROW_V1);
-        newDetail.setLoadStartTime(System.currentTimeMillis());
-        newDetail.setSegmentStatus(SegmentStatus.STREAMING);
-
-        LoadMetadataDetails[] newDetails = new LoadMetadataDetails[details.length + 1];
-        int i = 0;
-        for (; i < details.length; i++) {
-          newDetails[i] = details[i];
-        }
-        newDetails[i] = newDetail;
-        SegmentStatusManager
-            .writeLoadDetailsIntoFile(CarbonTablePath.getTableStatusFilePath(
-                table.getTablePath()), newDetails);
-        return newDetail.getLoadName();
+        return createNewSegment(table, details);
       } else {
         LOGGER.error(
-            "Not able to acquire the lock for stream table status updation for table " + table
+            "Not able to acquire the lock for stream table status update for table " + table

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#discussion_r456257399



##########
File path: streaming/pom.xml
##########
@@ -95,7 +95,7 @@
         <groupId>org.apache.maven.plugins</groupId>
         <artifactId>maven-surefire-plugin</artifactId>
         <version>2.18</version>
-        <!-- Note config is repeated in scalatest config -->
+        <!-- Note config is repeated in scala test config -->

Review comment:
       added




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kevinjmh commented on a change in pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

kevinjmh commented on a change in pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#discussion_r456266671



##########
File path: streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java
##########
@@ -400,12 +386,7 @@ public static void recoverFileIfRequired(
   public static CarbonFile[] listDataFiles(String segmentDir) {
     CarbonFile carbonDir = FileFactory.getCarbonFile(segmentDir);
     if (carbonDir.exists()) {
-      return carbonDir.listFiles(new CarbonFileFilter() {
-        @Override
-        public boolean accept(CarbonFile file) {
-          return CarbonTablePath.isCarbonDataFile(file.getName());
-        }
-      });
+      return carbonDir.listFiles(file -> CarbonTablePath.isCarbonDataFile(file.getName()));

Review comment:
       OK.
   
   FYI: I check [spark release note](https://spark.apache.org/docs/2.2.0/), support for Java 7 was removed since 2.2
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#issuecomment-659992230


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1672/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#issuecomment-660008418


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3414/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kevinjmh commented on pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

kevinjmh commented on pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826#issuecomment-660632458


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3826: [CARBONDATA-3889] Cleanup code for carbondata-streaming module

GitBox
In reply to this post by GitBox

asfgit closed pull request #3826:
URL: https://github.com/apache/carbondata/pull/3826


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]