[GitHub] [carbondata] QiangCai opened a new pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai opened a new pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox

QiangCai opened a new pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827


    ### Why is this PR needed?
    need cleanup code for carbondata-hadoop module
   
    ### What changes were proposed in this PR?
   Cleanup code for carbondata-hadoop module
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox

CarbonDataQA1 commented on pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#issuecomment-654746270


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3314/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#issuecomment-654748600


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1576/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

Indhumathi27 commented on pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#issuecomment-655461877


   In hadoop/src/main/java/org/apache/carbondata/hadoop/InputMetricsStats.java, remove unused method, `updateByValue`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kevinjmh commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

kevinjmh commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r451475710



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
##########
@@ -67,7 +65,7 @@
   public CarbonTable getOrCreateCarbonTable(Configuration configuration) throws IOException {
     CarbonTable carbonTableTemp;
     if (carbonTable == null) {
-      // carbon table should be created either from deserialized table info (schema saved in
+      // carbon table should be created either from deserialize table info (schema saved in

Review comment:
       google translate said it is correct ...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kevinjmh commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

kevinjmh commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r451482549



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/stream/StreamBlockletReader.java
##########
@@ -175,7 +175,7 @@ public int readIntFromStream() throws IOException {
     int ch4 = in.read();
     if ((ch1 | ch2 | ch3 | ch4) < 0) throw new EOFException();
     pos += 4;
-    return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
+    return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4));

Review comment:
       ```suggestion
       return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + ch4);
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r451494782



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
##########
@@ -65,7 +65,7 @@
  * creates new segment folder and manages the folder through tablestatus file.
  * It also generate and writes dictionary data during load only if dictionary server is configured.
  */
-// TODO Move dictionary generater which is coded in spark to MR framework.
+// TODO Move dictionary generator which is coded in spark to MR framework.
 public class CarbonTableOutputFormat extends FileOutputFormat<NullWritable, ObjectArrayWritable> {
 
   protected static final String LOAD_MODEL = "mapreduce.carbontable.load.model";

Review comment:
       1. Rename UPADTE_TIMESTAMP variable
   2. Remove unused OPERATION_CONTEXT

##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/util/CarbonInputFormatUtil.java
##########
@@ -48,7 +48,7 @@
 
   public static <V> CarbonFileInputFormat<V> createCarbonFileInputFormat(
       AbsoluteTableIdentifier identifier, Job job) throws IOException {
-    CarbonFileInputFormat<V> carbonInputFormat = new CarbonFileInputFormat<V>();
+    CarbonFileInputFormat<V> carbonInputFormat = new CarbonFileInputFormat<>();

Review comment:
       Looks like this method is Unused and LOGGER also

##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
##########
@@ -217,16 +217,6 @@
     return splits;
   }
 
-  /**

Review comment:
       Update Line No:105

##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
##########
@@ -259,14 +259,7 @@ public static void setColumnProjection(Configuration configuration, CarbonProjec
     if (projection == null || projection.isEmpty()) {
       return;
     }
-    String[] allColumns = projection.getAllColumns();
-    StringBuilder builder = new StringBuilder();
-    for (String column : allColumns) {
-      builder.append(column).append(",");
-    }
-    String columnString = builder.toString();
-    columnString = columnString.substring(0, columnString.length() - 1);
-    configuration.set(COLUMN_PROJECTION, columnString);
+    setColumnProjection(configuration, projection.getAllColumns());

Review comment:
       1. Please check line no 190 and fix
   2. Remove unused method `setFgIndexPruning`
   3. Looks like isFgIndexPruningEnable will always be true. Can check and remove
   4. Line No. 394, can update the method defintion




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r456260367



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
##########
@@ -67,7 +65,7 @@
   public CarbonTable getOrCreateCarbonTable(Configuration configuration) throws IOException {
     CarbonTable carbonTableTemp;
     if (carbonTable == null) {
-      // carbon table should be created either from deserialized table info (schema saved in
+      // carbon table should be created either from deserialize table info (schema saved in

Review comment:
       Reverted




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r456269091



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
##########
@@ -259,14 +259,7 @@ public static void setColumnProjection(Configuration configuration, CarbonProjec
     if (projection == null || projection.isEmpty()) {
       return;
     }
-    String[] allColumns = projection.getAllColumns();
-    StringBuilder builder = new StringBuilder();
-    for (String column : allColumns) {
-      builder.append(column).append(",");
-    }
-    String columnString = builder.toString();
-    columnString = columnString.substring(0, columnString.length() - 1);
-    configuration.set(COLUMN_PROJECTION, columnString);
+    setColumnProjection(configuration, projection.getAllColumns());

Review comment:
       1.  Reverted the change (deserialized => deserialize)
   2.  Removed
   3.  If the configuration exists and not equal to 'true', it will be false
   4.  Updated




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r456271059



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
##########
@@ -217,16 +217,6 @@
     return splits;
   }
 
-  /**

Review comment:
       updated




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r456271731



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
##########
@@ -65,7 +65,7 @@
  * creates new segment folder and manages the folder through tablestatus file.
  * It also generate and writes dictionary data during load only if dictionary server is configured.
  */
-// TODO Move dictionary generater which is coded in spark to MR framework.
+// TODO Move dictionary generator which is coded in spark to MR framework.
 public class CarbonTableOutputFormat extends FileOutputFormat<NullWritable, ObjectArrayWritable> {
 
   protected static final String LOAD_MODEL = "mapreduce.carbontable.load.model";

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r456272176



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/stream/StreamBlockletReader.java
##########
@@ -175,7 +175,7 @@ public int readIntFromStream() throws IOException {
     int ch4 = in.read();
     if ((ch1 | ch2 | ch3 | ch4) < 0) throw new EOFException();
     pos += 4;
-    return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
+    return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4));

Review comment:
       accept




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r456272851



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/util/CarbonInputFormatUtil.java
##########
@@ -48,7 +48,7 @@
 
   public static <V> CarbonFileInputFormat<V> createCarbonFileInputFormat(
       AbsoluteTableIdentifier identifier, Job job) throws IOException {
-    CarbonFileInputFormat<V> carbonInputFormat = new CarbonFileInputFormat<V>();
+    CarbonFileInputFormat<V> carbonInputFormat = new CarbonFileInputFormat<>();

Review comment:
       remove Unused variable and method

##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/util/CarbonInputFormatUtil.java
##########
@@ -48,7 +48,7 @@
 
   public static <V> CarbonFileInputFormat<V> createCarbonFileInputFormat(
       AbsoluteTableIdentifier identifier, Job job) throws IOException {
-    CarbonFileInputFormat<V> carbonInputFormat = new CarbonFileInputFormat<V>();
+    CarbonFileInputFormat<V> carbonInputFormat = new CarbonFileInputFormat<>();

Review comment:
       removed Unused variable and method




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r456274947



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
##########
@@ -67,7 +65,7 @@
   public CarbonTable getOrCreateCarbonTable(Configuration configuration) throws IOException {
     CarbonTable carbonTableTemp;
     if (carbonTable == null) {
-      // carbon table should be created either from deserialized table info (schema saved in
+      // carbon table should be created either from deserialize table info (schema saved in

Review comment:
       Reverted




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#discussion_r456275230



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/stream/StreamBlockletReader.java
##########
@@ -175,7 +175,7 @@ public int readIntFromStream() throws IOException {
     int ch4 = in.read();
     if ((ch1 | ch2 | ch3 | ch4) < 0) throw new EOFException();
     pos += 4;
-    return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
+    return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4));

Review comment:
       accept




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#issuecomment-659938594


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3417/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#issuecomment-659940523


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1675/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#issuecomment-660793729


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3427/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#issuecomment-660794567


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1685/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kevinjmh commented on pull request #3827: [CARBONDATA-3889] Cleanup code for carbondata-hadoop module

GitBox
In reply to this post by GitBox

kevinjmh commented on pull request #3827:
URL: https://github.com/apache/carbondata/pull/3827#issuecomment-660979881


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


12