[GitHub] [carbondata] nihal0107 opened a new pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 opened a new pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox

nihal0107 opened a new pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936


    ### Why is this PR needed?
   Reading data from empty carbontable through hive beeline was giving 'Unable read Carbon Schema' exception, when no data is present in the carbon table.
   
    ### What changes were proposed in this PR?
   Return the empty data if no data is present.
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694441987


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4121/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694442422


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2380/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694744293


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2382/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694744886


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4123/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490834840



##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##########
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa
     } else {
       carbonInputFormat = new CarbonFileInputFormat<>();
     }
-    List<org.apache.hadoop.mapreduce.InputSplit> splitList =
-        carbonInputFormat.getSplits(jobContext);
+    List<org.apache.hadoop.mapreduce.InputSplit> splitList;
+    try {
+      splitList = carbonInputFormat.getSplits(jobContext);
+    } catch (IOException ex) {
+      if (ex.getMessage().contains("No Index files are present in the table location :")) {

Review comment:
       I see that hive supports only non-transactional table support. Non trascational tables are often created by the path provided by user.  so, when query if the path doesn't have carbon files. It is must to throw exception.
   
   If external table is created on wrong path. Now query gives 0 rows and user will not know what went wrong.

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S
       columns = columns + "," + partitionColumns;
       columnTypes = columnTypes + ":" + partitionColumnTypes;
     }
-    String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+    String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes);
     CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
         HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
-            sortColumns, columns.split(","), columnTypeArray, new ArrayList<>()));
+            sortColumns, validatedColumnsAndTypes[0],
+            validatedColumnsAndTypes[1], new ArrayList<>()));
     carbonTable.setTransactionalTable(false);
     return carbonTable;
   }
 
+  private static String[][] validateColumnsAndTypes(String columns, String columnTypes) {
+    String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
       what are these changes for didn't see anything about this in descriptions. please add comments  

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -303,8 +321,15 @@ public void commitDropTable(Table table, boolean b) throws MetaException {
     List<String> tokens = new ArrayList();
     StringBuilder stack = new StringBuilder();
     int openingCount = 0;
-    for (int i = 0; i < schema.length(); i++) {
-      if (schema.charAt(i) == '<') {
+    int length = schema.length();

Review comment:
       same comment as above

##########
File path: integration/hive/src/test/java/org/apache/carbondata/hive/HiveTestUtils.java
##########
@@ -88,7 +86,6 @@ public boolean checkAnswer(ResultSet actual, ResultSet expected) throws SQLExcep
     }
     Collections.sort(expectedValuesList);Collections.sort(actualValuesList);
     Assert.assertArrayEquals(expectedValuesList.toArray(), actualValuesList.toArray());
-    Assert.assertTrue(rowCountExpected > 0);

Review comment:
       why removed this validation ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490843284



##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S
       columns = columns + "," + partitionColumns;
       columnTypes = columnTypes + ":" + partitionColumnTypes;
     }
-    String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+    String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes);
     CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
         HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
-            sortColumns, columns.split(","), columnTypeArray, new ArrayList<>()));
+            sortColumns, validatedColumnsAndTypes[0],
+            validatedColumnsAndTypes[1], new ArrayList<>()));
     carbonTable.setTransactionalTable(false);
     return carbonTable;
   }
 
+  private static String[][] validateColumnsAndTypes(String columns, String columnTypes) {
+    String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
       In case of empty table some additional columns are getting added in the configuration. Here I have validated if any additional column and removed that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490853673



##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##########
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa
     } else {
       carbonInputFormat = new CarbonFileInputFormat<>();
     }
-    List<org.apache.hadoop.mapreduce.InputSplit> splitList =
-        carbonInputFormat.getSplits(jobContext);
+    List<org.apache.hadoop.mapreduce.InputSplit> splitList;
+    try {
+      splitList = carbonInputFormat.getSplits(jobContext);
+    } catch (IOException ex) {
+      if (ex.getMessage().contains("No Index files are present in the table location :")) {

Review comment:
       so test for external table and don't skip exception for external table




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694802437


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4124/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490869778



##########
File path: integration/hive/src/test/java/org/apache/carbondata/hive/HiveTestUtils.java
##########
@@ -88,7 +86,6 @@ public boolean checkAnswer(ResultSet actual, ResultSet expected) throws SQLExcep
     }
     Collections.sort(expectedValuesList);Collections.sort(actualValuesList);
     Assert.assertArrayEquals(expectedValuesList.toArray(), actualValuesList.toArray());
-    Assert.assertTrue(rowCountExpected > 0);

Review comment:
       As now we are checking for empty table also which may contain zero row, so I have removed this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694803147


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2383/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490870410



##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##########
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa
     } else {
       carbonInputFormat = new CarbonFileInputFormat<>();
     }
-    List<org.apache.hadoop.mapreduce.InputSplit> splitList =
-        carbonInputFormat.getSplits(jobContext);
+    List<org.apache.hadoop.mapreduce.InputSplit> splitList;
+    try {
+      splitList = carbonInputFormat.getSplits(jobContext);
+    } catch (IOException ex) {
+      if (ex.getMessage().contains("No Index files are present in the table location :")) {

Review comment:
       The external table is not getting queried from this getSplits() of this file.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490872439



##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -303,8 +321,15 @@ public void commitDropTable(Table table, boolean b) throws MetaException {
     List<String> tokens = new ArrayList();
     StringBuilder stack = new StringBuilder();
     int openingCount = 0;
-    for (int i = 0; i < schema.length(); i++) {
-      if (schema.charAt(i) == '<') {
+    int length = schema.length();

Review comment:
       Here I have handled the schema parsing to consider comma-separated values.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490843284



##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S
       columns = columns + "," + partitionColumns;
       columnTypes = columnTypes + ":" + partitionColumnTypes;
     }
-    String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+    String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes);
     CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
         HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
-            sortColumns, columns.split(","), columnTypeArray, new ArrayList<>()));
+            sortColumns, validatedColumnsAndTypes[0],
+            validatedColumnsAndTypes[1], new ArrayList<>()));
     carbonTable.setTransactionalTable(false);
     return carbonTable;
   }
 
+  private static String[][] validateColumnsAndTypes(String columns, String columnTypes) {
+    String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
       In case of empty table some additional columns are getting added in the configuration. Here I have validated if any additional column and removed that. Also added the comments.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694890026


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4127/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694890983


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2386/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694895620


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

asfgit closed pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694913664


   spark 3.4 one test failure is a random issue I presume as these changes are unrelated and build passed before adding comments. We need to track this and fix


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694913880


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]