Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] [carbondata] nihal0107 opened a new pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

Classic

List

20 messages Options

Options

GitBox

[GitHub] [carbondata] nihal0107 opened a new pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

nihal0107 opened a new pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936

### Why is this PR needed?
Reading data from empty carbontable through hive beeline was giving 'Unable read Carbon Schema' exception, when no data is present in the carbon table.

### What changes were proposed in this PR?
Return the empty data if no data is present.

### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)

### Is any new testcase added?
- Yes

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694441987

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4121/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694442422

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2380/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694744293

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2382/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694744886

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4123/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490834840

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##########
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa
} else {
carbonInputFormat = new CarbonFileInputFormat<>();
}
- List<org.apache.hadoop.mapreduce.InputSplit> splitList =
- carbonInputFormat.getSplits(jobContext);
+ List<org.apache.hadoop.mapreduce.InputSplit> splitList;
+ try {
+ splitList = carbonInputFormat.getSplits(jobContext);
+ } catch (IOException ex) {
+ if (ex.getMessage().contains("No Index files are present in the table location :")) {

Review comment:
I see that hive supports only non-transactional table support. Non trascational tables are often created by the path provided by user. so, when query if the path doesn't have carbon files. It is must to throw exception.

If external table is created on wrong path. Now query gives 0 rows and user will not know what went wrong.

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S
columns = columns + "," + partitionColumns;
columnTypes = columnTypes + ":" + partitionColumnTypes;
}
- String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+ String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes);
CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
- sortColumns, columns.split(","), columnTypeArray, new ArrayList<>()));
+ sortColumns, validatedColumnsAndTypes[0],
+ validatedColumnsAndTypes[1], new ArrayList<>()));
carbonTable.setTransactionalTable(false);
return carbonTable;
}

+ private static String[][] validateColumnsAndTypes(String columns, String columnTypes) {
+ String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
what are these changes for didn't see anything about this in descriptions. please add comments

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -303,8 +321,15 @@ public void commitDropTable(Table table, boolean b) throws MetaException {
List<String> tokens = new ArrayList();
StringBuilder stack = new StringBuilder();
int openingCount = 0;
- for (int i = 0; i < schema.length(); i++) {
- if (schema.charAt(i) == '<') {
+ int length = schema.length();

Review comment:
same comment as above

##########
File path: integration/hive/src/test/java/org/apache/carbondata/hive/HiveTestUtils.java
##########
@@ -88,7 +86,6 @@ public boolean checkAnswer(ResultSet actual, ResultSet expected) throws SQLExcep
}
Collections.sort(expectedValuesList);Collections.sort(actualValuesList);
Assert.assertArrayEquals(expectedValuesList.toArray(), actualValuesList.toArray());
- Assert.assertTrue(rowCountExpected > 0);

Review comment:
why removed this validation ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490843284

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S
columns = columns + "," + partitionColumns;
columnTypes = columnTypes + ":" + partitionColumnTypes;
}
- String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+ String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes);
CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
- sortColumns, columns.split(","), columnTypeArray, new ArrayList<>()));
+ sortColumns, validatedColumnsAndTypes[0],
+ validatedColumnsAndTypes[1], new ArrayList<>()));
carbonTable.setTransactionalTable(false);
return carbonTable;
}

+ private static String[][] validateColumnsAndTypes(String columns, String columnTypes) {
+ String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
In case of empty table some additional columns are getting added in the configuration. Here I have validated if any additional column and removed that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490853673

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##########
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa
} else {
carbonInputFormat = new CarbonFileInputFormat<>();
}
- List<org.apache.hadoop.mapreduce.InputSplit> splitList =
- carbonInputFormat.getSplits(jobContext);
+ List<org.apache.hadoop.mapreduce.InputSplit> splitList;
+ try {
+ splitList = carbonInputFormat.getSplits(jobContext);
+ } catch (IOException ex) {
+ if (ex.getMessage().contains("No Index files are present in the table location :")) {

Review comment:
so test for external table and don't skip exception for external table

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694802437

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4124/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490869778

##########
File path: integration/hive/src/test/java/org/apache/carbondata/hive/HiveTestUtils.java
##########
@@ -88,7 +86,6 @@ public boolean checkAnswer(ResultSet actual, ResultSet expected) throws SQLExcep
}
Collections.sort(expectedValuesList);Collections.sort(actualValuesList);
Assert.assertArrayEquals(expectedValuesList.toArray(), actualValuesList.toArray());
- Assert.assertTrue(rowCountExpected > 0);

Review comment:
As now we are checking for empty table also which may contain zero row, so I have removed this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694803147

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2383/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490870410

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##########
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa
} else {
carbonInputFormat = new CarbonFileInputFormat<>();
}
- List<org.apache.hadoop.mapreduce.InputSplit> splitList =
- carbonInputFormat.getSplits(jobContext);
+ List<org.apache.hadoop.mapreduce.InputSplit> splitList;
+ try {
+ splitList = carbonInputFormat.getSplits(jobContext);
+ } catch (IOException ex) {
+ if (ex.getMessage().contains("No Index files are present in the table location :")) {

Review comment:
The external table is not getting queried from this getSplits() of this file.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490872439

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -303,8 +321,15 @@ public void commitDropTable(Table table, boolean b) throws MetaException {
List<String> tokens = new ArrayList();
StringBuilder stack = new StringBuilder();
int openingCount = 0;
- for (int i = 0; i < schema.length(); i++) {
- if (schema.charAt(i) == '<') {
+ int length = schema.length();

Review comment:
Here I have handled the schema parsing to consider comma-separated values.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490843284

##########
File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##########
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S
columns = columns + "," + partitionColumns;
columnTypes = columnTypes + ":" + partitionColumnTypes;
}
- String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+ String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes);
CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
- sortColumns, columns.split(","), columnTypeArray, new ArrayList<>()));
+ sortColumns, validatedColumnsAndTypes[0],
+ validatedColumnsAndTypes[1], new ArrayList<>()));
carbonTable.setTransactionalTable(false);
return carbonTable;
}

+ private static String[][] validateColumnsAndTypes(String columns, String columnTypes) {
+ String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
In case of empty table some additional columns are getting added in the configuration. Here I have validated if any additional column and removed that. Also added the comments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694890026

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4127/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694890983

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2386/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694895620

LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] asfgit closed pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

asfgit closed pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694913664

spark 3.4 one test failure is a random issue I presume as these changes are unrelated and build passed before adding comments. We need to track this and fix

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

In reply to this post by GitBox

ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694913880

retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]