[GitHub] [carbondata] Karan980 opened a new pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

classic Classic list List threaded Threaded
56 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Karan980 opened a new pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox

Karan980 opened a new pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970


    ### Why is this PR needed?
   Fix multiple issues occurred in SDK_IUD.
   a) TupleId always have linux file separator independent of the system.
   b) Filtered rows array size gives ArrayOutOfBound exception if number of deleted rows is greater than 4096.
   c) While writing the data of date column type during update operation gives bad record exception as dates are converted to Integer during read.
   
    ### What changes were proposed in this PR?
   a) Changed the tupleId file separator to linux file separator.
   b) Change the filtered rows size to default column page rows size.
   c) Converted the integer back to date type before writing the data during update operation.
   
     ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-704794039


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4319/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-704794823


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2569/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

nihal0107 commented on a change in pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#discussion_r503759359



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
##########
@@ -279,7 +279,8 @@ public void setAllColumnProjectionIfNotConfigured(JobContext job, CarbonTable ca
   private String[] getDeleteDeltaFiles(String segmentFilePath, List<String> allDeleteDeltaFiles) {
     List<String> deleteDeltaFiles = new ArrayList<>();
     String segmentFileName = null;
-    String[] pathElements = segmentFilePath.split(Pattern.quote(File.separator));
+    segmentFilePath = segmentFilePath.replace("\\", "/");

Review comment:
       Use WINDOWS_FILE_SEPARATOR and FILE_SEPARATOR at place of string




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

nihal0107 commented on a change in pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#discussion_r503762013



##########
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java
##########
@@ -155,7 +158,15 @@ public void update(String path, Expression filterExpression,
     Schema schema = CarbonSchemaReader.readSchema(indexFiles.get(0)).asOriginOrder();
     Field[] fields = schema.getFields();
     String[] projectionColumns = new String[fields.length + 1];
+    List<Integer> dateIndexes = new ArrayList<>();
+    List<Integer> timeStampIndexes = new ArrayList<>();
     for (int i = 0; i < fields.length; i++) {
+      if (fields[i].getDataType() == DataTypes.DATE) {
+        dateIndexes.add(i);
+      }
+      if (fields[i].getDataType() == DataTypes.TIMESTAMP) {

Review comment:
       ```suggestion
         else if (fields[i].getDataType() == DataTypes.TIMESTAMP) {
   ```
   and format the line 167 to 166




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

nihal0107 commented on a change in pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#discussion_r503767871



##########
File path: sdk/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonIUDTest.java
##########
@@ -72,6 +72,43 @@ public void testDelete() throws Exception {
     FileUtils.deleteDirectory(new File(path));
   }
 
+  @Test
+  public void testUpdateOnDateType() throws Exception {

Review comment:
       I think this testcase is for update on date type but here you have updated on int type. Please check once.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] kunal642 commented on a change in pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

kunal642 commented on a change in pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#discussion_r504400394



##########
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java
##########
@@ -173,10 +184,16 @@ public void update(String path, Expression filterExpression,
     RecordWriter<NullWritable, ObjectArrayWritable> deleteDeltaWriter =
         CarbonTableOutputFormat.getDeleteDeltaRecordWriter(path);
     ObjectArrayWritable writable = new ObjectArrayWritable();
-
+    long day = 24L * 3600 * 1000;
     while (reader.hasNext()) {
       Object[] row = (Object[]) reader.readNextRow();
       writable.set(Arrays.copyOfRange(row, row.length - 1, row.length));
+      for (Integer dateIndex : dateIndexes) {
+        row[dateIndex] = new Date((day * ((int) row[dateIndex])));

Review comment:
       Date and Timestamp conversion should be in SDK reader. Please fix the reader




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708434559


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2688/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708436627


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4441/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Karan980 commented on a change in pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

Karan980 commented on a change in pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#discussion_r504827655



##########
File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
##########
@@ -279,7 +279,8 @@ public void setAllColumnProjectionIfNotConfigured(JobContext job, CarbonTable ca
   private String[] getDeleteDeltaFiles(String segmentFilePath, List<String> allDeleteDeltaFiles) {
     List<String> deleteDeltaFiles = new ArrayList<>();
     String segmentFileName = null;
-    String[] pathElements = segmentFilePath.split(Pattern.quote(File.separator));
+    segmentFilePath = segmentFilePath.replace("\\", "/");

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Karan980 commented on a change in pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

Karan980 commented on a change in pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#discussion_r504827820



##########
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java
##########
@@ -155,7 +158,15 @@ public void update(String path, Expression filterExpression,
     Schema schema = CarbonSchemaReader.readSchema(indexFiles.get(0)).asOriginOrder();
     Field[] fields = schema.getFields();
     String[] projectionColumns = new String[fields.length + 1];
+    List<Integer> dateIndexes = new ArrayList<>();
+    List<Integer> timeStampIndexes = new ArrayList<>();
     for (int i = 0; i < fields.length; i++) {
+      if (fields[i].getDataType() == DataTypes.DATE) {
+        dateIndexes.add(i);
+      }
+      if (fields[i].getDataType() == DataTypes.TIMESTAMP) {

Review comment:
       Code removed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Karan980 commented on a change in pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

Karan980 commented on a change in pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#discussion_r504829543



##########
File path: sdk/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonIUDTest.java
##########
@@ -72,6 +72,43 @@ public void testDelete() throws Exception {
     FileUtils.deleteDirectory(new File(path));
   }
 
+  @Test
+  public void testUpdateOnDateType() throws Exception {

Review comment:
       Testcase was added to check conversion of dateType columns to integer while reading. As date column is present in schema. so no need to perform update on date column.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Karan980 commented on a change in pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

GitBox
In reply to this post by GitBox

Karan980 commented on a change in pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#discussion_r504829616



##########
File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java
##########
@@ -173,10 +184,16 @@ public void update(String path, Expression filterExpression,
     RecordWriter<NullWritable, ObjectArrayWritable> deleteDeltaWriter =
         CarbonTableOutputFormat.getDeleteDeltaRecordWriter(path);
     ObjectArrayWritable writable = new ObjectArrayWritable();
-
+    long day = 24L * 3600 * 1000;
     while (reader.hasNext()) {
       Object[] row = (Object[]) reader.readNextRow();
       writable.set(Arrays.copyOfRange(row, row.length - 1, row.length));
+      for (Integer dateIndex : dateIndexes) {
+        row[dateIndex] = new Date((day * ((int) row[dateIndex])));

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708601810


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2689/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708605209


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4442/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Karan980 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

GitBox
In reply to this post by GitBox

Karan980 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708624107


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708675919


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4446/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708676144


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2693/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708958736


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4454/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-708968372


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2700/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


123