[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3896: [WIP] Fix load failures due to daylight saving time changes

classic Classic list List threaded Threaded
77 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox

ShreelekhyaG opened a new pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896


    ### Why is this PR needed?
     1. Fix load failures due to daylight saving time changes.
     2. During load, date/timestamp year values with >4 digit should fail or be null according to bad records action property.
   
    ### What changes were proposed in this PR?
   New property added to setLeniet as true and parse timestampformat.
   Added validation for timestamp range values.
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox

VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472278473



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##########
@@ -434,20 +434,59 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String dateFormat) {
-    Date dateToStr;
-    DateFormat dateFormatter;
+    Date dateToStr = null;
+    DateFormat dateFormatter = null;
     try {
       if (null != dateFormat && !dateFormat.trim().isEmpty()) {
         dateFormatter = new SimpleDateFormat(dateFormat);
-        dateFormatter.setLenient(false);
       } else {
         dateFormatter = timestampFormatter.get();
       }
+      dateFormatter.setLenient(false);
       dateToStr = dateFormatter.parse(dimensionValue);
-      return dateToStr.getTime();
+      return validateTimeStampRange(dateToStr.getTime());
     } catch (ParseException e) {
-      throw new NumberFormatException(e.getMessage());
+      // If the parsing fails, try to parse again with setLenient to true if the property is set
+      if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+        try {
+          LOGGER.info("Changing setLenient to true for TimeStamp: " + dimensionValue);
+          dateFormatter.setLenient(true);
+          dateToStr = dateFormatter.parse(dimensionValue);
+          LOGGER.info(
+              "Changing setLenient to true for TimeStamp: " + dimensionValue + ". Changing "
+                  + dimensionValue + " to " + dateToStr);
+          dateFormatter.setLenient(false);
+          LOGGER.info("Changing setLenient back to false");
+          return validateTimeStampRange(dateToStr.getTime());
+        } catch (ParseException ex) {
+          dateFormatter.setLenient(false);
+          LOGGER.info("Changing setLenient back to false");
+          throw new NumberFormatException(ex.getMessage());
+        }
+      } else {
+        throw new NumberFormatException(e.getMessage());
+      }
+    }
+  }
+
+  private static Long validateTimeStampRange(Long timeValue) {
+    SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

Review comment:
       Instead of creating instance of simpleDateFormat each time, suggest to use existing `DateDirectDictionaryGenerator.MIN_VALUE` and `DateDirectDictionaryGenerator.MAX_VALUE` to validate




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675546141


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2028/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472285299



##########
File path: core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
##########
@@ -153,6 +154,12 @@ private boolean validateKeyValue(String key, String value) throws InvalidConfigu
       case ENABLE_UNSAFE_IN_QUERY_EXECUTION:
       case ENABLE_AUTO_LOAD_MERGE:
       case CARBON_PUSH_ROW_FILTERS_FOR_VECTOR:
+      case CARBON_LOAD_SETLENIENT_ENABLE:

Review comment:
       It can be  a fall through case. Can remove line 158-162`

##########
File path: core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
##########
@@ -153,6 +154,12 @@ private boolean validateKeyValue(String key, String value) throws InvalidConfigu
       case ENABLE_UNSAFE_IN_QUERY_EXECUTION:
       case ENABLE_AUTO_LOAD_MERGE:
       case CARBON_PUSH_ROW_FILTERS_FOR_VECTOR:
+      case CARBON_LOAD_SETLENIENT_ENABLE:

Review comment:
       It can be  a fall through case. Can remove line 158-162




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472288683



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -306,6 +307,39 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
     }
   }
 
+  test("test load, update data with daylight saving time from different timezone") {
+    CarbonProperties.getInstance().addProperty(
+      CarbonCommonConstants.CARBON_LOAD_SETLENIENT_ENABLE, "true")
+    val defaultTimeZone = TimeZone.getDefault
+    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
+    sql("DROP TABLE IF EXISTS t3")
+    sql(
+      """
+           CREATE TABLE IF NOT EXISTS t3
+           (ID Int, date date, starttime Timestamp, country String,
+           name String, phonetype String, serialname String, salary Int)
+           STORED AS carbondata TBLPROPERTIES('dateformat'='yyyy/MM/dd',
+           'timestampformat'='yyyy-MM-dd HH:mm')
+        """)
+    sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/timeStampFormatData3.csv' into table t3")
+    sql(s"insert into t3 select 11,'2015-7-23','1941-3-15 00:00:00','china','aaa1','phone197'," +
+        s"'ASD69643',15000")
+    sql("update t3 set (starttime) = ('1941-3-15 00:00:00') where name='aaa2'")
+    checkAnswer(
+      sql("SELECT starttime FROM t3 WHERE ID = 1"),
+      Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+    )
+    checkAnswer(
+      sql("SELECT starttime FROM t3 WHERE ID = 11"),
+      Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+    )
+    checkAnswer(
+      sql("SELECT starttime FROM t3 WHERE ID = 2"),
+      Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+    )
+    TimeZone.setDefault(defaultTimeZone)

Review comment:
       Remove `CARBON_LOAD_SETLENIENT_ENABLE` from carbon properies at the end of testcase.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472294029



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##########
@@ -434,20 +434,59 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String dateFormat) {
-    Date dateToStr;
-    DateFormat dateFormatter;
+    Date dateToStr = null;
+    DateFormat dateFormatter = null;
     try {
       if (null != dateFormat && !dateFormat.trim().isEmpty()) {
         dateFormatter = new SimpleDateFormat(dateFormat);
-        dateFormatter.setLenient(false);
       } else {
         dateFormatter = timestampFormatter.get();
       }
+      dateFormatter.setLenient(false);
       dateToStr = dateFormatter.parse(dimensionValue);
-      return dateToStr.getTime();
+      return validateTimeStampRange(dateToStr.getTime());
     } catch (ParseException e) {
-      throw new NumberFormatException(e.getMessage());
+      // If the parsing fails, try to parse again with setLenient to true if the property is set
+      if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+        try {
+          LOGGER.info("Changing setLenient to true for TimeStamp: " + dimensionValue);
+          dateFormatter.setLenient(true);
+          dateToStr = dateFormatter.parse(dimensionValue);
+          LOGGER.info(
+              "Changing setLenient to true for TimeStamp: " + dimensionValue + ". Changing "
+                  + dimensionValue + " to " + dateToStr);
+          dateFormatter.setLenient(false);
+          LOGGER.info("Changing setLenient back to false");
+          return validateTimeStampRange(dateToStr.getTime());
+        } catch (ParseException ex) {
+          dateFormatter.setLenient(false);
+          LOGGER.info("Changing setLenient back to false");
+          throw new NumberFormatException(ex.getMessage());
+        }
+      } else {
+        throw new NumberFormatException(e.getMessage());
+      }
+    }
+  }
+
+  private static Long validateTimeStampRange(Long timeValue) {
+    SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

Review comment:
       Here, the `DateDirectDictionaryGenerator.MIN_VALUE`  is ("0001-01-01") which is not equals to timestamp minvalue ("0001-01-01 00:00:00"). As the format is different, will get different long values after parse.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675563624


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3770/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472327182



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##########
@@ -434,20 +434,59 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String dateFormat) {
-    Date dateToStr;
-    DateFormat dateFormatter;
+    Date dateToStr = null;
+    DateFormat dateFormatter = null;
     try {
       if (null != dateFormat && !dateFormat.trim().isEmpty()) {
         dateFormatter = new SimpleDateFormat(dateFormat);
-        dateFormatter.setLenient(false);
       } else {
         dateFormatter = timestampFormatter.get();
       }
+      dateFormatter.setLenient(false);
       dateToStr = dateFormatter.parse(dimensionValue);
-      return dateToStr.getTime();
+      return validateTimeStampRange(dateToStr.getTime());
     } catch (ParseException e) {
-      throw new NumberFormatException(e.getMessage());
+      // If the parsing fails, try to parse again with setLenient to true if the property is set
+      if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+        try {
+          LOGGER.info("Changing setLenient to true for TimeStamp: " + dimensionValue);
+          dateFormatter.setLenient(true);
+          dateToStr = dateFormatter.parse(dimensionValue);
+          LOGGER.info(
+              "Changing setLenient to true for TimeStamp: " + dimensionValue + ". Changing "

Review comment:
       `Changing setLenient to true for TimeStamp: " + dimensionValue ` is redundant. we have already logged it in line 452.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472332314



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##########
@@ -434,20 +434,59 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String dateFormat) {
-    Date dateToStr;
-    DateFormat dateFormatter;
+    Date dateToStr = null;
+    DateFormat dateFormatter = null;
     try {
       if (null != dateFormat && !dateFormat.trim().isEmpty()) {
         dateFormatter = new SimpleDateFormat(dateFormat);
-        dateFormatter.setLenient(false);
       } else {
         dateFormatter = timestampFormatter.get();
       }
+      dateFormatter.setLenient(false);
       dateToStr = dateFormatter.parse(dimensionValue);
-      return dateToStr.getTime();
+      return validateTimeStampRange(dateToStr.getTime());
     } catch (ParseException e) {
-      throw new NumberFormatException(e.getMessage());
+      // If the parsing fails, try to parse again with setLenient to true if the property is set
+      if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+        try {
+          LOGGER.info("Changing setLenient to true for TimeStamp: " + dimensionValue);
+          dateFormatter.setLenient(true);
+          dateToStr = dateFormatter.parse(dimensionValue);
+          LOGGER.info(
+              "Changing setLenient to true for TimeStamp: " + dimensionValue + ". Changing "
+                  + dimensionValue + " to " + dateToStr);
+          dateFormatter.setLenient(false);
+          LOGGER.info("Changing setLenient back to false");
+          return validateTimeStampRange(dateToStr.getTime());
+        } catch (ParseException ex) {

Review comment:
       `validateTimeStampRange()` throws `NumberFormatException`. Your would want to do `dateFormatter.setLenient(false);` in that case too..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472380864



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##########
@@ -434,20 +434,59 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String dateFormat) {
-    Date dateToStr;
-    DateFormat dateFormatter;
+    Date dateToStr = null;
+    DateFormat dateFormatter = null;
     try {
       if (null != dateFormat && !dateFormat.trim().isEmpty()) {
         dateFormatter = new SimpleDateFormat(dateFormat);
-        dateFormatter.setLenient(false);
       } else {
         dateFormatter = timestampFormatter.get();
       }
+      dateFormatter.setLenient(false);
       dateToStr = dateFormatter.parse(dimensionValue);
-      return dateToStr.getTime();
+      return validateTimeStampRange(dateToStr.getTime());
     } catch (ParseException e) {
-      throw new NumberFormatException(e.getMessage());
+      // If the parsing fails, try to parse again with setLenient to true if the property is set
+      if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+        try {
+          LOGGER.info("Changing setLenient to true for TimeStamp: " + dimensionValue);
+          dateFormatter.setLenient(true);
+          dateToStr = dateFormatter.parse(dimensionValue);
+          LOGGER.info(
+              "Changing setLenient to true for TimeStamp: " + dimensionValue + ". Changing "
+                  + dimensionValue + " to " + dateToStr);
+          dateFormatter.setLenient(false);
+          LOGGER.info("Changing setLenient back to false");
+          return validateTimeStampRange(dateToStr.getTime());
+        } catch (ParseException ex) {

Review comment:
       ok added

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -306,6 +307,39 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
     }
   }
 
+  test("test load, update data with daylight saving time from different timezone") {
+    CarbonProperties.getInstance().addProperty(
+      CarbonCommonConstants.CARBON_LOAD_SETLENIENT_ENABLE, "true")
+    val defaultTimeZone = TimeZone.getDefault
+    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
+    sql("DROP TABLE IF EXISTS t3")
+    sql(
+      """
+           CREATE TABLE IF NOT EXISTS t3
+           (ID Int, date date, starttime Timestamp, country String,
+           name String, phonetype String, serialname String, salary Int)
+           STORED AS carbondata TBLPROPERTIES('dateformat'='yyyy/MM/dd',
+           'timestampformat'='yyyy-MM-dd HH:mm')
+        """)
+    sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/timeStampFormatData3.csv' into table t3")
+    sql(s"insert into t3 select 11,'2015-7-23','1941-3-15 00:00:00','china','aaa1','phone197'," +
+        s"'ASD69643',15000")
+    sql("update t3 set (starttime) = ('1941-3-15 00:00:00') where name='aaa2'")
+    checkAnswer(
+      sql("SELECT starttime FROM t3 WHERE ID = 1"),
+      Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+    )
+    checkAnswer(
+      sql("SELECT starttime FROM t3 WHERE ID = 11"),
+      Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+    )
+    checkAnswer(
+      sql("SELECT starttime FROM t3 WHERE ID = 2"),
+      Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+    )
+    TimeZone.setDefault(defaultTimeZone)

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472381408



##########
File path: core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
##########
@@ -153,6 +154,12 @@ private boolean validateKeyValue(String key, String value) throws InvalidConfigu
       case ENABLE_UNSAFE_IN_QUERY_EXECUTION:
       case ENABLE_AUTO_LOAD_MERGE:
       case CARBON_PUSH_ROW_FILTERS_FOR_VECTOR:
+      case CARBON_LOAD_SETLENIENT_ENABLE:

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472381642



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##########
@@ -434,20 +434,59 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String dateFormat) {
-    Date dateToStr;
-    DateFormat dateFormatter;
+    Date dateToStr = null;
+    DateFormat dateFormatter = null;
     try {
       if (null != dateFormat && !dateFormat.trim().isEmpty()) {
         dateFormatter = new SimpleDateFormat(dateFormat);
-        dateFormatter.setLenient(false);
       } else {
         dateFormatter = timestampFormatter.get();
       }
+      dateFormatter.setLenient(false);
       dateToStr = dateFormatter.parse(dimensionValue);
-      return dateToStr.getTime();
+      return validateTimeStampRange(dateToStr.getTime());
     } catch (ParseException e) {
-      throw new NumberFormatException(e.getMessage());
+      // If the parsing fails, try to parse again with setLenient to true if the property is set
+      if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+        try {
+          LOGGER.info("Changing setLenient to true for TimeStamp: " + dimensionValue);
+          dateFormatter.setLenient(true);
+          dateToStr = dateFormatter.parse(dimensionValue);
+          LOGGER.info(
+              "Changing setLenient to true for TimeStamp: " + dimensionValue + ". Changing "

Review comment:
       agree. removed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472382210



##########
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##########
@@ -434,20 +434,59 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String dateFormat) {
-    Date dateToStr;
-    DateFormat dateFormatter;
+    Date dateToStr = null;
+    DateFormat dateFormatter = null;
     try {
       if (null != dateFormat && !dateFormat.trim().isEmpty()) {
         dateFormatter = new SimpleDateFormat(dateFormat);
-        dateFormatter.setLenient(false);
       } else {
         dateFormatter = timestampFormatter.get();
       }
+      dateFormatter.setLenient(false);
       dateToStr = dateFormatter.parse(dimensionValue);
-      return dateToStr.getTime();
+      return validateTimeStampRange(dateToStr.getTime());
     } catch (ParseException e) {
-      throw new NumberFormatException(e.getMessage());
+      // If the parsing fails, try to parse again with setLenient to true if the property is set
+      if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+        try {
+          LOGGER.info("Changing setLenient to true for TimeStamp: " + dimensionValue);
+          dateFormatter.setLenient(true);
+          dateToStr = dateFormatter.parse(dimensionValue);
+          LOGGER.info(
+              "Changing setLenient to true for TimeStamp: " + dimensionValue + ". Changing "
+                  + dimensionValue + " to " + dateToStr);
+          dateFormatter.setLenient(false);
+          LOGGER.info("Changing setLenient back to false");
+          return validateTimeStampRange(dateToStr.getTime());
+        } catch (ParseException ex) {
+          dateFormatter.setLenient(false);
+          LOGGER.info("Changing setLenient back to false");
+          throw new NumberFormatException(ex.getMessage());
+        }
+      } else {
+        throw new NumberFormatException(e.getMessage());
+      }
+    }
+  }
+
+  private static Long validateTimeStampRange(Long timeValue) {
+    SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

Review comment:
       rechecked and made use of existing value from `DateDirectDictionaryGenerator.MIN_VALUE` and `DateDirectDictionaryGenerator.MAX_VALUE`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675687520


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2035/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [WIP] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675687794


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3777/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [CARBONDATA-3955] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675990742


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2044/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [CARBONDATA-3955] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-676013906


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3786/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [CARBONDATA-3955] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-676679976


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3800/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [CARBONDATA-3955] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-676683099


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2059/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [CARBONDATA-3955] Fix load failures due to daylight saving time changes

GitBox
In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r473665479



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
##########
@@ -816,10 +816,20 @@ object CarbonDataRDDFactory {
       val partitionByRdd = keyRDD.partitionBy(
         new SegmentPartitioner(segmentIdIndex, segmentUpdateParallelism))
 
+      val conf = SparkSQLUtil.broadCastHadoopConf(sqlContext.sparkSession.sparkContext, hadoopConf)
+      val carbonSessionInfo: CarbonSessionInfo = {

Review comment:
       Instead of this code block from 820-827lines, could have just used
   `carbonSessionInfo = ThreadLocalSessionInfo.getCarbonSessionInfo`
   
   As this object has configuration object  within it, i think, it is better to broadcast and use it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


1234