[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
GitHub user akashrn5 opened a pull request:

    https://github.com/apache/carbondata/pull/1524

    [CARBONDATA-1762] Remove existing column level dateformat and support dateformat, timestampformat in the load option

    (1) Remove column level dateformat option
    (2) Support dateformat and timestampformat in load options(table level)
   
    Be sure to do all of the following checklist to help us incorporate
    your contribution quickly and easily:
   
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [X] Document update required?
             2 new load level properties are added. Document to be updated accordingly.
     - [X] Testing done
            UT Added
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/akashrn5/incubator-carbondata timeformat

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1524.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1524
   
----
commit f7b253c672c4eed21466806cd4bc9990264a8a37
Author: akashrn5 <[hidden email]>
Date:   2017-11-17T11:25:33Z

    [CARBONDATA-1762] Remove existing column level dateformat and support dateformat, timestampformat in the load option

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151673677
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java ---
    @@ -52,6 +52,14 @@
       public static final String CARBON_OPTIONS_DATEFORMAT =
           "carbon.options.dateformat";
       public static final String CARBON_OPTIONS_DATEFORMAT_DEFAULT = "";
    +
    +  /**
    +   * option to specify the load option
    +   */
    +  @CarbonProperty
    +  public static final String CARBON_OPTIONS_TIMESTAMPFORMAT =
    +          "carbon.options.dateformat";
    --- End diff --
   
    I think `carbon.options.dateformat ` should be `carbon.options.timestampformat`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151674183
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    +   *
    +   * @param optionValue
    +   * @param optionName
    +   */
    +  def validateDateTimeFormat(optionValue: String, optionName: String): Unit = {
    --- End diff --
   
    give proper names to parameters


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151674289
 
    --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -799,18 +799,19 @@ object CarbonDataRDDFactory {
           throw new DataLoadingException("Partition column not found.")
         }
     
    -    val dateFormatMap = CarbonDataProcessorUtil.getDateFormatMap(carbonLoadModel.getDateFormat)
    -    val specificFormat = Option(dateFormatMap.get(partitionColumn.toLowerCase))
    -    val timeStampFormat = if (specificFormat.isDefined) {
    -      new SimpleDateFormat(specificFormat.get)
    +    val specificTimestampFormat = carbonLoadModel.getTimestampformat
    +    val specificDateFormat = carbonLoadModel.getDateFormat
    +    val timeStampFormat = if (specificTimestampFormat != null &&
    +                              !specificTimestampFormat.trim.isEmpty) {
    --- End diff --
   
    format it properly.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1231/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1233/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1236/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1241/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    Retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1252/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1261/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831351
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    +   *
    +   * @param dateTimeLoadFormat
    +   * @param dateTimeLoadOption
    +   */
    +  def validateDateTimeFormat(dateTimeLoadFormat: String, dateTimeLoadOption: String): Unit = {
         // allowing empty value to be configured for dateformat option.
    -    if (dateFormat != null && dateFormat.trim != "") {
    -        val dateFormats: Array[String] = dateFormat.split(CarbonCommonConstants.COMMA)
    -        for (singleDateFormat <- dateFormats) {
    -          val dateFormatSplits: Array[String] = singleDateFormat.split(":", 2)
    -          val columnName = dateFormatSplits(0).trim.toLowerCase
    -          if (!dimensions.exists(_.getColName.equals(columnName))) {
    -            throw new MalformedCarbonCommandException("Error: Wrong Column Name " +
    -              dateFormatSplits(0) +
    -              " is provided in Option DateFormat.")
    -          }
    -          if (dateFormatSplits.length < 2 || dateFormatSplits(1).trim.isEmpty) {
    -            throw new MalformedCarbonCommandException("Error: Option DateFormat is not provided " +
    -              "for " + "Column " + dateFormatSplits(0) +
    -              ".")
    -          }
    -        }
    +    if (dateTimeLoadFormat != null && dateTimeLoadFormat.trim != "") {
    +      try {
    +        new SimpleDateFormat(dateTimeLoadFormat)
           }
    +      catch {
    --- End diff --
   
    move to previous line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831356
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    +   *
    +   * @param dateTimeLoadFormat
    +   * @param dateTimeLoadOption
    +   */
    +  def validateDateTimeFormat(dateTimeLoadFormat: String, dateTimeLoadOption: String): Unit = {
         // allowing empty value to be configured for dateformat option.
    -    if (dateFormat != null && dateFormat.trim != "") {
    -        val dateFormats: Array[String] = dateFormat.split(CarbonCommonConstants.COMMA)
    -        for (singleDateFormat <- dateFormats) {
    -          val dateFormatSplits: Array[String] = singleDateFormat.split(":", 2)
    -          val columnName = dateFormatSplits(0).trim.toLowerCase
    -          if (!dimensions.exists(_.getColName.equals(columnName))) {
    -            throw new MalformedCarbonCommandException("Error: Wrong Column Name " +
    -              dateFormatSplits(0) +
    -              " is provided in Option DateFormat.")
    -          }
    -          if (dateFormatSplits.length < 2 || dateFormatSplits(1).trim.isEmpty) {
    -            throw new MalformedCarbonCommandException("Error: Option DateFormat is not provided " +
    -              "for " + "Column " + dateFormatSplits(0) +
    -              ".")
    -          }
    -        }
    +    if (dateTimeLoadFormat != null && dateTimeLoadFormat.trim != "") {
    +      try {
    +        new SimpleDateFormat(dateTimeLoadFormat)
           }
    +      catch {
    +        case _: IllegalArgumentException =>
    +          throw new MalformedCarbonCommandException("Error: Wrong option: " + dateTimeLoadFormat +
    --- End diff --
   
    use $ for variable


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831369
 
    --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -799,18 +799,19 @@ object CarbonDataRDDFactory {
           throw new DataLoadingException("Partition column not found.")
         }
     
    -    val dateFormatMap = CarbonDataProcessorUtil.getDateFormatMap(carbonLoadModel.getDateFormat)
    -    val specificFormat = Option(dateFormatMap.get(partitionColumn.toLowerCase))
    -    val timeStampFormat = if (specificFormat.isDefined) {
    -      new SimpleDateFormat(specificFormat.get)
    -    } else {
    -      val timestampFormatString = CarbonProperties.getInstance().getProperty(CarbonCommonConstants
    -        .CARBON_TIMESTAMP_FORMAT, CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT)
    -      new SimpleDateFormat(timestampFormatString)
    -    }
    +    val specificTimestampFormat = carbonLoadModel.getTimestampformat
    +    val specificDateFormat = carbonLoadModel.getDateFormat
    +    val timeStampFormat =
    +      if (specificTimestampFormat != null && !specificTimestampFormat.trim.isEmpty) {
    +        new SimpleDateFormat(specificTimestampFormat)
    +      } else {
    +        val timestampFormatString = CarbonProperties.getInstance().getProperty(CarbonCommonConstants
    --- End diff --
   
    move `CarbonCommonConstants` to next line, give one parameter in one line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831621
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala ---
    @@ -75,55 +76,19 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
           assert(false)
         } catch {
           case ex: MalformedCarbonCommandException =>
    -        assertResult(ex.getMessage)("Error: Option DateFormat is not provided for Column date.")
    +        assertResult(ex.getMessage)("Error: Wrong option: date is provided for option DateFormat")
           case _: Throwable=> assert(false)
         }
     
         try {
           sql(s"""
                LOAD DATA LOCAL INPATH '$resourcesPath/timeStampFormatData1.csv' into table t3
    -           OPTIONS('dateformat' = 'fasfdas:yyyy/MM/dd')
    --- End diff --
   
    why delete these testcase? `dateformat` is still a valid loading option, right?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831633
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    --- End diff --
   
    validate timestamp format and date format. please change


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831713
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java ---
    @@ -52,6 +52,14 @@
       public static final String CARBON_OPTIONS_DATEFORMAT =
           "carbon.options.dateformat";
       public static final String CARBON_OPTIONS_DATEFORMAT_DEFAULT = "";
    +
    +  /**
    +   * option to specify the load option
    +   */
    +  @CarbonProperty
    +  public static final String CARBON_OPTIONS_TIMESTAMPFORMAT =
    --- End diff --
   
    please add comment for these two options to explain how it should be used: `CARBON_OPTIONS_DATEFORMAT` and `CARBON_OPTIONS_TIMESTAMPFORMAT`


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151832053
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    --- End diff --
   
    this validates both timestamp and date format


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1266/



---
12