GitHub user akashrn5 opened a pull request:
https://github.com/apache/carbondata/pull/1524 [CARBONDATA-1762] Remove existing column level dateformat and support dateformat, timestampformat in the load option (1) Remove column level dateformat option (2) Support dateformat and timestampformat in load options(table level) Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [X] Document update required? 2 new load level properties are added. Document to be updated accordingly. - [X] Testing done UT Added - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/akashrn5/incubator-carbondata timeformat Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1524.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1524 ---- commit f7b253c672c4eed21466806cd4bc9990264a8a37 Author: akashrn5 <[hidden email]> Date: 2017-11-17T11:25:33Z [CARBONDATA-1762] Remove existing column level dateformat and support dateformat, timestampformat in the load option ---- --- |
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151673677 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java --- @@ -52,6 +52,14 @@ public static final String CARBON_OPTIONS_DATEFORMAT = "carbon.options.dateformat"; public static final String CARBON_OPTIONS_DATEFORMAT_DEFAULT = ""; + + /** + * option to specify the load option + */ + @CarbonProperty + public static final String CARBON_OPTIONS_TIMESTAMPFORMAT = + "carbon.options.dateformat"; --- End diff -- I think `carbon.options.dateformat ` should be `carbon.options.timestampformat` --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151674183 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala --- @@ -17,35 +17,32 @@ package org.apache.carbondata.spark.load -import scala.collection.JavaConverters._ +import java.text.SimpleDateFormat -import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.metadata.schema.table.CarbonTable -import org.apache.carbondata.processing.loading.model.CarbonLoadModel import org.apache.carbondata.processing.loading.sort.SortScopeOptions import org.apache.carbondata.spark.exception.MalformedCarbonCommandException object ValidateUtil { - def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = { - val dimensions = table.getDimensionByTableName(tableName).asScala + + /** + * validates both timestamp and date for illegal values + * + * @param optionValue + * @param optionName + */ + def validateDateTimeFormat(optionValue: String, optionName: String): Unit = { --- End diff -- give proper names to parameters --- |
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151674289 --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -799,18 +799,19 @@ object CarbonDataRDDFactory { throw new DataLoadingException("Partition column not found.") } - val dateFormatMap = CarbonDataProcessorUtil.getDateFormatMap(carbonLoadModel.getDateFormat) - val specificFormat = Option(dateFormatMap.get(partitionColumn.toLowerCase)) - val timeStampFormat = if (specificFormat.isDefined) { - new SimpleDateFormat(specificFormat.get) + val specificTimestampFormat = carbonLoadModel.getTimestampformat + val specificDateFormat = carbonLoadModel.getDateFormat + val timeStampFormat = if (specificTimestampFormat != null && + !specificTimestampFormat.trim.isEmpty) { --- End diff -- format it properly. --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1524 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1231/ --- |
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/1524 retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1524 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1233/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1524 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1236/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1524 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1241/ --- |
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/1524 Retest this please --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1524 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1252/ --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1524 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1261/ --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151831351 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala --- @@ -17,35 +17,32 @@ package org.apache.carbondata.spark.load -import scala.collection.JavaConverters._ +import java.text.SimpleDateFormat -import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.metadata.schema.table.CarbonTable -import org.apache.carbondata.processing.loading.model.CarbonLoadModel import org.apache.carbondata.processing.loading.sort.SortScopeOptions import org.apache.carbondata.spark.exception.MalformedCarbonCommandException object ValidateUtil { - def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = { - val dimensions = table.getDimensionByTableName(tableName).asScala + + /** + * validates both timestamp and date for illegal values + * + * @param dateTimeLoadFormat + * @param dateTimeLoadOption + */ + def validateDateTimeFormat(dateTimeLoadFormat: String, dateTimeLoadOption: String): Unit = { // allowing empty value to be configured for dateformat option. - if (dateFormat != null && dateFormat.trim != "") { - val dateFormats: Array[String] = dateFormat.split(CarbonCommonConstants.COMMA) - for (singleDateFormat <- dateFormats) { - val dateFormatSplits: Array[String] = singleDateFormat.split(":", 2) - val columnName = dateFormatSplits(0).trim.toLowerCase - if (!dimensions.exists(_.getColName.equals(columnName))) { - throw new MalformedCarbonCommandException("Error: Wrong Column Name " + - dateFormatSplits(0) + - " is provided in Option DateFormat.") - } - if (dateFormatSplits.length < 2 || dateFormatSplits(1).trim.isEmpty) { - throw new MalformedCarbonCommandException("Error: Option DateFormat is not provided " + - "for " + "Column " + dateFormatSplits(0) + - ".") - } - } + if (dateTimeLoadFormat != null && dateTimeLoadFormat.trim != "") { + try { + new SimpleDateFormat(dateTimeLoadFormat) } + catch { --- End diff -- move to previous line --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151831356 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala --- @@ -17,35 +17,32 @@ package org.apache.carbondata.spark.load -import scala.collection.JavaConverters._ +import java.text.SimpleDateFormat -import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.metadata.schema.table.CarbonTable -import org.apache.carbondata.processing.loading.model.CarbonLoadModel import org.apache.carbondata.processing.loading.sort.SortScopeOptions import org.apache.carbondata.spark.exception.MalformedCarbonCommandException object ValidateUtil { - def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = { - val dimensions = table.getDimensionByTableName(tableName).asScala + + /** + * validates both timestamp and date for illegal values + * + * @param dateTimeLoadFormat + * @param dateTimeLoadOption + */ + def validateDateTimeFormat(dateTimeLoadFormat: String, dateTimeLoadOption: String): Unit = { // allowing empty value to be configured for dateformat option. - if (dateFormat != null && dateFormat.trim != "") { - val dateFormats: Array[String] = dateFormat.split(CarbonCommonConstants.COMMA) - for (singleDateFormat <- dateFormats) { - val dateFormatSplits: Array[String] = singleDateFormat.split(":", 2) - val columnName = dateFormatSplits(0).trim.toLowerCase - if (!dimensions.exists(_.getColName.equals(columnName))) { - throw new MalformedCarbonCommandException("Error: Wrong Column Name " + - dateFormatSplits(0) + - " is provided in Option DateFormat.") - } - if (dateFormatSplits.length < 2 || dateFormatSplits(1).trim.isEmpty) { - throw new MalformedCarbonCommandException("Error: Option DateFormat is not provided " + - "for " + "Column " + dateFormatSplits(0) + - ".") - } - } + if (dateTimeLoadFormat != null && dateTimeLoadFormat.trim != "") { + try { + new SimpleDateFormat(dateTimeLoadFormat) } + catch { + case _: IllegalArgumentException => + throw new MalformedCarbonCommandException("Error: Wrong option: " + dateTimeLoadFormat + --- End diff -- use $ for variable --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151831369 --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -799,18 +799,19 @@ object CarbonDataRDDFactory { throw new DataLoadingException("Partition column not found.") } - val dateFormatMap = CarbonDataProcessorUtil.getDateFormatMap(carbonLoadModel.getDateFormat) - val specificFormat = Option(dateFormatMap.get(partitionColumn.toLowerCase)) - val timeStampFormat = if (specificFormat.isDefined) { - new SimpleDateFormat(specificFormat.get) - } else { - val timestampFormatString = CarbonProperties.getInstance().getProperty(CarbonCommonConstants - .CARBON_TIMESTAMP_FORMAT, CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT) - new SimpleDateFormat(timestampFormatString) - } + val specificTimestampFormat = carbonLoadModel.getTimestampformat + val specificDateFormat = carbonLoadModel.getDateFormat + val timeStampFormat = + if (specificTimestampFormat != null && !specificTimestampFormat.trim.isEmpty) { + new SimpleDateFormat(specificTimestampFormat) + } else { + val timestampFormatString = CarbonProperties.getInstance().getProperty(CarbonCommonConstants --- End diff -- move `CarbonCommonConstants` to next line, give one parameter in one line --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151831621 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala --- @@ -75,55 +76,19 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA assert(false) } catch { case ex: MalformedCarbonCommandException => - assertResult(ex.getMessage)("Error: Option DateFormat is not provided for Column date.") + assertResult(ex.getMessage)("Error: Wrong option: date is provided for option DateFormat") case _: Throwable=> assert(false) } try { sql(s""" LOAD DATA LOCAL INPATH '$resourcesPath/timeStampFormatData1.csv' into table t3 - OPTIONS('dateformat' = 'fasfdas:yyyy/MM/dd') --- End diff -- why delete these testcase? `dateformat` is still a valid loading option, right? --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151831633 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala --- @@ -17,35 +17,32 @@ package org.apache.carbondata.spark.load -import scala.collection.JavaConverters._ +import java.text.SimpleDateFormat -import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.metadata.schema.table.CarbonTable -import org.apache.carbondata.processing.loading.model.CarbonLoadModel import org.apache.carbondata.processing.loading.sort.SortScopeOptions import org.apache.carbondata.spark.exception.MalformedCarbonCommandException object ValidateUtil { - def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = { - val dimensions = table.getDimensionByTableName(tableName).asScala + + /** + * validates both timestamp and date for illegal values --- End diff -- validate timestamp format and date format. please change --- |
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151831713 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java --- @@ -52,6 +52,14 @@ public static final String CARBON_OPTIONS_DATEFORMAT = "carbon.options.dateformat"; public static final String CARBON_OPTIONS_DATEFORMAT_DEFAULT = ""; + + /** + * option to specify the load option + */ + @CarbonProperty + public static final String CARBON_OPTIONS_TIMESTAMPFORMAT = --- End diff -- please add comment for these two options to explain how it should be used: `CARBON_OPTIONS_DATEFORMAT` and `CARBON_OPTIONS_TIMESTAMPFORMAT` --- |
In reply to this post by qiuchenjian-2
Github user dhatchayani commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1524#discussion_r151832053 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala --- @@ -17,35 +17,32 @@ package org.apache.carbondata.spark.load -import scala.collection.JavaConverters._ +import java.text.SimpleDateFormat -import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.metadata.schema.table.CarbonTable -import org.apache.carbondata.processing.loading.model.CarbonLoadModel import org.apache.carbondata.processing.loading.sort.SortScopeOptions import org.apache.carbondata.spark.exception.MalformedCarbonCommandException object ValidateUtil { - def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = { - val dimensions = table.getDimensionByTableName(tableName).asScala + + /** + * validates both timestamp and date for illegal values --- End diff -- this validates both timestamp and date format --- |
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1524 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1266/ --- |
Free forum by Nabble | Edit this page |