Login  Register

[GitHub] carbondata pull request #1032: [WIP] Fixed range info overlapping values iss...

classic Classic list List threaded Threaded
34 messages Options Options
Embed post
Permalink
12
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/131/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
   
    Refer to this link for build results (access rights to CI server needed):
    https://builds.apache.org/job/carbondata-pr-spark-1.6/630/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata pull request #1032: [CARBONDATA-1149] Fixed range info overlappin...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1032#discussion_r124024397
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala ---
    @@ -288,6 +297,69 @@ object CommonUtil {
         result
       }
     
    +  def validateForOverLappingRangeValues(desType: Option[String],
    +      rangeInfoArray: Array[String]): Boolean = {
    --- End diff --
   
    better to use the some compare class  with range partitioner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata pull request #1032: [CARBONDATA-1149] Fixed range info overlappin...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1032#discussion_r124182621
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala ---
    @@ -288,6 +297,69 @@ object CommonUtil {
         result
       }
     
    +  def validateForOverLappingRangeValues(desType: Option[String],
    +      rangeInfoArray: Array[String]): Boolean = {
    --- End diff --
   
    @QiangCai ....Please correct me if I am wrong. Scala has already a predefined method for comparing different array elements. I think for writing a Comparator class we will have to write our own logic which will be an extra overhead to maintain.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata pull request #1032: [CARBONDATA-1149] Fixed range info overlappin...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1032#discussion_r124548518
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala ---
    @@ -288,6 +297,69 @@ object CommonUtil {
         result
       }
     
    +  def validateForOverLappingRangeValues(desType: Option[String],
    +      rangeInfoArray: Array[String]): Boolean = {
    +    val rangeInfoValuesValid = desType match {
    +      case Some("IntegerType") | Some("int") =>
    +        val intRangeInfoArray = rangeInfoArray.map(_.toInt)
    +        val sortedRangeInfoArray = intRangeInfoArray.sorted
    +        intRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("StringType") | Some("string") =>
    +        val sortedRangeInfoArray = rangeInfoArray.sorted
    +        rangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case a if (desType.get.startsWith("varchar") || desType.get.startsWith("char")) =>
    +        val sortedRangeInfoArray = rangeInfoArray.sorted
    +        rangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("LongType") | Some("long") | Some("bigint") =>
    +        val longRangeInfoArray = rangeInfoArray.map(_.toLong)
    +        val sortedRangeInfoArray = longRangeInfoArray.sorted
    +        longRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("FloatType") | Some("float") =>
    +        val floatRangeInfoArray = rangeInfoArray.map(_.toFloat)
    +        val sortedRangeInfoArray = floatRangeInfoArray.sorted
    +        floatRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("DoubleType") | Some("double") =>
    +        val doubleRangeInfoArray = rangeInfoArray.map(_.toDouble)
    +        val sortedRangeInfoArray = doubleRangeInfoArray.sorted
    +        doubleRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("ByteType") | Some("tinyint") =>
    +        val byteRangeInfoArray = rangeInfoArray.map(_.toByte)
    +        val sortedRangeInfoArray = byteRangeInfoArray.sorted
    +        byteRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("ShortType") | Some("smallint") =>
    +        val shortRangeInfoArray = rangeInfoArray.map(_.toShort)
    +        val sortedRangeInfoArray = shortRangeInfoArray.sorted
    +        shortRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("BooleanType") | Some("boolean") =>
    +        true
    +      case a if (desType.get.startsWith("DecimalType") || desType.get.startsWith("decimal")) =>
    +        val decimalRangeInfoArray = rangeInfoArray.map(value => BigDecimal(value))
    +        val sortedRangeInfoArray = decimalRangeInfoArray.sorted
    +        decimalRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("DateType") | Some("date") =>
    +        val dateRangeInfoArray = rangeInfoArray.map { value =>
    --- End diff --
   
    Dictionary generation can bring duplicate values. duplicate value check required.
    Same is the case with timesamp case also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata pull request #1032: [CARBONDATA-1149] Fixed range info overlappin...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1032#discussion_r124549140
 
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala ---
    @@ -288,6 +297,69 @@ object CommonUtil {
         result
       }
     
    +  def validateForOverLappingRangeValues(desType: Option[String],
    +      rangeInfoArray: Array[String]): Boolean = {
    +    val rangeInfoValuesValid = desType match {
    +      case Some("IntegerType") | Some("int") =>
    +        val intRangeInfoArray = rangeInfoArray.map(_.toInt)
    +        val sortedRangeInfoArray = intRangeInfoArray.sorted
    +        intRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("StringType") | Some("string") =>
    +        val sortedRangeInfoArray = rangeInfoArray.sorted
    +        rangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case a if (desType.get.startsWith("varchar") || desType.get.startsWith("char")) =>
    +        val sortedRangeInfoArray = rangeInfoArray.sorted
    +        rangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("LongType") | Some("long") | Some("bigint") =>
    +        val longRangeInfoArray = rangeInfoArray.map(_.toLong)
    +        val sortedRangeInfoArray = longRangeInfoArray.sorted
    +        longRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("FloatType") | Some("float") =>
    +        val floatRangeInfoArray = rangeInfoArray.map(_.toFloat)
    +        val sortedRangeInfoArray = floatRangeInfoArray.sorted
    +        floatRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("DoubleType") | Some("double") =>
    +        val doubleRangeInfoArray = rangeInfoArray.map(_.toDouble)
    +        val sortedRangeInfoArray = doubleRangeInfoArray.sorted
    +        doubleRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("ByteType") | Some("tinyint") =>
    +        val byteRangeInfoArray = rangeInfoArray.map(_.toByte)
    +        val sortedRangeInfoArray = byteRangeInfoArray.sorted
    +        byteRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("ShortType") | Some("smallint") =>
    +        val shortRangeInfoArray = rangeInfoArray.map(_.toShort)
    +        val sortedRangeInfoArray = shortRangeInfoArray.sorted
    +        shortRangeInfoArray.sameElements(sortedRangeInfoArray)
    +      case Some("BooleanType") | Some("boolean") =>
    +        true
    +      case a if (desType.get.startsWith("DecimalType") || desType.get.startsWith("decimal")) =>
    +        val decimalRangeInfoArray = rangeInfoArray.map(value => BigDecimal(value))
    --- End diff --
   
    Bigdecimal precision and scale needs to be considered , other wise two ranges can overlap after converting value during dataload.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
    SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/45/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/60/



---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1234/



---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/603/



---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/1032
 
    Not required as partition feature is re-implemented.


---
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

[GitHub] carbondata pull request #1032: [CARBONDATA-1149] Fixed range info overlappin...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user manishgupta88 closed the pull request at:

    https://github.com/apache/carbondata/pull/1032


---
12