[GitHub] carbondata pull request #1565: [CARBONDATA-1518]Support creating timeseries ...

classic Classic list List threaded Threaded
47 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1565: [CARBONDATA-1518][Pre-Aggregate]Support creating tim...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1565
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1793/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1565: [CARBONDATA-1518][Pre-Aggregate]Support creating tim...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1565
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1565: [CARBONDATA-1518][Pre-Aggregate]Support creat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1565


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1565: [CARBONDATA-1518][Pre-Aggregate]Support creat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1565#discussion_r163140389
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/timeseries/TimeseriesUtil.scala ---
    @@ -0,0 +1,159 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.execution.command.timeseries
    +
    +import java.sql.Timestamp
    +
    +import org.apache.spark.sql.execution.command.{DataMapField, Field}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    +import org.apache.carbondata.core.preagg.TimeSeriesUDF
    +import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
    +
    +/**
    + * Utility class for time series to keep
    + */
    +object TimeSeriesUtil {
    +
    +  /**
    +   * Below method will be used to validate whether column mentioned in time series
    +   * is timestamp column or not
    +   *
    +   * @param dmproperties
    +   * data map properties
    +   * @param parentTable
    +   * parent table
    +   * @return whether time stamp column
    +   */
    +  def validateTimeSeriesEventTime(dmproperties: Map[String, String],
    +      parentTable: CarbonTable) {
    +    val eventTime = dmproperties.get(CarbonCommonConstants.TIMESERIES_EVENTTIME)
    +    if (!eventTime.isDefined) {
    +      throw new MalformedCarbonCommandException("Eventtime not defined in time series")
    +    } else {
    +      val carbonColumn = parentTable.getColumnByName(parentTable.getTableName, eventTime.get)
    +      if (carbonColumn.getDataType != DataTypes.TIMESTAMP) {
    +        throw new MalformedCarbonCommandException(
    +          "Timeseries event time is only supported on Timestamp " +
    +          "column")
    +      }
    +    }
    +  }
    +
    +  /**
    +   * Below method will be used to validate the hierarchy of time series and its value
    +   * validation will be done whether hierarchy order is proper or not and hierarchy level
    +   * value
    +   *
    +   * @param timeSeriesHierarchyDetails
    +   * time series hierarchy string
    +   */
    +  def validateAndGetTimeSeriesHierarchyDetails(timeSeriesHierarchyDetails: String): Array[
    +    (String, String)] = {
    +    val updatedtimeSeriesHierarchyDetails = timeSeriesHierarchyDetails.toLowerCase
    +    val timeSeriesHierarchy = updatedtimeSeriesHierarchyDetails.split(",")
    +    val hierBuffer = timeSeriesHierarchy.map {
    +      case f =>
    +        val splits = f.split("=")
    +        // checking hierarchy name is valid or not
    +        if (!TimeSeriesUDF.INSTANCE.TIMESERIES_FUNCTION.contains(splits(0).toLowerCase)) {
    +          throw new MalformedCarbonCommandException(s"Not supported heirarchy type: ${ splits(0) }")
    --- End diff --
   
    should heirarchy be hierarchy?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1565: [CARBONDATA-1518][Pre-Aggregate]Support creat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1565#discussion_r163140644
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/timeseries/TimeseriesUtil.scala ---
    @@ -0,0 +1,159 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.execution.command.timeseries
    +
    +import java.sql.Timestamp
    +
    +import org.apache.spark.sql.execution.command.{DataMapField, Field}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    +import org.apache.carbondata.core.preagg.TimeSeriesUDF
    +import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
    +
    +/**
    + * Utility class for time series to keep
    + */
    +object TimeSeriesUtil {
    +
    +  /**
    +   * Below method will be used to validate whether column mentioned in time series
    +   * is timestamp column or not
    +   *
    +   * @param dmproperties
    +   * data map properties
    +   * @param parentTable
    +   * parent table
    +   * @return whether time stamp column
    +   */
    +  def validateTimeSeriesEventTime(dmproperties: Map[String, String],
    +      parentTable: CarbonTable) {
    +    val eventTime = dmproperties.get(CarbonCommonConstants.TIMESERIES_EVENTTIME)
    +    if (!eventTime.isDefined) {
    +      throw new MalformedCarbonCommandException("Eventtime not defined in time series")
    +    } else {
    +      val carbonColumn = parentTable.getColumnByName(parentTable.getTableName, eventTime.get)
    +      if (carbonColumn.getDataType != DataTypes.TIMESTAMP) {
    +        throw new MalformedCarbonCommandException(
    +          "Timeseries event time is only supported on Timestamp " +
    +          "column")
    +      }
    +    }
    +  }
    +
    +  /**
    +   * Below method will be used to validate the hierarchy of time series and its value
    +   * validation will be done whether hierarchy order is proper or not and hierarchy level
    +   * value
    +   *
    +   * @param timeSeriesHierarchyDetails
    +   * time series hierarchy string
    +   */
    +  def validateAndGetTimeSeriesHierarchyDetails(timeSeriesHierarchyDetails: String): Array[
    +    (String, String)] = {
    +    val updatedtimeSeriesHierarchyDetails = timeSeriesHierarchyDetails.toLowerCase
    +    val timeSeriesHierarchy = updatedtimeSeriesHierarchyDetails.split(",")
    +    val hierBuffer = timeSeriesHierarchy.map {
    +      case f =>
    +        val splits = f.split("=")
    +        // checking hierarchy name is valid or not
    +        if (!TimeSeriesUDF.INSTANCE.TIMESERIES_FUNCTION.contains(splits(0).toLowerCase)) {
    +          throw new MalformedCarbonCommandException(s"Not supported heirarchy type: ${ splits(0) }")
    +
    +        }
    +        // validating hierarchy level is valid or not
    +        if (!splits(1).equals("1")) {
    +          throw new MalformedCarbonCommandException(
    +            s"Unsupported Value for hierarchy:" +
    +            s"${ splits(0) }=${ splits(1) }")
    +        }
    +        (splits(0), splits(1))
    +    }
    +    // checking whether hierarchy is in proper order or not
    +    // get the index of first hierarchy
    +    val indexOfFirstHierarchy = TimeSeriesUDF.INSTANCE.TIMESERIES_FUNCTION
    +      .indexOf(hierBuffer(0)._1.toLowerCase)
    +    val index = 0
    --- End diff --
   
    What usage has the index?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1565: [CARBONDATA-1518][Pre-Aggregate]Support creat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1565#discussion_r163249066
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/timeseries/TimeseriesUtil.scala ---
    @@ -0,0 +1,159 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.execution.command.timeseries
    +
    +import java.sql.Timestamp
    +
    +import org.apache.spark.sql.execution.command.{DataMapField, Field}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    +import org.apache.carbondata.core.preagg.TimeSeriesUDF
    +import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
    +
    +/**
    + * Utility class for time series to keep
    + */
    +object TimeSeriesUtil {
    +
    +  /**
    +   * Below method will be used to validate whether column mentioned in time series
    +   * is timestamp column or not
    +   *
    +   * @param dmproperties
    +   * data map properties
    +   * @param parentTable
    +   * parent table
    +   * @return whether time stamp column
    +   */
    +  def validateTimeSeriesEventTime(dmproperties: Map[String, String],
    +      parentTable: CarbonTable) {
    +    val eventTime = dmproperties.get(CarbonCommonConstants.TIMESERIES_EVENTTIME)
    +    if (!eventTime.isDefined) {
    +      throw new MalformedCarbonCommandException("Eventtime not defined in time series")
    +    } else {
    +      val carbonColumn = parentTable.getColumnByName(parentTable.getTableName, eventTime.get)
    +      if (carbonColumn.getDataType != DataTypes.TIMESTAMP) {
    +        throw new MalformedCarbonCommandException(
    +          "Timeseries event time is only supported on Timestamp " +
    +          "column")
    +      }
    +    }
    +  }
    +
    +  /**
    +   * Below method will be used to validate the hierarchy of time series and its value
    +   * validation will be done whether hierarchy order is proper or not and hierarchy level
    +   * value
    +   *
    +   * @param timeSeriesHierarchyDetails
    +   * time series hierarchy string
    +   */
    +  def validateAndGetTimeSeriesHierarchyDetails(timeSeriesHierarchyDetails: String): Array[
    +    (String, String)] = {
    +    val updatedtimeSeriesHierarchyDetails = timeSeriesHierarchyDetails.toLowerCase
    +    val timeSeriesHierarchy = updatedtimeSeriesHierarchyDetails.split(",")
    +    val hierBuffer = timeSeriesHierarchy.map {
    +      case f =>
    +        val splits = f.split("=")
    +        // checking hierarchy name is valid or not
    +        if (!TimeSeriesUDF.INSTANCE.TIMESERIES_FUNCTION.contains(splits(0).toLowerCase)) {
    +          throw new MalformedCarbonCommandException(s"Not supported heirarchy type: ${ splits(0) }")
    +
    +        }
    +        // validating hierarchy level is valid or not
    +        if (!splits(1).equals("1")) {
    --- End diff --
   
    Why we should "splits(1).equals("1")“?
    Cann't we support hout=2 or other?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1565: [CARBONDATA-1518][Pre-Aggregate]Support creat...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1565#discussion_r164043115
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala ---
    @@ -0,0 +1,93 @@
    +package org.apache.carbondata.integration.spark.testsuite.timeseries
    +
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.BeforeAndAfterAll
    +
    +class TestTimeSeriesCreateTable extends QueryTest with BeforeAndAfterAll {
    +
    +  override def beforeAll: Unit = {
    +    sql("drop table if exists mainTable")
    +    sql("CREATE TABLE mainTable(dataTime timestamp, name string, city string, age int) STORED BY 'org.apache.carbondata.format'")
    +    sql("create datamap agg0 on table mainTable using 'preaggregate' DMPROPERTIES ('timeseries.eventTime'='dataTime', 'timeseries.hierarchy'='second=1,hour=1,day=1,month=1,year=1') as select dataTime, sum(age) from mainTable group by dataTime")
    +  }
    +
    +  test("test timeseries create table Zero") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg0_second"), true, "maintable_agg0_second")
    +    sql("drop datamap agg0_second on table mainTable")
    +  }
    +
    +  test("test timeseries create table One") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg0_hour"), true, "maintable_agg0_hour")
    +    sql("drop datamap agg0_hour on table mainTable")
    +  }
    +  test("test timeseries create table two") {
    +    checkExistence(sql("DESCRIBE FORMATTED maintable_agg0_day"), true, "maintable_agg0_day")
    +    sql("drop datamap agg0_day on table mainTable")
    +  }
    +  test("test timeseries create table three") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg0_month"), true, "maintable_agg0_month")
    +    sql("drop datamap agg0_month on table mainTable")
    +  }
    +  test("test timeseries create table four") {
    +    checkExistence(sql("DESCRIBE FORMATTED mainTable_agg0_year"), true, "maintable_agg0_year")
    +    sql("drop datamap agg0_year on table mainTable")
    +  }
    +
    +  test("test timeseries create table five") {
    +    try {
    +      sql(
    +        "create datamap agg0 on table mainTable using 'preaggregate' DMPROPERTIES ('timeseries.eventTime'='dataTime', 'timeseries.hierarchy'='sec=1,hour=1,day=1,month=1,year=1') as select dataTime, sum(age) from mainTable group by dataTime")
    +      assert(false)
    +    } catch {
    +      case _:Exception =>
    +        assert(true)
    +    }
    +  }
    +
    +  test("test timeseries create table Six") {
    +    try {
    +      sql(
    +        "create datamap agg0 on table mainTable using 'preaggregate' DMPROPERTIES ('timeseries.eventTime'='dataTime', 'timeseries.hierarchy'='hour=2') as select dataTime, sum(age) from mainTable group by dataTime")
    +      assert(false)
    +    } catch {
    +      case _:Exception =>
    +        assert(true)
    +    }
    +  }
    +
    +  test("test timeseries create table seven") {
    +    try {
    +      sql(
    +        "create datamap agg0 on table mainTable using 'preaggregate' DMPROPERTIES ('timeseries.eventTime'='dataTime', 'timeseries.hierarchy'='hour=1,day=1,year=1,month=1') as select dataTime, sum(age) from mainTable group by dataTime")
    +      assert(false)
    +    } catch {
    +      case _:Exception =>
    +        assert(true)
    +    }
    +  }
    +
    +  test("test timeseries create table Eight") {
    +    try {
    +      sql(
    +        "create datamap agg0 on table mainTable using 'preaggregate' DMPROPERTIES ('timeseries.eventTime'='name', 'timeseries.hierarchy'='hour=1,day=1,year=1,month=1') as select name, sum(age) from mainTable group by name")
    +      assert(false)
    --- End diff --
   
    It must run success whatever the sql run success...


---
123