[jira] [Updated] (CARBONDATA-2516) Filter Greater than in timestamp datatype not generating

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (CARBONDATA-2516) Filter Greater than in timestamp datatype not generating

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sourabh Verma updated CARBONDATA-2516:
--------------------------------------
    Description:
Scenario - carbon-data Table 'load_table' with columns 'integer', 'datetime'

//table creation and load code (spark)
val random = new Random()
val df = spark.sparkContext.parallelize(1 to (365 * 24 * 360))
.map(x => (random.nextInt(200), new Timestamp(currentMillis - (x * 1000l))))
.toDF("integer", "datetime")

// Saves dataframe to carbondata file
df.write.format("carbondata")
.option("tableName", "load_table")
.option("compress", "true")
.option("tempCSV", "false")
.mode(SaveMode.Overwrite)
.save()

SQL (through Presto CLI) - select * from load_table where datetime > date_parse('2018-05-10 18:22:15', '%Y-%m-%d %T');

Issue - Carbondata is having full scan over the files although we have passed greater than expression filter on timestamp.

cause - PrestoFilterUtil is not creating greater than Expression for timestamp.

> Filter Greater than in timestamp datatype not generating
> ---------------------------------------------------------
>
>                 Key: CARBONDATA-2516
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2516
>             Project: CarbonData
>          Issue Type: Bug
>          Components: presto-integration
>    Affects Versions: 1.4.0
>            Reporter: Sourabh Verma
>            Priority: Major
>             Fix For: 1.4.0
>
>
> Scenario - carbon-data Table 'load_table' with columns 'integer', 'datetime'
> //table creation and load code (spark)
> val random = new Random()
> val df = spark.sparkContext.parallelize(1 to (365 * 24 * 360))
> .map(x => (random.nextInt(200), new Timestamp(currentMillis - (x * 1000l))))
> .toDF("integer", "datetime")
> // Saves dataframe to carbondata file
> df.write.format("carbondata")
> .option("tableName", "load_table")
> .option("compress", "true")
> .option("tempCSV", "false")
> .mode(SaveMode.Overwrite)
> .save()
> SQL (through Presto CLI) - select * from load_table where datetime > date_parse('2018-05-10 18:22:15', '%Y-%m-%d %T');
> Issue - Carbondata is having full scan over the files although we have passed greater than expression filter on timestamp.
> cause - PrestoFilterUtil is not creating greater than Expression for timestamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)