[ANNOUNCE] Apache CarbonData 1.3.0 release

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[ANNOUNCE] Apache CarbonData 1.3.0 release

Liang Chen
Administrator
This post was updated on .
Hi

The Apache CarbonData PMC team is happy to announce the release of Apache
CarbonData version 1.3.0.

What’s New in Version 1.3.0?

In this version of CarbonData, following are the new features added for
performance improvements, compatibility, and usability of CarbonData.

1)Support Spark 2.2.1
Spark 2.2.1 is the latest stable version and has added new features and
improved the performance. CarbonData 1.3.0 integrate with it for getting
the advantage of it after upgrading.
Support Streaming

2)Supports streaming ingestion for real-time data. After the real-time data
is ingested into carbon store, it can be queried from compute engine like
SparkSQL.

3)Pre Aggregate Support
Supports pre aggregating of data so that "group by" kind of queries can
fetch data much faster(around 10X performance faster). You can create as
many aggregate tables as require as datamaps to improve their query
performance.

4)Support Time Series (Alpha feature)
Supports to create multiple pre-aggregate tables for the time hierarchy and
CarbonData can do automatic roll-up for the queries on these
hierarchies.Note, this feature is alpha feature

5)CTAS (CREATE TABLE AS SELECT)

Supports to create a CarbonData table from any of the Parquet/Hive/Carbon
table. This is beneficial when you want to create CarbonData table from any
other Parquet/Hive table and use the Carbon query engine to query and
achieve better query results. This can be also used for backing up the data.
Standard Partitioning

6)Supports standard hive partition, this
allows you to use any columns to create a partition for improving query
performance significantly.

7)Support External DB & Table Path
Supports external DB and Table path. Now while creating DB or table, you
can specify the location where the DB or table needs to be stored.
Support Query Data with Specified Dataload

8)Support query data with specified segments (one dataload generates one
segment), users can query data as per the real required data, this would be
very helpful to improve query performance.

9)Support Boolean Data Type



You can follow this document to use these artifacts:

https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md

You can find the latest CarbonData document and learn more at:
http://carbondata.apache.org


Please find the detailed JIRA list:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220&version=12341004

Regards
Liang