CarbonStore Java & REST API proposal

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

CarbonStore Java & REST API proposal

Jacky Li

 

Motivation

Currently, CarbonData is mainly used through integration with big data compute framework like Spark, Presto, etc. This is useful for complex data analytics workflow like using SQL or Spark’s DataFrame/DataSet API. However, there are also other simpler analytic scenarios that can leverage CarbonData, like filtering a single table, or just scanning a given folder contains carbondata files.

 

We believe application performance can benefit from CarbonData’s acceleration technique like Indexing and Materialized View. To enable such scenario, we would like to provide CarbonStore Java API so that any Java application can use it in its process. And we would also like to provide REST API for application that prefer microservice architecture.



Since google doc is not available for some of us. I have put the full proposal in JIRA CARBONDATA-2688 (https://issues.apache.org/jira/browse/CARBONDATA-2688)

Please feel free to provide feedback. Any feedback is welcomed.


Regards,
Jacky
Reply | Threaded
Open this post in threaded view
|

Re: CarbonStore Java & REST API proposal

David CaiQiang
+1



-----
Best Regards
David Cai
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Best Regards
David Cai
Reply | Threaded
Open this post in threaded view
|

Re: CarbonStore Java & REST API proposal

xuchuanyin
In reply to this post by Jacky Li
Hi, jacky, please check the following comments:


1. Do we need to provide other inferfaces, such as `listTable`,
`renameTable`...

2. What's the difference between the function of 'Carbon-SDK' and
'CarbonStore'

As for the CarbonStore API `createTable`:
3. Will it make use of the existing `TableSchemaBuilder`?
4. Better to return the `TableIdentifier` instead of `void`?

As for the CarbonStore API `loadData`:
5. Is it possible to return the number of records that have been loaded.

6. What's the relationship between `CarbonStore` and Session? (for set
command)




--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: CarbonStore Java & REST API proposal

Jacky Li
Answer inline.

> 在 2018年7月6日,下午3:24,xuchuanyin <[hidden email]> 写道:
>
> Hi, jacky, please check the following comments:
>
>
> 1. Do we need to provide other inferfaces, such as `listTable`,
> `renameTable`…

As of now, I have added getTable to return the CarbonTable. For more metadata interfaces like listTable, renameTable  you mentioned, I would like to add it later if there are use cases for them.
>
> 2. What's the difference between the function of 'Carbon-SDK' and
> ‘CarbonStore'

Carbon SDK is used for reading and writing carbondata files, it is in file level, while CarbonStore is a store level interface that operates on “Table”. And CarbonStore is built on top of Carbon-SDK

>
> As for the CarbonStore API `createTable`:
> 3. Will it make use of the existing `TableSchemaBuilder`?
Yes, it will

> 4. Better to return the `TableIdentifier` instead of `void`?

Now, TableIdentifier is a simple object that contains table name and database name (optional). I think user wants to provide this when creating table. I guess you want createTable to return CarbonTable? This can be consider, I just prefer to keep the interface simple as a first step.
 
>
> As for the CarbonStore API `loadData`:
> 5. Is it possible to return the number of records that have been loaded.
I think this can be consider, are you looking for a specific use case?

>
> 6. What's the relationship between `CarbonStore` and Session? (for set
> command)

CarbonStore is targeted as a library that works independently without compute framework like spark.  As we are adding more functionality to carbon, I think we’d better to add them in CarbonStore in the future, and spark/presto can integrate CarbonStore in its integration module. So to answer your question, one possibility is that CarbonSession will use CarbonStore internally, to be specific, CarbonScanRDD will use CarbonStore.


>
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>