[jira] [Comment Edited] (CARBONDATA-1572) Support Streaming Ingest

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Comment Edited] (CARBONDATA-1572) Support Streaming Ingest

Akash R Nilugal (Jira)

    [ https://issues.apache.org/jira/browse/CARBONDATA-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205412#comment-16205412 ]

Jacky Li edited comment on CARBONDATA-1572 at 10/16/17 4:05 AM:
----------------------------------------------------------------

My understanding is:

1. Handoff process is like compaction process, it will first read files in streaming segment and create new columnar files, after all files are converted, mark the old streaming segment for deletion. It will be delete after timeout or user can delete it manually by CLEAN command.

2. As a first step, I think landing on HDFS directly is simpler and it can provide data consistency also. The only problem is the write throughput may not be comparable to HBase. But if we implement like HBase, we need many infrastructures like mem-store, WAL, recovery process, etc. It is long way to go.



was (Author: jackylk):
My understanding is:

1. Handoff process is like compaction process, it will first read files in streaming segment and create new columnar files, after all files are converted, mark the old streaming segment for deletion. It will be delete after timeout or user can delete it manually by CLEAN command.

2. As a first step, I think landing on HDFS directly is simpler. If we implement like HBase, we need many infrastructures like mem-store, WAL, recovery process, etc. It is long way to go.


> Support Streaming Ingest
> ------------------------
>
>                 Key: CARBONDATA-1572
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1572
>             Project: CarbonData
>          Issue Type: New Feature
>            Reporter: QiangCai
>         Attachments: CarbonData Streaming Ingest.pdf
>
>
> CarbonData should support streaming ingest.
> [^CarbonData Streaming Ingest.pdf]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)