[DISCUSSION] Support write Flink streaming data to Carbon

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSSION] Support write Flink streaming data to Carbon

niuge
This post was updated on .
The write process is:

1.Write flink streaming data to local file system of flink task node use flink StreamingFileSink and carbon SDK;
2.Copy local carbon data file to carbon data store system, such as HDFS, S3;
3.Generate and write segment file to ${tablePath}/load_details;

Run "alter table ${tableName} collect segments" command on server, to compact segment files in ${tablePath}/load_details, and then move the compacted segment file to ${tablePath}/Metadata/Segments/,update table status file finally.

Have raised a jira https://issues.apache.org/jira/browse/CARBONDATA-3557 and attached design document to it. Request you to please have a look.

Welcome you opinion and suggestions.
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Support write Flink streaming data to Carbon

Jacky Li-3
+1 for these feature, in my opinion, flink-carbon is a good fit for near realtiem analytics

One doubt is that in your design, the Collect Segment command and Compaction command are two separate commands, right?

Collect Segment command modify the metadata files (tablestatus file and segment file), while Compaction command merges small data files and build indexes.

Is my understanding right?

Regards,
Jacky

On 2019/10/29 06:59:51, "爱在西元前" <[hidden email]> wrote:

> The write process is:
>
> Write flink streaming data to local file system of flink task node use flink StreamingFileSink and carbon SDK;
>
> Copy local carbon data file to carbon data store system, such as HDFS, S3;
>
> Generate and write segment file to ${tablePath}/load_details;
>
> Run "alter table ${tableName} collect segments" command on server, to compact segment files in ${tablePath}/load_details, and then move the compacted segment file to ${tablePath}/Metadata/Segments/,update table status file finally.
>
> Have raised a jira https://issues.apache.org/jira/browse/CARBONDATA-3557
>
> Welcome you opinion and suggestions.
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Support write Flink streaming data to Carbon

sraghunandan
+1

On Thu, 31 Oct, 2019, 9:13 AM Jacky Li, <[hidden email]> wrote:

> +1 for these feature, in my opinion, flink-carbon is a good fit for near
> realtiem analytics
>
> One doubt is that in your design, the Collect Segment command and
> Compaction command are two separate commands, right?
>
> Collect Segment command modify the metadata files (tablestatus file and
> segment file), while Compaction command merges small data files and build
> indexes.
>
> Is my understanding right?
>
> Regards,
> Jacky
>
> On 2019/10/29 06:59:51, "爱在西元前" <[hidden email]> wrote:
> > The write process is:
> >
> > Write flink streaming data to local file system of flink task node use
> flink StreamingFileSink and carbon SDK;
> >
> > Copy local carbon data file to carbon data store system, such as HDFS,
> S3;
> >
> > Generate and write segment file to ${tablePath}/load_details;
> >
> > Run "alter table ${tableName} collect segments" command on server, to
> compact segment files in ${tablePath}/load_details, and then move the
> compacted segment file to ${tablePath}/Metadata/Segments/,update table
> status file finally.
> >
> > Have raised a jira https://issues.apache.org/jira/browse/CARBONDATA-3557
> >
> > Welcome you opinion and suggestions.
>