http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Optimize-and-refactor-insert-into-command-tp88449p89389.html
> Hi Ajantha,
>
> Thanks for your initiative, I have couple of questions even though.
>
> a) As per your explanation the dataset validation is already done as part
> of the source table, this is what you mean? What I understand is the insert
> select queries are going to get some benefits since we don’t do some
> additional steps.
>
> What about if your destination table has some different table properties
> like few columns may have non null properties or date format or decimal
> precision’s or scale may be different.
> So you may need a bad record support then , how you are going to handle
> such scenarios? Correct me if I misinterpreted your points.
>
> Regards,
> Sujith
>
>
> On Fri, 20 Dec 2019 at 5:25 AM, Ajantha Bhat <
[hidden email]>
> wrote:
>
> > Currently carbondata "insert into" uses the CarbonLoadDataCommand itself.
> > Load process has steps like parsing and converter step with bad record
> > support.
> > Insert into doesn't require these steps as data is already validated and
> > converted from source table or dataframe.
> >
> > Some identified changes are below.
> >
> > 1. Need to refactor and separate load and insert at driver side to skip
> > converter step and unify flow for No sort and global sort insert.
> > 2. Need to avoid reorder of each row. By changing select dataframe's
> > projection order itself during the insert into.
> > 3. For carbon to carbon insert, need to provide the ReadSupport and use
> > RecordReader (vector reader currently doesn't support ReadSupport) to
> > handle null values, time stamp cutoff (direct dictionary) from scanRDD
> > result.
> > 4. Need to handle insert into partition/non-partition table in local
> sort,
> > global sort, no sort, range columns, compaction flow.
> >
> > The final goal is to improve insert performance by keeping only required
> > logic and also decrease the memory footprint.
> >
> > If you have any other suggestions or optimizations related to this let me
> > know.
> >
> > Thanks,
> > Ajantha
> >
>