Posted by
rahul_kumar on
Feb 04, 2019; 1:28pm
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-read-latest-schema-in-case-of-external-table-and-file-format-tp74986p74996.html
hi akash,
I have one concern related to this change:
*Concern*: why we are skipping old datafile? if user is not giving the
schema then also i think we should read old data file. we can fill column
*d* and *e* with *None* value.
i guess *if data file is present at given location it means user
wants to read data from all files*.
*Suggestion*: In internal flow some how if we are maintaining the schema ,
we can use alter table flow as well.
On Mon, Feb 4, 2019, 4:25 PM Liang Chen <
[hidden email] wrote:
> Hi
>
> Can you explain which scenario will generate two carbondata files with
> different schema?
>
> Regards
> Liang
>
>
> akashrn5 wrote
> > Hi dev,
> >
> > Currently we have a validation that if there are two carbondata files in
> a
> > location with different schema, then we fail the query. I think there is
> > no
> > need to fail. If you see the parquet behavior also we cna understand.
> >
> > Here i think failing is not good, we can read the latets schema from
> > latest
> > carbondata file in the given location and based on that read all the
> files
> > and give query output. For the columns which are not present in some data
> > files, it wil have null values for the new column.
> >
> > But here basically we do not merge schema. we can maintain the same now
> > also, only thing is can take latest schma.
> >
> > for example:
> > 1. one data file with columns a,b and c. 2nd file is with columns
> > a,b,c,d,e. then can read and create table with 5 columns or 3 columns
> > which
> > ever is latest and create table(This will be when user does not specify
> > schema). If he species table will be created with specified schema.
> >
> > I have created a jira for this
> >
https://issues.apache.org/jira/browse/CARBONDATA-3287> > If any input, please let me know.
> >
> > Regards,
> > Akash
>
>
>
>
>
> --
> Sent from:
>
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/>