Hi Dev,
I observed that we have added many buildXXXWriter in CarbonWriterBuilder in SDK, but all of them is just different parameter combination. I think it is better to add those parameter by withXXX function, which complies to Builder pattern. Otherwise it is hard to maintain if we keep adding these parameter and new build functions. Any suggestion on this? public CarbonWriter buildWriterForCSVInput(Schema schema, Configuration configuration) public CarbonWriter buildThreadSafeWriterForCSVInput(Schema schema, short numOfThreads, Configuration configuration) public CarbonWriter buildWriterForAvroInput(org.apache.avro.Schema avroSchema, Configuration configuration) public CarbonWriter buildThreadSafeWriterForAvroInput(org.apache.avro.Schema avroSchema, short numOfThreads, Configuration configuration) public JsonCarbonWriter buildWriterForJsonInput(Schema carbonSchema, Configuration configuration) public JsonCarbonWriter buildThreadSafeWriterForJsonInput(Schema carbonSchema, short numOfThreads, Configuration configuration) Regards, Jacky |
Yeah, it actually belongs to 'Builder Pattern'. We should simplify this
before they are widely used. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Yes, should not have this many methods, we will remove those methods and
add those to builder. Regards, Ravindra. On Thu, 13 Sep 2018 at 9:14 AM, xuchuanyin <[hidden email]> wrote: > Yeah, it actually belongs to 'Builder Pattern'. We should simplify this > before they are widely used. > > > > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
yes,
finally, we can keep 3 API with below signatures. *buildWriterForCSVInput(carbonSchema)buildWriterForAvroInput(avroSchema)buildWriterForJsonInput(carbonSchema)* and create new methods for *a) withThreadSafe(NumOfThreads) b) withHadoopConf(configuration)* Along with these change, I want to simplify some more API. 1) Below 5 table properties methods can be made private. so that end user uses only withTableProperties API instead of each separate API. *public CarbonWriterBuilder sortBy(String[] sortColumns)public CarbonWriterBuilder withBlockSize(int blockSize)public CarbonWriterBuilder localDictionaryThreshold(int localDictionaryThreshold)public CarbonWriterBuilder enableLocalDictionary(boolean enableLocalDictionary)public CarbonWriterBuilder withBlockletSize(int blockletSize)* this is the exposed API for the above configurations. *public CarbonWriterBuilder withTableProperties(Map<String, String> options)* 2) Also at SDK side, name transactional and nontransactional doesn't make sense. As long as we don't write the metadata folder, it is always nonTransactional table at write side. *public CarbonWriterBuilder isTransactionalTable(boolean isTransactionalTable)* can we rename above API to isFlatFolderPath ? Also, at SDK reader side also we can rename this. and by default keep flat folder as true. let me know if any suggestions. So, that I can go ahead with the changes Thanks, AB On Thu, Sep 13, 2018 at 11:15 AM Ravindra Pesala <[hidden email]> wrote: > Yes, should not have this many methods, we will remove those methods and > add those to builder. > > Regards, > Ravindra. > On Thu, 13 Sep 2018 at 9:14 AM, xuchuanyin <[hidden email]> wrote: > > > Yeah, it actually belongs to 'Builder Pattern'. We should simplify this > > before they are widely used. > > > > > > > > -- > > Sent from: > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > > > |
In reply to this post by ravipesala
can you provide the full signature of these 3 API?
I’d like to call the method like this : WriterBuilder .csv(CarbonSchma) .withXXX(…) .build WriterBuilder .avro(AvroSchma) .withXXX(…) .build WriterBuilder .json(CarbonSchma) .withXXX(…) .build |
+1 for the proposal to clear SDK APIs.
Thanks Ajantha for initiating the code changes. For schema input for writer creation, I also feel we should unify to all writer creation methods to Builder. API looks cleaner if we provide just build() without out any more arguments. "withTableProperties(Map<String, String> options) vs sortBy(..),withBlockSize(...),etc" ----- I think both of these methods can serve for different purposes. withTableProperties(Map<String, String> options) can be used by customer apps which takes property input directly by end users who is familiar with carbon create table syntax. Individual methods can be used by customers app code to avoid problems like wrong spells or wrong names. "public CarbonWriterBuilder isTransactionalTable(boolean isTransactionalTable)" -- I think we can remove if we are not clear on the usecase at this moment and to avoid confusions -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
-
Kanaka |
@xuchuanyin:
yes, method signatures will be like you specified. @kanaka: I still think we should keep only table properties Map as we validate "wrong_spells and names". More options will create more confusion. So, just keeping table properties Map can simplify configurations. End user can form a map and pass. Just like existing withLoadOptions map Any other suggestions are welcome Thanks, AB . On Thu, Sep 20, 2018 at 10:55 PM kanaka <[hidden email]> wrote: > +1 for the proposal to clear SDK APIs. > Thanks Ajantha for initiating the code changes. > > For schema input for writer creation, I also feel we should unify to all > writer creation methods to Builder. API looks cleaner if we provide just > build() without out any more arguments. > > > "withTableProperties(Map<String, String> options) vs > sortBy(..),withBlockSize(...),etc" > ----- I think both of these methods can serve for different purposes. > withTableProperties(Map<String, String> options) can be used by customer > apps which takes property input directly by end users who is familiar with > carbon create table syntax. > Individual methods can be used by customers app code to avoid problems like > wrong spells or wrong names. > > "public CarbonWriterBuilder isTransactionalTable(boolean > isTransactionalTable)" > -- I think we can remove if we are not clear on the usecase at this moment > and to avoid confusions > > > > > > > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
Also now we that we support Hadoop conf,
we don't require below API. we can remove them from CarbonWriterBuilder. *setAccessKeysetAccessKeysetSecretKeysetSecretKeysetEndPointsetEndPoint* Thanks, AB On Thu, Sep 20, 2018 at 11:16 PM Ajantha Bhat <[hidden email]> wrote: > > @xuchuanyin: > > yes, method signatures will be like you specified. > > @kanaka: I still think we should keep only table properties Map as we > validate "wrong_spells and names". More options will create more confusion. > So, just keeping table properties Map can simplify configurations. End > user can form a map and pass. Just like existing withLoadOptions map > > Any other suggestions are welcome > > Thanks, > AB > . > > On Thu, Sep 20, 2018 at 10:55 PM kanaka <[hidden email]> > wrote: > >> +1 for the proposal to clear SDK APIs. >> Thanks Ajantha for initiating the code changes. >> >> For schema input for writer creation, I also feel we should unify to all >> writer creation methods to Builder. API looks cleaner if we provide just >> build() without out any more arguments. >> >> >> "withTableProperties(Map<String, String> options) vs >> sortBy(..),withBlockSize(...),etc" >> ----- I think both of these methods can serve for different purposes. >> withTableProperties(Map<String, String> options) can be used by customer >> apps which takes property input directly by end users who is familiar with >> carbon create table syntax. >> Individual methods can be used by customers app code to avoid problems >> like >> wrong spells or wrong names. >> >> "public CarbonWriterBuilder isTransactionalTable(boolean >> isTransactionalTable)" >> -- I think we can remove if we are not clear on the usecase at this moment >> and to avoid confusions >> >> >> >> >> >> >> -- >> Sent from: >> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ >> > |
+1
I feel it is better to remove the transactional flag from sdk API as it is redundant currently. we better support it in a better way in future. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |