[Discussion] Refactor dynamic configuration

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[Discussion] Refactor dynamic configuration

xubo245
*Background:*
In CarbonData, there are many configuration in:
   org.apache.carbondata.core.constants.CarbonCommonConstants,
   org.apache.carbondata.core.constants.CarbonV3DataFormatConstants,
   org.apache.carbondata.core.constants.CarbonLoadOptionConstants;
and so on. Which one can be dynamic configure, which one can not be dynamic
configure, it's not clear.
User will confuse when they use the CarbonData configuration. So we should
optimize it.

*Plan:*
Refactor dynamic configuration for carbon:
1. Decide and collect all dynamic configurations which can be SET in
carbondata application like in beeline.

2. For every dynamic configuration, use an annotation to tag them. (re-use
the CarbonProperty annotation but change its name). This annotation should
be used for validation when user invoking SET command.

*Implement:*
1.change CarbonProperty annotation to DynamicConfigurable, add
NonDynamicConfigurable, DefaultValue,Threshold annotation and so on

2.unify default value variable name: XX_XX_DEFAULT
CARBON_MERGE_SORT_READER_THREAD_DEFAULTVALUE
=>CARBON_MERGE_SORT_READER_THREAD_DEFAULT
SORT_SIZE_DEFAULT_VAL                                              
=>SORT_SIZE_DEFAULT
CARBON_TIMESTAMP_DEFAULT_FORMAT                           =>
CARBON_TIMESTAMP_FORMAT_DEFAULT
DEFAULT_COMPRESSOR                                                   =>
COMPRESSOR_DEFAULT

3.optimize the properties: remove some warning, such as:

carbon.query.validate.directqueryondatamap =>
carbon.query.validate.direct.query.on.datamap
carbon.scheduler.minregisteredresourcesratio=>
carbon.scheduler.min.registered.resources.ratio





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Refactor dynamic configuration

xuchuanyin
Does the annotations have other effects other than just providing literal
coding contract?

For example we can use these annotations to
1. generate docs
2. restrict some operations (for example some configurations should not
support SET command)
3. limit scope for usage (for example some configurations can only be used
during loading while some can only be used during query)



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Refactor dynamic configuration

Jacky Li

Hi Xubo,

Since you have modified so many place for this feature, I think instead of adding more annotation, it is better to the CarbonProperty more object-oriented. For example, we can change the CarbonProperty to

class Property<T> {
   String name;
   T value;
   T default;
   String doc;
   boolean dynamicConfigurable;

   static PropertyBuilder<String> buildStringProperty() {…}
}

And provide property builder for it, then you can use it in CarbonCommonConstant to replace current String implementation.
For example

public static final Property CARBON_BAD_RECORDS_ACTION = Property.buildStringProperty().
        .name(“carbon.bad.records.action”)
        .default(“FAIL”)
        .doc(“keep the same description as .md file”)
        .dynamic(true)
        .build()

Then in the core module, we can still use CarbonProperties.getInstance().getProperty(CCC. CARBON_BAD_RECORDS_ACTION) to get its value.


Regards,
Jacky

> 在 2018年10月31日,下午4:07,xuchuanyin <[hidden email]> 写道:
>
> Does the annotations have other effects other than just providing literal
> coding contract?
>
> For example we can use these annotations to
> 1. generate docs
> 2. restrict some operations (for example some configurations should not
> support SET command)
> 3. limit scope for usage (for example some configurations can only be used
> during loading while some can only be used during query)
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Refactor dynamic configuration

xubo245
In reply to this post by xuchuanyin
the annotation mainly providing literal explain whether this parameter can be
dynamic configurable. What's more, it will throw exception if add
@CarbonProperty and can't be dynamic configurable. If don't add
@CarbonProperty for parameter, it won't throw exception and also won't take
effect



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Refactor dynamic configuration

xubo245
In reply to this post by Jacky Li
ok, this is better:

public static final Property CARBON_BAD_RECORDS_ACTION =
Property.buildStringProperty().
        .name(“carbon.bad.records.action”)
        .default(“FAIL”)
        .doc(“keep the same description as .md file”)
        .dynamic(true)
        .build()

I will raise another PR for it. And try to consider other scenario, such as:

/**
 * Below class will be used to keep all the internal property constants,
 * which will be used internally. These property will not be exposed to
users
 */
public interface CarbonCommonConstantsInternal {

  String QUERY_ON_PRE_AGG_STREAMING = "carbon.query.on.preagg.streaming.";

}



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Refactor dynamic configuration

xubo245
This post was updated on .
In reply to this post by Jacky Li
I rasie a PR for it and write a
demo:https://github.com/apache/carbondata/pull/2914

Please check it.

1. Optimize the String implementation, we can provide property builder for it, then can use it in CarbonCommonConstant to replace current String implementation.
For example:

  public static final Property CSV_READ_BUFFER_SIZE = Property.buildIntProperty()
      .name("carbon.csv.read.buffersize.byte")
      .defaultValue(1048576)
      .minValue(10240)
      .maxValue(10485760)
      .doc("CSV_READ_BUFFER_SIZE, default value is 1mb." +
          "min value for csv read buffer size, 10 kb." +
          "max value for csv read buffer size, 10 mb")
      .build();
2.add interface for Property:
org.apache.carbondata.core.util.CarbonProperties#addProperty(org.apache.carbondata.core.util.Property, java.lang.String)
org.apache.carbondata.core.util.CarbonProperties#addProperty(org.apache.carbondata.core.util.Property)
org.apache.carbondata.core.util.CarbonProperties#getProperty(org.apache.carbondata.core.util.Property, java.lang.String)


If the demo is ok, I will change other properties in this PR



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [Discussion] Refactor dynamic configuration

ravipesala
Hi,

Please check my views on it.

Basic design should be there is a clear separation between modules. Like
Spark based configurations are no means to presto, So every module can have
there owned conf and constant classes.
1. No need for CarbonProperties and  CarbonCommonConstants.
2. Should have CarbonStaticConf and CarbonRuntimeConf classes to
differentiate between static and runtime configurations.
3. Don't add description as string to it as it consumes lot of memory,
adding as comment should be enough.
4. Create CarbonCoreConstants for keeping all core level constants to it.
5. Every integration layer should have its own constants classes like Spark,
presto and Hive.
6. Even configurations also should be different for each integration layer.
That means spark should have CarbonSparkConf which can extends or wrap
CarbonStaticConf. And same for presto and hive as well.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/