Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[jira] [Commented] (CARBONDATA-276) Add trim option

Classic

List

Threaded

1 message

Akash R Nilugal (Jira)

[jira] [Commented] (CARBONDATA-276) Add trim option

[ https://issues.apache.org/jira/browse/CARBONDATA-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568332#comment-15568332 ]

ASF GitHub Bot commented on CARBONDATA-276:
-------------------------------------------

Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/200#discussion_r82977592

--- Diff: processing/src/main/java/org/apache/carbondata/processing/csvreaderstep/UnivocityCsvParser.java ---
@@ -102,8 +102,8 @@ public void initialize() throws IOException {
parserSettings.setMaxColumns(
getMaxColumnsForParsing(csvParserVo.getNumberOfColumns(), csvParserVo.getMaxColumns()));
parserSettings.setNullValue("");
- parserSettings.setIgnoreLeadingWhitespaces(false);
- parserSettings.setIgnoreTrailingWhitespaces(false);
+ parserSettings.setIgnoreLeadingWhitespaces(csvParserVo.getTrim());
--- End diff --

pros of this approach will be suppose in one load user loaded with dirty data and suddenly he realizes no i need to trim then in the next load he will enable the option and load the data, this will increase the dictionary space also, also in query dictionary lookup overhead will increase.

> Add trim option
> ---------------
>
> Key: CARBONDATA-276
> URL: https://issues.apache.org/jira/browse/CARBONDATA-276
> Project: CarbonData
> Issue Type: Bug
> Reporter: Lionx
> Assignee: Lionx
> Priority: Minor
>
> Fix a bug and add trim option.
> Bug: When string is contains LeadingWhiteSpace or TrailingWhiteSpace, query result is null. This is because the dictionary ignore the LeadingWhiteSpace and TrailingWhiteSpace and the csvInput dose not.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)