[jira] [Updated] (CARBONDATA-3519) A new column page MemoryBlock is allocated at each row addition to table page if having string column with local dictionary enabled.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (CARBONDATA-3519) A new column page MemoryBlock is allocated at each row addition to table page if having string column with local dictionary enabled.

Akash R Nilugal (Jira)

     [ https://issues.apache.org/jira/browse/CARBONDATA-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Venugopal Reddy K updated CARBONDATA-3519:
------------------------------------------
    Description:
*Context:*

For a string column with local dictionary enabled, a column page of

{{{color:#de350b}UnsafeFixLengthColumnPage{color} }}with datatype `{{DataTypes.BYTE_ARRAY}}` is created for `{{encodedPage}}` along with regular `{{actualPage}}` of `{{UnsafeVarLengthColumnPage}}`. 

We have `*{{capacity}}*` field in 

 the `{{UnsafeFixLengthColumnPage}}`. And this field indicates the capacity of  allocated

`{{memoryBlock}}` for the page. `{{ensureMemory()}}` method is being called while adding rows to check if 

`{{totalLength + requestSize > capacity}}` to allocate a new memoryBlock if there is no room to add the next row, copy the old context(prev rows) and free the old memoryBlock.

 

*Issues:*
 # While, `{{UnsafeFixLengthColumnPage}}` with with datatype `{{DataTypes.BYTE_ARRAY}}` is created for `{{encodedPage}}`, we have not assigned the *`{{capacity}}`* field with allocated memory block size. Hence, for each add row to tablePage, ensureMemory() check always fails, allocates a new column page memoryBlock, copy the old context(prev rows) and free the old memoryBlock. This allocation of new memoryBlock and free of old memoryBlock happens at row addition for the string columns with local dictionary enabled.
 # And in `VarLengthColumnPageBase`, we have a `rowOffset` column page of type `UnsafeFixLengthColumnPage` to maintain the offset to each row of variable length columns. This `rowOffset` page is 

  was:
*Context:*

For a string column with local dictionary enabled, a column page of

`{{UnsafeFixLengthColumnPage}}` with datatype `{{DataTypes.BYTE_ARRAY}}` is created for `{{encodedPage}}` along with regular `{{actualPage}}` of `{{UnsafeVarLengthColumnPage}}`. 

We have `*{{capacity}}*` field in 

 the `{{UnsafeFixLengthColumnPage}}`. And this field indicates the capacity of  allocated

`{{memoryBlock}}` for the page. `{{ensureMemory()}}` method is being called while adding rows to check if 

`{{totalLength + requestSize > capacity}}` to allocate a new memoryBlock if there is no room to add the next row, copy the old context(prev rows) and free the old memoryBlock.

 

*Issues:*
 # While, `{{UnsafeFixLengthColumnPage}}` with with datatype `{{DataTypes.BYTE_ARRAY}}` is created for `{{encodedPage}}`, we have not assigned the *`{{capacity}}`* field with allocated memory block size. Hence, for each add row to tablePage, ensureMemory() check always fails, allocates a new column page memoryBlock, copy the old context(prev rows) and free the old memoryBlock. This allocation of new memoryBlock and free of old memoryBlock happens at row addition for the string columns with local dictionary enabled.
 # And in `VarLengthColumnPageBase`, we have a `rowOffset` column page of type `UnsafeFixLengthColumnPage` to maintain the offset to each row of variable length columns. This `rowOffset` page is 


> A new column page MemoryBlock is allocated at each row addition to table page if having string column with local dictionary enabled.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-3519
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3519
>             Project: CarbonData
>          Issue Type: Improvement
>          Components: core
>            Reporter: Venugopal Reddy K
>            Priority: Minor
>
> *Context:*
> For a string column with local dictionary enabled, a column page of
> {{{color:#de350b}UnsafeFixLengthColumnPage{color} }}with datatype `{{DataTypes.BYTE_ARRAY}}` is created for `{{encodedPage}}` along with regular `{{actualPage}}` of `{{UnsafeVarLengthColumnPage}}`. 
> We have `*{{capacity}}*` field in 
>  the `{{UnsafeFixLengthColumnPage}}`. And this field indicates the capacity of  allocated
> `{{memoryBlock}}` for the page. `{{ensureMemory()}}` method is being called while adding rows to check if 
> `{{totalLength + requestSize > capacity}}` to allocate a new memoryBlock if there is no room to add the next row, copy the old context(prev rows) and free the old memoryBlock.
>  
> *Issues:*
>  # While, `{{UnsafeFixLengthColumnPage}}` with with datatype `{{DataTypes.BYTE_ARRAY}}` is created for `{{encodedPage}}`, we have not assigned the *`{{capacity}}`* field with allocated memory block size. Hence, for each add row to tablePage, ensureMemory() check always fails, allocates a new column page memoryBlock, copy the old context(prev rows) and free the old memoryBlock. This allocation of new memoryBlock and free of old memoryBlock happens at row addition for the string columns with local dictionary enabled.
>  # And in `VarLengthColumnPageBase`, we have a `rowOffset` column page of type `UnsafeFixLengthColumnPage` to maintain the offset to each row of variable length columns. This `rowOffset` page is 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)