Saturday, February 25, 2012

Can anyone explain this processing schedule

Hi,

I have a fairly complicated cube which has:

74 Measure groups

Each measure group contains 12 partitions

= 888 partitions in my cube

I have a parameterized XMLA processing script that processes 2 partitions in each measure group - i.e. 148 partitions. It does this inside a <Parallel></Parallel> element.

When I executed the XMLA script I recorded the progress in Profiler. I saw that, prior to processing the 148 paritions, there was lots and lots of "Progress Report End" EventClass records with a message of:

Finished processing the '<partition name>' partition

There were actually 518 of these records so whatever it is doing, it is not doing it for every single partition. Note that most of the listed partitions were NOT specified in my XMLA processing script.

I also has a trace running against the underlying source data and can see that none of these events resulted in a query being fired against the source.

So, the question is fairly simple. Why am I seeing all these "Finished processing..." events for partitions that:

    I never asked to be processed and

    Don't result ina query against the source.

Any explanation would be most welcome.

Thanks

Jamie

P.S. The .trc file from tracing AS is 2.14MB. I can provide it if anyone is interested.

It is possible you are seeing Analysis Server as part of preparing to load new data, removing data from the partitions that are no longer valid.

It does that by internally issuing ProcessClear command. ProcessClear will generate ProgressReportBegin and ProgressReportEnd commands.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

|||

Edward Melomed wrote:

It is possible you are seeing Analysis Server as part of preparing to load new data, removing data from the partitions that are no longer valid.

It does that by internally issuing ProcessClear command. ProcessClear will generate ProgressReportBegin and ProgressReportEnd commands.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

Thanks Edward but that doesn't yet explain what I am seeing.


In my scenario all the other partitions are perfectly valid. What circumstances might cause AS to think that data in those partitions was NOT valid and thus do a ProcessClear? if this is what IS happening then I am very worried.

Regards

|||

What does you processing command lists?

Does it list entire measure group or it lists individual partitions?

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

|||

Edward Melomed wrote:

What does you processing command lists?

Does it list entire measure group or it lists individual partitions?

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

Partitions only!

It also does a processUpdate on all my dimensions.

-Jamie

|||

That explains it. ProcessUpdate on dimension will case Analysis Server to drop aggregations and indexes from your partitoins. These are the events you are seeing.

To learn more about what to expect during processing you can click on Impact Analysis button in the processing dialog.

Processing dialog sends a kind of "prepare" statement to the sever and recives back a rowset with all objects affected by the processing command.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

|||

Edward Melomed wrote:

That explains it. ProcessUpdate on dimension will case Analysis Server to drop aggregations and indexes from your partitoins. These are the events you are seeing.

To learn more about what to expect during processing you can click on Impact Analysis button in the processing dialog.

Processing dialog sends a kind of "prepare" statement to the sever and recives back a rowset with all objects affected by the processing command.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.


Edward,

I knew we'd get there in the end! :)

That's great service. Thank you very much.

-Jamie

|||

And I've done a small write-up here: https://blogs.conchango.com/jamiethomson/archive/2006/07/26/4272.aspx

-Jamie

|||

If I understand this correctly, we will lose aggregations on the cube when we do process update against dimensions.

I have a following scenario, what will happen in this case then? I loop through list of dimensions (16) and conduct process update by executing DDL statememt. Once I updated the dimensions I fully process 3 partitions (Current Fiscal Period, Current Fiscal Period-1, Current Fiscal Period-2) which we expect to get new / delta fact data.

I have got 7 years worth of history data loaded into the cube. Aggregations were built based on user reporting requirements.

Question is, will I lose aggregations all together, when I process update on dimensions every night?

Thanks

Sutha

|||

Yes, you might loose some of the aggregations.

For more information about the situation try and search for term "flexible aggregations".

In short: ProcessUpdate of the dimension allows you to move hierarcy member from one paren to another. For example JonD can change city from Seattle to Portland. If you allow such change in your dimension, Analysis Server needs to drop aggregation that sums up data across cities.

If you said that customer cannot move from one city to another by setting attribute relationship between customer attribute and city attribute as Rigid, Analysis Server will not drop aggregation going across the cities. So one way to treat this situation is to set all attribute relationsip types to Rigid across all dimensions.

Another solution is to use Process Add for dimension. This option only adds new members to dimension and can change some dimension member properties, but does not move member from parent to another. So Analysis Server doesnt need to touch aggregations in this case. To use Process Add you need to make sure you build a view restricting to only new members.

And your last option is to run Process Index command across all of your old partitions to make sure you build all aggregations.

Hope that helps.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

|||

Edward

Thanks. Yes it does help. One quick question though.

In my scenario, I have a measure group with multi billion rows, but I do process 3 partitions daily. Do I have to "Process Index" for those 3 partitions or for the measure group? If I understand you correctly, I need to do Process Index against the measure group not just those 3 partitions I have just processed.

Thanks

Sutha

No comments:

Post a Comment