Friday, February 5, 2010

Developing a Reporting Solution with Hyperion Essbase – Part1

Identifying the need of the Essbase

This is the first part in the series - Developing a Reporting Solution with Hyperion Essbase considering an ERP System PSGL as source.
we will discuss the issues and solution design approaches in all the phases of developing an Essbase application.

We could do all our reporting from the ERP system but why should we use Essbase to do it?
The answer is related to the necessity for Speed, Performance and complicated business logic processing. The need for multiple users to generate reports simultaneously and quickly.

How can we expect speed and performance from Hyperion Essbase?
Essbase — an OLAP Server, handling all data storage, caching, calculations, and data security. It supports drill down, drill up, slicing and dicing.
key strengths of Essbase - Fast reporting and analysis, powerful calculation engine for the complicated business logic processing, organized in dimensions of data – a.k.a the business view, write back ability.
Essbase is a blank sheet of paper and Essbase can be designed and tailored for any business, any industry, any application.

To explain OLAP - On-Line Analytical Processing, we will first look into OLTP.
This stands for On-Line Transaction Processing. ERP (Enterprise Resource Planning) systems, such as Oracle E-Business Suite, SAP, Peoplesoft etc, are considered to be OLTP systems. That is, they are designed and optimized primarily for Transaction Processing.

OLAP is a category of applications or technology for collecting, gathering, processing and presenting the multidimensional data for analysis and management purposes.
On-Line Analytical Processing (OLAP) differs from On-Line Transaction Processing (OLTP) in that its primary function is to perform analytical processing for the transactions occurring in the OLTP Enterprise Resource Planning system.
Although ERP systems provide a reporting capability, there are significant advantages of using an OLAP system for analyzing and reporting the transactional data.

In short, the functional requirements for OLAP are as follows:
-> Rich dimensional structuring with hierarchical referencing
-> Efficient specification of dimensions and dimensional calculations
-> Separation of structure and representation
-> Flexibility
-> Sufficient speed to support ad hoc analysis
-> Complex business logic processing
-> Multi-user support
-> Robust security

System Speed and Performance
Consider a typical General Ledger, which contains the raw data for most management and statutory reporting. Usually, ERP systems store the individual transactions (i.e. GL journals and postings from sub-modules such as Accounts Payable, Purchasing, Fixed Assets, and Inventory etc.) but they do not store the account balances.


When you run a report which requires account balances, these are calculated by adding up the individual transactions for the period(s) in question. Although this is usually a task undertaken by the underlying relational database, and is relatively efficient, this still places a considerable load on the system.

This might be acceptable outside of normal working hours (assuming that the system is not accessed 24/7 from multiple time zones) but if many such reports are run during the working day, performance can deteriorate for other users performing the tasks for which the ERP system is designed and optimised, namely entering transactions.

The relational database on which the ERP system sits is not designed to 'understand' the tree hierarchies for each segment of our GL accounts.

The report user typically understands that a number such as Cost of Goods is made up of a number of separate elements, such as Purchases, Stock Movement, Carriage and Settlement Discounts. These elements may also each be represented by a number of separate GL accounts.
If the user needs to analyse the detail behind one of these numbers on a report (i.e. drill-down), it is usually necessary to run a completely separate report to get the detail, placing a repeated load on the database and further reducing system performance.

Apart from the system performance problems with this approach, the production of the report itself can often be a time-consuming process. Certainly the generation of the numbers is not likely to be up to the speed at which the user can request and analyze them.

Considering Essbase as a Solution
Typically, data is extracted from the General Ledger (or other modules) either every night, or perhaps just at Period End, depending on the requirement.

The account balances (not the individual transactions) are loaded into the Essbase cube which not only knows the account codes, but also contains the groupings or consolidation levels to provide the higher-level 'roll-ups' such as the Cost of Sales example we discussed earlier. Not only are the individual account balances stored, but also the system calculates and stores the rolled-up numbers as well.

It is now possible for the report user to interrogate the OLAP system directly. He or she might start with a high-level report showing, for example, a complete Profit and Loss account. The report is generated almost instantaneously as all the required numbers are immediately available from the database without additional calculation.

If further detail is required on a number such as Cost of Goods, the user can drill-down on the number and get an immediate response showing the individual balances which make up the rolled-up number.

Lastly, the extra load on the ERP relational database caused by ad-hoc report generation is entirely removed and replaced by a single extraction routine which can be controlled centrally to ensure that it is run at an appropriate time to minimize user inconvenience.

Conclusion
The design of the OLAP reporting systems and the interfaces between them and the ERP systems are critically important - effort in this area should not be underestimated.
It is also sensible to integrate the thought processes behind OLAP system design, reporting requirements and GL Chart-Of-Accounts and data conversion at the earliest possible stage of an implementation project, particularly if the OLAP tools are to be used for Budgeting and Forecasting in addition to their more traditional reporting roles.

Saturday, January 30, 2010

Data Block: The Building Block of Essbase

The basic unit of storage in Essbase is the “data block”. Blocks contain the actual data of your database and are stored in page (.pag file type) files located on one or more Server drives.
Essbase loads, calculates and retrieves data by data block.
This is key! Knowing this basic fact will enable you to think through optimization of data loading, calculation scripts and retrievals.

The data block and index explained:
The creators of Arbor Essbase enabled dimensions to be tagged or defined as either dense or sparse. When a dimension is tagged as dense, it becomes part of the storage structure called the data block

Consider an Essbase cube with the Dimensionality
DIM (160, 19, 3, 30, 200, 10000)

Tagging some dimensions dense creates the block
DIM (160d, 19d, 3d, 30s, 200s, 10000s)

Every data block that is created in the database has an identical structure.
In this example, it contains precisely 160 * 19 * 3 = 9,120 cells, or intersection points.
Data block size in bytes would be 9,120 * 8 = 72, 960. This is the fully expanded block size.
Physical block size calculation: it is the product of the stored members of the dense dimensions * 8 bytes.
All data blocks are stored on disk within the ESS*.PAG files.

Addressing, or locating, blocks of data is provided by means sparse member combinations.
These combinations become part of the storage structure called the index and are stored on disk with the ESS*.IND files.

The data block is a fixed format data structure the existence of which is driven by data-relevant sparse member combinations in the index. By data-relevant we mean that only where business data actually exists across sparse member combinations will a data block be generated.
For example, if we do not have any business for the product P00 in January in the country Jordan, we do not reserve any space in our cube for those intersection coordinates.
The small subcomponents of the cube (the data block and its index address) are quite readily moved between disk and working memory. These structures mesh very well with the general user requirement of only being interested in sub-sets of information from the cube at any one point in time.

Block creation explored:
The above Essbase cube contains 6 dimensions with the following configuration:
• dense dimension #1 containing 160 members
• dense dimension #2 containing 19 members
• dense dimension #3 containing 3 members
• sparse dimension #1 containing 30 members
• sparse dimension #2 containing 200 members
• sparse dimension #3 containing 10000 members

Unique combinations of the sparse dimensions form the index.
Data blocks are actually created depends upon unique sparse combinations that contain data

Essbase Cube Explosion:
The three defining characteristics of a Essbase cube storage structure are:
1. The number of dimensions
2. The number of members within each dimension
3. The hierarchical relationship of the members within each dimension
Data explosion can occur across each characteristic individually and concurrently having a combined (that is Cartesian) impact.
If we increase sparse dimension #1 to include 60 more members, the number of potential intersection points will be tripled.
Adding a completely new dimension will explode the number of potential intersection.
We will load the data at the leaf level in essbase and all hierarchical levels will form the consolidation levels. More the consolidation levels, more the block explosion.

The first rule of design in Essbase modeling is to minimize the number of dimensions in a database design.

While minimizing dimensionality is our first guiding principle, there is a dynamic tension between it and the reporting needs of the enterprise. The point is that the possibility of block explosion needs to be bounded in such a way as to produce intersections that are actually used for business analysis.

Guidelines:
a) The final criterion for dense/sparse settings is the combination of settings that produces the lowest overall calculation time with acceptable storage configurations.
b) Tips to reduce the block size: Change a dense dimension to a sparse dimension or Make members in a dense dimension dynamic – Label only or – Dynamic calc.
c) The goal of the Essbase designer is two fold: One Create as few blocks as possible and the another is Create blocks that are as densely populated as possible.

Note:
a) Data blocks contain Cells: A cell is created for every intersection of stored members of the dimensions that are set as dense in the Data Storage dialog.
b) Stored members are those defined in the outline with storage types: Store Data, Dynamic Calc and Store, and Never Share.
c) All other storage types--i.e. Dynamic Calc, Label Only and Shared Member-- are not stored members and therefore do not take up space-- i.e. cells-- within a data block.
d) Each cell, as defined by the stored dense members, takes up 8 bytes of storage space on disk (before compression) and in memory.
e) The size of a data block can be calculated: it is the product of the stored members of the dense dimensions * 8 bytes.

Friday, January 29, 2010

Essbase Practice - Blog Introduction

Dear everyone,

I am practicing Essbase for more than five years.
I have worked for the different customers and for the different business processes.

The purpose of this blog is to post a series of articles that will discuss the various issues, solutions, design approaches, practices and the standards followed in the Essbase development.

A kind of media, I always googled for that will share the Essbase information in my early career.

Any article on recommended practices of any kind, it seems, should not require a lengthy explanation or argument in favor of the corresponding technology.
It should concentrate on providing the reader with guidelines as how to best use the technology, how to avoid misinterpreting its functionality, how to prioritize implementation goals to get a high cost/benefit ratio, and so on.
As an author, I assume that reader has sufficient understanding of the underlying fundamental principles to be able to visualize the raw power of this technology.


Thanks & Regards,
Srinivas Bobbala