free online convert avi to mp4 specific applications to perform the specific functions efficiently. It is born out of the need for strategic information and is the result of the search for a new way to provide such infor- mation. More than ponjiah data warehousing fundamentals by paulraj ponniah solution manual free download all U. The purposes served by such external data sources cannot be fulfilled by the data avail- able within your organization itself. Remaining competitive and perhaps even surviving itself depended on centralizing strategic information from various sources, streamlining data access, and facilitating analysis of the data warehousing fundamentals by paulraj ponniah solution manual free download by the business units.">

data warehousing fundamentals by paulraj ponniah solution manual free download

data warehousing fundamentals by paulraj ponniah solution manual free download

A data warehouse in an environment, not a product. Data warehousing is the only viable means to resolve the information crisis and to provide strategic information. List four reasons to support this assertion and explain them.

Match the columns: 1. OLTP application produce ad hoc reports explosive growth despite lots of data data cleaned and transformed users go to get information used for decision making environment, not product for day-to-day operations simple, easy to use.

Explain via some examples how exactly technology trends do help. You are the IT Director of a nationwide insurance company. Write a memo to the Executive Vice President explaining the types of opportunities that can be realized with readily available strategic information.

For an airlines company, how can strategic information increase the number of frequent flyers? Discuss giving specific details. You are a Senior Analyst in the IT department of a company manufacturing automobile parts.

The marketing VP is complaining about the poor response by IT in providing strategic information. Draft a proposal to him explaining the reasons for the problems and why a data warehouse would be the only viable solution. As we have seen in the last chapter, the data warehouse is an information delivery system.

In this system, you integrate and transform enterprise data into information suitable for strategic decision making. You take all the historic data from the various operational systems, combine this internal data with any relevant data from outside sources, and pull them together. You resolve any conflicts in the way data resides in different systems and transform the integrated data content into a format suitable for providing information to the various classes of users.

Finally, you implement the information delivery methods. In order to set up this information delivery system, you need different components or building blocks. These building blocks are arranged together in the most optimal way to serve the intended purpose. They are arranged in a suitable architecture. Before we get into the individual components and their arrangement in the overall architecture, let us first look at some fundamental features of the data warehouse.

Bill Inmon, considered to be the father of Data Warehousing provides the following definition: A Data Warehouse is a subject oriented, integrated, nonvolatile, and time variant collection of data in support of managements decisions. Sean Kelly, another leading data warehousing practitioner defines the data warehouse in the following way.

The data in the data warehouse is: Separate Available What about the nature of the data in the data warehouse?

How is this data different from the data in any operational system? Why does it have to be different?

How is the data content in the data warehouse used? Subject-Oriented Data In operational systems, we store data by individual applications. In the data sets for an order processing application, we keep the data for that particular application. These data sets provide the data for all the functions for entering orders, checking stock, verifying customers credit, and assigning the order for shipment. But these data sets contain only the data that is needed for those functions relating to this particular application.

We will have some data sets containing data about individual orders, customers, stock status, and detailed transactions, but all of these are structured around the processing of orders. Similarly, for a banking institution, data sets for a consumer loans application contain data for that particular application.

Data sets for other distinct applications of checking accounts and savings accounts relate to those specific applications. Again, in an insurance company, different data sets support individual applications such as automobile insurance, life insurance, and workers compensation insurance.

In every industry, data sets are organized around individual applications to support those particular operational systems. These individual data sets have to provide data for the specific applications to perform the specific functions efficiently. Therefore, the data sets for each application need to be organized around that specific application. In striking contrast, in the data warehouse, data is stored by subjects, not by applications. If data is stored by business subjects, what are business subjects?

Business subjects differ from enterprise to enterprise. These are the subjects critical for the enterprise.

For a manufacturing company, sales, shipments, and inventory are critical business subjects. For a retail store, sales at the check-out counter is a critical subject. Figure distinguishes between how data is stored in operational systems and in the data warehouse. In the operational systems shown, data for each application is organized separately by application: order processing, consumer loans, customer billing, accounts receivable, claims processing, and savings accounts.

For example, Claims is a critical business subject for an insurance company. Claims under automobile insurance policies are processed in the Auto Insurance application. Claims data for automobile insurance is organized in that application. Similarly, claims data for workers compensation insurance is organized in the Workers Comp Insurance application. But in the data warehouse for an insurance company, claims data are organized around the subject of claims and not by individual applications of Auto Insurance and Workers Comp.

In the data warehouse, data is not stored by operational applications, but by business subjects. In a data warehouse, there is no application flavor. The data in a data warehouse cut across applications. Integrated Data For proper decision making, you need to pull together all the relevant data from the various applications.

The data in the data warehouse comes from several operational systems. Source data are in different databases, files, and data segments. These are disparate applications, so the operational platforms and operating systems could be different. The file layouts, character code representations, and field naming conventions all could be different. In addition to data from internal operational systems, for many enterprises, data from outside sources is likely to be very important.

Companies such as Metro Mail, A. Nielsen, and IRI specialize in providing vital data on a regular basis. Your data warehouse may need data from such sources. This is one more variation in the mix of source data for a data warehouse.

Figure illustrates a simple process of data integration for a banking institution. Here the data fed into the subject area of account in the data warehouse comes from three different operational applications. Even within just three applications, there could be several variations. Naming conventions could be different; attributes for data items could be different.

The account number in the Savings Account application could be eight bytes long, but only six bytes in the Checking Account application. Before the data from various disparate sources can be usefully stored in a data warehouse, you have to remove the inconsistencies. You have to standardize the various data elements and make sure of the meanings of data names in each source application.

Before moving the data into the data warehouse, you have to go through a process of transformation, consolidation, and integration of the source data. Here are some of the items that would need standardization: Naming conventions Codes Data attributes Measurements. Time-Variant Data For an operational system, the stored data contains the current values. In an accounts receivable system, the balance is the current outstanding balance in the customers account.

In an order entry system, the status of an order is the current status of the order. In a consumer loans application, the balance amount owed by the customer is the current amount.

Of course, we store some past transactions in operational systems, but, essentially, operational systems reflect current information because these systems support day-to-day current operations.

On the other hand, the data in the data warehouse is meant for analysis and decision making. If a user is looking at the buying pattern of a specific customer, the user needs data not only about the current purchase, but on the past purchases as well. When a user wants to find out the reason for the drop in sales in the North East division, the user needs all the sales data for that division over a period extending back in time.

When an analyst in a grocery chain wants to promote two or more products together, that analyst wants sales of the selected products over a number of past quarters. A data warehouse, because of the very nature of its purpose, has to contain historical data, not just current values. Data is stored as snapshots over past and current periods. Every data structure in the data warehouse contains the time element. You will find histor-. This aspect of the data warehouse is quite significant for both the design and the implementation phases.

For example, in a data warehouse containing units of sale, the quantity stored in each file record or table row relates to a specific time element.

Depending on the level of the details in the data warehouse, the sales quantity in a record may relate to a specific date, week, month, or quarter. The time-variant nature of the data in a data warehouse Allows for analysis of the past Relates information to the present Enables forecasts for the future Nonvolatile Data Data extracted from the various operational systems and pertinent data obtained from outside sources are transformed, integrated, and stored in the data warehouse.

The data in the data warehouse is not intended to run the day-to-day business. When you want to process the next order received from a customer, you do not look into the data warehouse to find the current stock status. The operational order entry application is meant for that purpose. In the data warehouse, you keep the extracted stock status data as snapshots over time. You do not update the data warehouse every time you process a single order. Data from the operational systems are moved into the data warehouse at specific intervals.

Depending on the requirements of the business, these data movements take place twice a day, once a day, once a week, or once in two weeks. In fact, in a typical data warehouse, data movements to different data sets may take place at different frequencies. The changes to the attributes of the products may be moved once a week.

Any revisions to geographical setup may be moved once a month. The units of sales may be moved once a day. You plan and schedule the data movements or data loads based on the requirements of your users. As illustrated in Figure , every business transaction does not update the data in the data warehouse.

The business transactions update the operational system databases in real time. We add, change, or delete data from an operational system as each transaction happens but do not usually update the data in the data warehouse. You do not delete the data in the data warehouse in real time. Once the data is captured in the data warehouse, you do not run individual transactions to change the data there.

Data updates are commonplace in an operational database; not so in a data warehouse. The data in a data warehouse is not as volatile as the data in an operational database is. The data in a data warehouse is primarily for query and analysis. Data Granularity In an operational system, data is usually kept at the lowest level of detail. In a point-ofsale system for a grocery store, the units of sale are captured and stored at the level of units of a product per transaction at the check-out counter.

In an order entry system, the quantity ordered is captured and stored at the level of units of a product per order received from the customer. Whenever you need summary data, you add up the individual transac-.

If you are looking for units of a product ordered this month, you read all the orders entered for the entire month for that product and add up. You do not usually keep summary data in an operational system. When a user queries the data warehouse for analysis, he or she usually starts by looking at summary data. The user may start with total sale units of a product in an entire region.

Then the user may want to look at the breakdown by states in the region. The next step may be the examination of sale units by the next level of individual stores. Frequently, the analysis begins at a high level and moves down to lower levels of detail. In a data warehouse, therefore, you find it efficient to keep data summarized at different levels.

Depending on the query, you can then go to the particular level of detail and satisfy the query. Data granularity in a data warehouse refers to the level of detail. The lower the level of detail, the finer the data granularity. Of course, if you want to keep data in the lowest level of detail, you have to store a lot of data in the data warehouse. You will have to decide on the granularity levels based on the data types and the expected system performance for queries.

Figure shows examples of data granularity in a typical data warehouse. Many who are new to this paradigm are confused about these terms. Some authors and vendors use the two terms synonymously.

Some make distinctions that are not clear enough. At this point, it would be worthwhile for us to examine these two terms and take our position. Writing in a leading trade magazine in , Bill Inmon stated, The single most important issue facing the IT manager this year is whether to build the data warehouse first.

Data granularity refers to the level of detail. Depending on the requirements, multiple levels of detail may be present. Many data warehouses have at least dual levels of granularity. Figure Data granularity. This statement is true even today. Let us examine this statement and take a stand. Before deciding to build a data warehouse for your organization, you need to ask the following basic and fundamental questions and address the relevant issues: Top-down or bottom-up approach?

Enterprise-wide or departmental? Which firstdata warehouse or data mart? Build pilot or go with a full-fledged implementation? Dependent or independent data marts? These are critical issues requiring careful examination and planning. Should you look at the big picture of your organization, take a top-down approach, and build a mammoth data warehouse?

Or, should you adopt a bottom-up approach, look at the individual local and departmental requirements, and build bite-size departmental data marts? Should you build a large data warehouse and then let that repository feed data into local, departmental data marts? On the other hand, should you build individual local data marts, and combine them to form your overall data warehouse? Should these local data marts be independent of one another? Or, should they be dependent on the overall data warehouse for data feed?

Should you build a pilot data mart? These are crucial questions. How are They Different? Let us take a close look at Figure Here are the two different basic approaches: 1 overall data warehouse feeding dependent data marts, and 2 several departmental or lo-.

In the first approach, you extract data from the operational systems; you then transform, clean, integrate, and keep the data in the data warehouse. So, which approach is best in your case, the top-down or the bottomup approach? Let us examine these two approaches carefully. Top-Down Versus Bottom-Up Approach Top-Down Approach The advantages of this approach are: A truly corporate effort, an enterprise view of data Inherently architectednot a union of disparate data marts Single, central storage of data about the content Centralized rules and control May see quick results if implemented with iterations.

This is the big-picture approach in which you build the overall, big, enterprise-wide data warehouse. Here you do not have a collection of fragmented islands of information. The data warehouse is large and integrated. This approach, however, would take longer to build and has a high risk of failure. If you do not have experienced professionals on your team, this approach could be dangerous. Also, it will be difficult to sell this approach to senior management and sponsors. They are not likely to see results soon enough.

Bottom-Up Approach The advantages of this approach are: Faster and easier implementation of manageable pieces Favorable return on investment and proof of concept Less risk of failure Inherently incremental; can schedule important data marts first Allows project team to learn and grow. The disadvantages are: Each data mart has its own narrow view of data Permeates redundant data in every data mart Perpetuates inconsistent and irreconcilable data Proliferates unmanageable interfaces.

In this bottom-up approach, you build your departmental data marts one by one. You would set a priority scheme to determine which data marts you must build first. The most severe drawback of this approach is data fragmentation. Each independent data mart will be blind to the overall requirements of the entire organization. A Practical Approach In order to formulate an approach for your organization, you need to examine what exactly your organization wants.

Is your organization looking for long-term results or fast data marts for only a few subjects for now? Does your organization want quick, proof-of-concept, throw-away implementations? Or, do you want to look into some other practical approach? Although both the top-down and the bottom-up approaches each have their own advantages and drawbacks, a compromise approach accommodating both views appears to be practical.

The chief proponent of this practical approach is Ralph Kimball, an eminent author and data warehouse expert. The steps in this practical approach are as follows: 1. Plan and define requirements at the overall corporate level Create a surrounding architecture for a complete warehouse Conform and standardize the data content Implement the data warehouse as a series of supermarts, one at a time. In this practical approach, you go to the basics and determine what exactly your organization wants in the long term.

The key to this approach is that you first plan at the enterprise level. You gather requirements at the overall level. You establish the architecture for the complete warehouse. Then you determine the data content for each supermart. Supermarts are carefully architected data marts. You implement these supermarts, one at a time. Before implementation, you make sure that the data content among the various supermarts are conformed in terms of data types, field lengths, precision, and semantics.

A certain data element must mean the same thing in every supermart. This will avoid spread of disparate data across several data marts. A data mart, in this practical approach, is a logical subset of the complete data warehouse, a sort of pie-wedge of the whole data warehouse.

A data warehouse, therefore, is a conformed union of all data marts. Individual data marts are targeted to particular business groups in the enterprise, but the collection of all the data marts form an integrated whole, called the enterprise data warehouse. When we refer to data warehouses and data marts in our discussions here, we use the meanings as understood in this practical approach.

For us, a data warehouse means a collection of the constituent data marts. We have established our position on what the term data warehouse means to us. Now we are ready to examine its components. When we build an operational system such as order entry, claims processing, or savings account, we put together several components to make up the system. The front-end component consists of the GUI graphical user interface to interface with the users for data input. The display component is the set of screens and reports for the users.

The data interfaces and the network software form the connectivity component. Depending on the information requirements and the framework of our organization, we arrange these components in the most optimum way. Architecture is the proper arrangement of the components. You build a data warehouse with software and hardware components.

To suit the requirements of your organization you arrange these building blocks in a certain way for maximum benefit. You may want to lay special emphasis on one component; you may want to bolster up another component with extra tools and services. All of this depends on your circumstances. Figure shows the basic components of a typical warehouse.

You see the Source Data component shown on the left. The Data Staging component serves as the next building block. In the middle, you see the Data Storage component that manages the data warehouse data. This component not only stores and manages the data, it also keeps track of the data by means of the metadata repository.

The Information Delivery component shown on the right consists of all the different ways of making the information from the data warehouse available to the users.

Whether you build a data warehouse for a large manufacturing company on the Fortune list, a leading grocery chain with stores all over the country, or a global banking institution, the basic components are the same.

Each data warehouse is put together with the same building blocks. The essential difference for each organization is in the way these building blocks are arranged. The variation is in the manner in which some of the blocks are made stronger than others in the architecture. We will now take a closer look at each of the components. At this stage, we want to know what the components are and how each fits into the architecture.

We also want to review specific issues relating to each particular component. Source Data Component Source data coming into the data warehouse may be grouped into four broad categories, as discussed here. Source Data External. Production Data. This category of data comes from the various operational systems of the enterprise. Based on the information requirements in the data warehouse, you choose segments of data from the different operational systems.

While dealing with this data, you come across many variations in the data formats. You also notice that the data resides on different hardware platforms. Further, the data is supported by different database systems and operating systems. This is data from many vertical applications. In operational systems, information queries are narrow. You query an operational system for information about specific instances of business objects.

You may want just the name and address of a single customer. Or, you may need the orders placed by a single customer in a single week. Or, you may just need to look at a single invoice and the items billed on that single invoice. In operational systems, you do not have broad queries. You do not query the operational system in unexpected ways. The queries are all predictable. Again, you do not expect a particular query to run across different operational systems.

What does all of this mean? Simply this: there is no conformance of data among the various operational systems of an enterprise.

A term like an account may have different meanings in different systems. The significant and disturbing characteristic of production data is disparity. Your great challenge is to standardize and transform the disparate data from the various production systems, convert the data, and integrate the pieces into useful data for storage in the data warehouse.

Internal Data. In every organization, users keep their private spreadsheets, documents, customer profiles, and sometimes even departmental databases. This is the internal data, parts of which could be useful in a data warehouse. If your organization does business with the customers on a one-to-one basis and the contribution of each customer to the bottom line is significant, then detailed customer profiles with ample demographics are important in a data warehouse.

Profiles of individual customers become very important for consideration. When your account representatives talk to their assigned customers or when your marketing department wants to make specific offerings to individual customers, you need the details. Although much of this data may be extracted from production systems, a lot of it is held by individuals and departments in their private files. You cannot ignore the internal data held in private files in your organization.

It is a collective judgment call on how much of the internal data should be included in the data warehouse. The IT department must work with the user departments to gather the internal data. Internal data adds additional complexity to the process of transforming and integrating the data before it can be stored in the data warehouse.

You have to determine strategies for collecting data from spreadsheets, find ways of taking data from textual documents, and tie into departmental databases to gather pertinent data from those sources. Again, you may want to schedule the acquisition of internal data. Initially, you may want to limit yourself to only some significant portions before going live with your first data mart.

Archived Data. Operational systems are primarily intended to run the current business. In every operational system, you periodically take the old data and store it in archived files. The circumstances in your organization dictate how often and which portions of the operational databases are archived for storage.

Some data is archived after a year. Sometimes data is left in the operational system databases for as long as five years. Many different methods of archiving exist. There are staged archival methods. At the first stage, recent data is archived to a separate archival database that may still be online.

At the second stage, the older data is archived to flat files on disk storage. At the next stage, the oldest data is archived to tape cartridges or microfilm and even kept off-site.

As mentioned earlier, a data warehouse keeps historical snapshots of data. You essentially need historical data for analysis over time. For getting historical information, you look into your archived data sets.

Depending on your data warehouse requirements, you have to include sufficient historical data. This type of data is useful for discerning patterns and analyzing trends. External Data. Most executives depend on data from external sources for a high percentage of the information they use. They use statistics relating to their industry produced by external agencies.

They use market share data of competitors. They use standard values of financial indicators for their business to check on their performance. For example, the data warehouse of a car rental company contains data on the current production schedules of the leading automobile manufacturers. This external data in the data warehouse helps the car rental company plan for their fleet management.

The purposes served by such external data sources cannot be fulfilled by the data available within your organization itself. The insights gleaned from your production data and your archived data are somewhat limited. They give you a picture based on what you are doing or have done in the past. In order to spot industry trends and compare performance against other organizations, you need data from external sources. Usually, data from outside sources do not conform to your formats. You have to devise.

You have to organize the data transmissions from the external sources. Some sources may provide information at regular, stipulated intervals. Others may give you the data on request. You need to accommodate the variations. Data Staging Component After you have extracted data from various operational systems and from external sources, you have to prepare the data for storing in the data warehouse.

The extracted data coming from several disparate sources needs to be changed, converted, and made ready in a format that is suitable to be stored for querying and analysis. Three major functions need to be performed for getting the data ready.

You have to extract the data, transform the data, and then load the data into the data warehouse storage. These three major functions of extraction, transformation, and preparation for loading take place in a staging area. The data staging component consists of a workbench for these functions.

Data staging provides a place and an area with a set of functions to clean, change, combine, convert, deduplicate, and prepare source data for storage and use in the data warehouse. Why do you need a separate place or component to perform the data preparation?

Can you not move the data from the various sources into the data warehouse storage itself and then prepare the data? When we implement an operational system, we are likely to pick up data from different sources, move the data into the new operational system database, and run data conversions. Why cant this method work for a data warehouse? The essential difference here is this: in a data warehouse you pull in data from many source operational systems.

Remember that data in a data warehouse is subject-oriented and cuts across operational applications. A separate staging area, therefore, is a necessity for preparing data for the data warehouse.

Now that we have clarified the need for a separate data staging component, let us understand what happens in data staging. We will now briefly discuss the three major functions that take place in the staging area. Data Extraction. This function has to deal with numerous data sources. You have to employ the appropriate technique for each data source.

Source data may be from different source machines in diverse data formats. Part of the source data may be in relational database systems. Some data may be on other legacy network and hierarchical data models. Many data sources may still be in flat files. You may want to include data from spreadsheets and local departmental data sets. Data extraction may become quite complex.

Tools are available on the market for data extraction. You may want to consider using outside tools suitable for certain data sources. For the other data sources, you may want to develop in-house programs to do the data extraction.

Purchasing outside tools may entail high initial costs. In-house programs, on the other hand, may mean ongoing costs for development and maintenance.

After you extract the data, where do you keep the data for further preparation? You may perform the extraction function in the legacy platform itself if that approach suits your framework. More frequently, data warehouse implementation teams extract the source into a separate physical environment from which moving the data into the data warehouse. In the separate environment, you may extract the source data into a group of flat files, or a data-staging relational database, or a combination of both.

Data Transformation. In every system implementation, data conversion is an important function. For example, when you implement an operational system such as a magazine subscription application, you have to initially populate your database with data from the prior system records. You may be converting over from a manual system.

Or, you may be moving from a file-oriented system to a modern system supported with relational database tables. In either case, you will convert the data from the prior systems. It is very comprehensive and interesting subject also easy to score marks.

Life cycle of data,What is Data Mining? So,you can read it and practice more to get good score on this particular subject. All the best for your future. Unknown 9 January at Unknown 1 March at Astrologer Jagdish Shastri 14 March at Therefore, the data sets for each application need to be organized around that specific application. In striking contrast, in the data warehouse, data is stored by subjects, not by applica- tions.

If data is stored by business subjects, what are business subjects? Business subjects differ from enterprise to enterprise. These are the subjects critical for the enterprise. For a manufacturing company, sales, shipments, and inventory are critical business subjects.

For a retail store, sales at the check-out counter is a critical subject. Figure distinguishes between how data is stored in operational systems and in the data warehouse. In the operational systems shown, data for each application is organized separately by application: order processing, consumer loans, customer billing, accounts receivable, claims processing, and savings accounts.

For example, Claims is a critical business subject for an insurance company. Claims under automobile insurance policies are processed in the Auto Insurance application. Claims data for automobile insurance is organized in that application. In a data warehouse, there is no application flavor. The data in a data warehouse cut across applications. Integrated Data For proper decision making, you need to pull together all the relevant data from the vari- ous applications.

The data in the data warehouse comes from several operational systems. Source data are in different databases, files, and data segments. These are disparate appli- cations, so the operational platforms and operating systems could be different. The file layouts, character code representations, and field naming conventions all could be differ- ent.

In addition to data from internal operational systems, for many enterprises, data from outside sources is likely to be very important. Companies such as Metro Mail, A. Nielsen, and IRI specialize in providing vital data on a regular basis. Your data warehouse may need data from such sources. This is one more variation in the mix of source data for a data warehouse. Figure illustrates a simple process of data integration for a banking institution.

Here the data fed into the subject area of account in the data warehouse comes from three different operational applications. Even within just three applications, there could be sev- eral variations. Naming conventions could be different; attributes for data items could be different. The account number in the Savings Account application could be eight bytes long, but only six bytes in the Checking Account application. Before the data from various disparate sources can be usefully stored in a data ware- house, you have to remove the inconsistencies.

You have to standardize the various data el- ements and make sure of the meanings of data names in each source application. Before moving the data into the data warehouse, you have to go through a process of transforma- tion, consolidation, and integration of the source data. In an order entry system, the status of an order is the current status of the order. In a con- sumer loans application, the balance amount owed by the customer is the current amount.

Of course, we store some past transactions in operational systems, but, essentially, opera- tional systems reflect current information because these systems support day-to-day cur- rent operations. On the other hand, the data in the data warehouse is meant for analysis and decision making. If a user is looking at the buying pattern of a specific customer, the user needs data not only about the current purchase, but on the past purchases as well. When a user wants to find out the reason for the drop in sales in the North East division, the user needs all the sales data for that division over a period extending back in time.

When an analyst in a grocery chain wants to promote two or more products together, that analyst wants sales of the selected products over a number of past quarters. A data warehouse, because of the very nature of its purpose, has to contain historical data, not just current values. Data is stored as snapshots over past and current periods. Every data structure in the data warehouse contains the time element.

This aspect of the data ware- house is quite significant for both the design and the implementation phases. For example, in a data warehouse containing units of sale, the quantity stored in each file record or table row relates to a specific time element.

Depending on the level of the details in the data warehouse, the sales quantity in a record may relate to a specific date, week, month, or quarter. The data in the data warehouse is not intended to run the day-to-day business.

When you want to process the next order received from a customer, you do not look into the data ware- house to find the current stock status. The operational order entry application is meant for that purpose. In the data warehouse, you keep the extracted stock status data as snap- shots over time. You do not update the data warehouse every time you process a single order.

Data from the operational systems are moved into the data warehouse at specific inter- vals. Depending on the requirements of the business, these data movements take place twice a day, once a day, once a week, or once in two weeks.

In fact, in a typical data ware- house, data movements to different data sets may take place at different frequencies. The changes to the attributes of the products may be moved once a week.

Any revisions to ge- ographical setup may be moved once a month. The units of sales may be moved once a day. You plan and schedule the data movements or data loads based on the requirements of your users. As illustrated in Figure , every business transaction does not update the data in the data warehouse.

The business transactions update the operational system databases in real time. We add, change, or delete data from an operational system as each transaction hap- pens but do not usually update the data in the data warehouse. You do not delete the data in the data warehouse in real time. Once the data is captured in the data warehouse, you do not run individual transactions to change the data there.

Data updates are commonplace in an operational database; not so in a data warehouse. The data in a data warehouse is not as volatile as the data in an operational database is. The data in a data warehouse is primarily for query and analysis.

Data Granularity In an operational system, data is usually kept at the lowest level of detail. In a point-of- sale system for a grocery store, the units of sale are captured and stored at the level of units of a product per transaction at the check-out counter. In an order entry system, the quantity ordered is captured and stored at the level of units of a product per order received from the customer. If you are looking for units of a product ordered this month, you read all the orders entered for the entire month for that product and add up.

You do not usually keep summa- ry data in an operational system. When a user queries the data warehouse for analysis, he or she usually starts by look- ing at summary data. The user may start with total sale units of a product in an entire re- gion. Then the user may want to look at the breakdown by states in the region.

The next step may be the examination of sale units by the next level of individual stores. Frequent- ly, the analysis begins at a high level and moves down to lower levels of detail. In a data warehouse, therefore, you find it efficient to keep data summarized at differ- ent levels.

Depending on the query, you can then go to the particular level of detail and satisfy the query. Data granularity in a data warehouse refers to the level of detail. The lower the level of detail, the finer the data granularity. Of course, if you want to keep data in the lowest level of detail, you have to store a lot of data in the data warehouse. You will have to decide on the granularity levels based on the data types and the expected system performance for queries. Figure shows examples of data granularity in a typical data warehouse.

Some authors and vendors use the two terms synonymously. Some make distinctions that are not clear enough. At this point, it would be worthwhile for us to examine these two terms and take our position. Let us examine this statement and take a stand. These are critical issues requiring careful examination and planning. Should you look at the big picture of your organization, take a top-down approach, and build a mammoth data warehouse? Or, should you adopt a bottom-up approach, look at the individual local and departmental requirements, and build bite-size departmental data marts?

Should you build a large data warehouse and then let that repository feed data into lo- cal, departmental data marts? On the other hand, should you build individual local data marts, and combine them to form your overall data warehouse? Should these local data marts be independent of one another? Or, should they be dependent on the overall data warehouse for data feed?

Should you build a pilot data mart? These are crucial questions. How are They Different? Let us take a close look at Figure Depending on the requirements, multiple levels of detail may be present. Many data warehouses have at least dual levels of granularity. In the first approach, you extract data from the operational systems; you then transform, clean, integrate, and keep the data in the data warehouse.

So, which approach is best in your case, the top-down or the bottom- up approach? Let us examine these two approaches carefully.

Here you do not have a collection of fragmented islands of information. The data warehouse is large and integrated. This approach, however, would take longer to build and has a high risk of failure. If you do not have experienced professionals on your team, this approach could be dangerous.

Also, it will be difficult to sell this approach to senior management and sponsors. They are not likely to see results soon enough. You would set a priority scheme to determine which data marts you must build first. The most severe drawback of this approach is data fragmentation.

Each independent data mart will be blind to the overall requirements of the entire organization. A Practical Approach In order to formulate an approach for your organization, you need to examine what exact- ly your organization wants.

Is your organization looking for long-term results or fast data marts for only a few subjects for now? Does your organization want quick, proof-of-con- cept, throw-away implementations? Or, do you want to look into some other practical ap- proach? Although both the top-down and the bottom-up approaches each have their own advan- tages and drawbacks, a compromise approach accommodating both views appears to be practical.

The chief proponent of this practical approach is Ralph Kimball, an eminent au- thor and data warehouse expert. The steps in this practical approach are as follows: 1. Plan and define requirements at the overall corporate level 2. Create a surrounding architecture for a complete warehouse 3. Conform and standardize the data content 4. Implement the data warehouse as a series of supermarts, one at a time In this practical approach, you go to the basics and determine what exactly your orga- nization wants in the long term.

The key to this approach is that you first plan at the enter- prise level. You gather requirements at the overall level. You establish the architecture for the complete warehouse. Then you determine the data content for each supermart. Super- marts are carefully architected data marts.

You implement these supermarts, one at a time. Before implementation, you make sure that the data content among the various super- marts are conformed in terms of data types, field lengths, precision, and semantics.

A cer- tain data element must mean the same thing in every supermart. This will avoid spread of disparate data across several data marts.

A data mart, in this practical approach, is a logical subset of the complete data ware- house, a sort of pie-wedge of the whole data warehouse.

A data warehouse, therefore, is a conformed union of all data marts. Individual data marts are targeted to particular busi- ness groups in the enterprise, but the collection of all the data marts form an integrated whole, called the enterprise data warehouse.

When we refer to data warehouses and data marts in our discussions here, we use the meanings as understood in this practical approach. For us, a data warehouse means a col- lection of the constituent data marts.

We have established our position on what the term data warehouse means to us. Now we are ready to examine its components. When we build an operational system such as order entry, claims processing, or sav- ings account, we put together several components to make up the system.

The front-end component consists of the GUI graphical user interface to interface with the users for data input. The display component is the set of screens and reports for the users. The data interfaces and the network software form the connec- tivity component. Depending on the information requirements and the framework of our organization, we arrange these components in the most optimum way.

Architecture is the proper arrangement of the components. You build a data warehouse with software and hardware components. To suit the requirements of your organization you arrange these building blocks in a certain way for maximum benefit. You may want to lay special emphasis on one component; you may want to bolster up another component with extra tools and services. All of this depends on your circumstances. Figure shows the basic components of a typical warehouse.

You see the Source Data component shown on the left. The Data Staging component serves as the next build- ing block. In the middle, you see the Data Storage component that manages the data ware- house data. This component not only stores and manages the data, it also keeps track of the data by means of the metadata repository. The Information Delivery component shown on the right consists of all the different ways of making the information from the data warehouse available to the users.

Whether you build a data warehouse for a large manufacturing company on the For- tune list, a leading grocery chain with stores all over the country, or a global banking institution, the basic components are the same.

Each data warehouse is put together with the same building blocks. The essential difference for each organization is in the way these building blocks are arranged. The variation is in the manner in which some of the blocks are made stronger than others in the architecture. We will now take a closer look at each of the components. At this stage, we want to know what the components are and how each fits into the architecture.

We also want to re- view specific issues relating to each particular component. Source Data Component Source data coming into the data warehouse may be grouped into four broad categories, as discussed here. Production Data. This category of data comes from the various operational systems of the enterprise. Based on the information requirements in the data warehouse, you choose segments of data from the different operational systems.

While dealing with this data, you come across many variations in the data formats. You also notice that the data resides on different hardware platforms. Further, the data is supported by different database systems and operating systems.

This is data from many vertical applications. In operational systems, information queries are narrow. You query an operational sys- tem for information about specific instances of business objects.

You may want just the name and address of a single customer. Or, you may need the orders placed by a single customer in a single week. Or, you may just need to look at a single invoice and the items billed on that single invoice.

In operational systems, you do not have broad queries. You do not query the operational system in unexpected ways. The queries are all predictable.

Again, you do not expect a particular query to run across different operational systems. What does all of this mean? Simply this: there is no conformance of data among the vari- ous operational systems of an enterprise. A term like an account may have different meanings in different systems.

The significant and disturbing characteristic of production data is disparity. Your great challenge is to standardize and transform the disparate data from the various production systems, convert the data, and integrate the pieces into useful data for storage in the data warehouse.

Internal Data. This is the internal data, parts of which could be useful in a data warehouse. If your organization does business with the customers on a one-to-one basis and the contribution of each customer to the bottom line is significant, then detailed customer profiles with ample demographics are important in a data warehouse. Profiles of individ- ual customers become very important for consideration.

When your account representa- tives talk to their assigned customers or when your marketing department wants to make specific offerings to individual customers, you need the details. Although much of this data may be extracted from production systems, a lot of it is held by individuals and de- partments in their private files. You cannot ignore the internal data held in private files in your organization. It is a col- lective judgment call on how much of the internal data should be included in the data warehouse.

The IT department must work with the user departments to gather the internal data. Internal data adds additional complexity to the process of transforming and integrating the data before it can be stored in the data warehouse.

You have to determine strategies for collecting data from spreadsheets, find ways of taking data from textual documents, and tie into departmental databases to gather pertinent data from those sources.

Again, you may want to schedule the acquisition of internal data. Initially, you may want to limit yourself to only some significant portions before going live with your first data mart. Archived Data. Operational systems are primarily intended to run the current business. In every operational system, you periodically take the old data and store it in archived files. The circumstances in your organization dictate how often and which portions of the operational databases are archived for storage.

Some data is archived after a year. Some- times data is left in the operational system databases for as long as five years. Many different methods of archiving exist. There are staged archival methods. At the first stage, recent data is archived to a separate archival database that may still be online. At the second stage, the older data is archived to flat files on disk storage.

Designations used by free xbox live codes 12 month to distinguish their products are often claimed as trademarks. Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic or mechanical, including uploading, downloading, printing, decompiling, recording or otherwise, except as permitted under Sections or of the Data warehousing fundamentals by paulraj ponniah solution manual free download States Copyright Act, without the prior written permission of the Publisher. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold with the understanding that the publisher is not dqta in rendering professional services. Data warehousing fundamentals by paulraj ponniah solution manual free download professional advice or other expert assistance is required, the services of a competent professional person should be sought. For more information about Wiley products, visit our web downloav at www. I am delighted to share my paulrxj with information technology professionals about my faculty colleague Paulraj Ponniahs textbook Data Warehousing Fundamentals. In the spring ofRaritan Valley Community College decided to offer a course on data warehousing. This was mainly through the initiative of Dr. Ponniah, who had been teaching our database design and development course for several years. It was data warehousing fundamentals by paulraj ponniah solution manual free download difficult to find a good textbook for a college course on data warehousing. We had to settle for a book that was not quite suitable. In order to make the course effective, Paul had to supplement the book with his own data data warehousing fundamentals by paulraj ponniah solution manual free download seminar materials. Our students, paulran IT professionals from local industries, received the course very well. Now this magnificent textbook on data warehousing comes to you through the foresight and diligent work of Dr. Ponniah, along with the insightful support of the publishers, John Data warehousing fundamentals by paulraj ponniah solution manual free download and Sons. This book has numerous features that make it a winner: The order of topics is very logical. The choice of topics is quite appropriate for a comprehensive introductory book. The coverage of topics is also very well balanced. The subject matter is logically structured, with chapters covering essential components of the data warehousing field. The sequence of topics is well planned to ariana grande break free ft zedd letra a seamless transition from design to implementation. Within each chapter, the continuity of topics is excellent. data warehousing fundamentals by paulraj ponniah solution manual free download Data Warehousing Fundamentals Paulraj Ponniah - Free ebook Top-​Down Versus Bottom-Up Approach 26 A Practical Approach 27 40 1 Data Warehouse Expansion 41 1 Vendor Solutions and Products 42 1. Data warehousing fundamentals for IT professionals / Paulraj Ponniah.—2nd ed. Vendor solutions and products run the gamut of data warehous- begins at a high level and moves down to lower levels of detail. For free-form text data, retrieval engines preindex the textual documents to allow searches. Our solution manuals are written by Chegg experts so you can be assured of the highest quality! Textbook Solutions for Data Warehousing Fundamentals for IT Professionals. by. 0 Editions. Author: Paulraj Ponniah Why is Chegg Study better than downloaded Data Warehousing Fundamentals for IT Professionals PDF. Vendor solutions and products run the gamut of data ware. Data Warehousing Fundamentals By Paulraj Ponniah Solution Manual: Free Programs Faculty colleague Paulraj Ponniah's textbook Data Warehousing Fundamentals. Top-​down Parsing, Predictive Parsing, Recursive Descent Parsing. Paulraj Ponniah DATA WAREHOUSING FUNDAMENTALS A Comprehensive 1 Top-Down Versus Bottom-Up Approach 26 1 A Practical 40 1 Data Warehouse Expansion 41 1 Vendor Solutions and Products 42 1. Download Citation | Data Warehousing Fundamentals for it Professionals: Second Edition Since the first edition of Data Warehousing Fundamentals, numerous enterprises have Join for free According Paulraj Ponniah [15], the data warehouse promises to be a new computing environment by offering viable solutions. by Paulraj Ponniah Explore a preview version of DATA WAREHOUSING FUNDAMENTALS: A Comprehensive Guide for IT Professionals Start your free trial. Data Warehousing Fundamentals: A Comprehensive Guide for IT Paulraj Ponniah Ph.D., DOWNLOAD FULL BOOK. Export Citation(s). Free Access. free Appendix C: Guidelines for Evaluating Vendor Solutions (Pages. This blog provides I.T. engineering books,materials,solutions,tips & tricks and many more. All the best for your future. Data Warehousing Fundamentals for IT. June 1. Cutting-edge content and guidance from a data warehousing expert—now expanded to reflect field trends. Paulraj Ponniah Languange. Paulraj Ponniah Language : en Windows 10 Manual PDF; grey pdf ita; If you are searching for the book Data warehousing fundamentals solution manual in pdf form, then you have come on to the right website. Robert Kirkman for the Walking Dead Universe. Astrologer Jagdish Shastri 14 March at This is the title of your second post. Read Chapters 11 15, 18 Data Warehousing Text.. data warehousing fundamentals by paulraj ponniah solution manual free download