DATA WAREHOUSE VS. DATA MARTS. Data warehouses come in two configurations: corporate-wide and departmental data marts. The corporate-wide data warehouse presents all the data in the same manner to all end users, as the data fields are described using a universally standard naming convention. In contrast, the data mart consists of data which is specific to a smaller organizational group. The corporate sales, finance or marketing departments may each have a data mart for their own use, and the naming conventions used by each data mart may differ. Within this data mart scheme, there may be no single data warehouse where all the data can be checked and validated. Even though individual data marts may overlap and duplicate the same data, each may yield different results because of inconsistencies in the data field naming conventions.
A better solution is to employ data marts in conjunction with a “master” data warehouse. In such a scenario, the data marts are aggregated and summarized from the already validated data warehouse and may be deployed using stored procedures and trigger programs from within the data warehouse. Results from aggregation can easily be joined with other tables in queries. Data warehouse-generated data marts ensure informational integrity from mart to mart with the CA-MANMAN transactional database. What follows is a case study of one such data warehouse/data mart deployment.
THE NEED. Gen-Probe had a need to bring customer, sales, invoicing and shipment information to their remote sales force. Each of 35 sales representatives received from three to five reports of 10 to 1,000 pieces of paper each at the end of each month. They often worked from reports that were 30 days old and had no view of sales, shipments or invoices in the current month. Gen-Probe’s initial approach was to use interpreted SQL for data extraction and end-user query reporting. They found, however, that performance was inadequate for many of their complex queries, as it took up to 10 hours to process a single report.
- Retrieving the data must be quick.
- Each sales representative must be able to get his subset of information easily.
- Information must be up-to-date and accurate. Must be able to get changed data from CA-MANMAN to the data mart once each day.
- Must have minimal setup and training on the client computer.
- The system must replace the need to send printed reports.
- Must be able to create indexes as needed to optimize queries.
- Must be able to join tables from multiple databases quickly and efficiently.
- Must be able to automate operations using stored procedures and the Microsoft SQL Server Scheduler (no manual processes or downloads).
- There must be no additional load placed on the CA-MANMAN computer system.
THE SOLUTION. Gen-Probe opened a project to improve timeliness and availability of CA-MANMAN data. They acquired and deployed the eNVy Systems Data Warehouse Suite, hosting it on a DEC Alpha 2100 /NT server running Microsoft SQL Services.
eNVy Systems provided a turn-key solution to create and populate two MANDB (OMAR, MFG) data warehouses. The data was then replicated to the MS SQL Server database (the eNVy Data Replicator for NT built the SQL Server database and copied the CA-MANMAN data to it). Subsequently, indexes were added to improve query performance. The administrator could add indexes at will to improve specific end-user queries (e.g. product description, customer name). Stored procedures were added to the MS SQL Server database to automate the creation of the Shipment and Sales data mart tables.
The data warehouse is updated once a day using the eNVy Data Replicator for NT. A control .INI file was built for each database, thus enabling MS SQL Server Scheduler to execute Data Replicator automatically early each morning.
To simplify data delivery to the end users, the data was made available in a database on Gen-Probe’s intranet network. Using Microsoft Internet Information Server and Active Server pages to perform server-side queries, formatted HTML pages were generated and sent via a WEB browser to the users’ 28,800-baud modems. There were no additional costs; WEB browser plug-ins, special drivers (DLLs) and custom application programs were not required.
CONCLUSION. Most companies have adopted system-of-record informational databases for processing day-to-day transactions. The data warehouse—the complete copy of the transactional database—shares this system-of-record label. As with the transactional database, the data warehouse consists of a constant and reliable representation of the data. Creating data marts in conjunction with such a data warehouse expands the functionality, permitting consolidated and aggregated subsets of specific user- and department-centric data for analysis. The eNVy Data Warehouse Suite facilitates all this power and flexibility, while maintaining data and informational integrity with the transactional database.
By using eNVy’s Warehouse Plus Suite, Gen-Probe found the data warehouse and data mart databases easy to create, implement and maintain, thus giving Gen-Probe a tremendous informational advantage over the previously implemented interpreted SQL, 4 G/L and 3 G/L technologies.
eNVy Systems is the manufacturer of the eNVy Data Warehouse Plus Suite for ORACLE CODASYL DBMS, installed and operational at companies ranging from Fortune 100 to Fortune 1000, a proven technology for quick and reliable deployment of CA-MANMAN data to Microsoft SQL Server, Oracle Rdbms, SYBASE, CA-Ingres and other RDBMS. eNVy also provides consultation ranging from system and data integration to CODASYL DBMS administration, de-corruption and VAX, Alpha AXP – OpenVMS systems management.
Gen-Probe is the recognized world leader in the development, manufacture and commercialization of diagnostic products based on its patented genetic probe technology. It develops and markets DNA probe products that test for a wide range of microorganisms that cause tuberculosis, strep throat, pneumonia, fungal infections and sexually transmitted diseases. Gen-Probe is located on the WEB at: http://gen-probe.com.