SAP BusinessObjects Data Services, part of the SAP BusinessObjects Enterprise Information Management (EIM) suite, is not only an extract, transform and load (ETL) tool, but also a data quality suite integrated with many traditional SAP products. With the release of SAP BusinessObjects BI 4.0, the EIM product stack includes many new features.
SAP BusinessObjects BI 4.0 security integration
In version 4.0, Data Services architecture has significantly changed with SAP BusinessObjects BI 4.0 security integration. Now there is one platform to manage user security and groups for the entire BusinessObjects suite. With version 4.0, administrators can manage users and repository connections in the Central Management Console (Figure 1). Not only is there one place to manage security, but SAP BusinessObjects Data Services now has access to a much more advanced security model. This model allows for security groups, password policies and integration with Active Directory for single sign-on. All these enhancements allow for managing permissions at a much more granular level and for much more flexibility for administrators.
Figure 1. More groups are available to Data Services with the combined security model.
This flexibility adds additional complications to both installation and configuration. SAP BusinessObjects BI 4.0 is now tied to Data Services, so there are additional steps to take during an installation. For companies that are not existing BusinessObjects customers, SAP BusinessObjects Data Services ships with a mini Central Management Server. It has all of the necessary support components for the SAP BusinessObjects EIM products, but none of the components needed to support SAP BusinessObjects reporting and dashboard services.
For more on SAP BusinessObjects 4.0
Read about integrated data management in SAP BusinessObjects BI 4.0
Get expert advice on upgrading to SAP BusinessObjects 4.0
See what Bridgette Chambers of ASUG says about SAP BusinessObjects 4.0
New 64-bit for Windows
Although 64-bit capabilities have been available for Data Services server installations on other operating systems for some time, this functionality has been lacking in Windows. In version 4.0, this is no longer the case. These 64-bit systems offer fewer memory restrictions, so the days of a 2-gigabyte limit per process are gone. Now users are bound only by the amount of memory on the job server, so it is beneficial to install the Windows Enterprise Server x64 operating system and load as much memory on the job server as possible. This allows much more headroom for expensive caching operations as well as many more tuning options to developers and ETL architects.
Full ANSI join support
From an SAP developer’s perspective, this is one of the most exciting enhancements. Data Services in version 3.x and below supported only joins in the WHERE clause of the query transform. In version 4.0, the FROM clause in the query transform is fully functional, supporting not only OUTER joins but any combination of INNER and OUTER joins. This is a huge leap forward in terms of development options. In previous versions, developers would sometimes split data flows into multiple flows because they couldn’t mix join types. Figure 2 shows how you can now specify complex joins in one query transform.
Figure 2. The FROM clause tab is now fully functional, allowing mixing of INNER and OUTER joins.
Enhanced validation transform
The validation transform has also improved with version 4.0. SAP developers could always validate their data and capture statistics and data into metadata reports, but now validation is much more capable because it supports multiple rules per column and complex rules across multiple columns. In versions 3.x and earlier, the validation transform could only handle one rule per column.
Unstructured data processing
Unstructured data processing is one of the most significant changes to the new 4.0 version, and probably the most exciting for business users. Unstructured data is growing faster than traditional, structured data, and it offers many challenges that are simply not present when processing structured data with a predefined data model.
Unstructured data must be categorized, or “tagged,” with metadata to be useful, and SAP BusinessObjects Data Services is now capable of this type of categorization with the use of the new Entity_Extraction transform. This transform can extract information from any text, HTML or XML content and generate output metadata. This output metadata can be used in a variety of ways, such as inputs to other transforms to drive other ETL processes or stored as additional attributes in database tables. This allows linking structured information to unstructured data to make new connections and gain insight that wasn’t possible before.
Sentiment analysis on unstructured data is also possible with this new transform (Figure 3). Imagine analyzing press releases, Twitter streams or RSS feeds to quantify and create meaning with data from any source or format. Also, using the Entity_Extraction transform allows you to gauge customer sentiment. The developer can use predefined entities such as company, person, firm and city. This transform also supports a variety of languages.
Figure 3. Analyzing unstructured data: This example took a press release and split the data stream into all of the word components.
ABOUT THE AUTHOR
Don Loden is a business intelligence consultant with full life cycle data warehouse development experience in multiple verticals. He is an SAP Certified Application Associate on SAP BusinessObjects Data Integrator with . He has more than 10 years of information technology experience. Email him or find him on Twitter @donloden.