News Stay informed about the latest enterprise technology news and product updates.

SAP projects should start with data cleansing, expert says

Consolidating multiple legacy systems onto SAP is a daunting task, but one expert says that a little in-house data cleansing helps ease migration headaches.

LAS VEGAS -- Extracting data from multiple legacy systems dating back to the 1980s -- mainframe files, raw data, PC databases and GPS data -- and migrating it to one goliath SAP system can be a daunting task for any enterprise. When the City Council of Tacoma, Wash., decided to forgo best-of-breed applications for an enterprise-based system, the data extraction team quickly realized that data cleansing would be the most arduous task of all.

We were pulling city developers from every branch, even people who are now DBAs.
Kathy Palon
SAP technical development leadCity of Tacoma, Wash.

Cleansing the data from the source systems required lots of manual work for the staff, according to Kathy Palon, SAP technical development lead for the city's implementation plan. In a presentation at SAP TechEd 2006 in Las Vegas, Palon explained the city's 18-month migration to SAP R/3 Enterprise.

"We were pulling city developers from every branch, even people who are now DBAs," Palon said of the data cleansing and migration project.

The project, which began in 2002, combined data from all of the city's governing systems -- utilities, public works, public facilities and public safety -- and incorporated that information into one SAP R/3 Enterprise system.

The project wasn't a smooth move. The city absorbed a number of cost overruns. Problems with the rollout of SAP's financial module resulted in delays to the city budget, and integration issues with SAP's Customer Care and Services (CCS) software slowed down customer service at the city's public utility operations.

Palon attributes the problems to a lack of adequate employee training and a recent audit of the project, conducted by IBM, supports that assertion.

The city plans to take what it learned from the project and apply it in two years to an upgrade to mySAP ERP 2005, Palon said.

In her presentation, Migration: Everything but the kitchen sink, Palon focused on three data cleansing steps that she said deserve the most attention: migration sources, business partner objects and matching.

Migration sources

A migration source is where the original data comes from. Tacoma had numerous sources that had to be cleansed then migrated, such as individual HR programs, finance and asset systems, PC databases, aerial photograph maps and other documents.

"Everything was hierarchical; partners had to go first, then accounts, then more and more detail until you got way down to payments and interest," Palon said. "If you didn't load all your partners 100% perfectly, you couldn't get all your accounts in, then you couldn't get all your payments, and it escalated and escalated."

The technical team decided to start with its utilities systems because they were the largest migration source.

"It was difficult to reconcile all the detail," Palon said. "We thought we had 20,000 assets but -- oops -- guess what, we had more than 40,000 assets."

Instances like these are common when data is not all stored on the same system, she said. Levels of complexity only increase as the separate systems get loaded with data that have no parameters for control.

Business partner objects

A business partner object is any entity that conducts any form of business with the city. This includes businesses, homeowners, city government personnel and residents.

The city thought it conducted business with more than 320,000 partners but found about 35,000 duplicate entries during the data cleansing process, Palon said, adding that about 180,000 partners came from the utility billing system.

While cleansing business partner data, one question became obvious to Palon and her colleagues: "How do we consolidate these duplicates into one single instance of a business partner object?"

All these business partners had to be cleansed and categorized to establish one single instance of a business partner object within the SAP system, Palon said. Duplicates were eliminated immediately and matches were made using ABAP coding, allowing Palon's team to clean and update the data using one uniform business object to be entered into the R/3 system.

Priority matching

Priority matching means that every department user can access the same uniform information from a business partner object using defined parameters such as addresses, unique business identifiers, social security numbers, and name characters from reliable departments.

To ensure the proper functionality, Palon's team wanted to create priority matching that was uniform throughout the system. Heads of the departments met to discuss their individual preferences.

"Get all those people together, and guess what … everyone wants the address done their way… including the mailroom," Palon said.

To clean the address data, the team decided to implement a hybrid foundation address to suit everyone for combination tables in SAP. That process alone took one week. Finally, the city decided on its priorities for matches based on the cleanest data. Today, multiple accounts can be traced back to one business partner object.

With two years of the SAP R/3 enterprise under the city's belt, Palon is very happy with the migration.

"Migration issues have been less than 2%," she said. "I'm sure there is still dirty data out there," she admits -- but nothing is ever perfect.

Dig Deeper on SAP data management