Strategize and avoid SAP data integration failures

Software projects often fail because of a lack of proper testing before the project goes live. Expert Ethan Jewett details testing approaches he swears by.

There’s no shortage of ways SAP data integration projects can fail, as many companies have discovered.

Take the example of a company that’s moving invoice data from an ERP system to a business intelligence (BI) reporting system to perform analysis on open invoices and profitability according to customer and product. Perhaps too much data gets loaded, which doubles the values for some invoices. During the testing process, the company finds that cancellations aren’t being handled correctly. Or maybe it has discovered that a piece of the profitability puzzle, like sales commissions, is missing from the picture, and that data needs to be loaded from a separate system to complete the analysis.

Although it’s unlikely that a company would see all these problems in a single extraction and reporting project, they can each be an issue in software integration projects. Sometimes it seems like anything that can go wrong will go wrong -- so why take chance with inadequate testing strategies?

SAP data integration projects, especially in data warehousing or BI, pose different challenges when it comes to testing strategy than traditional application development. For example, applications usually span very different systems, rendering traditional unit testing tools unsuitable.

Plus, integration projects are often conceptualized differently than traditional programs are. The design is defined in terms of examples of records of data and fields, rather than specific "if-then" logic.

This can cause difficulties in applying traditional testing methodologies. While the testing concepts used for traditional programs are almost always useful in integration projects, we often need to rework them.

Even in an all-SAP landscape, where the problem of spanning heterogeneous systems is mitigated, integration can still be a challenge. In a heterogeneous landscape, testing SAP data integrations becomes even more problematic because of a lack of testing tools and differing system architectures. I will address a few strategies and experiences to get you thinking about testing your integration projects.

Test coverage in SAP data integration projects

Test coverage is a metric identifying the percentage of a business process or code that the tests check. Missing test coverage can cause problems that are caught late in the development process or not at all. Full test coverage means identifying all transaction types that need to transfer through the integration code and writing tests for them. This ensures that there are test cases for all types of transactions.

In an SAP data integration project for a global client, I worked on integrating data from a large number of ERP systems into a single data warehouse. My team worked with system owners to find ways to test the integration points.

During the project, the company introduced a new year-end financial process, along with new transaction types. But our team didn’t identify the new process as a test case because it was introduced late in our project, and cross-team communication channels broke down. The tests we designed caught bugs in other processes, but the untested year-end process introduced a major integration bug that made it through acceptance testing because our integration code wasn’t designed to handle the new types of transactions.

Ideally, each type of transaction that flows through the integration code will be covered by tests. In Figure 1, we have three types of transactions: blue, light blue and gray. These types of records might be general-ledger postings for invoices, returns and manual journals. Each type of posting may require different logic in the integration code, so a lack of test coverage means that bugs may slip through. Tests cover the blue (invoices) and light blue (returns) transaction types, but the gray transactions (manual journals) are untested.

Figure 1. Each type of posting may require different logic in the integration code, so a lack of test coverage means that bugs may slip through. In this diagram, tests cover the blue (invoices) and light blue (returns) transaction types, but the gray transactions (manual journals) are untested.

To avoid these problems, identify the types of transactions that will be moving through the integration code and develop tests to cover all the types that are expected. Then check in regularly with those individuals responsible for the data integration to make sure that no new and unexpected transactions are introduced.

Two testing strategies for SAP software integration

Developers typically first create applications using a small, representative subset of data and then run the integration scenario with a full data set. Sometimes integration logic appears to work well on a subset of data, but it slows down immensely or never finishes running when used with the full dataset.

I use two strategies to avoid this problem:

  • Test full-size and realistic data sets. If such a data set is not available, manufacture it, keeping in mind the issue of coverage.
  • Test all requirements early and often. This is especially important with less tangible requirements such as performance and usability.

When testing integrations, some testers look at the end result of processing millions of records and break this result down using a few key dimensions such as company, time period, account, product or customer. If all the dimensions fall within a predefined threshold of the expected value -- for example, 1%, or 0.1% -- then all is well.

Other testers look at individual records they know are important and verify only if those records are correct.

I think of these two strategies as testing “in the large” and testing “in the small,” as illustrated in Figure 2. “Testing in the large” looks at values aggregated across all transactions and verifies that the aggregate values are within an acceptable margin after the software integration project has run. “Testing in the small” compares values on individual records before and after the record passes through the integration code to make sure the result is as expected.

Figure 2. This metric illustrates “testing in the large” and “testing in the small” strategies. “Testing in the large” looks at values aggregated across all transactions. “Testing in the small” compares values on individual records.

I thought it was enough to pursue both of these strategies separately until early in my career when I worked with someone who combined them. My team found that testing “in the large” turned up significant issues we hadn’t anticipated “in the small.”

My colleague identified small problems in the test system at a high level, and then went through the data thoroughly until he found detailed reasons for each discrepancy. As a result, we often found problems with the integration business logic implemented in the test system. By the end of the project, these painstaking sessions of drilling down into the numbers resulted in significant improvements in system quality.

Balancing risk and reward

Incorporating different testing tactics also calls for balancing risk and reward. Different organizations and situations demand different appetites for risk. Sensitivity to the variables controlling the risk equation is important while developing a testing strategy.

A project team cannot test everything. Testers need to balance the desire to deliver a flawless application against the budget, timeline and the team’s ability to react to problems. Focus on the areas that are central to the application’s value or the ones hardest to fix. The idea is to deliver as much value as possible and give the team the chance to fix any problems uncovered later in the project.

On a recent data warehouse software upgrade, the conversation about balancing risk and reward played out between different people involved in the project. Based on outside time constraints, the capabilities of the team and support from the vendor, the team adopted an aggressive timeline. This approach entailed higher risk because there was less time for testing and fixing issues.

The team developed a testing strategy focusing on central functions that the team could not easily address by go-live. As a result, the team identified many issues and addressed them early in the project. Although more issues arose in relatively untested areas nearer the go-live, the team managed these problems and delivered an upgrade much faster than had previously been accomplished.

Dig Deeper on SAP data management