XML processing in ABAP

Simple Transformations are a SAP proprietary programming language that is integrated into ABAP in kernel release 6.40. Learn about the power of this technology in this code-filled tip.

Part 1 - Expressiveness of Simple Transformations Simple Transformations are a SAP proprietary programming language...

that is integrated into ABAP by CALL TRANSFORMATION in kernel release 6.40. Its concept differs from other transformation languages, like JiBX, which is used for XML-Java mapping, or languages suitable both for queries and transformations, like Lore, XCerpt, XDuce or even XSLT, which is also supported in ABAP by CALL TRANSFORMATION. By looking at examples of ABAP package SST_DEMO you will understand how ST works. In contrast to XSLT, Simple Transformations (ST) are fast, memory efficient and symmetric, but lack of expressiveness.

Unfortunately I can't describe the power of ST in a mathematical way because, as usual, conditions cause most technical problems and the <tt:cond> seems to be difficult to handle. In fact, despite its name, Simple Transformations have very powerful and sometimes complicated commands.

So, when I'm asked about expressiveness, usually I tell how many times an ABAP or XML node can be accessed, mention the linear order during processing, lookahead of 1 and so on. If I start to tell about the possibilities to use parameters and breaking symmetry between serilization and deserilization, clever people will start to think how far they can go. A typical question is:

Can I create relational data models from nested structures with an unbounded number of occurrences during deserialization?

This is a natural question when you are doing data exchange with non-SAP systems using standardized XML frameworks and you want to save your data in transparent tables with additional foreign keys. If you are working with small datasets you won't have problems when you copy nested internal tables into the target data structure in ABAP. But let's ask the question if we can avoid heavy postprocessing. Of course, I would accept a pragmatic solution. Simple post processing in ABAP without copying data from one internal table into another would be acceptable.

The answer to the question above is, "no chance. You have to do heavy post processing." The reason is very simple and has to do with the <tt:loop> statement. Let's go into detail. In the following we consider an XML-document with a simple nested structure:


 <A> <name>A1</name> <ZLS> <Z>01</Z> <Z>02</Z> </ZLS> </A> <A> <name>A2</name> <ZLS> <Z>03</Z> </ZLS> </A>

A typical example for a ST is to transform this XML document into an internal table whose table line contains an component 'nummer' and another internal table:


 <?sap.transform simple?> <tt:transform template="temp1" xmlns:tt="http://www.sap.com/transformation-templates"> <tt:root name="ROOT"/> <tt:template name="temp1"> <tt:loop ref=".ROOT" name="a"> <A> <name> <tt:value ref="$a.name" /> </name> <ZLS> <tt:loop ref="$a.zls" name="z"> <Z> <tt:value ref="$z.nummer" /> </Z> </tt:loop> </ZLS> </A> </tt:loop> </tt:template> </tt:transform>

Note that the transformation above is symmetric and can be used in both directions to serialize and deserialize.

If we want to transform the XML stream above to a relational data model first we have to break up the nested structure. The values of <A> elements have to be put into one internal table and then <Z> elements into another one. Afterwards we have to deal with the problem of how to link these internal tables. Without ABAP calls from ST we will have problems generating the foreign keys we need. But perhaps this could be done using clever mapping methods. If there is a new <A> element we write additional information to the <Z> elements. Think of '+' respective of '-' in an alternating order each time a new <A> occurs. Later we could calculate foreign keys in a post processing step in ABAP. The result after the transformation would be as follows:

Internal table for <A> elements:

Counter calculated afterwards Value of element <name>

Internal table for <Z> elements:

Additional mark Value of element <Z>
+ 01
+ 02
- 03

Now we can do two loops in ABAP. At first we increment the counter for our <A> elements and then introduce a counter for our <Z> elements that is incremented each time our mark changes from '+' to '-' or vice versa. We would yield the following result which is exactly what we are looking for:

Internal table for <A> elements:

Counter during postprocessing Value of element <name>
1 A1
2 A2

Internal table for <Z> elements:

Additional mark after postprocessing Value of element <Z>
1 01
1 02
2 03

So let's start to code. The following transformation does this job without calculating alternating marks. Just have a look at the inner loop. We change the root from ROOT1 to ROOT2:


 <?sap.transform simple?> <tt:transform template="temp1" xmlns:tt="http://www.sap.com/transformation-templates"> <tt:root name="ROOT1"/> <tt:root name="ROOT2"/> <tt:template name="temp1"> <tt:loop ref=".ROOT1" name="a"> <A> <name> <tt:value ref="$a.name"/> </name> <ZLS> <tt:loop ref=".ROOT2" name="z"> <Z> <tt:value ref="$z.nummer" /> </Z> </tt:loop> </ZLS> </A> </tt:loop> </tt:template> </tt:transform>

Then we test it with following quick-hack:


 DATA xml_string TYPE string. data: begin of z, mark type c, nummer type string, end of z. data: begin of a, counter type i, name type string, zl like table of z, end of a. data t_a like table of a. data t_z like table of z. xml_string = `<A>` & `<name>A1</name>` & `<ZLS>` & `<Z>01</Z>` & `<Z>02</Z>` & `</ZLS>` & `</A>` & `<A>` & `<name>A2</name>` & `<ZLS>` & `<Z>03</Z>` & `</ZLS>` & `</A>`. CALL TRANSFORMATION my_first SOURCE XML xml_string RESULT ROOT1 = t_a ROOT2 = t_z.

The result is annoying. The internal table t_a contains data of two <A> elements but t_z had only one entry with value "03".

The explanation is simple. When you want to deal with internal tables you have to use <tt:loop> statements. But everytime the <tt:loop> inner statement is called during deserialization the internal table is cleared, so in fact you are losing information in the example above. At first this example may look strange, just do the transformation back and look at the result, it will differ from the XML-document above! At first glance we have an asymmetric behaviour although we didn't use any asymmetric commands. At first I thought of it as a bug but switching the root makes sense. Why should ST inventors forbid those mechanisms and reduce expressivenes of this language?

In the following I want to mention a second aspect that seems to be confusing when you are confronted with ST for the first time.

You can assign the value of an element during deserilization only once!

In the rest of this issue we try to solve following task -- we want to deserialize the content of the <name> elements into one table and the nested <Z> elements into another if the content of its associated <name> element has value "A1". Perhaps you would start to define a variable <tt:variable name="N"/> for this task that stores the value of the current <A> element. I guess your code might look something like the following:


 <?sap.transform simple?> <tt:transform template="temp1" xmlns:tt="http://www.sap.com/transformation-templates"> <tt:root name="ROOT1"/> <tt:root name="ROOT2"/> <tt:variable name="N"/> <tt:template name="temp1"> <tt:loop ref=".ROOT1" name="a"> <A> <name> <tt:value ref="$a.name"/> <tt:read type="C" var="N"/> </name> <tt:switch-var> <tt:cond-var check="var(N)=C('A1')"> <ZLS> <tt:loop ref=".ROOT2" name="z"> <Z> <tt:value ref="$z.nummer" /> </Z> </tt:loop> </ZLS> </tt:cond-var> <tt:cond-var> <tt:skip name="ZLS" count="1"/> </tt:cond-var> </tt:switch-var> </A> </tt:loop> </tt:template> </tt:transform>

If you run this transformation you won't get the expected result. Why? Compared to XSLT, variables in ST behave like variables in any other procedural language and you can assign them for more than one time. But following two lines doesn't work:


 <tt:value ref="$a.name"/> <tt:read type="C" var="N"/>

First you want to read it into an ABAP structure and then bind it to a variable. Even changing the order of the two commands wouldn't help you. Let's test it with an easier example:


 <?sap.transform simple?> <tt:transform template="temp1" xmlns:tt="http://www.sap.com/transformation-templates"> <tt:root name="ROOT1"/> <tt:root name="ROOT2"/> <tt:template name="temp1"> <X1> <tt:value ref="ROOT1" /> <tt:value ref="ROOT2" /> </X1> </tt:template> </tt:transform>

Here is the ABAP code for running this ST:


 DATA xml_string TYPE string. DATA field1 TYPE string. DATA field2 TYPE string. xml_string = `<X1> Test </X1>`. CALL TRANSFORMATION my_transformation SOURCE XML xml_string RESULT ROOT1 = field1 ROOT2 = field2.

You will verify that ROOT2 is empty afterwards, so this doesn't work. Therefore we have to find another solution to implement our transformation, perhaps using the condition construct. But here we have another problem; the template content of the conditional is either a template (then you can't do assignments to variables for example) or it is evaluated unconditionally during deserialization. So, in fact, I can't tell you whether this will lead to success.

So let's summarize what we have learned. Programming ST is not difficult and ST does a great job transforming an XML document into nested ABAP structures, back and forth, because it is designed for this task. So to be honest, it's not surprising that transformations to a relational model is impossible without heavy post processing because for this task we have to create a set of internal tables from an XML tree and have to add foreign keys. In our small example above those keys are not part of our model so we would have to generate them and, of course, break the symmetry of our transformation thereby. The impossibility is caused by the fact that we can't increment variables and we can't append deserialized data to an internal table.

If we want to describe the (lack of) power of Simple Transformations I suggest to collect some simple examples for transformations we can't realize with ST and reduce other problems to them.

But does the lack of expressiveness really bother you? I don't believe in solutions that can solve every problem without causing new problems. Perhaps some new features that could help us will be added to ST in post 6.40 development but designers will have to be careful not to make the language too complex.

On the other hand, a software developer should try to choose the right technology -- iXML, XSLT, XSLT with ABAP calls or Java enhancement, ST or even JAXB -- and not to misuse a certain technology to create programs that can solve a problem in an unexpected way but are hard to understand and possibly difficult to maintain.

Check back soon for part 2 of this series!

This content is reposted from the SAP Developer Network.
Copyright 2006, SAP Developer Network

SAP Developer Network (SDN) is an active online community where ABAP, Java, .NET, and other cutting-edge technologies converge to form a resource and collaboration channel for SAP developers, consultants, integrators, and business analysts. SDN hosts a technical library, expert blogs, exclusive downloads and code samples, an extensive eLearning catalog, and active, moderated discussion forums. SDN membership is free.

Want to read more from this author? Click here to read Tobias Trapp's weblog. Click here to read more about ABAP on SDN.


Dig Deeper on SAP development and programming languages