Chapter 11: Local Area Network Solutions
Enterprises need to maintain applications that can adapt to changing environments and changing business processes. But companies still need infrastructures that are just as adaptive, available and problem-free as the software. SAP Adaptive Computing allows users to incorporate applications that create a holistic, all-around solution. The SAP Adaptive Computing Controller provides an interface between the infrastructure and the application, and as such, moves towards joining two previously separate worlds.
In this chapter, Michael Missbach, Senior Consultant at the SAP HP Competence Center in Walldorf, Germany, along with his co-authors, explores the aspects of local area networks pertaining to the end-to-end availability of SAP systems. Learn about increasing the availability for local area networks, wires and fibers, wireless networks, and voice-data convergence, as well as different adaptive hardware infrastructures for SAP environments.
This section explores why link aggregation supports only parallel point-to-point connections between two devices. Learn why installing redundant cables and switches for separate work centers can lead to high costs and complex configurations, and learn why up to 50% of work centers can lose their network connections after a backbone switch breaks down, even if the operation of the enterprise is maintained.
Adaptive hardware infrastructures for SAP, Ch. 11
Table of contents:
How to attain high availability for SAP and local networks
Configuring wires and fibers in adaptive SAP hardware infrastructures
WLAN standards and integrating WLAN into SAP hardware infrastructures
Chapter 11: Local Area Network Solutions
SAP NetWeaver solutions on the internal data highway
Network infrastructures installed on a company's premises are called local-area networks (LAN). Modern LAN technologies are capable of providing extremely high bandwidths on an area of several square miles.
When our first book on SAP infrastructures was published, many company networks were still based on proprietary terminal-based applications with their own cable and plug types, bus systems based on coaxial cables, and architectures that were built from hubs and bridges. With the advent of SAP, cabling and network technology often had to be completely redone.
Since then, a unified infrastructure based on full switched Ethernet fiber optics, or twisted pair cables, RJ45 plugs, and hierarchical architectures from department and backbone switches has become standard, which is in keeping with the requirements of SAP solutions.
At the same time, mobile network technologies have undergone further rapid development. Therefore, we have devoted a specific section of this chapter to the characteristics, and the advantages and disadvantages of the wireless network.
Today's LANs are integrated data highways that transport a variety of data flows. These data flows can be divided into three categories: businesscritical, time-critical, and other data traffic. The average bandwidth requirement of a typical client application is:
- AP GUI: 1.5kbps (depending on the content)
- VoIP: 12kbps (depending on the compression)
- Web browser: 30kbps (depending on the content)
- MPEG video: 1.45Mbps (depending on the compression)
- File transfer: many Mbps but not time-critical
If you compare the bandwidth consumption of these typical online activities, you'll notice that an SAP system, when contrasted with all applications, creates the least amount of data load on the network. Since only a few users use SAP and video-on-demand simultaneously, a bandwidth of 10Mbps is more than adequate for the individual end-user connection.
The individual data flows are merged at the work group level. For 12 to 24 users, a bandwidth of 100Mbps will generally suffice. The data flows are merged a second time at the building level where a multiple of 100Mbps or 1,000Mbps is required as a bandwidth. In the computer center, this bandwidth is then distributed with 100Mbps or 1,000Mbps connections to the server systems on which the applications run.
Optical and electrical signals are subject to various negative effects on their path. As these effects increase in proportion to the product of frequency and distance, the achievable bandwidths and range of communication lines are subject to physical limits. Therefore, increasing the signal frequency alone will not help you to attain a higher bandwidth. Special encoding algorithms must ensure that the signal that was originally put into the medium on the sender's side can be correctly reconstructed from the signal that contains added noise on the recipient's side.
Fast Ethernet and Gigabit Ethernet could only be developed so quickly, because existing tried-and-tested coding algorithms could be reverted to. Fast Ethernet is based on technologies that were originally developed for Fiber Distributed Data Interface (FDDI), according to ISO 9314. For Gigabit Ethernet, the encoding algorithms of fiber channel (see Chapter 7) where adapted. Meanwhile, even switches with 10 gigabit uplinks are no longer considered exotic.
11.1 High Availability for Local Networks
As already discussed, the local network architectures generally used contain several points of failure regarding both the active components and the cable connections. The worst-case scenario is represented by a failure of the central backbone. This is similar to a blackout for the entire company. A disruption at the work group level affects the work of an entire department.
In order to achieve high availability of local networks, it is apparent that all cable connections and active components should be redundantly designed. Unfortunately, for Ethernet networks, this is not always possible because redundant connections would generate loops in the data path, which must be avoided under all circumstances within a broadcast domain.
To understand this problem, we must look at the functionality of switches. A switch learns the addresses of the hosts which are reachable through its different ports from the transferred packets that arrive at these ports. In an internal address table, the switch saves information regarding whose hosts can be reached via which connection.
When a data packet arrives at a port, a check is performed to verify whether the destination address already exists in the address table. If this is the case, the packet is forwarded through the corresponding connection. Otherwise, the switch replicates the packet through all its connections, with the exception of the connection through which the packet was received. This means that target hosts, which had been unknown until this point, can be reached, and due to their response, the address table can be updated. This is exactly where the problem of redundant configuration lies, as shown in Figure 11.1.
In this example, there are two potential paths from computer a to computer b, which, in each case, lead through a connection to switches 1 and 2. What happens if computer a sends a packet to computer b, but computer b is not yet listed in the address tables of either switch?
- Computer a in network Segment A transfers a packet. Both switches learn that computer a can be reached through connection 1.1. or 2.1, and broadcast the packet through all the other connections as they do not yet know the address of computer b.
- Thus, there are two identical packets in Segment B. Because both switches are connected with each other via Segment B, these packets also reach both switches cross-over, then through connection 1.2 or 2.2 respectively. Since the packet still contains computer a as the sender, both switches learn that computer a suddenly seems to be located in Segment B.
- As computer b is still not recognized, the packet is once again replicated through connections 1.1 or 2.1 into Segment A, from which it originates and is then duplicated. As the switches do not know each other and each switch continuously broadcasts the packet into the other segment, an endless loop is generated.
Due to the endless replication of the broadcast packets, the loop creates an avalanche effect. In a short space of time, one of the feared broadcast storms floods the network. In such a situation, no more communication is possible as the broadcasts use up the entire available bandwidth. This causes a network meltdown. The PCs connected are so strongly overloaded with interrupts that the systems freeze, which disturbs the entire data processing. For this reason, you should definitely avoid starting loops within a broadcast domain.
Spanning tree To solve this problem, Radia Perlman developed the spanning tree algorithm (IEEE 802.1D). Switches exchange information using the spanning tree protocol, in order to recognize parallel paths. These paths are then shut down in sequence until only one is left. The remaining loop-free paths result in a tree structure that spans from the data center to the end devices, which is why it is called a spanning tree.
The disadvantage of the spanning tree process is that in redundant processes only one link can be used for data transport, while all other links are switched to standby mode. Investments in these cables and connections are therefore not exploited until there is a breakdown. In the event of a breakdown, the necessary recalculation of the spanning tree is also a relatively time-consuming process. During this time, the connections don't forward any more packets.
Therefore, the spanning tree concept doesn't play an important role anymore. Modern switches have mechanisms that limit broadcast storms. These mechanisms are based on the assumption that typically a certain ratio between user data and broadcasts is not exceeded. In the case of a broadcast storm, those broadcasts that go over the limit are distorted. You should, however, implement such mechanisms with caution. On the one hand, there is the danger that good broadcast packets could also be eliminated; on the other hand, the danger exists that a real broadcast storm could be disguised. In both cases, problems that are difficult to identify can emerge.
11.1.1 Link Aggregation
Different manufacturers provide different technologies that can bundle several 100Mbps or 1 Gbit/s connections (link aggregation). For the operating system, the bundled connections represent a single logical interface with a single MAC and IP address. Due to this aggregation, the load is distributed to the parallel connections. This provides higher performance and redundant paths. If a connection breaks down, the data traffic is automatically transferred to the remaining connections within the bundle.
However, link aggregation supports only parallel point-to-point connections between two devices. This means the switches that are linked through link aggregation still represent single points of failure. In addition, the different cables of a bundle are generally placed in the same position so that they're exposed to the same potential risks and can be simultaneously destroyed.
11.1.2 Highly Available Network Clusters for Businesscritical Applications
Installing redundant cables and switches for each important work center in a company would lead to exorbitantly high costs and immensely complex configurations. An alternative approach, (which was developed by one of the authors) is based on the view of the business functions in a enterprise to design highly available networks for business-critical applications.
From this hands-on approach, you can assume that a high availability is essential at the department level, but for individual work centers, a certain downtime can be tolerated. This is because at the department level there is always a functional redundancy since each user is assigned a substitute for leave, sickness, and so on. This substitute is often the colleague at the next desk. Even if the substitutes themselves are not there, the employee whose connection to the SAP system is disrupted can use the colleague's PC (or the colleague's connection wall socket). This fact can be used to grant also that work will be done even in case a network device is going down.
In order to avoid SPoF on the business functions level, network clusters must be configured according to the following simple rules:
- An employee's PC that fulfills an important business function should never be connected to the same network switch as that of his or her substitute.
- For this reason, every network cabinet must contain at least two switches with separate connections (uplinks) to the data center.
- If possible, the connections should be executed on different paths.
- There must be at least two backbone switches installed in the data center (in different fire protection zones).
- Each clustered SAP server must be connected through separate network cards to these backbone switches.
When this concept is implemented methodically, single points of failure are avoided at the business function level as well as network loops and inactive standby connections. Network clusters can be implemented with plug & play components of any manufacturer.
Two simple examples illustrate the network cluster concept. In Figure 11.2, the sales departments PCs are located on the left-hand side and the logistics department PCs are located on the right-hand side.
- Scenario 1: One of the switches (or its uplink) in the sales department breaks down. Every other work center is dead, but the rest remain operational. In a typical network environment in which all PCs of the same department are connected to the same switch, the entire department cannot process any more orders.
- Scenario 2: One of the backbone switches in the data center breaks down. In half of the hosts the connection breaks down, but the rest, however, remain operational. If it takes too long to replace the backbone, switch cross connections between the switches can be used as a bypass.
The network cluster concept ensures that at least one in every two PCs of a department has a connection to the SAP system at any time from a network perspective, and the business tasks of a department can be executed in any situation. The investments are the same as for a redundantly designed network based on the spanning tree concept. Alternatively, there is a network cluster but no convergence time, and the available bandwidth is substantially higher due to the utilization of all available links and connections.
11.1.3 Error-tolerant Meshed Networks
For a network cluster, even when a backbone switch or link breaks down (see Scenario 2 in the previous example), the operation of the enterprise can be maintained. However, in this case, up to 50 % of the work centers can lose their network connection. Due to switch meshing, the network cluster concept can be extended so that even the breakdown of a backbone component can be absorbed. This means that the local network is extensively error tolerant.
Switch meshing is a technology originally developed by Hewlett Packard, Switch meshing which enables the creation of a completely meshed local network infrastructure without generating the risk of loops and subsequent network meltdown due to broadcast storms. Currently, other manufacturers implement this technology as well. All links and connections are always active. On the basis of load statistics, algorithms distribute the data traffic equally to all links and prevent broadcast landslides.
If the "intelligent patching" described for network clusters is implemented with completely meshed switches, even in the case of a failing backbone component, the full operation of the network can be ensured. The switchover time in an SAP cluster test environment, after turning off a backbone switch, was under two seconds in a running operation. The switch took place transparently for the application and user without losing any transactions.
|To ensure the availability of the SAP infrastructure, in areas where there are more frequent voltage fluctuations, Uninterrupted Power Supply (UPS) units should be used. The ability of a UPS unit to filter and stabilize the power supply voltage is more important than bridging long power outages. A switching operation in the high-voltage power grid of the electricity supplier only causes the office lighting to flicker, but the switches and routers may restart and cause network downtime. However, USP units are also active components that must be monitored and maintained.|