Sergey Nivens - Fotolia

Planning a HANA big data strategy with SAP HANA Vora

SAP has worked hard to position HANA as a big data platform. To formulate a viable big data strategy, you need to know the tools, such as SAP HANA Vora.

As big data continues to flood into the enterprise, companies need to determine how to best control and use it. SAP has invested significantly in extending HANA's capabilities as the underlying database for processing big data, but actually using HANA as a big data platform is no simple project. You need to plan a big data strategy around HANA and tools, such as SAP HANA Vora, which the company believes can form the foundation of an enterprise big data platform.

To formulate a big data strategy around HANA, it's important to understand the tools you have at your disposal. First and foremost is SAP HANA Vora, an in-memory query engine that extends Apache Spark to integrate with data stored in Hadoop. CenterPoint Energy Inc., a Houston-based electric and natural gas utility is one of the first enterprises to use HANA as a big data platform. Data stored in Hadoop and SAP HANA Vora is the engine that analyzes and makes business sense of the data. Because CenterPoint Energy delivers power to more than 2.3 million consumers in six states, collecting electronic meter data every 15 minutes for energy use reporting, its data-storage costs are very high. To help address this, SAP and CenterPoint Energy built a testing environment that processed over 5 billion data records with Hadoop, HANA and SAP HANA Vora. This gives anyone planning a HANA-based big data strategy a good sense of the business value they can derive from SAP HANA Vora.

However, deploying and running HANA or HANA Vora is challenging, and it's good to understand some best practices for using them in big data projects. The first decision, for example, may be whether to deploy HANA on premises or in the cloud, and this usually depends on how companies treat their data. If security and data governance are serious concerns, it might be better to deploy HANA on premises. If a company is moving data from an older SAP system onto HANA, it makes sense to determine what data to move because HANA is a costly platform for data storage and not all data may be needed.

Companies should also plan for the amount of memory required to support applications, because this provides the basis for the hardware recommendations. Before starting a big data HANA project, it's important to select the right implementation partner and companies should investigate which potential partners have experience in their industries and understand the focus of the project.

Understanding the technical challenges of implementing HANA and tools like SAP HANA Vora for big data applications is important, but it's only part of the story if these projects are going to be successful. There's much that companies need to consider as they integrate big data into their business processes, and the technology is only one of them, according to David Jonker, SAP's senior director of big data initiatives. Jonker said that many people in the tech industry still don't understand the human implications of big data on business. One of the first and most crucial steps to developing a big data strategy is defining the business problems that big data applications can solve. Companies need to determine the insights that they can derive from big data and apply these to day-to-day operations. After this, they need to determine the platform that fits best and how this can be configured to solve the problem. However, a big data strategy has the best chance of succeeding if the business and IT organizations are aligned around a common framework and language.

Next Steps

SAP HANA Vora lets you take a dip in the big data lake.

Are SAP internet of things offerings ready for prime time?

Hadoop enables organizations to design big data environments to meet specific needs, but putting it together isn't easy.

Data visualization tools like Tableau can start making sense out of complex data.

Dig Deeper on SAP data archiving