Companies considering text analytics software to monitor social media channels to gauge customer sentiment, ensure compliance with e-discovery requirements, or solve other business problems first need to ask themselves what they want out of the software. Only then can they choose a vendor, given the wide range of differences among those analytics tools, according to experts.
"Rather than starting with the technology, [companies] need to start with the need: who the users are, what the task is and what the outcome should be," said Susan Feldman, research vice president for search and discovery technologies at the Framingham, Mass.-based analyst firm IDC. The needs can range from broad sentiment analysis to precise analytics that hone in on key words and phrases for legal compliance purposes, she added.
According to Feldman, approximately 80% of information isn't structured in a classic computing sense, like in a database. That's where text analytics comes in: it helps users understand the inherent structure and relationships between words. It can be used for everything from gauging what users are saying on Twitter about a company to finding documents containing a particular phrase in a legal situation.
The text analytics marketplace
SAP users have an array of choices, including SAP BusinessObjects Data Services, as well as Social Media Analytics by Netbase, which SAP resells. SAP shops can also choose from mega-vendors like IBM and SAS, as well as enterprise content management (ECM) and search platform vendors whose text analytics application is at the core of their offerings. There are also specialized vendors that provide some or all of the capabilities, and vendors that provide text analytics as a feature, not as the core functionality, according to Gartner's "Who’s Who in Text Analytics." The study recommended evaluating a range of vendors, because so many of them support e-discovery, customer voice and social media.
Determining what the goals are
According to Tom Reamy, chief knowledge architect at KAPS Group LLC, an Oakland, Calif.-based text analytics consultancy, companies need to focus on two things before diving into text analytics: the capabilities that text analytics offers, and what their goals are for text analytics.
For more on SAP and analytics:
Learn why experts advise caution when choosing text analytics vendors
Read about the intersection of SAP BusinessObjects 4.0 and video analytics
According to a recent survey Reamy conducted, most businesses don't know what text analytics can do for them. "To me, it's always critical to start there. Once you have that [understanding of text analytics], then we can do an evaluation and figure out exactly what features you really need," he said.
Choosing the right product for the business really depends on what the business plans to use text analytics for, Reamy said. For example, one documentations company required as many languages as possible in their text analytics software. Some products only support three or four languages, while others support over 40. "If that's important, that's a big decider," Reamy said. "It's really critical to start by what you're going to do with the text analytics."
Once most companies have decided on a goal for text analytics, they start with classifying their data, identifying strings and sentences in the text, then reporting on those findings said IDC's Feldman. However, no matter what the company wants to do, it should start with the goals for text analytics and work backward, she said.
"Always start with the information you need to get, then back-check and say, 'OK, in order to get to that point, what do we have to do? Which information is it, and how do we have to analyze it?'" she said. For example, an online retailer would need its text analytics tool to search for, categorize and tag data, returning the data by category type in a report.
How does SAP text analytics stack up?
SAP's text analytics offering is very competitive and offers plenty of capabilities, but may not be the best tool for every SAP shop, said Seth Grimes, IT strategy consultant and leading industry analyst with consultancy Alta Plana, based in Takoma Park, Md.
"It will be natural for an existing SAP customer to try out the SAP solution. It does handle a lot of different human languages … although that's not something everyone needs," Grimes said. "The real question is how well-integrated the SAP solution is for text analytics and sentiment analysis with the business problems you're trying to solve." That can be complying with a discovery request in a lawsuit, analyzing social media sentiment or a range of text-parsing uses.
SAP has been making a play for the text analytics market with HANA, its in-memory database platform, but experts agreed that in-memory analytics may not make much of a difference with text analytics.
Using in-memory technology for interactive, visual analysis or exploratory analysis because of its quick response time has its advantages, but it isn't the best solution in every case, Grimes said. "If you're dealing with a truly massive number of records, it's not going to be possible to fit all that data in-memory," he said. SAP HANA is trying to scale beyond traditional in-memory capabilities, and it offers high levels of compression that allow users to store more information in-memory, he added.
In-memory doesn't work if most of your data sources are external, said IDC's Feldman. "In-memory means that you've already tagged, stored and analyzed information so it's available very fast, but it also means you've had to do the processing ahead of time," she said. Increasingly, more information will be stored in a variety of locations, so that is something to consider, Feldman added.
The bottom line is to evaluate the text analytics package as it fits the company's business needs, experts said. "Different products are better adapted to different business problems," Grimes said. He advised examining whether the vendor has experience dealing with industry-specific problems and whether it can adapt to the company's business needs, as well as evaluating expected issues of accuracy and speed.
This was first published in November 2012