News Stay informed about the latest enterprise technology news and product updates.

New angles on data mining

Barry Grushkin is Chairman and CTO of the Machine Intelligence Company, a provider of decision support solutions. He is also a columnist for Intelligent Enterprise magazine. Here Grushkin provides insight into the burgeoning data mining market and its revolutionary technology for SearchDatabase readers.

What's unique about PolyVista's cube?
What PolyVista does that no one else seems to be doing is allowing the asking of a global question to all levels of the cube at one time. It has a proprietary representation of the cube that does not require accessing the original database, but its speed most likely will have limitations as the cube grows. What's a cube?
A cube is a term from OLAP (On-line Analytic Processing). It refers to the model created by the intersection of numerous dimensions. In OLAP, a dimension is usually a hierarchical representation of some key aspect of a business. A geography dimension might a the highest level represent the world, the drill down to countries, then to states or provinces, then to cities. A time dimension might represent years, and can be seen broken down further into months and days. These are sometimes called locator dimensions. What sort of data can it yield?
At the intersection of any number of locator dimensions you can read off values of measure dimensions such as sales, costs, growth rates, and so forth. For example, at the intersection of New York and June 24, 2002 you can read the sales, i.e. sales for the business in New York on that date. Are database vendors offering this technology?
Oracle and Teradata are actively integrating data mining methods as part of their systems. They are working to optimize where and how the calculations are done. Oracle 9i includes cube building capabilities. You can do calculations on that cube; again, you write code and access its library of data mining routines. Many of these routines came from the Darwin data mining system purchased by Oracle from Thinking Machines. Where does SAS fit in this market?
SAS has a relational database language integrated with its statistical methods. They remain the most widely used statistical package. Additionally, they have slicing and dicing methods just as any OLAP system might. Again, you can write code to do calculations and data mining on any aggregations, but the key is you have to hardwire what you want to look for. Are there more powerful data mining tools?
Others are far more industrial strength. They are designed for mega data applications. Oracle and Teradata are made for large business demands. Torrent has paralyzed SAS for those who need rapid output for big data mining calculations. Microsoft's cube has its scaling limitations. MicroStrategy claims many users and large database capabilities, but I have heard mix reviews from users. PolyVista says they have no competitors. Is this true?
I guess they are in competition with anyone who has a method for looking at data and coming up with useful business information. Everyone has an angle. It seems to me there will never be a best solution and there always will be differing things that differing methods reveal. Directly there is nothing like Polyvista out there, but I can say this about other data mining and visualization systems. The have succeeded in uniquely integrating data mining visualization and OLAP. I think their next step should be finding partners and applications where their unique approach can add the most value and build on these verticals.

Read more about how data mining is headed to the masses.

Dig Deeper on SAP data management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.