Big Data Analysis and SAP HANA
Big Data is a popular term used to describe massive volume of both structured and unstructured data. We live in the era of big data, with it being amassed from everywhere and in particular from social networks, sensor data, commercial transactions, and communications. A staggering statistic is that 90% of the data in the world today was collected in just the past two years.
While the term Big Data may seem to reference the volume of data, that isn’t always the case. This term, may refer to the new generation of technologies and architectures designed to extract value economically from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis. This definition includes hardware, software, and services that integrate, organize, manage, analyze, and present data that is characterized by 3V — volume, variety and velocity. Variety in Big Data is a critical attribute. The combination of data from a variety of data sources and in a variety of formats is a key criterion in determining whether an application can be consider Big Data.
SAP offers a range of technologies that address Big Data use cases and requirements:
The SAP HANA platform makes possible instant analysis of large amounts of multistructured data and the embedding of analytics into operational applications. Operational data is captured in memory and made available for instant analysis.
Enterprise information management solution from SAP help to manage Big Data from a variety of data sources. In particular, SAP Data Services delivers ETL, data quality management and text data processing capabilities for structured or unstructured data residing in databases, data warehouses or distributed file systems such Hadoop.
New solutions, such as SAP Predictive Analysis and SAP Lumira anables business users to manipulate data quickly. SAP Predictive Analysis is a statistical analysis and data mining solution that enables to build predictive models to discover hidden insights and relationships in data, from which user can make predictions about future events. It’s an intuitive, drag-and-drop, code-free experience with enough power for data scientists to conduct more sophisticated analysis using Big Data, yet simple enough to allow business analysts to conduct forward-looking analysis using departmental data from Excel.
Instead of installing a separate application to the analysis, data scientist would use Predictive Analysis Library (PAL) component of SAP HANA, which is a library of in-database predictive analysis algorithms in SAP HANA. The Predictive Analysis Library defines functions that can be called from within SQLScript procedures to perform classic and universal predictive analysis algorithms.
Predictive analysis can be simply defined as quantitative analysis that supports the making of predictions. Predictions of, for example, product sales, costs, headcount, key performance metrics, customer churn, creditworthiness, cross-selling and up-selling opportunities, market campaign response, anomalies and possible fraud. It is a relatively new term, but it’s not a new topic, given its foundation on such disciplines as statistical analysis, machine learning and operations research. Until recently it was usually referred to as data mining. The heart of predictive analytics is finding the relationship between known variables and a predicted variable, using past occurrences. This relationship is then used to predict an unknown outcome.
- On 07/11/2014