Oracle DBMS_PREDICTIVE_ANALYTICS
Version 12.1.0.1

General Information
Library Note Morgan's Library Page Footer
The Library is currently in the process of being upgraded from Oracle Database Version 12.1.0.1 to 12.1.0.2. Demos are being upgraded to reflect the new Container paradigm as well as EBR (Edition Based Redefinition) and may contain references to CDBs, PDBs, and other objects you may not be familiar with such as CDB_OBJECTS_AE: Welcome to 12c.

Interested in Oracle GoldenGate? Check out the IGGOUG,
Purpose Provides routines for predictive analytics operations useful in data mining
AUTHID CURRENT_USER
Data Mining Option Installation Check col parameter format a30
col value format a30

SELECT parameter, value
FROM gv$option
WHERE parameter = 'Data Mining';

col comp_name format a35

SELECT comp_name, version, status
FROM dba_registry
ORDER BY 1;
Dependencies
DBMS_ASSERT DMP_SEC DUAL
DBMS_DATA_MINING DMP_SYS ODM_MODEL_UTIL
DBMS_SQL DM_QGEN PLITBLM
DBMS_SYS_ERROR    
Documented Yes
First Available Not Known
Security Model Owned by SYS with EXECUTE granted to PUBLIC
Source {ORACLE_HOME}/rdbms/admin/prvtdmmi.plb
System Prvileges The package exposes APIs which are leveraged by the Oracle Data Mining option. Users who wish to invoke procedures in this package require the CREATE MINING MODEL system privilege (as well as the CREATE TABLE and CREATE VIEW system privilege).
Subprograms
 
EXPLAIN
Used for identifying attributes that are important/useful for explaining the variation on an attribute of interest (e.g., a measure of an OLAP fact table). Only known cases (i.e. cases where the value of the explain column is not null) will be taken into consideration when assessing the importance of the input attributes upon the dependent attribute. The resulting table will contain one row for each of the input attributes. dbms_predictive_analytics.explain(
data_table_name     IN VARCHAR2,
explain_column_name IN VARCHAR2,
result_table_name   IN VARCHAR2,
data_schema_name    IN VARCHAR2 DEFAULT NULL);
-- session 1
conn sh/sh@pdborcl

CREATE VIEW sales_view AS
SELECT country_name country, prod_name prod, calendar_year year,
SUM(amount_sold) sale, COUNT(amount_sold) cnt
FROM sales, times, customers, countries, products
WHERE sales.time_id = times.time_id AND
sales.prod_id = products.prod_id 
AND sales.cust_id = customers.cust_id
AND customers.country_id = countries.country_id
GROUP BY country_name, prod_name, calendar_year;

SELECT view_name
FROM user_views;

-- session 2
conn sh/sh@pdborcl

CREATE SEQUENCE seq;

CREATE TABLE pa_demo AS
SELECT seq.NEXTVAL ID, country, prod, year, sale, cnt
FROM sales_view;

UPDATE pa_demo
SET country = 'US'
WHERE country = 'United States of America';
COMMIT;

set linesize 121
col country format a20

SELECT *
FROM pa_demo;

BEGIN
  dbms_predictive_analytics.explain ('PA_DEMO','COUNTRY','PA_RESULTS');
END;
/

-- session 1: while session 2 is running
SELECT view_name
FROM user_views;

-- the actual view names will be different
desc DM$TCAS_BLWUXAMINMARTTNMWABYPH

desc DM$TPBV_ZQIVNCNBKYGLMXHYJSVUSD

-- session 2: continue
desc pa_results

set linesize 121
col attribute_name format a30
col attribute_subname format a20

SELECT *
FROM pa_results;

drop table pa_results purge;

BEGIN
  dbms_predictive_analytics.explain('PA_DEMO', 'PROD', 'PA_RESULTS');
END;
/

SELECT *
FROM pa_results;

drop table pa_results purge;

BEGIN
  dbms_predictive_analytics.explain ('PA_DEMO', 'YEAR', 'PA_RESULTS');
END;
/

SELECT *
FROM pa_results;

drop table pa_results purge;

BEGIN
  dbms_predictive_analytics.explain ('PA_DEMO','SALE','PA_RESULTS');
END;
/

SELECT *
FROM pa_results;

drop table pa_results purge;

BEGIN
  dbms_predictive_analytics.explain ('PA_DEMO','CNT','PA_RESULTS');
END;
/

SELECT *
FROM pa_results;

-- session 1: when procedure run is completed
SELECT view_name
FROM user_views;

desc pa_results

SELECT *
FROM pa_results;

-- the higher the value the more important the attribute is to
-- whether any specific record in SALES is related to the sale.


-- another example
SELECT promo_id, COUNT(DISTINCT TO_CHAR(time_id, 'DDD'))
FROM sales
GROUP BY promo_id;

BEGIN
  dbms_predictive_analytics.explain ('SALES', 'PROMO_ID', 'RTAB');
END;
/

SELECT promo_id, COUNT(DISTINCT TO_CHAR(time_id, 'DDD'))
FROM sales
GROUP BY promo_id;

SELECT DISTINCT promo_id FROM sales ORDER BY 1;

SELECT promo_id, COUNT(DISTINCT TO_CHAR(time_id, 'DDD'))
FROM sales
GROUP BY promo_id;

SELECT DISTINCT TO_CHAR(time_id, 'DDD')
FROM sales
WHERE promo_id = 33
INTERSECT
SELECT DISTINCT TO_CHAR(time_id, 'DDD')
FROM sales
WHERE promo_id = 350;

SELECT DISTINCT TO_CHAR(time_id, 'DDD')
FROM sales
WHERE promo_id = 33
INTERSECT
SELECT DISTINCT TO_CHAR(time_id, 'DDD')
FROM sales
WHERE promo_id = 351;
 
PREDICT
Produces predictions for unknown targets. The input data table should contain records where the target value is known (not null). The known cases will be used to train and test a model. Any cases where the target is unknown, i.e. where the target value is null, will not be considered during model training. Once a mining model is built internally, it will be used to score the records from the input data (both known and unknown), and a table will be persisted containing the results. In the case of binary classification, an ROC analysis of the results will be performed, and the predictions will be adjusted to support the optimal probability threshold resulting in the highest True Positive Rate (TPR) versus False Positive Rate (FPR). dbms_predictive_analytics.predict(
accuracy            OUT NUMBER,
data_table_name     IN VARCHAR2,
case_id_column_name IN VARCHAR2,
target_column_name  IN VARCHAR2,
result_table_name   IN VARCHAR2,
data_schema_name    IN VARCHAR2 DEFAULT NULL);
conn sh/sh@pdborcl

set serveroutput on

CREATE SEQUENCE seq;

CREATE TABLE pa_demo AS
SELECT seq.NEXTVAL ID, country, prod, year, sale, cnt
FROM sales_view;

UPDATE pa_demo
SET country = 'US'
WHERE country = 'United States of America';
COMMIT;

set linesize 121
col country format a20

SELECT *
FROM pa_demo;

DECLARE
 v_accuracy NUMBER(30,10);
BEGIN
  dbms_predictive_analytics.predict(
  ACCURACY => v_accuracy,
  DATA_TABLE_NAME => 'PA_DEMO',
  CASE_ID_COLUMN_NAME => 'ID',
  TARGET_COLUMN_NAME => 'CNT',
  RESULT_TABLE_NAME => 'PA_RESULTS');

  dbms_output.put_line('*** Accuracy ***');
  dbms_output.put_line(v_accuracy);
END;
/

desc pa_results

SELECT COUNT(*)
FROM pa_results;

SELECT prediction, COUNT(*)
FROM pa_results
GROUP BY prediction
ORDER BY 2;

DROP TABLE pa_results PURGE;
 
PROFILE
Segment data based on some target attribute and value. Creates profiles or rules for records where the specific attribute and value exist, in some sense it can be seen directed or supervised segmentation. dbms_predictive_analytics.profile(
data_table_name    IN VARCHAR2,
target_column_name IN VARCHAR2,
result_table_name  IN VARCHAR2,
data_schema_name   IN VARCHAR2 DEFAULT NULL);
exec dbms_predictive_analytics.profile('PA_DEMO', 'CNT', 'PA_RESULTS', 'SH');

desc pa_results

SELECT COUNT(*)
FROM pa_results;

set linesize 121
set long 1000000

SELECT profile_id, record_count, description
FROM pa_results
ORDER BY 1;

DROP TABLE pa_results PURGE;

Related Topics
DM_QGEN
Packages

Morgan's Library Page Footer
This site is maintained by Dan Morgan. Last Updated: This site is protected by copyright and trademark laws under U.S. and International law. © 1998-2014 Daniel A. Morgan All Rights Reserved