There are several complex ways of generating PL/SQL ETL modules based on metadata. Now we have to avoid all the rules such as:
Open metadata interface
Custom - dynamic ETL patterns (templates)
...
Following stuff is dedicated to be used in places with the minimum impact into infrastructure requirement.
XMLMap concept of generated mappings supposes generating of package body based on package specification enhanced by XML metadata as a comment in the end of specification. Solution requires two following packages:
CREATE OR REPLACE package BW_Buffer AUTHID CURRENT_USER as
--------------------------------------------------------------
--Purpose: Execute and Output buffer to store texts --
--Author: Bob Jankovsky, copyleft 2005 --
--Version: 1.2 /09-FEB-2005 --
--------------------------------------------------------------
TYPE TA_CHR is table of VARCHAR2(32767) index by binary_integer;
g_Buffer TA_CHR;
g_append_pos Integer;
g_append_len Integer;
CREATE OR REPLACE PACKAGE BW_SIMPLE_GEN AS
--------------------------------------------------------------
--Purpose: Module generates package body based on XML --
-- specification of mapping stored as comment --
-- in package specification. --
--Author: Bob Jankovsky, copyleft 2007 --
--Version: 1.0 /19-OCT-2007 --
--Exceptions: 20006 ... Unsuccessfully compiled module --
--------------------------------------------------------------
Procedure GenPackageBody(
SRCTAB/WHERE - specifies local filter condition associated with particular table. It's usefull to define it here instead of common FILTER cause of more efficient manipulation with matadata.
SRCTAB/ON - specifies join conditions.
CREATE OR REPLACE PACKAGE ETL_PT_CUST_M IS
/* ===================================================
Purpose: Fill the PARTY table from CUSTOMERS
Version: 1.0
Change History:
Date Author Change
20.08.2008 Bob Jankovsky New module (v 1.0)
*/
procedure main
(p_process_key in number default -1,
p_effective_date in date default to_date('10000101','YYYYMMDD'),
p_load_partition in varchar2 default null
);
The FILTER element
describes various kind of filters valid over all mapping.
SOURCE - filtering condition valid for input set of data. It use input notation (source names and aliases)
TARGET - filtering condition valid for target set of data. It use target notation (target names and TRG$$ as alias)
SLURCE_OUTER - like SOURCE but in a target notation. It should be used for evaluation of used analytical functions
ON - additional contition for MERGE ON in a target notation
INSET - additional contition for INSERT WHERE in a target notation
UPDATE - additional contition for UPDATE WHERE in a target notation
Attributes:
Attribute
Type
Description
TargetColumn
Mandatory
Unique name of target column.
SourceAlias
Optional
Alias of source table of column.
SourceAlias
Optional
Alias of source table of column. It should be specified in some SRCTAB specification.
SourceColumn
Optional
Name of source column. It should be specified in pair with SourceAlias.
MatchFlag
{Y|N}
Flag of column used as a matching key.
UpdateFlag
{Y|N}
Flag of update could be used to suppress updating of some values specified just during insert. Logically it is unset for coumn with MatchFlag set.
DirectFlag
{Y|N}
Flag of direct usage of constant or independent value - useful for non-deterministic function such as sequences.
DummyFlag
{Y|N}
Flag of column used just for filtering or deduplication but for target modufication.
PartFlag
{Y|N}
Flag specific for CREEPING DEATH algorithm.
ValidFromFlag
{Y|N}
Flag specific for CREEPING DEATH algorithm. It specifies column with VALID_FROM meaning.
ValidToFlag
{Y|N}
Flag specific for CREEPING DEATH algorithm. It specifies column with VALID_THRU meaning.
NullableFlag
{Y|N}
Flag, that target column can be null.
Expr
Optional
Identifier of simple expression from the EXPR element.
UpdateExpr
Optional
Identifier of simple expression from the EXPR element specific just for update.
InsertExpr
Optional
Identifier of simple expression from the EXPR element specific just for insert.
package body ETL_PT_CUST_M as
--Procedure MAIN
procedure main
(p_process_key in number default -1,
p_effective_date in date default to_date('10000101','YYYYMMDD'),
p_load_partition in varchar2 default null
) is
--Audit variables
v_processed_rows Integer:=0;
v_Module_name Varchar2(100):='ETL_PT_CUST_M';
Begin
--Log info -- some logging procedure should be added here
--Mapping statement
MERGE INTO A2.PARTY TRG$$
USING(
SELECT
'SR1' as PT_SID
,to_char(CUST.ID) as PT_ID
,CUST.CUST_FIRST_NAME as PT_FIRST_NAME
,CUST.CUST_LAST_NAME as PT_LAST_NAME
,CUST.CUST_BIRTH_DATE as PT_BIRTH_DATE
,CUST.CUST_CODE as PT_CODE
,CUST.CUST_DELETED_FLAG as PT_DELETED_FLAG
,p_effective_date as UPD_EFF_DATE
,sysdate as INS_DT
,sysdate as UPD_DT
,p_process_key as INS_PROCESS
,p_process_key as UPD_PROCESS
FROM A2.CUSTOMER CUST
) SRC$$
ON(
TRG$$.PT_SID=SRC$$.PT_SID
AND TRG$$.PT_ID=SRC$$.PT_ID
)
WHEN MATCHED THEN UPDATE SET
TRG$$.PT_FIRST_NAME=SRC$$.PT_FIRST_NAME
,TRG$$.PT_LAST_NAME=SRC$$.PT_LAST_NAME
,TRG$$.PT_BIRTH_DATE=SRC$$.PT_BIRTH_DATE
,TRG$$.PT_CODE=SRC$$.PT_CODE
,TRG$$.PT_DELETED_FLAG=SRC$$.PT_DELETED_FLAG
,TRG$$.UPD_EFF_DATE=SRC$$.UPD_EFF_DATE
,TRG$$.INS_DT=SRC$$.INS_DT
,TRG$$.UPD_DT=SRC$$.UPD_DT
,TRG$$.INS_PROCESS=SRC$$.INS_PROCESS
,TRG$$.UPD_PROCESS=SRC$$.UPD_PROCESS
WHERE (TRG$$.PT_FIRST_NAME!=SRC$$.PT_FIRST_NAME
OR TRG$$.PT_LAST_NAME!=SRC$$.PT_LAST_NAME
OR TRG$$.PT_BIRTH_DATE!=SRC$$.PT_BIRTH_DATE
OR TRG$$.PT_CODE!=SRC$$.PT_CODE
OR TRG$$.PT_DELETED_FLAG!=SRC$$.PT_DELETED_FLAG
OR TRG$$.UPD_EFF_DATE!=SRC$$.UPD_EFF_DATE
OR TRG$$.INS_DT!=SRC$$.INS_DT
OR TRG$$.UPD_DT!=SRC$$.UPD_DT
OR TRG$$.INS_PROCESS!=SRC$$.INS_PROCESS
OR TRG$$.UPD_PROCESS!=SRC$$.UPD_PROCESS
)--/differential part
WHEN NOT MATCHED THEN INSERT
(
TRG$$.PT_SEQ
,TRG$$.PT_SID
,TRG$$.PT_ID
,TRG$$.PT_FIRST_NAME
,TRG$$.PT_LAST_NAME
,TRG$$.PT_BIRTH_DATE
,TRG$$.PT_CODE
,TRG$$.PT_DELETED_FLAG
,TRG$$.UPD_EFF_DATE
,TRG$$.INS_DT
,TRG$$.UPD_DT
,TRG$$.INS_PROCESS
,TRG$$.UPD_PROCESS
)VALUES(
PT_SEQ.nextval
,SRC$$.PT_SID
,SRC$$.PT_ID
,SRC$$.PT_FIRST_NAME
,SRC$$.PT_LAST_NAME
,SRC$$.PT_BIRTH_DATE
,SRC$$.PT_CODE
,SRC$$.PT_DELETED_FLAG
,SRC$$.UPD_EFF_DATE
,SRC$$.INS_DT
,SRC$$.UPD_DT
,SRC$$.INS_PROCESS
,SRC$$.UPD_PROCESS
)
;
v_processed_rows:=v_processed_rows+SQL%ROWCOUNT;
--Log info -- some logging procedure should be added here
End Main;
End ETL_PT_CUST_M;
NOTE:
I don't recommend to use mentioned solution as a system way of ETL architecture. If you have to solve that, contact me, I'll recommend you better way. Anyway, this way is useful for minimum impact requirements and for illustration and education cases.
Particular patterns used in a transformation phase of ETL and advanced options of this simple generator will be described in later in separate articles.
The PREMAPPING/POSTMAPPING element
can contain PL/SQL code executed either before (PREMAPPING) or after (POSTMAPPING) mapping. It solves tasks such as truncate or create proper partition, calculate statistics etc.
The EXECCONDITION element
can contain condition when the mapping has to be executed. It is evaluated before PREMAPPING.
<EXECCONDITION Comment="Do not work when the month was loaded yet">
C_EFFECTIVE_MONTH>C_LAST_WCP_MONTH
</EXECCONDITION>
Attributes:
Attribute
Type
Description
Comment
Optional
Comment of the condition.
The SRCTAB element - advanced options
Following options are possible using subelements of SRCTAB:
Deduplication (DEDUP_PARTITION,DEDUP_ORDER)
Aggregation (AGG)
Subquery (SUBJOIN)
The SRCTAB/DEDUP_PARTITION element Key attributes of deduplication set.
The SRCTAB/DEDUP_ORDER element Order based preference for choise proper of duplicates to be used.
Example:
<SRCTAB Owner="A2" Name="PARTY_ADDR" Alias="PTADR" Type="JOIN">
<ON>PTADR.PT_SEQ=PT.PT_SEQ</ON>
<WHERE>P_EFFECTIVE_DATE between PTADR.VALID_FROM and PTADR.VALID_THRU</WHERE>
</SRCTAB>
could be daduplicated following way:
<SRCTAB Owner="A2" Name="PARTY_ADDR" Alias="PTADR" Type="JOIN">
<ON>PTADR.PT_SEQ=PT.PT_SEQ</ON>
<WHERE>P_EFFECTIVE_DATE between PTADR.VALID_FROM and PTADR.VALID_THRU</WHERE>
<DEDUP_PARTITION>PT_KEY</DEDUP_PARTITION>
<DEDUP_ORDER>PTADR_TYPE desc,VALID_FROM desc</DEDUP_ORDER>
</SRCTAB>
It deduplicates addresses prefering some types and more recent records.
The SRCTAB/AGG element allows specify an aggregation function used on source table.
For each aggregated column one tag. All unspecified columns used in mappings will be used in GROUP BY clause.
Example:
Aliased columns are further named as:
M_CC.SUM_ACCFC_BALANCE_CZK a M_CC.MIN_ACCFC_BALANCE_CZK.
Despite ColumName and AggFunction attributes are marked as mandatory, there is one exception. Following example realizes simple DISTINCT of the result set.
The PTTP.PTTP_SOURCE_ID column will be further named as PTH.PTTP_ID.
Attributes:
Attribute
Type
Description
ColumnName
Mandatory
Inner name of column.
AliasName
Mandatory
Further name - alias
LOOP and LOOP/VAR elements allows repeat execution of the same module several times with various values of defined constants. Execution is repeated including EXECCONDITION, PREMAPPING, POSTMAPPING. Every variable should be defined as a constant before.
Example: