Showing posts with label ODI. Show all posts
Showing posts with label ODI. Show all posts

Monday, 9 March 2015

Static vs Flow control in Oracle Data Integrator

FLOW CONTROL :

Data quality validation is done before loading the data into target tables.

Check Control Knowledge Module (CKM) will create E$ table and SNP_CHECK_TAB table for data quality check.

It will validate data in I$ table before inserting data into target table. If it has any errors then it will delete from I$ table and insert into E$ table and common error message and interface name into SNP_CHECK_TAB

STATIC CONTROL :

Data quality validation is done after loading the data into target tables.

CKM will validate data on target table and if any error is detected it will be inserted to E$ table and SNP_CHECK_TAB. Remember that the incorrect entry will not be deleted as in Flow control. 

What are the different temporary tables created in ODI ?

Oracle Data Integrator will create temporary tables during the data transformation and loading/integration.
Let us study about these temp tables.

  1. E$ table: This is generated by Check Control Knowledge Module (CKM) if flow control is enabled.
  2. I$ table: Created by Integration Knowledge Module (IKM). After LKM loads the data into C$ table, the IKM will load these into I$ table.
  3. J$ table: Created during the implementation of Change Data Capture (CDC). This is created by Journalizing Knowledge module (JKM)
  4. C$ table: The LKM creates the "C$" temporary table in the staging area. This table will hold the records loaded from the source server.

Saturday, 7 March 2015

Types of Work repositories

odi-work-repository-types

While creating the work repository under a master repository we come across this situation, to choose the type of repository. There are two types of work repositories possible as explained below.
  1. Development: This type of repository allows management of design-time objects such as data models and projects (including interfaces, procedures, etc). A development repository includes also the run-time objects (scenarios and sessions). This type of repository is suitable for development environments. 
  2. Execution: This type of repository only includes run-time objects (scenarios, schedules and sessions). It allows launching and monitoring of data integration jobs in Operator Navigator. Such a repository cannot contain any design-time artifacts. Designer Navigator cannot be used with it. An execution repository is suitable for production environments.

Oracle Data Integrator repositories

There are two types of repositories:
  1.  Master repositories
  2. Work repositories
These repositories are data bases stored in RDBMS. Usually there will be one master repository and several work repositories attached to the master repository. 

Master repository: Master repository stores the information defined by topology manager and security navigator. Master Repository stores the following information 
  1. Security Information i.e. information about users, profiles and access privileges 
  2. Topology information including technologies, Data servers, physical and logical schemas , contexts, agents and languages
  3. ODI versions and objects 
Work Repository: Work repository should always attached to a master repository. 
Work repository stores the information defined in Designer navigator . It stores the following information: 
  1. Data models: metadata, constraints, references, data lineage, etc 
  2. Projects: interfaces, packages, procedures, folders, knowledge modules, etc 
  3. Scenarios, scheduling information and logs Work repository contents manged by designer and operator.

Friday, 6 March 2015

ETL vs ELT

etl-vs-elt
ETL vs ELT
Traditional Extract Transform Load (ETL) tools operate by first Extracting the data from various sources, Transforming the data in a proprietary, middle-tier ETL engine that is used as the staging area, and then loading the transformed data into the target data warehouse, integration server, or Hadoop cluster. Hence the term ETL represents both the names and the order of the operations performed.

Drawbacks of ETL approach

  • The data transformation step of the ETL process is by far the most compute-intensive, and is performed entirely by the proprietary ETL engine on a dedicated server. The ETL engine performs data transformations and data quality checks on a row-by-row basis, and hence, can easily become the bottleneck in the overall process. 
  • Data must be moved over the network twice – once between the sources and the ETL server, and again between the ETL server and the target data warehouse or Hadoop cluster. 
  • To ensure referential integrity by comparing data flow references against values from the target data warehouse, the referenced data must be downloaded from the target to the engine, thus further increasing network traffic, download time, and leading to additional performance issues. 

Extract Load Transform (ELT) approach by Oracle Data Integrator (ODI)


In response to the issues raised by ETL architectures, a new architecture has emerged, which in many ways incorporates the best aspects of manual coding and automated code-generation approaches. Known as E-LT, this new approach changes where and how data transformation takes place, and leverages existing developer skills, RDBMS and Big Data engines, and server hardware to the greatest extent possible. In essence, E-LT moves the data transformation step to the target RDBMS, changing the order of operations to: Extract the data from the source tables, Load the tables into the destination server, and then transform the data on the target RDBMS using native SQL operators. Note, with E-LT there is no need for a middle-tier engine or server.

Monday, 23 February 2015

Oracle Data Integrator Skills needed to become a good ETL Developer

The recruiter will look for the following skills while searching for an ETL or ODI developer. Particular positions may need different skills but these are the general ODI skills needed.

  1. Know everything about ODI architecture and ODI components.
  2. Working knowledge with creating different interfaces, packages, constraints, procedures, SQL joins functions and agents etc.
  3. Extend the base ODI Knowledge Modules to create customized versions that will be re-usable across the company's data integration projects.
  4. Map business rules into Oracle Data Integrator Interfaces, Packages, and Load Plans source to target mappings .
  5. Take care of business rules using ODI Constraints and Data quality functions.
  6. Know the differences between Oracle Data Integrator versions like 10g, 11g and 12c.
  7. Working knowledge with Change Data Capture (CDC). 
  8. Knowledge on Data warehouse concepts such as Dimensional modelling, Entity relationship.
Apart from knowing the tool, the developer must be aware of data warehouse concepts in depth. SQL skills are also must.

Sunday, 22 February 2015

What are the Oracle Data Integrator GUI components ?

Oracle Data Integrator is made up of the following graphical user interface (GUI) components.
In the current releases of ODI all of these Graphical User Interface components are combined into Oracle Data Integrator Studio.

What is an agent in Oracle Data Integrator (ODI) ?

We know that there are run time and design time components in Oracle Data Integrator.

ODI agent is a run time component.

What is ODI agent ?

Oracle Data Integrator agent is a java program which will listen on a particular TCP/IP port. Agent can execute the scenarios when it is invoked from one of the ODI GUI modules or schedule these executions when it is started as a scheduler.


There are two types of ODI Agents available.

  1. Standalone Java Agent :When request is received for a scenario execution, it start arranging the scenario and updates the status to work repository. A parent and child standalone agents can be configured in ODI studio to do the load balancing if there is a need to load balance between stand alone agents.
  2. JavaEE agent : This agent is part of web logic and listens on a particular port. When request is received for a scenario execution, it start arranging scenario and updates the status to work repository. Load balancing is supported to through web logic clusters for a Java EE Agent.

Difference between Oracle Data Integrator features 10g, 11g and 12c


Differences between Oracle Data Integrator 12c and 11g: 

ODI 12c is the latest version of ODI.


  1. loading multiple target tables as part of a single interface. Interface is termed as mapping in ODI 12c on wards.
  2. Redesigned declarative flow-based user interface, reusable mappings, runtime performance enhancements, Oracle GoldenGate integration improvements, Big Data and XML improvements, Oracle Warehouse Builder integration.
complete change log is here
Oracle Data Integrator 11g and 10g:
  1. ODI 11g has all the ODI components(Designer, Operator, Topology and Security) integrated together as Oracle Data Integrator Studio.
  2. ODI 11g provides a Java API to manipulate both the design-time and run-time artifacts of the product. This API allows you for example to create or modify interfaces programmatically, create your topology, perform import or export operations, launch or monitor sessions. This API can be used in any Java SE and Java EE applications, or in the context of Java-based scripting languages like Groovy or Jython.
  3. External Password Storage, to have source/target data servers (and contexts) passwords stored in an enterprise credential store. External Authentication, to have user/password information stored in an enterprise identity store (e.g.: LDAP, Oracle Directory, Active Directory), and ODI authenticating against this store. These two features let you optionally store critical information in dedicated storages and not within the ODI repository. The ODI Console may also use Oracle's single-sign on systems with ODI.

Friday, 20 February 2015

Oracle Data Integrator (ODI) Interview questions

1) What components make up Oracle Data Integrator?

"Oracle Data Integrator" comprises of the following

  1. Oracle Data Integrator + Topology Manager + Designer + Operator + Agent. 
  2. Oracle Data Quality for Data Integrator.
  3. Oracle Data Profiling.
2) What is a model in ODI ?

A data model corresponds to group of tabular data structure stored in a data server and is based on a Logical Schema defined in the topology and contain only metadata.


3) What are User Parameters in Oracle Data Integrator?


Oracle Data Integrator saves user parameters such as default directories, windows positions etc. User parameters are saved in the userpref.xml file in /bin.


Tuesday, 27 January 2015

Oracle Data Integrator 11g Certified Implementation Essentials Syllabus

Introduction to Oracle Data Integrator (ODI)
  1. Describe the benefits of ODI (including ODI architecture - Extract Load Transform) and it's typical use cases 
  2. Describe the ODI components and their uses 
  3. Architecture Overview 
  4. Describe ODI Architecture 
  5. Install and Configure the physical and logical architecture 
  6. Explain the Fusion Middleware Components - Upgrade Assistant, Repository Creation Utility and Java Enterprise Edition (JEE) Components and High Availability