Dataware housing ETL Interview question and answers

What are snapshots? What are materialized views & where do we use them? What is a materialized view log?
Materialized view is a view in wich data is also stored in some temp table.i.e if we will go with the View concept in DB in that we only store query and once we call View it extract data from DB.But In materialized View data is stored in some temp tables.

What are the various tools?
The various ETL tools are as follows.
Informatica
Datastage
Business Objects Data Integrator
OLAp tools are as follows.
Cognos
Business Objects 

Can Informatica load heterogeneous targets from heterogeneous sources?
yes! it loads from heterogeneous sources..

What is the difference between Power Center & Power Mart?
Power Center : we can connect to single and multiple Repositories, generally used in big Enterprices.
Power Mart : we can connect to only a single Repository.

What is a mapping, session, worklet, workflow, mapplet?
Mapping – represents the flow and transformation of data from source to taraget.
Mapplet – a group of transformations that can be called within a mapping.
Session – a task associated with a mapping to define the connections and other configurations for that mapping.
Workflow – controls the execution of tasks such as commands, emails and sessions.
Worklet – a workflow that can be called within a workflow.

What are parameter files ? Where do we use them?
Parameter file is any text file where u can define a value for the parameter defined in the informatica session, this parameter file can be referenced in the session properties,When the informatica sessions runs the values for the parameter is fetched from the specified file. For eg : $$ABC is defined in the infomatica mapping and the value for this variable is defined in the file called abc.txt as
[foldername_session_name]
ABC=’hello world”
In the session properties u can give in the parameter file name field abc.txt
What are the different Lookup methods used in Informatica?
connected lookup will receive input from the pipeline and sends output to the pipeline and can return any number of values.it does not contain retun port .
Unconnected lookup can return only one column. it containn return port. 

Can we lookup a table from source qualifier transformation. ie. unconnected lookup
You cannot lookup from a source qualifier directly. However, you can override the SQL in the source qualifier to join with the lookup table to perform the lookup.

What is ODS (operation data source)
ODS is the Operational Data Source which is also called transactional data ODS is the source of a warehouse. Data from ODs is staged, transformed and then moved to datawarehouse.

What is the difference between etl tool and olap tools
ETL tool is ment for extraction data from the legecy systems and load into specified data base with some process of cleansing data.
ex: Informatica,data stage ….etc
OLAP is ment for Reporting purpose.in OLAP data avaliable in Mulitidimectional model. so that u can write smple query to extract data fro the data base.ex: Businee objects,Cognos….etc

What is Full load & Incremental or Refresh load?
By Full Load or One-time load we mean that all the data in the Source table(s) should be processed. This contains historical data usually. Once the historical data is loaded we keep on doing incremental loads to process the data that came after one-time load.

What is partitioning? What are the types of partitioning?
Partitioning is a part of physical data warehouse design that is carried out to improve performance and simplify stored-data management. Partitioning is done to break up a large table into smaller, independently-manageable components because it:
1. reduces work involved with addition of new data.
2. reduces work involved with purging of old data.
Two types of partitioning are:
1. Horizontal partitioning.
2. Vertical partitioning (reduces efficiency in the context of a data warehouse).

No comments: