Ab Initio interview questions and answers
1. What does dependency analysis mean in Ab Initio?
Dependency analysis will answer the questions regarding datalinage.That is where does the data come from,what applications prodeuce and depend on this data etc.We can retrieve the maximum (surrogate key) from the existing data,the by using scan or next_in_sequence/reformat we can generate further sequence for new records.
2. When using multiple DML statements to perform a single unit of work, is it preferable to use implicit or explicit transactions, and why?
Because implicit is using for internal processing and explicit is using for user open data required.3. Explain what is the architecture of Abinitio?
Architecture of Abinitio includesGDE (Graphical Development Environment)
Co-operating System
Enterprise meta-environment (EME)
Conduct-IT
4. What is MAX CORE of a component?
MAX CORE is the space consumed by a component that is used for calculations
Each component has different MAX COREs
Component performances will be influenced by the MAX CORE’s contribution
The process may slow down / fasten if a wrong MAX CORE is set
5. Explain what is de-partition in Abinitio?
De-partition is done in order to read data from multiple flow or operations and are used to re-join data records from different flows. There are several de-partition components available which includes Gather, Merge, Interleave, and Concatenation.
6. How do you add default rules in transformer?
The following is the process to add default rules in transformer
Double click on the transform parameter in the parameter tab page in component properties
Click on Edit menu in Transform editor
Select Add Default Rules from the dropdown list box.
It shows Match Names and Wildcard options. Select either of them.
7. Mention what is the role of Co-operating system in Abinitio?
The Abinitio co-operating system provide features like Manage and run Abinitio graph and control the ETL processes
Provide Abinitio extensions to the operating system
ETL processes monitoring and debugging
Meta-data management and interaction with the EME
8. Describe the Grant/Revoke DDL facility and how it is implemented?
Basically,This is a part of D.B.A responsibilities GRANT means permissions for example GRANT CREATE TABLE ,CREATE VIEW AND MANY MORE .
REVOKE means cancel the grant (permissions).So,Grant or Revoke both commands depend upon D.B.A.
9. State the first_defined function with an example.
This function is similar to the function NVL() in Oracle database
It performs the first values which are not null among other values available in the function and assigns to the variable
Example: A set of variables, say v1,v2,v3,v4,v5,v6 are assigned with NULL.
Another variable num is assigned with value 340 (num=340)
num = first_defined(NULL, v1,v2,v3,v4,v5,v6,NUM)
The result of num is 340
10.Explain what is SANDBOX?
A SANDBOX is referred for the collection of graphs and related files that are saved in a single directory tree and behaves as a group for the purposes of navigation, version control, and migration.11. How to run a graph infinitely?
To run a graph infinitely…The .ksh graph file should be called by the end script in the graph.
If the graph name is abc.mp then the graph should call the abc.ksh file.
12. Explain what does dependency analysis mean in Abinitio?
In Abinitio, dependency analysis is a process through which the EME examines a project entirely and traces how data is transferred and transformed- from component-to-component, field-by-field, within and between graphs.
13. Explain PDL with an example?
To make a graph behave dynamically, PDL is used
Suppose there is a need to have a dynamic field that is to be added to a predefined DML while executing the graph
Then a graph level parameter can be defined
Utilize this parameter while embedding the DML in output port.
For Example : define a parameter named myfield with a value “string(“ | “”) name;”
Use ${mystring} at the time of embedding the dml in out port.
Use $substitution as an interpretation option
14. Describe the elements you would review to ensure multiple scheduled batch jobs do not collide with each other?
Because every job depend upon another job for example if you first job result is successfull then another job will execute otherwise your job doesn’t work.
15. What is a local lookup?
• Local lookup file has records which can be placed in main memory
• They use transform function for retrieving records much faster than retrieving from the disk.
16. Mention how can you connect EME to Abinitio Server?
To connect with Abinitio Server, there are several ways like
• Set AB_AIR_ROOT
• Login to EME web interface- http://serverhost:[serverport]/abinitio
• Through GDE, you can connect to EME data-store
• Through air-command
17.Describe the Evaluation of Parameters order.
Following is the order of evaluation:
• Host setup script will be executed first
• All Common parameters, that is, included , are evaluated
• All Sandbox parameters are evaluated
• The project script – project-start.ksh is executed
• All form parameters are evaluated
• Graph parameters are evaluated
• The Start Script of graph is executed
18. Explain what is Sort Component in Abinitio?
The Sort Component in Abinitio re-orders the data. It comprises of two parameters “Key” and “Max-core”.
• Key: It is one of the parameters for sort component which determines the collation order
• Max-core: This parameter controls how often the sort component dumps data from memory to disk
19. Describe the process steps you would perform when defragmenting a data table. This table contains mission critical data?
There are several ways to do this:
1) We can move the table in the same or other tablespace and rebuild all the indexes on the table.
alter table
analyze table table_name compute statistics to capture the updated statistics.
2)Reorg could be done by taking a dump of the table, truncate the table and import the dump back into the table.
20. What is a ramp limit?
• A limit is an integer parameter which represents a number of reject events
• Ramp parameter contain a real number representing a rate of reject events of certain processed records
• The formula is – No. of bad records allowed = limit + no. of records x ramp
• A ramp is a percentage value from 0 to 1.
• These two provides the threshold value of bad records.
21. Mention what information does a .dbc file extension provides to connect to the database?
The .dbc extension provides the GDE with the information to connect with the database are
• Name and version number of the data-base to which you want to connect
• Name of the computer on which the data-base instance or server to which you want to connect runs, or on which the database remote access software is installed
• Name of the server, database instance or provider to which you want to link
22. Explain the methods to improve performance of a graph?
The following are the ways to improve the performance of a graph :
• Make sure that a limited number of components are used in a particular phase
• Implement the usage of optimum value of max core values for the purpose of sorting and joining components.
• Utilize the minimum number of sort components
• Utilize the minimum number of sorted join components and replace them by in-memory join / hash join, if needed and possible
• Restrict only the needed fields in sort, reformat, join components
• Utilize phasing or flow buffers when merged or sorted joins
• Use sorted join, when two inputs are huge, otherwise use hash join
23. How can you force the optimizer to use a particular index?
Use hints /*+
24. Have you used rollup component? Describe how?
If the user wants to group the records on particular field values then rollup is best way to do that. Rollup is a multi-stage transform function and it contains the following mandatory functions.
1. initialise
2. rollup
3. finalise
Also need to declare one temporary variable if you want to get counts of a particular group.
For each of the group, first it does call the initialise function once, followed by rollup function calls for each of the records in the group and finally calls the finalise function once at the end of last rollup call.
25. We know rollup component in Abinitio is used to summarize group of data record then why do we use aggregation?
• Aggregation and Rollup, both are used to summarize the data.
• Rollup is much better and convenient to use.
• Rollup can perform some additional functionality, like input filtering and output filtering of records.
• Aggregate does not display the intermediate results in main memory, where as Rollup can.
• Analyzing a particular summarization is much simpler compared to Aggregations.
26. Mention what is Abinitio?
“Abinitio” is a latin word meaning “from the beginning.” Abinitio is a tool used to extract, transform and load data. It is also used for data analysis, data manipulation, batch processing, and graphical user interface based parallel processing.
27. What are the operations that support avoiding duplicate record?
Duplicate records can be avoided by using the following:
• Using Dedup sort
• Performing aggregation
• Utilizing the Rollup component
28. Mention what is Rollup Component?
Roll-up component enables the users to group the records on certain field values. It is a multiple stage function and consists initialize 2 and Rollup 3.
29. What kind of layouts does Abinitio support?
• Abinitio supports serial and parallel layouts.
• A graph layout supports both serial and parallel layouts at a time.
• The parallel layout depends on the degree of the data parallelism
• A multi-file system is a 4-way parallel system
• A component in a graph system can run 4-way parallel system.
30. Have you used rollup component? Describe how?
Post Your Answer
If the user wants to group the records on particular field values then rollup is best way to do that. Rollup is a multi-stage transform function and it contains the following mandatory functions.
1. initialise
2. rollup
3. finalise
Also need to declare one temporary variable if you want to get counts of a particular group.
For each of the group, first it does call the initialise function once, followed by rollup function calls for each of the records in the group and finally calls the finalise function once at the end of last rollup call.
31. What is $mpjret? Where it is used in ab-initio?
You can use $mpjret in endscript like
if 0 -eq($mpjret)
then
echo “success”
else
mailx -s “[graphname] failed” mailid
32. What is local and formal parameter?
Two are graph level parameters but in local you need to initialize the value at the time of declaration where as globle no need to initialize the data it will promt at the time of running the graph for that parameter.
33. What is m_dump?
m_dump command prints the data in a formatted way.
m_dump
34. What is AB_LOCAL expression where do you use it in ab-initio?
ablocal_expr is a parameter of itable component of Ab Initio.ABLOCAL() is replaced by the contents of ablocal_expr.Which we can make use in parallel unloads.There are two forms of AB_LOCAL() construct, one with no arguments and one with single argument as a table name(driving table).
The use of AB_LOCAL() construct is in Some complex SQL statements contain grammar that is not recognized by the Ab Initio parser when unloading in parallel. You can use the ABLOCAL() construct in this case to prevent the Input Table component from parsing the SQL (it will get passed through to the database). It also specifies which table to use for the parallel clause.
35. What is the latest version that is available in Ab-initio?
The latest version of GDE ism1.15 AND Co>operating system is 2.14
36. What are differences between different versions of Co-op?
1.10 is a non key version and rest are key versions.
There are lot of components added and revised at following versions.
37. What is the importance of EME in abinitio?
EME is a repository in Ab Inition and it used for checkin and checkout for graphs also maintains graph version.
38. How to get DML using Utilities in UNIX?
If your source is a cobol copybook, then we have a command in unix which generates the required in Ab Initio. here it is:
No comments:
Post a Comment