Talend & Stored Procedures using SSAS 2008
ETL and ELT process requires large volume of data. Processing large volume of data all at once without a filter can cause performance issues.
- The Database, Data Warehouse, Data Mart or the application systems may be active requiring multiple connections which can slow down other processes due to ACID properties.
- The system may be distributed across different locations which may cause the query to timeout due to security settings.
Talend Open Studio , Informatica Power Center , Pentaho, Microsoft SSIS and other ETL applications utilizes both stored procedures to restrict the follow of data based on specific criteria using context in terms of Talend or Variable (SSIS). How do you pass down the stored procedure parameter in Talend to filter the data based on specific value ?
The process is not that hard at first. It begins with Metadata which contains all the database connection properties as illustrated below. Once the database connection has been established and the corresponding schema has been retrieved, the next process is to design the Talend job.
Microsoft SQL Server is being utilized, the stored procedure has been created using the SQL Sever Management Studio. Other application like Oracle SQL Developer or IBM Data Studio can be used to connect to SQL Server and created various objects. A linked server was created to connect to SSAS 2008 to allow the MDX query to be passed through OpenQuery.
Stored Procedure with SSAS linked Server name
Linked Server Connection to SSAS
Drag the tFixedFlowInput component to the job and the database connection from the Metadata which will open the dialog box. Since stored procedure are being utilized, select the last option with sp for stored procedure into the job site. Connect the two component together.
Click on tFixedFlowInput and edit schema as illustrated below. Type in the stored procedure parameter and select the right data type. Close it and now edit the database connection.
Click on the database connection and enter the stored procedure name as illustrated below. Enter the stored procedure name created for the job in double quotations and click sync columns.
This will populate the columns together by matching the output column with the input columns. For Talend to capture the output of the data, an additional column has to be created with object type. Click on the edit schema which will open the dialog box and output column , click the plus sign and add additional column as illustrated below.
Add the stored procedure name
Add new column
Now add the input and output parameters by clicking the plus sign.
Drag the tlogicRow to the job site. Click the Sync columns and edit schema.
Next the output of the stored procedure has to be extracted into a columns which can be accomplished using the tParaseRecordSet. Click the edit schema and enter the corresponding output columns as defined by your stored procedure. Ensure that the datatype matches.
Add another tlogicRow to the job site and sync the columns. After wards edit the schema.
Depending on the destination of your project different component like flat file or XML palette may be utilized. For this exercise, the end destination is a SQL Server table.
Now run your job.
For additional reading please see Transforming SSAS data
Comments
Post a Comment