How to select distinct values from a file and load to postgres table.
I have a delimited file and I want to load a distinct set of records from two field into the postgres table. I do not see a distinct option in the file source instance similar to other database sources. What option I have other that manipulating the file at unix level.
In order to select the distinct rows from a delimited file source and load into postgres follow the below steps.
- create a dataflow of type postgres native and drag the File source and postgres target into the dataflow.
- Link the file source to TSTAGE transformation and add a derived attribute in it.
- The derived attribute has Row_Number() function applied to the column for which we are selecting distinct values.
4. TStage transformation is where we can store the data/results of inflow transformations in it.
5. Then add a Filter transformation which filters and gives only the distinct values of that particular column and loads into the target by using the below filter condition
6. The target table will have transformed data as per the logic only the distinct department_id values are inserted.