sql - How to Import Flat File in SSIS with different first Row Header Columns and Appending Headers to each data row -
i'm trying load data fixed width flat file in ssis (2008 r2), first row contains data that:
- needs parsed different fixed widths data below and
- the parsed data first row needs appended each item in data below it, after data has been separately parsed.
what best way approach this? i'm relatively new ssis, i've tried using row count , conditional split separate out first row, i'm not sure how parse data outside of flat file importer. i've read using script transform work, don't know code should be...
by way of example, if had flat data looks like:
hamilton beach 20150410 sunny
bob male blue black
bill male brownbrown
georgemale greenblonde
jackiefemalegreenblack
jill femaleblue black
it should in output table as:
hamilton beach, 20150410, sunny, bob, male, blue, black
hamilton beach, 20150410, sunny, bill, male, brown, brown
hamilton beach, 20150410, sunny, george, male, green, blonde
hamilton beach, 20150410, sunny, jackie, female, green, black
hamilton beach, 20150410, sunny, jill, female, blue, black
you in luck. ssis not support mixed record types can away because have 1 header row.
my implementation script task reads first line of file , data flow task reads rest of data.
read first line
this 1 simple. create ssis variable, call firstline
of type string. pass value read/write value script task.
use code answer
read first few lines of text file
now need push value of line1
our ssis level variable. looks like
dts.variables["user::firstline"].value = line1;
this assumes want whole line stored firstline. if need portion out individual fields, you'll need implement logic. don't provide guidance on how delimit "hamilton beach 20150410 sunny" individual pieces above logic holds true. parse , assign different ssis level variables.
my specific implementation created 3 ssis variables, of type string
- user::headerihavenoideawhatthisis
- user::headerobservationdate
- user::headerweather
the following code represents what's been linked
using system; using system.data; using system.io; using microsoft.sqlserver.dts.runtime; using system.windows.forms; namespace st_7edd5e6df63a4837afac15b86c21d639.csproj { [system.addin.addin("scriptmain", version = "1.0", publisher = "", description = "")] public partial class scriptmain : microsoft.sqlserver.dts.tasks.scripttask.vstartscriptobjectmodelbase { #region vsta generated code enum scriptresults { success = microsoft.sqlserver.dts.runtime.dtsexecresult.success, failure = microsoft.sqlserver.dts.runtime.dtsexecresult.failure }; #endregion public void main() { // user::headerihavenoideawhatthisis,user::headerobservationdate,user::headerweather // https://stackoverflow.com/questions/9439733/read-only-the-first-few-lines-of-text-from-a-file string line1 = string.empty; using (streamreader reader = new streamreader(@"c:\ssisdata\so_29811494.txt")) { line1 = reader.readline(); } // magic here understand how split out. assuming fixed width // horrible, hard coded brittle approach taken //hamilton beach 20150410 sunny string h1, h2, h3; h1 = line1.substring(0, 20).trimend(); h2 = line1.substring(20, 12).trimend(); h3 = line1.substring(32, line1.length - 32); dts.variables["user::headerihavenoideawhatthisis"].value = h1; dts.variables["user::headerobservationdate"].value = h2; dts.variables["user::headerweather"].value = h3; dts.taskresult = (int)scriptresults.success; } } }
read rest of data
in flat file connection manager, want change value skip header rows 0 1. indicates validation of data , parsing should not begin until we've read first n rows. define connection manager usual.
connect data flow task above script task. within data flow task, use flat file source , connect derived column component. derived column component how we're going value our ssis variable data flow. add new column called headercolumn
, use expression @[user::firstline]
.
if notice column on right indicates data type of dt_ntext
, that's not going match target column definition. might need substring variable substring(@[user::firstline], 1, 20)
. results in data type of dt_wstr , length of 20. goal make match target definition.
you may need make dt_str data type instead of dt_wstr. in case, add explicit cast substring operations (dt_str, 20, 1252)substring(@[user::firstline], 1, 20)
source data
i defined file based on supplied data (click edit on question definitions without stripping of white space)
hamilton beach 20150410 sunny bob male blue black bill male brownbrown georgemale greenblonde jackiefemalegreenblack jill femaleblue black