Tuesday, June 22, 2010

How I created a very large import spec in Access 2007 without doing it the hard way


I had a situation arise at my workplace where I needed to create an import specification in MS Access 2007 for input containing over 260 fields/record. Once I found the solution I started out by writing these steps down as a future reference for myself, but since I had very little success finding any info online I decided to post this in the hopes that someone else might find this useful down the line.

We receive score data in a non-delimited text file. There is one row/record and each row belongs to one person, but a single row is nearly a thousand characters long and contains anywhere from 1 to 30 sets of scores.

The purpose of having this data in an Access DB is that we need to have the data in a human-readable format that is also searchable based on fields (e.g. search for last name containing "Smith"). We get, at most, a few thousand records a year, so there is very close to zero benefit to be had from spending the resources to develop and maintain a properly normalized database for this. The score data from the records are entered into the production computer system through other means, but so we need to be able to go back to the source data at any time (I'm told we could get charged penalties if we ask for too many re-sends).

Description of the file layout I had to work with:
- 30 sets of scores @ 7 fields each
- 8 more sets @ 2 field each that I don't need
- 268 total fields
- By subtracting those I don't want I whittled it down to 258 fields, 251 of which I needed to import (just under the 255 max imposed by Access!)

Steps
- Start to import data and create a new import specification
    - Optionally, set the general import spec options (date format, separators, etc.)
- Make note of the specification name (optionally, change it to what you want) & save
- Cancel out of the import process
- Show System Objects in the Navigation Pane
    - right-click on "All Access Objects" & choose "Navigation Options..."
    - click the "Show System Objects" checkbox (optionally also the "Show Hidden Objects" box) and click OK
- Open the table "MSysIMEXSpecs" and make note of the SpecID for the SpecName you just created
- Close the table "MSysIMEXSpecs"
- Open the table "MSysIMEXColumns" and filter the rows to show only those with SpecID equal to the one you noted above
- At this point I copied & pasted those into an Excel spreadsheet for reference.
- I made an temporary spec in order to figure out a few of the values that I needed:
{DataType & Indexing value conversion tables}
SkipColumn is TRUE/FALSE (ok, Yes/No)
- I created the Field Names based on the names given in the file layout. For those that repeated I tacked on a number to the end. I found the simplest way was to use an Excel spreadsheet. You could also write a simple script in something like perl or python. I chose Excel because I want my solution to still work if they make me remove cygwin and perl. ;-)
- I used 0 (zero) for all Attributes
- Set the SpecID to the number previously noted.
{screenshot to show the Excel table}
- Go back to Access and delete all rows in the MSysIMEXColumns table with SpecID = X
- Import External Data from the Excel document into the MSysIMEXColumns table
- Save and done. I can now use my import spec to pull in my flat file data!

No comments:

Post a Comment