Automating Dataset Downloads with Powershell

Challenge

Automate the weekly download of the “Death Master File” (DMF) from the National Technical Information Service (NTIS). This is an ASCII file downloaded from an SSL protected website requiring authentication credentials.

Edit: There’s a Bash version of this post and script for *nix (developed on OS X).

Solution

Create a Windows Powershell script (.ps1 extension) with the following code. Schedule it to run daily with the Windows Scheduler.

 #
 # Powershell (2.0) Script to download the weekly Death Master File (DMF) from NTIS.gov
 # Created by Colin A. White in April 2012
 #
 $dmfpath = "D:\DMF";
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Scheduled Job Started." ;
 $date = Get-Date -format yyMMdd;
 $url = "https://dmf.ntis.gov/dmldata/weekly/WA$date";

Try{
 $request = New-Object System.Net.Webclient;
 $passwd = ConvertTo-SecureString "**your_password**" -AsPlainText -Force;
 $request.Credentials = New-Object System.Management.Automation.PSCredential ("**your_username**", $passwd);
 $request.Downloadstring($url) | Out-File $dmfpath\WA$date.txt -force;
 }

Catch [System.exception] {
 Write-Host "404. Nothing to download."
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) 404. Nothing to download." ;
 Exit;
 }

Finally{
 # Process the file and log the results
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Logging into NTIS to request download." ;
 If (!(Test-Path -Path $dmfpath\WA$date.txt)) {
 Write-Host "No file to clean."
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) 404. No new data file found. Job Terminating" ;
 Exit;
 }
 else {

Try{
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Got NEW data. Attempting to cleanse..." ;
 $conv = gc "$dmfpath\WA$date.txt" | %{$_.insert(1,",").insert(5,"-").insert(8,"-").insert(13,",").insert(34,",").insert(39,",").insert(55,",").insert(71,",").insert(73,",").insert(76,"/").insert(79,"/").insert(84,",").insert(87,"/").insert(90,"/")} | Out-file "$dmfpath\tmp.txt";
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Cleansed Succeeded." ;
 }

Catch [System.exception] {
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Non-Fatal System Exception Caught!" ;
 }

Finally {
 Move-Item "$dmfpath\tmp.txt" "$dmfpath\WA$date.txt" -force;
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Cleaned up temp files." ;
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) NEW file WA$date cleaned. All done." ;
 Exit;
 }
 }
 }

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s