Downloading Medicare Data with Webclient

Challenge

Automate the download of data sets from data.medicare.gov data mart using their Socrata Open Data API (SODA).

Solution

This version pulls Hospital HCAHPS data in XML but can be easily edited to bring back csv or xls formats.

# # Check and Correct the .Net CLR Env settings
if ([Environment]::Version.Major -lt 4) { $configFile = Join-Path $PSHOME -ChildPath '.\powershell.exe.config'if (-not(Test-Path $configFile)) { @' <?xml version="1.0"?> <configuration> <startup useLegacyV2RuntimeActivationPolicy="true"> <supportedRuntime version="v4.0.30319" /> <supportedRuntime version="v2.0.50727" /> </startup> </configuration> '@ | Out-File -FilePath$configFile -Encoding UTF8 "Created $configFile" } 'Restart powershell in order to make it run in .NET 4' }

# Set the extention to xml, csv, xls
$xml_source = "https://data.medicare.gov/resource/jp3v-djai.xml";

# Substitute your password for **PWD**
$dest_pwd = ConvertTo-SecureString "**PWD**" -AsPlainText -Force;

# Specify where to put the downloaded file and make sure the extension matches
$xml_destination = "V:\test.xml";

# Use the default user settings for your enterprise Proxy access
$proxy = [System.Net.WebRequest]::DefaultWebProxy;
$proxy.Credentials = [System.Net.CredentialCache]::DefaultNetworkCredentials;

# Instantiate the .net WebClient object and call the proxy and credential methods
$Wc = New-Object System.Net.Webclient;
$Wc.Proxy = $proxy;

# Substitute **UID** for your login User ID
$Wc.Credentials = New-Object System.Management.Automation.PSCredential ("**UID**", $dest_pwd);

# Call the actual WebCLient DownloadFile method
$Wc.DownloadFile($xml_source,$xml_destination);

# Return some console message for debugging
if ($Wc -ne 0){Write-Host "Opened $url successfully" -ForegroundColor Red }else {Write-Host "Unable to access $url" -ForegroundColor Yellow }

Automating Dataset Downloads with Powershell

Challenge

Automate the weekly download of the “Death Master File” (DMF) from the National Technical Information Service (NTIS). This is an ASCII file downloaded from an SSL protected website requiring authentication credentials.

Edit: There’s a Bash version of this post and script for *nix (developed on OS X).

Solution

Create a Windows Powershell script (.ps1 extension) with the following code. Schedule it to run daily with the Windows Scheduler.

 #
 # Powershell (2.0) Script to download the weekly Death Master File (DMF) from NTIS.gov
 # Created by Colin A. White in April 2012
 #
 $dmfpath = "D:\DMF";
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Scheduled Job Started." ;
 $date = Get-Date -format yyMMdd;
 $url = "https://dmf.ntis.gov/dmldata/weekly/WA$date";

Try{
 $request = New-Object System.Net.Webclient;
 $passwd = ConvertTo-SecureString "**your_password**" -AsPlainText -Force;
 $request.Credentials = New-Object System.Management.Automation.PSCredential ("**your_username**", $passwd);
 $request.Downloadstring($url) | Out-File $dmfpath\WA$date.txt -force;
 }

Catch [System.exception] {
 Write-Host "404. Nothing to download."
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) 404. Nothing to download." ;
 Exit;
 }

Finally{
 # Process the file and log the results
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Logging into NTIS to request download." ;
 If (!(Test-Path -Path $dmfpath\WA$date.txt)) {
 Write-Host "No file to clean."
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) 404. No new data file found. Job Terminating" ;
 Exit;
 }
 else {

Try{
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Got NEW data. Attempting to cleanse..." ;
 $conv = gc "$dmfpath\WA$date.txt" | %{$_.insert(1,",").insert(5,"-").insert(8,"-").insert(13,",").insert(34,",").insert(39,",").insert(55,",").insert(71,",").insert(73,",").insert(76,"/").insert(79,"/").insert(84,",").insert(87,"/").insert(90,"/")} | Out-file "$dmfpath\tmp.txt";
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Cleansed Succeeded." ;
 }

Catch [System.exception] {
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Non-Fatal System Exception Caught!" ;
 }

Finally {
 Move-Item "$dmfpath\tmp.txt" "$dmfpath\WA$date.txt" -force;
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) Cleaned up temp files." ;
 Add-Content "$dmfpath\log.txt" -value "$(Get-Date) NEW file WA$date cleaned. All done." ;
 Exit;
 }
 }
 }