ProcRun;

Stamping the SAS Log with all Macro Variable Values

The topic of the day around the office was stamping all of the user created macro variables into the log function. I’ve got a few programs that have a lot (20+) auto-generated dates, etc and the following is just one of the the many ways to print what each macro resolves to into your log file.

Code:

data _null_;
  y = today();
  x = intnx('month',y,-1);
  call symput('prev_month',strip(put(x,monname9.)));
  call symput('today',put(y,yymmddn8.));
run;

data _null_;
 set sashelp.vmacro;
 where value > "" and scope = "GLOBAL"
       and substr(name, 1, 3) not in ('SQL','SYS');
 put "Macro Variable: " name  "resolves to: " value;
run;

Output into the Log File:

1    data _null_;
2      y = today();
3      x = intnx('month',y,-1);
4      call symput('prev_month',strip(put(x,monname9.)));
5      call symput('today',put(y,yymmddn8.));
6    run;
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds

7   data _null_;
8    set sashelp.vmacro;
9    where value > "" and scope = "GLOBAL"
10          and substr(name, 1, 3) not in ('SQL','SYS');
11   put "Macro Variable: " name  "resolves to: " value;
12  run;

Macro Variable: PREV_MONTH resolves to: March
Macro Variable: TODAY resolves to: 20120403
NOTE: There were 2 observations read from the data set SASHELP.VMACRO.
      WHERE (value>' ') and (scope='GLOBAL') and
      SUBSTR(name, 1, 3) not in ('SQL', 'SYS');
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.01 seconds

The where clause of the second data step can be edited to include the SQL and SYSTEM macro variables if need be and / or the scope can be change to print just the local macro variables with a macro (if you put the code inside of the macro).


Rails and Recent d3.js Examples

A lot of my free time has been spent running and developing a running log using rails.  I want greater control of monitoring / visualizing each training cycle than the sites out there will let me have (no API’s..).   Thus, I’ll just build my own!

This whole thing was spurred by my dabbling in d3.  It’s pretty much the most awesome and only visualization library you’ll need.   I’ve been slowly adding links to all of the example graphics I’ve put together to the d3 Examples Page.

There are three out there right now that’ll let you see what I’ve done so far…

  • Last.fm_bar – Bar chart of a Last.fm user’s top 50 artists.  Calls the Last.fm api and parses the returned json.  Will update as the user scrobbles more tracks to Last.fm. Feel FREE to make fun of my music choices in the comments….
  • Rolling 7 day Training Load - Area chart of my cumulative 7 day training load for miles run.    Much more useful at analyzing the total stress and cycles that weekly or monthly miles.
  • Running Calendar - Calendar view of the daily miles run.  Similar to the R running calendar.   Adapted from the d3 example which was inspired by Rick Wicklin and Robert Allison’s winning poster (pdf).

There are a few data sets that I really want to work with but I need to do some serious data cleaning in SAS first.  Writing an json exporter from SAS is next up on the list when I get  a down moment.


Simple Python Scrapper for Daytum

I use one of the major running sites to keep track of my running log but unfortunately there’s no public API.  The underlying html is a mess and would be utterly complicated to scrape.   For this training cycle,  I’ve been keeping a second log using Daytum since their iPhone app makes updating it way easy.

The following  python script scrapes the data off of Daytum and outputs a .csv file.

# Program: scrapper.py
# Author : Andy McNeice
#
# Purpose: Scrapes my daytum running log and creates a csv file of the
# entries.
import re, urllib2, math, csv
from BeautifulSoup import BeautifulSoup
from datetime import datetime

# Daytum Url
log = urllib2.urlopen('http://daytum.com/amcneice/items/1183813')

# csv output
csv = csv.writer(file(r'data.csv','wb'))

# Constants
parse = [] # holder for Beautiful Soup finds
data = [] # final array

# Use BeautifulSoup to scrap to the Webpage
html = log.read()
soup = BeautifulSoup(html)

# Iterate through the soup to find the text matching the regular expressions
for text in soup.findAll(text=True):
  if re.search(r"\d{1,}\.\d{1,}", text):
    if not (re.search('Total',text) or re.search('DOC',text)):
      parse.append(text.strip())
  elif re.search(r"\d{2}\/\d{2}\/\d{4}", text):
      temp = re.compile(',').split(text)
      parse.append(datetime.strptime(temp[0].strip(),"%m/%d/%Y").strftime("%Y-%m-%d"))
      parse.append(temp[1].strip())

# Manipulate parsed data from long form into multidimensional array
for i in range(0, len(parse)/3):
 data.append([])
 for j in xrange(3):
   data[i].append(parse[i*3+j])

# Sort by day
data.sort(key=lambda x: x[1])

# csv headers for label data
data.insert(0, [])
data[0].append("Miles")
data[0].append("Time")
data[0].append("Time of Day")

# Write out csv to use data elsewhere
csv.writerows(data)

The Beautiful Soup code is a bit of a mess and will only work for quantitative data at the moment.  It was easier for me to use regular expressions to pluck the important info out of the soup that try to parse based on the underlying html tags.

Right now the script is kicked off by a cron job each hour to capture any updates.  Eventually I’ll have a pretty awesome dashboard (using d3) to monitor each of my training cycles.

I’m not a python guru, so if you’ve got suggestions on how to clean up the code,  I’m all ears.

Full code and output .csv available on github.