ST 590G -- Computation for Data Analysis Second Assignment -- due Tuesday, 25 September 2012 In the 'ushcn' directory are a few files of interest taken from the US Historical Climatology Network page on the CDIAC/ORNL site. The file 'data_format.txt' has information on how the data files are organized. I have downloaded and uncompressed files for several states to the 'ushcn' directory. See the file 'h2f12who' for which file/state you should analyze. 1) From the stations file 'ushcn-stations.txt', read the station id number, latitude, longitude, and elevation. Note any missing values -- you may not want to deal with those stations. 2) From the data file, read the TMAX values and the flags for each day and month. Notice that each record holds the information for 31 days. Note that missing values are coded as '-9999'.) If a flag is thrown -- either the MFLAG or QFLAG, then record that observation as missing. Read only the element of interest (TMAX, maximum daily temperature) and delete the others. 3) Since we have daily values, there's a lot of data -- monthly would be easier. So construct a dataset for each month and year from the daily values, using the mean of TMAX. If too many days are missing, denote that month as missing. 4) For each station in the state, run the ANACOVA model with monthly effects and year as the covariate: proc glm data=whatever ; class month ; model precip = year month / solution ; 5) Assemble these results in a SAS dataset (use ODS) and keep only the slopes (and maybe t-statistics) of the covariate effects (year coefficient). 6) Merge these back with the station information to get lat/lon. 7) Construct some coding of the slopes and/or t-statistics so we can discern whether the slopes are big or small, positive or negative. 8) Using your code from (7) as a plotting symbol, plot these effects by latitude and longitude. (The shape of the plot should resemble the shape of the state.) The information on the US Historical Climatology Network can be found at their site: http://cdiac.ornl.gov/epubs/ndp/ushcn/ushcn.html *) If you want to do a fancier statistical model with the daily data, then fit a time trend with at least three harmonics of a yearly (365.25 day) cycle instead of the ANACOVA model. If you don't understand what I'm talking about, then ignore this point.