Getting asked for Advice on being a Data Analyst

I got a message on LinkedIn from a former colleague of my from Arizona Priority Care asking me:

Wanted to pick your brain on something. what do you think the outlook is for a data analyst? Debating a masters program in that and covers a few things but also includes certifications in SAS. Trying to decide if that will “pay off” in the long run or if I should explore different disciplines.

This was a really good question and I thought about it a bit. My response was:

I think Data Analysis (or Data Science, or Analytics) are all going to play a huge role in business going forward and that it would be a smart move to get a masters degree in one of those. I would avoid any certification programs though, just because they can be less rigorous and don’t seem to have the same weight as a full degree.

SAS is an interesting language, but I’d investigate what companies use SAS and make sure that you’d like to work for them (or in the industry). Many companies are turning towards open source Data Analytics tools (like R and Python). But in general, don’t get too hung up on the tool (SAS, Python, R) but really understand what you’re doing with them. Why would I choose this Standard Regression over Two Stage Least Squares. When do I wan to use a Logistics regression model and why. What does the output tell me, and what is it missing.

Developing that understanding will allow you to really standout.

Good luck with your decision. Let me know which direction you decide to go in,

Best,

Ryan

I hope that I was able to help my former colleague and was super happy that he reached out to me.

I wanted to write this into a more public form just in case in helps someone, or just in case I look back on it at some point and it helps me.

How to pick a team to root for (when the Dodgers aren’t playing)

I’ve been thinking a bit about how to decide which team to root for. Mostly I just want to stay logically consistent with the way I choose to root for a team (when the Dodgers aren’t playing obviously).

After much thought (and sketches on my iPad) I’ve come up with this table to help me determine who to root for:

Opp1 / Opp 2 NL West NL Central NL East AL West AL Central AL East
NL West Root for team that helps the Dodgers NL Central Team NL East Team NL West Team,unless it hurts the Dodgers NL West Team,unless it hurts the Dodgers NL West Team,unless it hurts the Dodgers
NL Central NL Central Team Root for underdog NL Central Team NL Central Team NL Central Team NL Central Team
NL East NL East Team NL Central Team Root for underdog NL East Team NL East Team NL East Team
AL West NL West Team,unless it hurts the Dodgers NL Central Team NL East Team The Angels over the A’s over the Mariners over the Rangers over the Astros AL West Team AL West Team
AL Central NL West Team,unless it hurts the Dodgers NL Central Team NL East Team AL West Team Root for underdog AL Central Team
AL East NL West Team,unless it hurts the Dodgers NL Central Team NL East Team AL West Team AL Central Team Root for underdog (unless it’s the Yankees)

The basic rule is root for the team that helps the Dodgers payoff changes, then National League over American League and finally West over Central over East (from a division perspective).

There were a couple of cool sketches I made, on real paper and my iPad. Turns out, sometimes you really need to think about thing before you write it down and commit to it.

Of course, this is all subject to change depending on the impact any game would have on the Dodgers.

Daylight Savings Time

Dr Drang has posted on Daylight Savings in the past, but in a recent post he critiqued (rightly so) the data presentation by a journalist at the Washington Post on Daylight Savings, and that got me thinking.

In the post he generated a chart showing both the total number of daylight hours and the sunrise / sunset times in Chicago. However, initially he didn’t post the code on how he generated it. The next day, in a follow up post, he did and that really got my thinking.

I wonder what the chart would look like for cities up and down the west coast (say from San Diego, CA to Seattle WA)?

Drang’s post had all of the code necessary to generate the graph, but for the data munging, he indicated:

If I were going to do this sort of thing on a regular basis, I’d write a script to handle this editing, but for a one-off I just did it “by hand.”

Doing it by hand wasn’t going to work for me if I was going to do several cities and so I needed to write a parser for the source of the data (The US Naval Observatory).

The entire script is on my GitHub sunrisesunset repo. I won’t go into the nitty gritty details, but I will call out a couple of things that I discovered during the development process.

Writing a parser is hard. Like really hard. Each time I thought I had it, I didn’t. I was finally able to get the parser to work o cities with 01, 29,30, or 31 in their longitude / latitude combinations.

I generated the same graph as Dr. Drang for the following cities:

  • Phoenix, AZ
  • Eugene, OR
  • Portland
  • Salem, OR
  • Seaside, OR
  • Eureka, CA
  • Indio, CA
  • Long Beach, CA
  • Monterey, CA
  • San Diego, CA
  • San Francisco, CA
  • San Luis Obispo, CA
  • Ventura, CA
  • Ferndale, WA
  • Olympia, WA
  • Seattle, WA

Why did I pick a city in Arizona? They don’t do Daylight Savings and I wanted to have a comparison of what it’s like for them!

The charts in latitude order (from south to north) are below:

San Diego

Phoenix

Indio

Long Beach

Ventura

San Luis Obispo

Monterey

San Francisco

Eureka

Eugene

Salem

Portland

Seaside

Olympia

Seattle

Ferndale

While these images do show the different impact of Daylight Savings, I think the images are more compelling when shown as a GIF:

We see just how different the impacts of DST are on each city depending on their latitude.

One of Dr. Drang’s main points in support of DST is:

If, by the way, you think the solution is to stay on DST throughout the year, I can only tell you that we tried that back in the 70s and it didn’t turn out well. Sunrise here in Chicago was after 8:00 am, which put school children out on the street at bus stops before dawn in the dead of winter. It was the same on the East Coast. Nobody liked that.

I think that comment says more about our school system and less about the need for DST.

For this whole argument I’m way more on the side of CGP Grey who does a great job of explaining what Day Lights Time is.

I think we may want to start looking at a Universal Planetary time (say UTC) and base all activities on that regardless of where you are in the world. The only reason 5am seems early (to some people) is because we’ve collectively decided that 5am (depending on the time of the year) is either WAY before sunrise or just a bit before sunrise, but really it’s just a number.

If we used UTC in California (where I’m at) 5am would we 12pm. Normally 12pm would be lunch time, but that’s only a convention that we have constructed. It could just as easily be the crack of dawn as it could be lunch time.

Do I think a conversion like this will ever happen? No. I just really hope that at some point in the distant future when aliens finally come and visit us, we aren’t late (or them early) because we have such a wacky time system here.