Each year for the past six years, we’ve asked data scientists and analytics professionals whether they prefer to use SAS, R, or Python and examined their responses to see how tool preferences vary by a variety of factors.
What began as a “spirited” debate between SAS and R users in 2014 and 2015, became even more animated when Python was added to the mix in 2016 and 2017, culminating in a dead heat between all three competitors in our 2018 survey results.
With the stakes for victory running high, responses were fiery this year with over 1,000 votes counted.
For comparison purposes, we only ask one question: Which do you prefer to use – SAS, R, or Python?
And the overall 2019 winner is… Python!
As you can see from the six year trend, Python has more than doubled its share of the vote since its introduction in 2016 – from 20% to 41%. R squeaked by SAS to narrowly claim second place with 30% of the vote over SAS’ 29%.
SAS, R, or Python Preferences Examined by Demographic Factors
Each year we also combine participant responses with demographic information to show how respondent preferences vary by factors like region, industry, years of experience, education, and data scientists vs. other predictive analytics professionals.
Since we’ve noticed early career professionals increasingly gravitating towards open source tools like R, and especially Python in the past few years, this year we thought it would be interesting to ask the data science and analytics students in our network to see how their preferences compared to working professionals.
*College/graduate students shows the preferences of Bachelor’s, Master’s, and PhD students, the majority of whom are (or will be) in the graduating class of 2018-2020. For those who have already graduated, they were counted as students if they are still in internships, completing post-graduate work, or are newly-graduated and not yet employed in a data science or analytics position. Because they are not yet employed in the field, these students are excluded from the other samples (overall results, industry, region, etc.).
Note: Since Burtch Works is a recruiting firm, we do not ask professionals for their age. However, we do know their years of quantitative work experience, which is highly correlated with age, and takes into account how many years since they first entered the analytics or data science fields (this might be after university, or, for those who changed careers, when they transitioned into the quantitative fields).
In past surveys, Bachelor’s and Master’s degree holders have typically showed stronger support for SAS, while PhD holders have consistently favored open source tools like R and increasingly Python. However, this year all groups preferred Python.
Area of Study
A new factor that we examined this year was the area of study for the respondent’s highest degree earned, which is an attribute we showcase in our salary reports, and this showed some interesting tool preference trends.
The votes were relatively close for four out of the seven areas of study that we examined, with professionals studying Economics showing the strongest support for SAS and those in the Social Sciences showed the strongest preference for R. Professionals whose area of study was Engineering, Computer Science, and Natural Sciences all overwhelmingly preferred Python.
In our recent salary report, we noted that data scientists who mainly analyze unstructured data are far more likely than other predictive analytics professionals to come from a computer science, engineering, or natural science educational background.
While preferences in the West Coast and Mountain regions were very similar to those that we reported last year, there was a surge in votes for Python among respondents in the Midwest, Southeast, and Northeast regions. In 2018, SAS was still leading in the Midwest and Southeast, however this year Python is in the lead in every region.
While Tech/Telecom has always had a strong preference for open source tools (especially Python), this year we were especially interested to see the preference shift in Financial Services.
Continuing a trend that we highlighted in our analysis video last year, Python surged from 19% of the vote in 2017 to 28% in 2018, this year Python leapt again to 41% to capture first place. In 2018, SAS held the lead among professionals in financial services firms with 42% of the vote, but in 2019 dropped to 35%.
This shift in tool preferences is happening alongside another trend we noted recently, where more financial services firms are supporting open source tools like R and Python, or at least allowing them. We’ve also seen several major financial services employers, which began to transition away from SAS a few years ago, complete their transition to open source tools over the past year, which has made the surge towards Python even more apparent than before.
As we’ve pointed out, tool choice can be especially crucial for these organizations in order to land top quantitative talent, especially at the early career level.
Comparing Data Scientists to other Predictive Analytics Professionals
While Burtch Works has always regarded data scientists as a specialized subset of predictive analytics professionals, we’ve often segmented data scientists and predictive analytics professionals in our salary and other analyses to compare the two groups. This is primarily because of differences in skillsets that results in differing salary bands, but, as we pointed out in our recent salary report, there are a number of demographic distinctions between the two groups as well (such as common areas of study or industries of employment).
As we’ve defined them, data scientists work primarily with unstructured or streaming data whereas others predictive analytics professionals mostly focus on structured data. Although the two areas are becoming more blended as of late, and the rest of this post combines the two groups, we thought it might be interesting to show how their tool preferences differ.
As you can see, data scientists overwhelmingly prefer Python (and none chose SAS), whereas preferences amongst predictive analytics professionals were relatively even.
A Few Lively Comments
Every year we enjoy reading the interesting responses that we get along with your survey votes, and wanted to share a few!
I think I’m going with R this time, I’m a recent convert from the gospel of SAS.
Python. Been a SAS guy my entire career but it’s been dropped corporately (terrible decision).
SAS. I don’t know the other two. 😉
R – aka pirate language.
It’s an interesting question to ponder how many more years you’ll have SAS on the survey. My guess is that next year will be the last year
I definitely prefer Python. Even if it wasn’t the objectively superior language (which it is), it’s by far the best looking 😁
I am still using SAS but this time next year it will be Python.
I use both SAS and R. Neither one does everything I need.
Python. This shouldn’t be a question anymore.
As always, this survey gives us a lot to think about! What did you think of this year’s results? Did anything surprise you or do you have any predictions for future years? Let us know in the comments below.
Interested in our salary research on data scientists and predictive analytics professionals? Download our studies using the button below.
Want to learn more? Check out our extended analysis below, where we go into more depth with additional trends that we found in this year’s data.