The Must-Have Skills You Need to Become a Data Scientist
Update 2018: Years after publication, this post continues to generate a lot of interest! As such, we've updated it to more accurately reflect the skills that employers are looking for now, as well as current statistics on data scientists' education.
Over the four years, interest in data science has soared. Nate Silver is a household name, companies everywhere are searching for unicorns, and professionals in many different disciplines have begun eyeing the well-salaried profession as a possible career move.In our recruiting searches here at Burtch Works, we’ve spoken to many analytics professionals who are considering adapting their skills to the growing field of data science, and have questions about how to do so. From my perspective as a recruiter, I wanted to put together a list of technical and non-technical skills that are critical to success in data science careers, and at the top of hiring managers’ lists.Every company will value skills and tools a bit differently, and this is by no means an exhaustive list, but if you have experience in these areas you will be making a strong case for yourself as a data scientist.
Data scientists are highly educated – 91% have at least a Master’s degree and 48% have PhDs – and while there are notable exceptions, a very strong educational background is usually required to develop the depth of knowledge necessary to be a data scientist. Their most common fields of study are Mathematics and Statistics (25%), followed by Computer Science (20%), Natural Sciences such as Physics (20%), and Engineering (18%). You can learn more about this in our Burtch Works Study: Salaries of Data Scientists, which can be downloaded here.
2. Python Coding
Python is the most common coding language I typically see required in data science jobs, along with Java, Perl, or C/C++.
3. Machine Learning
Machine learning techniques have quickly become an integral part of all the data science jobs we work on, so if you're looking to transition into the field this is a must-know!
4. Hadoop Platform
Although this isn’t always a requirement, it is heavily preferred in many cases. Having experience with Hive or Pig is also a strong selling point. Familiarity with cloud tools such as AWS can also be beneficial.
5. Unstructured data
It is critical that a data scientist be able to work with unstructured data, whether it is from social media, video feeds, sensor data or audio.
6. Intellectual curiosity
No doubt you’ve seen this phrase everywhere lately, especially as it relates to data scientists. Frank Lo describes what it means, and talks about other necessary “soft skills” in his guest blog posted a few months ago.
7. Business acumen
To be a data scientist in a business role, you’ll need a solid understanding of the industry you’re working in, and know what business problems your company is trying to solve. In terms of data science careers, this means that being able to discern which problems are important to solve for the business is critical, in addition to identifying new ways the business should be leveraging its data.
8. Communication skills
Companies searching for a strong data scientist are looking for someone who can clearly and fluently translate their technical findings to a non-technical team, such as the Marketing or Sales departments. A data scientist must enable the business to make decisions by arming them with quantified insights, in addition to understanding the needs of their non-technical colleagues in order to wrangle the data appropriately. Check out our flash survey for more information on communication skills for quantitative professionals.
The next question I always get is, “What can I do to develop these skills?” There are many resources around the web, but I don’t want to give anyone the mistaken impression that the data scientist career path is as simple as taking a few MOOCs. Unless you already have a strong quantitative background, the road to becoming a data scientist will be challenging – but not impossible.However, if it’s something you’re sincerely interested in, and have a passion for data and lifelong learning, don’t let your background discourage you from pursuing data science as a career. Here are a few of the resources we’ve found to be helpful:
- Advanced Degree – More Data Science programs are popping up to serve the current demand, but there are also many Mathematics, Statistics, and Computer Science programs.
- MOOCs –Coursera, Udacity, and codeacademy are good places to start.
- Certifications – KDnuggets has compiled an extensive list.
- Bootcamps – For more information about how this approach compares to degree programs or MOOCs, check out this guest blog from the data scientists at Datascope Analytics.
- Kaggle – Kaggle hosts data science competitions where you can practice, hone your skills with messy, real world data, and tackle actual business problems. Employers take Kaggle rankings seriously, as they can be seen as relevant, hands-on project work.
- GitHub – As employers look to verify the technical prowess of their applicants, many have been asking for links to GitHub profiles to look over samples of code. If you're applying for a data science role make sure you have examples of work you can provide!
- KDnuggets – KDnuggets is a good resource for staying at the forefront of industry trends in data science.
- The Burtch Works Study: Salaries of Data Scientists – If you’re looking for more information about the salaries and demographics of current data scientists be sure to download our data scientist salary study.
I’m sure there are items I may have missed, so if there’s a crucial skill or resource you think would be helpful to any data scientist hopefuls, feel free to share it in the comments below!