4 Ways to Spot a Fake Data Scientist
Editor's Note, 2018: Even several years after publication, this blog post continues to generate a fair amount of traffic and discussion, with many asking what our complete criteria are for discerning who is a data scientist. We posted the complete (and updated for 2018) criteria that we use for our data science salary studies, which you can find here, if you're interested.In June 2013, with the data science hype picking up steam, I took to my blog to write Data Scientists… or Data Wannabes? – a post bemoaning the plague of professionals who had begun changing their titles to Data Scientist without any of the necessary qualifications. At the time, the Data Scientist title was still loosely defined, which resulted in confusion in the market, obfuscation in resumes, and exaggeration of skills.
Two years later, and the trend has gotten even worse. With the media inflating Data Scientists’ already-high salaries (I’ve yet to see a newly-graduated data scientist making $300,000+, as has been reported), data scientists have captured the imagination of job seekers thinking that they can write Hadoop on their resume and get a 50% raise.I’m here to tell you that from all of my conversations with data scientists and “data scientists” I’ve discovered four telltale signs that a professional is not a true data scientist:1. Lack of a highly quantitative advanced degree – It’s incredibly rare for someone without an advanced quantitative degree to have the technical skills necessary to be a data scientist. In our data science salary report we found that 88% of data scientists have at least a Master’s degree, and 48% have a Ph.D. The areas of study may vary, but the vast majority are very rigorous quantitative, technical, or scientific programs, including Math, Statistics, Computer Science, Engineering, Economics, and Operations Research. 2017 Update - Although it is becoming slightly more common for data scientists to have a quantitative Bachelor's or Master's layered with a top-tier bootcamp instead of a PhD, without a strong foundation in a technically rigorous program, it is very difficult to master all of the statistical and computer science concepts and skills necessary to be hired as a data scientist.2. No concrete examples of experience with unstructured data or statistical analysis – Lists of tools such as Hadoop, Python, and AWS need to be accompanied by projects that show those skills being put to good use. If a professional cannot provide clear examples of their experience with unstructured data, or mentions data science projects, but keeps their involvement very vague, then they are probably not a data scientist. If their specific role in or impact on a Big Data project is unclear, that is cause for concern. 2017 Update - This is also true for statistical analysis. If a professional has experience organizing and structuring large data sets, but little-to-no experience with statistical concepts and analytics, then they are likely a Data Engineer, not a Data Scientist.3. Purely academic or research background – Now, this is not to say that someone with a stellar academic or research background won’t make a great corporate data scientist, but a key component to being a data scientist in a corporate setting is business acumen. Understanding how findings affect business goals and delivering actionable insights to leaders is critical to a data scientist’s success. Many research academics have exceptional data skills, but without strong business savvy they are not data scientists… yet.4. List of basic business skills – If I see a list of tools on a “data scientist” resume like Omniture, Google Analytics, SPSS, Excel, or any other Microsoft Office tool, you can be sure that I will take a harder look at whether or not this professional makes the grade. These skills are basic business qualifications that, by themselves, are insufficient for most data science jobs, and are not indicative of a true data scientist.