9 Predictions for Data Engineering in 2021 and Beyond
This post is contributed byCaroline Evans, Burtch Works’ data engineering recruiting specialist. After so much change and disruption over the past year, I know many of us are likely glad to be looking forward at the fresh year ahead. As a recruiter that specializes in data engineering roles (including machine learning engineers and big data engineers, among others), I thought it might be interesting to participate in the Burtch Works tradition of Annual Predictions!While we’re undoubtedly in for even more potentially unforeseeable hiring market shifts in 2021, I wanted to share some of my own insights on where I think the data engineering space is headed. Here are some of my predictions and trends for this year – and beyond! – based on my conversations with hundreds of data engineers and their employers over the past few years.
Data Engineering Predictions and Trends to Watch in 2021 and Beyond
1. Product-focused data engineering roles are on the rise
In many cases, when a data product will serve external customers as opposed to internal ones, it needs to be customizable and scalable to serve a variety of different customers. An example could be a company like Microsoft customizing their products for a wide range of external clients, or a company that specifically caters to clients in a specific industry, like healthcare.Rather than looking for data engineers who can create a data product that will be recycled for various internal clients, many of my searches over the past six months have been specifically seeking data engineers coming from a data-driven product company. This is becoming more of a desired background to have since it is unique, and I expect we’ll see more of these roles popping up in the future.
2. Data Engineers being asked to do DevOps work
With the rise of the machine learning engineer, many teams are looking for engineers who are adept at cleaning data, as well as building machine learning pipelines and productionalizing models. However, I’ve also been hearing from some data engineers that companies are also looking for those who can effectively test and troubleshoot in a production environment, which is traditionally considered more DevOps. I have found that while many engineers have these skills, they are much more interested in sticking with the data engineering work.
3. The lines between Big Data Engineers and Data Scientists will continue to blur
Following along with #2, I’m also seeing increased interest from clients looking to hire big data engineers who deal with the data from end to end – including the statistics and modeling! I had one client describe a machine learning engineer to me as a “full stack data scientist”, and I’ve been specifically noticing this in the medical- and healthcare-related fields. This focus on finding someone who can “do it all” could potentially be because of the type of data these professionals are dealing with (IoT or streaming data). And, this brings me to my next prediction…
4. Hybrid roles – the unicorn is coming back?
Predictions 2, 3, and 4 are all part of a larger trend that I’m seeing, which is the increasing prevalence of hybrid roles (or companies attempting to find someone for a hybrid role). One example could be a role with data engineering, machine learning engineering, and DevOps all rolled into one position.There are a few potential reasons why we might see companies do this. One is that if the data organization is still relatively green, they might not be exactly sure what they need yet or what makes the most sense for them, so they’re looking to find someone who can do a bit of everything. Another reason could be if it’s a small company who has limited ability to hire a large team of engineers, so they want to find someone (or a few people) who can wear many hats.Unfortunately, from what I’ve seen and heard from engineers, most companies who attempt to fill positions like this are doing themselves a disservice. Not only is it very difficult to find someone who is an expert in many fields at once, but also hiring data engineers for these roles can be genuinely misleading and frustrating if they find themselves often working on a lot of different tasks that are not exactly in their wheelhouse or don’t fall under what they’d prefer to be doing most of the time.
5. More data engineering jobs are going 100% remote
As my colleagues pointed out in their trends section of their data science salary report, there has been a noticeable increase in remote working opportunities, and this has a variety of advantages for the talent market. This opens doors for both data engineers looking for opportunities in a variety of locations, as well as for employers who are looking to broaden their talent reach.
6. Huge acceleration of on-premises databases transitioning to the cloud
I’ve already been seeing this happen in my own conversations with data engineers and employers in this space, but the migration to the cloud is accelerating. In fact, Gartner predicts that “By 2022, 75% of all databases will be deployed or migrated to a cloud platform, with only 5% ever considered for repatriation to on-premises”. This has many advantages, including cost and time savings, reliability, and mobility.
7. With the shift to cloud, BI Engineering will shift as well
With this transition, of course, will come different tools and new ways to accomplish goals for the organization. While dashboards have long been a mainstay in business intelligence, we’re already seeing many teams move away from static dashboards to solutions that are more easily tailored to the individual user.
8. The lines between data lakes and data warehouses will continue to blur
Data lakes were initially designed to take on unstructured data, and while data warehouses have also evolved, I think we may see more of a mix between these two approaches. Data lakes will be more fine-tuned and have more capabilities, which has huge potential to advance machine learning and artificial intelligence.
9. It will be harder than ever to attract data engineers to a Hadoop ecosystem
Hadoop has been on the decline for years, but now more than ever it will be extremely difficult to attract data engineering talent with a Hadoop ecosystem and no cloud capabilities. Cloud is everywhere right now, and strong talent are drawn to organizations that are using the latest tools and technologies to push the business forward.In terms of in-demand cloud tools, AWS, Azure, and GCP are always highly in-demand. AWS especially has a lot of flexibility, so it’s a highly desired addition to your toolkit! Obviously there are many, many different tools that go along with all of these products and serve different functions, and so it’s becoming increasingly common for employers to target their searches around specific tools.