K
Kathleen Martin
Guest
I lived the life of a data engineer. The majority of my time as a data engineer was dedicated to writing or maintaining data pipelines. Every time I thought I was on top of things, something new would set me back. There was always a new request for faster analytics, new data in the pipeline, or scaling to an impossible level.
My day-in-the-life-of-a-data engineer story went like this:
Data Engineers Feel Burnt Out
It turns out that I was not the only one suffering burnout due to data pipeline engineering. We have been hearing a great deal lately about data engineers feeling exhausted and ready to leave their jobs. One aspect that is rarely talked about in this discussion is the role that manual operations for data pipelines play in the mounting frustration levels for data engineers.
According to an October 2021 study, 97 percent of data engineers reported experiencing burnout in their daily jobs. Nearly 80 percent reported they were considering switching careers. Four in five respondents (78 percent) wished they’d had a therapist available to help them cope with work stress.
This is troubling for a number of reasons. First, as tech leaders, we want professionals to be in jobs they enjoy and where they feel they are making important contributions without feeling overwhelmed. Second, as the U.S. deals with the Great Resignation and a large number of professionals are leaving their jobs and careers, the tech industry is feeling the pain.
Right now we need more -- not fewer -- data engineers. Data engineering is suffering from an intense talent gap already. To get a sense of how big the shortage is, I recently looked on LinkedIn and discovered that there are 217,000 open data engineer positions in the U.S. for only 33,000 employed data engineers, roughly a 6.5-to-1 ratio of jobs to people. This ratio eclipses that of data science, a profession which has for years been the skills deficit poster child. I also found roughly 398,000 open data science positions and 78,000 employed data scientists, a 5-to-1 ratio, which is bad but still a better situation than for data engineering.
Finally, the advent of big, complex, and streaming data, combined with real-time analytics and machine learning users, has made data engineering much more difficult than it was when I experienced this pain. Data engineers used to be productive with SQL and Oracle under their belts. Today they need to manage several data platforms, write production code on complex distributed systems, and perform more manual operations (orchestration, file system management, and state management, for example) than ever before.
Continue reading: https://tdwi.org/articles/2022/05/16/diq-all-how-to-prevent-data-pipeline-engineering-burnout.aspx
My day-in-the-life-of-a-data engineer story went like this:
- Fielding around five requests a day to create new tables, update schemas, and change transformations. (Of course, from the point of view of my internal customer, each of these requests was urgent.)
- Starting work at 2 a.m. because operations wouldn’t allow me to change data pipelines during work hours.
- Responding to calls from the network operations center about production data pipelines not completing, leading me to have to profile the problem, restart servers, increase server sizes, and clean temporary data that wasn't purged -- all under extreme time pressure.
Data Engineers Feel Burnt Out
It turns out that I was not the only one suffering burnout due to data pipeline engineering. We have been hearing a great deal lately about data engineers feeling exhausted and ready to leave their jobs. One aspect that is rarely talked about in this discussion is the role that manual operations for data pipelines play in the mounting frustration levels for data engineers.
According to an October 2021 study, 97 percent of data engineers reported experiencing burnout in their daily jobs. Nearly 80 percent reported they were considering switching careers. Four in five respondents (78 percent) wished they’d had a therapist available to help them cope with work stress.
This is troubling for a number of reasons. First, as tech leaders, we want professionals to be in jobs they enjoy and where they feel they are making important contributions without feeling overwhelmed. Second, as the U.S. deals with the Great Resignation and a large number of professionals are leaving their jobs and careers, the tech industry is feeling the pain.
Right now we need more -- not fewer -- data engineers. Data engineering is suffering from an intense talent gap already. To get a sense of how big the shortage is, I recently looked on LinkedIn and discovered that there are 217,000 open data engineer positions in the U.S. for only 33,000 employed data engineers, roughly a 6.5-to-1 ratio of jobs to people. This ratio eclipses that of data science, a profession which has for years been the skills deficit poster child. I also found roughly 398,000 open data science positions and 78,000 employed data scientists, a 5-to-1 ratio, which is bad but still a better situation than for data engineering.
Finally, the advent of big, complex, and streaming data, combined with real-time analytics and machine learning users, has made data engineering much more difficult than it was when I experienced this pain. Data engineers used to be productive with SQL and Oracle under their belts. Today they need to manage several data platforms, write production code on complex distributed systems, and perform more manual operations (orchestration, file system management, and state management, for example) than ever before.
Continue reading: https://tdwi.org/articles/2022/05/16/diq-all-how-to-prevent-data-pipeline-engineering-burnout.aspx