top of page

The One Thing Every Data Engineer Must Do

Every Data Engineer has the power to revonutionaulize data analytics for their organization if they choose.


While Data Engineers are hard at work, their job is getting harder. The volume of data is exploding, and is predicted to reach 149 zettabytes by the end of 2024. It would take more than 582 billion iphones to hold that much data.

Compounding the skyrocketing data volumes is the growing number of data technologies and growing demand for Data Engineers to provide data to answer important business questions and make data-driven decisions.


For the Data Engineer and their peers, this means backlogs are getting bigger, it’s taking longer to finish jobs and it’s unrealistic to consolidate all data into one data warehouse. When the business doesn’t know what questions they need to answer tomorrow, how can the Data Team possibly be prepared to help them answer those questions quickly?


And here is the one thing every Data Engineer must do; always be looking for and suggesting new ways to deliver more, faster. If they don't then their job will get harder and the business will not be able to optimize distribution or sales or customer engagement with data.

 
Always be looking for and suggesting new ways to deliver more, faster.
 

It happens to everyone, as workload grows we work harder and don’t ever feel like we have the time to find ways to improve. It feels like that if you take your foot off the gas for even one minute you will fall even further behind, and it won’t be possible to catch up.


It may seem impossible or even counterintuitive, but it is important to make time to stop and look objectively at the process you and your team use to answer questions with data.


It will help to think about where the problems are at each step of the process. Typically there are nine steps end to end:

  1. Ask the question

  2. Discover Data

  3. Move Data

  4. Verify Data

  5. Assemble Data

  6. Prep Data

  7. Query Data

  8. Visualize Data

  9. Return Answer


For each step in the process ask these questions:

  1. How long does it take (use a range from shortest to longest)?

  2. How do we do it today (process, people, tools used, partners)?

  3. Why do we do it that way?

  4. What if we could [insert idea]? Examples, reduce the number of tools used, automate.

  5. What solutions can I find on the web for this step?


Below is an example that you can use as a template to get started:


If in your job you don’t perform each step, then focus only on the steps you perform. Talk to your colleagues and learn from them for the steps they perform.


Some of the obvious causes of long process times or productivity issues include:

  • Rework - the need to repeat steps

  • Duplicating work - two or more people create the same thing

  • Moving data when it isn’t necessary

  • Not giving your consumers self service


Don’t make this into a big task, break it into smaller pieces. Set aside 10 minutes every other day to spend time thinking and exploring. If nothing else, search the internet. It is likely that someone else has experienced the same problem and have shared their experience or solution.


Remember, it is part of every Data Engineer's job to find ways to improve. Best of luck!

 

What to read next


Learn more about saving time at each data management and visualization step.


Download the 451 report with 6 trends to driving a more direct path from data to decision.


Read the report from experts Eckerson Group - When a Data Catalog is Not Enough.

 

74 views
bottom of page