top of page

What is the workflow or process of a data scientist? What tools do they use in data science workflows?

  • Dr Dilek Celik
  • Jun 25
  • 1 min read

Updated: Jul 9

Flowchart of the Data Science Process: Ask, Get, Explore, Model, and Communicate data, with questions and tasks in blue and black text. Data Science Workflows.
Image credit: This framework was introduced by Professors Joe Blitzstein and Hanspeter Pfister in their Harvard course, Introduction to Data Science.

Note all the purple arrows pointing backward—the data science workflow is non-linear, iterative, and cyclical. You can’t know the best path from the start.

Each stage demands different skills and tools.

Functional stack:

  • Stage 1: Ask a question (relevant to your organization)

    Skills: scientific thinking, domain knowledge, curiosity, business sense

    Tools: your brain, expert input, experience

  • Stage 2: Get the data

    Skills: cleaning, querying, scraping, coding

    Tools: SQL, Python, pandas, (Spark)

  • Stage 3: Explore the data

    Skills: pattern recognition, hypothesis building

    Tools: matplotlib, numpy, scipy, pandas, (Spark)

  • Stage 4: Model the data

    Skills: regression, ML, validation

    Tools: scikit-learn, pandas, (Spark, MLlib)

  • Stage 5: Communicate the data

    Skills: storytelling, visuals, writing

    Tools: matplotlib, Illustrator, PowerPoint

  • Stage 6: Implementation

    Skills: product sense, communication, organizational savvy

    This stage is crucial. Without pushing your work to implementation, you're just a consultant. You may not implement it alone—but it’s still your job to push it forward.

  • Stage 7: Test and measure impact

    Skills: all previous

    Did it work? Was it worth it? You’re best placed to answer.

  • Conclusion: Data science is a cyclical process: Ask, Get the Data, Explore, Model, Communicate, Implement, and Measure.

    • The strongest data scientists follow through all the way.

    • This framework stands out by stressing: the importance of asking meaningful questions and the need to revise those questions as you learn more.

There are many tools out there—this is just one powerful stack.

1 Comment

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Guest
Jul 10
Rated 5 out of 5 stars.

Very informative.

Like

machine learning shap
Data Scientist jobs

business analytics

bottom of page