This website uses cookies

Read our Privacy policy and Terms of use for more information.

Happy Thursday, everyone! This is Melissa Strong and today I have for you the last installment of my Open Source Series (you can find the rest here: one, two, three, four, five).

Git: Collaboration and Version Control for Visualization Projects

If you have ever ended up with files named final_v3_revised_REAL_final.csv, you already know the problem Git is trying to solve, and you also know the particular despair of opening that folder and genuinely not knowing which one is current.

Visualization projects rarely move in a straight line. Data gets updated, annotations change, charts are revised, and filenames start multiplying in increasingly unhelpful ways. Git brings order to that chaos by recording how a project changes over time, so you can understand not just what a visualization is, but how it became what it is.

Why does this matter? Visualization work isn't only about the final chart. It usually includes datasets, cleaning scripts, notebooks, design experiments, written interpretation, and documentation. Git can track all of those text-based pieces together, creating a shared history of decisions rather than a pile of disconnected files. Instead of trying to reconstruct when a change is introduced or which version is "the real one," you can follow the project's development step by step.

This is why Git belongs in a conversation about visualization, even though it's usually framed as a tool for software development. In practice, it's just as useful for anyone building data stories. A revision to an axis label, a change in data filtering, or a rewritten paragraph of interpretation can shape the meaning of a visual just as much as a code change can. Git treats those revisions as part of the work, not as invisible background activity that disappears once the project ships.

It also makes collaboration more manageable. Visualization projects often involve multiple kinds of work happening at once: one person cleaning data, another refining a chart, another revising text, or reviewing framing. Git lets those efforts develop in parallel and come back together in a controlled way. That's especially valuable when the work is iterative, which visualization almost always is. There's a lot of experimenting, comparing, and revising before anything feels ready to publish.

Source: EDrawMax

There's also something fitting about Git from a visualization perspective. Its structure is often represented as a branching tree, with commits connected through time as a network of decisions and relationships. In that sense, Git isn't just a tool for managing projects. It's a form of data representation in its own right: a visual map of how work evolves. Branches split, merge, and diverge, and the history itself becomes a kind of diagram showing process, collaboration, and change over time.

Platforms like GitHub and GitLab build on this by making the collaborative process easier to share and review. They create space for discussion around changes before those changes become permanent. In a visualization context, that review process isn't only technical. It can include questions about clarity, interpretation, accessibility, and bias. A pull request might be about whether a dataset was transformed correctly, but it can also be about whether a visual choice overstates a pattern or whether an annotation is guiding readers fairly. Making that discussion visible supports more thoughtful, more accountable work.

Git is also valuable because it preserves the evolution of a project. Finished visualizations tend to hide the drafts, false starts, and alternate directions that shaped them. With Git, that history doesn't disappear. Earlier versions stay accessible, which is useful for learning, review, and reproducibility. If someone wants to understand how a project has changed, or return to an earlier state, the records are there.

Publishing a project openly on GitHub or GitLab lets others see the code, the data pipeline, the documentation, and the reasoning behind the final result. Instead of presenting a chart as a polished endpoint, you can reveal the process behind it. That openness invites scrutiny, but it also builds trust, because people can inspect the choices that shaped the outcome, reuse the work in their own projects, or contribute improvements.

Git isn't perfect for every asset. Large binary files (videos, design exports, that kind of thing) can be awkward in a standard Git workflow, and the tool can feel intimidating if you haven't used version control before. But for the core materials that define most visualization projects (code, text, structured data, and documentation), it's a strong fit.

In the context of this series, Git serves as the connective tissue. Penpot addressed open design, Observable Plot focused on chart construction, Bootstrap supported presentation across devices, and Docker reinforced reproducibility. Git ties those pieces together by tracking how they change, making collaboration workable, and turning process into something visible rather than hidden. For visualization work, that's not a side benefit. It's part of what makes the work more open, more reliable, and more useful to others.

I hope you've enjoyed this series as much as I have. Putting it together pushed me to look more carefully at the tools I already use and understand how they fit together.

Until next time.

Recommended for you