Code Collaboration: Persisted State Sharing

May 22, 2022

Opening a new writing segment called “Something I should build one day” that allows me to expound on (with a bit of ambiguity) a product/project/thing that could be brought into existence. With the ethos being something that I should do (probably won’t), it allows me to shout into the void and flex some creative thought patterns in product development. Think of it as a loose technical or product memo and not a pitch deck. (Look away, there may actually be no market here.) There’s also the potential to write a few more of these or not. Anything’s a dice throw at this point.

With much of my time spent writing and sharing code artifacts in various languages and forms, I’ve felt devoid of the ability to test, share and execute “snippets” (awful term) with others and plainly myself at a future point in time. Many development tools converge in this area, but are frankly focusing on different outcomes. Thinking about the notion for several months now, I might ultimately have some modicum of material to explain.

Does it save the state of the shell?
— Pat Migliaccio (@pat_migliaccio) November 23, 2021

Looking Back

Historically, collaborative software development has progressed through different mediums characterized in the following non-exhaustive list.

Plugboard wiring
- Introduction of manual reprogramming
- e.g. ENIAC
Punched cards and tapes
- Brought about portability, with static reproducibility
- e.g. IBM100
Files
- Promoted sharing and iterative development
Version Control Systems (VCS)
- Offered change management and distributed collaboration
- e.g. svn, git
Online or cloud development environments
- Real-time, multiplayer collaboration with a low-barrier to entry and minimal compute resources required

Each being a natural progression beyond the prior model, innovating on the basic notion of an engineer defining a set of logic that can then be read, manipulated and executed by the next.

A Background Scenario

Let’s start with a problem to align on the current state of production software development. We have X, a full-stack engineer, probably at an early to mid-stage organization, who works across a variety of team functions. Over the course of a general working day, they may write some code, review some code, collaborate on a feature or fix a bug. Most of their toolset lives locally in their interactive development environment (IDE) or collaboratively in a relatively static venue like their shared code repository (e.g. GitHub). They augment their workflow with a messaging application (e.g. Slack), project management tools and some note-taking or documentation software.

Suppose that X would like the opinion of Y (a second engineer) on a bug they are working to resolve. For effect, Y has never worked in this part of the system before but is knowledgeable in the coding paradigm and may provide valuable insight to X. The process of sharing can happen via a few paths, each with their own deficiencies.

X copies an isolated segment of code and messages it to Y
- The code lacks context within the file and less importantly syntax highlighting
X shares the full code file to Y
- The area in question isn’t isolated and lacks context in the broader application
X shares a reference to the line via permalink within their shared repository
- The working example must be pushed to the repository
X shares their screen with Y using a video sharing or live sharing application in a pair programming model
- The collaboration is synchronous and requires Y to present an answer on the spot

All of these are missing two key characteristics in that they do not allow the second engineer to execute the code in it’s current context on their own machine, nor provide a seamless sharing experience between the two engineers.

Imagine a model that was networked, with dynamic reproducibility. Say persistence at runtime, shareable via hyperlink for the recipient to engage at the current state of execution. For analogous intent, a video link (e.g. YouTube) at the 23:22 minute mark but for code. Indefinite persistence.

Thought similarly to orthogonal persistence in a virtual machine hibernation model, but with the backing of a state language for reproducibility.¹

Suppose the interactive development environment is streamed to the local machine with a single source of state persisted in the cloud. A fixture in time within the debugging execution could theoretically be shared, allowing the next individual to capture that specific moment and continue processing.

Retaining the state indefinitely poses challenges in both execution time and memory allocation but through a model of on-demand persistence and runtime checkpointing, it is certainly feasible.² Moving from machine to machine and being persistent at runtime without having to stop or redeploy the code.³

Availability for other perceivable enhancements like “time-travel debugging”, where an engineer can step backwards through time to understand the prior execution path in the stack, is also evident.⁴

Productization

Adoption for cloud-based development environments is gaining momentum, but for large scale projects there is still quite the dependence on the localized model. Presented in a minimum viable product form, an IDE extension that tracks file manipulations and synchronizes changes up to the cloud may be a start to bridging the gap. But there still poses the problem of syncing the more cumbersome runtime execution data.

Presented as a streaming application, where all interaction is happening in the cloud, would exhibit a more seamless experience. On execution, the runtime is persisted in a state file that would be preserved when a link is generated and contain a reference to the fixed point in time. If no link is shared, the persisted state would simply be ephemeral and removed when the execution is terminated, conserving system memory.

The developer experience might look like a right-click on a line of code that was just modified in the local environment. When selecting to share, a link is generated and copied to the clipboard. The developer would paste the link into their messaging application. A screenshot of the breakpoint with the segment of code is displayed, showing some values for the observed variables inline. The recipient developer would click the link and be brought into their own local environment at the exact point in time of the runtime execution.

Engineering this process might contain some combination of a compute layer that requires the application to be containerized with some available test data, like that of an integration test, to present identical reproducibility. Alternatively, the environment can be backed by a database snapshot for further process execution.

Objections

Being quite inspired by Dorsey and Square’s 140 reasons why it will fail, I have given myself four.

Where is the market differentiation?
- Present a solution that allows developers to engage in this model without changing their current development workflow.
What is to stop one of these companies from just complementing their existing offering?
- Nothing. But as mentioned earlier, the desired outcome is different. Most tooling is at the stage of the hobbyist and entry-level engineer.
Is it not the assumption that online development environments become the defacto standard?
- Maybe. There is always room to adopt the full-SaaS model, but client-server streaming presents the opportunity to take existing applications and start developing for production now.
Doesn’t persisting the state of innumerous application links pose a significant technical hurdle?
- It is well within the limits of physics.

The minute this is shared, someone may produce another “what about” that I welcome with full exposure.

Go-to-Market

Candidly, at this juncture, I care very little about the nature of bringing an idea such as this to market. But if I were to back it by some language, I would suggest that a combination of a bottom-up B2B SaaS approach, mixed with a freemium model for developers, followed by some organic network effects of sharing code in their messaging ecosystem could be a decent first round strategy.

As always, open and welcome to comments.

References

¹ Atkinson, M.P., Bailey, P.J., Chisholm, K.J., Cockshott, W.P. & Morrison, R. “PS-algol: A Language for Persistent Programming”. In Proc. 10th Australian National Computer Conference, Melbourne, Australia (1983) pp 70-79.

² Liu, Z. (1996). A persistent runtime system using persistent data structures. Proceedings of the 1996 ACM Symposium on Applied Computing - SAC ‘96, 429–439. https://doi.org/10.1145/331119.331420

³ Nicoara, A., & Alonso, G. (2007). Making applications persistent at run-time. 2007 IEEE 23rd International Conference on Data Engineering. https://doi.org/10.1109/icde.2007.369013

⁴ Garreau, M., & Faurot, W. (2018). Chapter 3. Debugging Redux applications - Redux in Action. Redux in Action. Retrieved May 22, 2022, from https://livebook.manning.com/book/redux-in-action/chapter-3/12

Recommend this article for more to read by sharing on Twitter.

Photo by Jan Canty on Unsplash