Console #76 -- Symbolica, Mastodon, and Uptime Kuma
Interview With Matt and Iain of Symbolica
Want to give back to the community while having a super low-key side hustle to fund your habit?
Wynter is looking for people to join its research panel! Participate in research surveys and get paid ($90-$180/hr) for your feedback and comments. The best part? It’s only 10-15 mins per survey. Sign up today!
Not subscribed to Console? Subscribe now to get a list of new open-source projects curated by an Amazon engineer in your email every week.
Already subscribed? Why not spread the word by forwarding Console to the best engineer you know?
Want to be paid for contributing to open-source? Check out the continually updated Console job board!
If you’re an employer looking to get your job in front of thousands of the best engineers on the planet, you can request to post your job here.
Mastodon is a free, open-source social network server based on ActivityPub where users can follow friends and discover new ones.
language: Ruby, stars: 25136, watchers: 639, forks: 4168, issues: 1975
last commit: October 21, 2021, first commit: February 20, 2016
Uptime Kuma is a self-hosted monitoring tool like “Uptime Robot”.
last commit: October 22, 2021, first commit: June 24, 2021
Symbolica is symbolic execution as a service. Find bugs faster by exploring every reachable state of your program.
language: C#, stars: 30, watchers: 2, forks: 2, issues: 1
last commit: October 13, 2021, first commit: May 19, 2021
Console is powered by donations. We use your donations to grow the newsletter readership via advertisement. If you’d like to see the newsletter reach more people, or would just like to show your appreciation for the projects featured in the newsletter, please consider a donation 😊
Hey guys! Thanks for joining us! Let’s start with your backgrounds. Where have you worked in the past, where are you from, how did you learn how to program, what languages or frameworks do you like?
Matt - I learnt to program on my first job after graduating with a physics degree. I joined a computational fluid dynamics team that had a huge C++ codebase. It was an initiation by fire because I’d only taken a single course in C and some computational labs in languages like Matlab as part of my degree. I was really frustrated and didn’t enjoy programming to begin with, I just wanted to do physics (whatever that meant!). It was after I moved to a non-software team that I realised I really missed programming. I missed the feeling of building something out of nothing by incrementally solving little puzzles and at the same time I felt an urge to master this new skill.
Iain - I first discovered programming at quite a young age, in computing class at school. I think this has now been replaced with something more like Computer Science, but at the time it was about general use of a computer, and only a fraction was spent on simple programming in Visual Basic. However, I found it engaging enough to try more interesting things in my own time, such as scripting some simple games. I then learned some C and C++ out of general interest and used them for small projects. My degree was in Information Engineering, but there was a lot of flexibility on module choices and I took a few programming and Software Engineering modules to try to formalize things a bit.
My first job as a Software Engineer was in defence, using C and C++. I then moved to finance and switched to Python and C#.
Who or what are your biggest influences as developers?
Domain Driven Design by Eric Evans has had a huge impact on us and how we think about composing large systems. We’ve found it interesting how it so often turns out that by finding the correct terminology to describe the domain, you usually end up with the cleanest and most optimal solution. One really striking example of this is that if you model your aggregates correctly then they should line up with the transactional boundaries in your system, which means you can avoid the need for things like distributed transitions and two phase commits. It also means your persistence layer is simpler because you can usually then get away with just writing the data to a document store (like Mongo), which avoids a lot of the overhead around SQL object mapping and the need for an ORM.
What’s your most controversial programming opinion?
You probably don’t need dependency injection (DI). We used to be big fans, but over time we’ve realised that DI is just one way to achieve inversion of control (IoC). These two things are often conflated. We definitely still want loose coupling with IoC, but find DI is often too “magic” and can make it very hard to understand the application at the top level. As we’ve written more F# we’ve found that because generally there’s less ceremony, a dependency can be as simple as a single function, then actually it’s easy enough to wire up the application by hand.
What is your favorite software tool?
It’s got to be git, it’s saved our bacon so many times when we thought we’d lost all our work, git reflog is a life saver. It’s also a constant source of new tricks, it’s got so many powerful features. For example, recently we had a difficult to debug issue with one of our dependencies, and we were able to use git bisect to automatically find the exact commit that introduced the regression by getting it to run a test on each commit it searched through.
If you could dictate that everyone in the world should read one book, what would it be?
Matt - Finite and Infinite Games comes to mind because I think too often people get trapped playing finite games and living in an artificial zero sum world, which is really limiting.
Iain - The Bonfire of the Vanities, because its social satire and understanding of human nature might somehow be even more relevant today than when it was written in the 80s.
If you had to suggest 1 person developers should follow, who would it be?
It’s hard to pick just one person. We have a few people that have influenced us as developers, but we find that nobody gets everything right.
If you could teach every 12 year old in the world one thing, what would it be and why?
Statistics. The world is becoming increasingly data driven, and the ability to make sense of this in order to properly understand risks and to avoid disinformation campaigns is, and will continue to be, a critical skill. We’re also entering a phase where people think they’re making data driven decisions, but have misinterpreted it. Conversely, they also believe that without enough data they can’t make a decision. We need to teach kids critical thinking skills and stats so they can sift through this data and make good decisions with or without it.
If I gave you $10 million to invest in one thing right now, where would you put it?
Right now we would have to say Symbolica. We’ve still got so many ideas in front of us to add to the project, and being able to hire some extra developers would be amazing. For example, we want to make testing more powerful and end-to-end by working directly with syscalls and assembly. We also want to incorporate some threading support to catch race conditions and deadlocks. These are big jobs and will take time and serious effort.
If I gave you $100 million to invest in one thing right now, where would you put it?
Quantum computing. Once this technology matures it will bring about a huge advancement in the quality of human life because it will allow us to simulate large complex molecules which will give us exponential improvements in technologies such as batteries, carbon capture and drug synthesis as well as unlocking new technologies that were previously out of reach.
What are you currently learning?
Matt - Over the last few years I’ve been making this broad shift from OOP to functional programming, so I’m definitely still learning lots about FP. More specifically I’ve been working through SICP lately, so that’s taught me some Lisp and I’ve been trying to learn Haskell when I can.
Iain - I have been learning about LLVM as part of our current work. I hadn’t previously looked into the internals of compilers in much detail, but it is quite fascinating to see some of the amazing things people have done.
Matt, would you say that learning Lisp has been worth the effort?
I think it's a language that's had quite a large impact on other languages so it's nice to have some familiarity with it and the influence it's had. I think at a high level it has a similar impact on your brain as learning any functional language in the different way they make you think compared to imperative languages.
What have you been listening to lately?
Matt - I’m a big fan of BBC Radio 6, so I often have that on. I recently rediscovered a band called Delorean, so I’ve had them on quite a bit. There’s also a song called London Gangs by Sault which has been a real ear worm for me lately and according to my Spotify I seem to have had a Placebo revival too. Oh and with it being the 30th anniversary of Nirvana’s Nevermind recently I’ve had that on quite a bit. In terms of podcasts I really like CoRecursive and Software Engineering Radio and Exponential View are regulars for me too.
Iain - I jump around a lot, but most recently Doves and White Lies.
How do you separate good project ideas from bad ones?
For personal projects just follow your interests and try not to worry too much about whether it’s good or bad, just whether it’s interesting, fun or helping learn something new.
If it’s a work project then we think the best technique is to really understand the problem you’re solving before you start building anything. Getting to know the user/customer, the pain they’re currently facing and how much value this solution could generate for them is really important. As software developers we have a tendency to immediately start building because it’s exciting, but often we can miss simpler solutions which might not even need us to build anything bespoke at all. We always try to look for an off-the-shelf solution first. If there’s nothing already available or the current solutions don’t work then we try to understand why. At this point you’re usually in a much better position to design a solution and can be more sure it’s going to be worth your time working on it.
Why was Symbolica started?
We wanted to write a tool that would let us prove that programs, before and after a refactoring, or written in different languages, were effectively equivalent. For instance, if someone wanted to port a program from a clean reference implementation to an efficient but complicated implementation, they could check it using this tool. We tried using existing proof tools for a while, such as KLEE, but we found that its abstractions weren’t amenable to the extensions we wanted to make. A concrete example of this was that when we were trying to check that a small program written in Python was equivalent to one written in C, we needed to symbolically execute the Python interpreter and we weren’t able to get this to work with KLEE. So that put us on the path of writing our own symbolic execution engine from scratch with some different design goals in mind.
Where did the name for Symbolica come from?
We were originally going to call it Symbolic, but the name was already associated with another company, so we settled on Symbolica as we liked how it sounded.
What was the biggest inspiration for Symbolica?
Some of the core technology is most similar to an existing symbolic executor called KLEE, and we have taken a lot of great ideas from this and its forks.
Are there any overarching goals of Symbolica that drive design or implementation?
We broadly have two design goals.
One is for it to be easily parallelizable. During symbolic execution, each time a fork in the code is found we can create smaller problems to represent each path, so we want to be able to run these subproblems in parallel, as well as apply some other optimizations to avoid going down paths we’ve explored before.
The second is to really focus on the usability from the developer’s point of view. We want it to be really easy for developers to run this against their code so that we can reduce the friction people face when trying to test their software and get people writing more tests and shipping fewer bugs.
What trade-offs have been made in Symbolica as a consequence of these goals?
We have a lot of immutability throughout, to make parallelization as trivial as possible. Of course, this is generally good for simplicity and robustness too, but has led to some interesting design questions and efficiency problems. For instance, to implement these immutable data structures efficiently takes some care because we have to ensure we’re not copying unnecessary data whenever there is a modification. We also need to be able distribute these data structures over multiple machines when parallelizing it across a compute cluster. So we have to be smart about how much data we’re copying and when we do that. In many cases we can get away with lazily fetching the data on demand.
What is the most challenging problem that’s been solved in Symbolica, so far?
So far, most of the work has gone into decomposing the overall problem into its fundamental abstractions, which we believe will allow us to support the features we need.
What sort of fundamental abstractions have you found thus far?
The necessary abstractions are probably not that surprising in hindsight, they’re just those that describe how programs work at a very basic level, such as functions, instructions, operands and expressions etc. Really perfecting these and their relationships takes time, but it has paid off for us in that several features have since “fallen out in the wash”. For example, it became self-evident that we could treat all data as fixed-size expressions, whether constant or symbolic, and we could build everything from a small set of operations on these types.
What was the most surprising thing you learned while working on Symbolica?
It’s amazing how far a poor but minor design decision can cascade throughout a complex system.
What is your typical approach to debugging issues filed in the Symbolica repo?
As the project is pretty new we’ve not actually had any submitted yet. On other projects we’ve worked on together we’d try and write a failing test and then fix from there.
What is the release process like for Symbolica?
It’s really simple at the moment, we just tag the master branch when we want to create a new release. If we want to patch an old version we’ll just create a release branch from the tag that’s broken and start creating patches on that, which we’ll then propagate forward to newer versions if we need to. As it’s a library we plan to follow semver, but we’re not at v1 yet, so things are moving about a lot and so we’re not being strict about not making breaking changes at the moment.
How is Symbolica intended to eventually be monetized?
Our intention is to offer a cloud-hosted version of Symbolica that people can pay for on a consumption basis. We believe our solution is amenable to fairly large parallelization, and so our goal is to make it efficient and cost-effective for everyone by running it on some elastic infrastructure in the cloud.
How do you balance your work on open-source with your day job and other responsibilities?
Luckily for us Symbolica is our day job. Before starting Symbolica we used to work together at another company and they were generally happy for us to open source things that we’d built internally, providing they weren’t commercially sensitive. So we’ve both always been fortunate enough to be able to work on open-source projects during work time.
Do you think any of your projects do more harm than good?
I think this is one of the trickiest things with new technology. It can be really hard to predict the higher order effects it will have on society or how others will use it. I can’t think of anything particularly harmful about Symbolica because our aim is to help people find more bugs in their code base and so hopefully we’ll help to prevent some really bad outcomes like security vulnerabilities or loss of life.
What is the best way for a new developer to contribute to Symbolica?
The best way to contribute is to open an issue and let us know if there’s anything you want to add to the project. We’re friendly and we’d love external contributions. We are currently working on the hosted part, but we plan to spend some time on the open-source docs soon, so people know how to get started building and running the project on their machines. You can also email us at firstname.lastname@example.org or message us on Twitter @SymbolicaDev.
Where do you see the project heading next?
We want to get an efficient and scalable version working in the cloud so that we can handle large problems. We also want to develop much better tooling around it so that it’s easy for developers to integrate symbolic tests into their code base and development workflows.
What motivates you to continue contributing to Symbolica?
Well partly because we hope to be able to commercialize it and earn a living this way, but also because the problem is really interesting. We’re both big proponents of software testing, but we also understand that many developers feel like they don’t get a worthwhile return on investment when they write tests. We believe that with better tools, like Symbolica, we can change this.
Are there any other projects besides Symbolica that you’re working on?
Symbolica takes up all of our focus at the moment, but we have carved some internal bits out into other open source libraries. For instance, we’re building most of the cloud hosted version using F# and we recently wrote a small library to make configuration binding more type safe. The de facto .NET config binder was a common source of null values for us, which was frustrating when the rest of F#’s type system is really designed as if null isn’t a valid value, so our library takes a more F# friendly approach and uses a Result type to model the fact that binding might fail.
Where do you see software development heading next?
We think there are a couple of big themes. The first one is that we’re going to see this increase in low-code tools continue. We agree with Fred Brooks when he says there’s no silver bullet, so we don’t think these are some kind of magic solution, but we think we’ll see domain-specific tools come out that lower the barrier to entry. So, we think we’ll see this rise of people with specific domain expertise who architect and design domain-specific tools. They’ll still have to put a whole load of thought into the information architecture and how different parts of the system interact, but they won’t need to care about lower level details of the machine etc. So it will become much more about domain modeling and we might see people employed primarily for their domain knowledge and then trained on how to use whatever low-code platform it is that they build solutions in. Developers can get quite cagey about this because they think it’s going to erode their value, but we disagree because there’s still going to be a huge demand for people who know the lower level languages, in part to be able to build and support these low code tools / environments, and there'll still be plenty of other things that are better off written in lower level languages.
The second is an increase in popularity of statically typed and functional languages. We personally can’t imagine working on a code base of any real size without having the compiler check our work and we think TypeScript, with its structural typing, has done a great job of showing that you can have static type checking with little ceremony and still retain the flexibility of dynamically typed languages. We think switching to F# has also really opened our eyes to just how much baggage OOP carries with it and how it’s actually not a very natural mental model for most types of programming today, for example REST has a stateless architectural model, yet we often build REST APIs using OOP which is all about state. We think OOP still has a place for programs that really need to encapsulate mutative state (.e.g games) in long lived objects, but a lot of people only use it because it was the predominant paradigm in the first language they learnt.
Where do you see open-source heading next?
Personally we hope that the people that put in the large amount of effort and hours to maintain critical open source projects that are depended on by millions of developers get the recognition and compensation they deserve. Unfortunately, we’ve seen recently with projects like babel that it’s often difficult to get enough funding to keep even a handful of developers working full time on something that’s a critical piece of tooling to many companies. We’re not sure what the exact solution is, but we expect to see more “open-core” business models. To us, this seems to strike a nice balance between developing in the open and also being able to capture some value from those efforts, where larger enterprises are willing to pay for a hosted version of the tool because it works out cheaper for them than running it themselves. Although, obviously this doesn’t work in all cases.
Do you have any suggestions for someone trying to make their first contribution to an open-source project?
Always open an issue and speak with the core project maintainers first. It’s really frustrating if you spend your time building a feature or fixing a bug and then the PR gets rejected because it’s not in keeping with the rest or project or something. It’s always best to speak with the core maintainers first and scope the work out with them. You’ll save yourself a lot of time and you’ll end up with a better designed solution too as well as getting to learn a bit about how the core maintainers think and work.