Opening up Open Source (by alaric)
One of the awesome things about free/open source software (FOSS) is that, as you have access to the source code, you have exactly as much power as the original authors of the software to modify and extend it.
When you are upset about something in closed-source software, or one hosted on the owner's servers on the Internet as a web app or API, all you can do is beg and plead with them, and threaten to take your future business elsewhere; they control the source code, so they have ultimate power. With FOSS software, in theory, the original authors have no special powers.
However, it often doesn't quite work out like that in practice.
Most FOSS software comes onto people's computers as a precompiled binary; they download it and run it. This is convenient and efficient. But if they then want to use their theoretical right to get in there and modify it, they need to do the following things:
- Learn the programming language(s) it's written in, and the libraries/frameworks/APIs it builds on top of. (This is particularly tricky if they have no previous programming experience).
- Find and download the sources (which shouldn't be too hard, but can still be tricky).
- Set up a build environment that can compile the thing. This may involve installing compilers, and development versions of libraries in order to get the header files, and so on.
- Actually make it compile. A lot of the time, a source package as shipped by the authors won't compile perfectly outright, as you need to apply patches written by the people who maintain the package for your platform, as each platform has their own conventions as to where things are placed in the filesystem, and their interaction with less-standardised bits of infrastructure such as service management frameworks, management of network interfaces, low-level hardware access, and so on.
- Learn the workings of the codebase. For a large project, this can be VERY daunting, even to a seasoned programmer.
- Actually design, implement, and debug the change. If the codebase is poorly architected, this can be made unnecessarily difficult, and require refactoring of existing code elsewhere within the codebase.
- Install the software once it's building. As it's been custom-built, it may not interact nicely with your platform's package manager, so you may end up running it out of
/usr/local/bin
or~/bin
or similar, and have trouble making managed packages that depend on the one you're tinkering with correctly linking to it, interaction with system configuration tools, and so on.
I think this is harder than it needs to be. It puts a lower bound on the effort required to make even a trivial change that, in many cases, means it's not worth making it; we're as beholden to the whims of the developers of the package as we would be to a closed-source software company, and the supposed benefits of open source are denied to you.
We can break down the barriers into a number of categories, and look at what can be done to solve each.
Learning the programming environment
Whether you're a seasoned programmer or not, any given piece of software you didn't write will involve a set of languages, tools, libraries, and other things, that you may not be familiar with, and these will need to be learnt.
What might help is better standards for automatically-generated API documentation, so you can find the documentation of API functions as they are used in the software you're learning how to edit by just pressing a button in your editor.
But if it became easier, commonplace, and expected for people to dig around in other people's source code, for curiosity or to make their own changes, developers of infrastructure components would feel more compelled to document their interfaces in ways that casual programmers can quickly pick them up, and to make their interfaces simpler and easier to learn, because the expected audience would be less dominated by seasoned programmers.
And, conversely, if more people were casually getting involved in simple programming tasks in order to improve the software they use (if it became easier to do so), then the general programming ability of the population would also rise, giving more people the grounding in basic conventions and concepts required to understand programming tools...
Migrating from a normal installation of the software to a hackable one
This is perhaps the biggest hurdle, and yet, the most amenable to being overcome with better technology. It covers the whole spectrum from finding and downloading the source code, setting up a build environment, getting it building on your platform, and getting it installed as a first-class citizen in the eyes of your package manager.
I can think of two technical fixes to this problem, and the best bit is, they're both things that already exist out there rather than my usual kinds of crazy new reinvent-the-wheel thinking!
Firstly, package managers like Nix make it easy to establish build environments on your own hardware, as the build environment of any package can be requested, and automatically set up for you. Also, they are built around installing software from sources in the first place, and offer downloading pre-built binaries as an optimisation. It's quite easy to adapt a nix expression that builds a software package from downloaded sources to one that builds from a source tarball you've made yourself, and install it into an isolate "profile" in such a way that it's easily kept isolated from other software you have running and might not want to risk being broken by your experimental changes yet, and to roll the change back if it doesn't work out.
I suspect that many traditional package managers are written with users in mind, and not developers, which sounds laudable; but in practice it forces the distinction between user and developer, not allowing the former to easily migrate into the latter. Nix feels written for developers, of course acknowledging that developers are also users and still want to be able to install off-the-shelf prebuilt binaries easily. The inbuilt package manager for Chicken Scheme is likewise developer-friendly, letting you directly build from arbitrary checked-out source trees into a properly installed package; my development process for most Chicken software is to run chicken-install
in my in-progress sources as the first port of call to compile and run it, rather than the usual idiom in most languages of compiling and running from the source checkout then "installing" the binaries as an optional, later, step. And yet a "mere" end-user of my software can type chicken-install ugarit
, and Chicken will download the latest public Ugarit release from the Internet and install it for them. If they want to join me in hacking Ugarit, they can check out the latest sources from my web site and get stuck in pretty quickly.
Secondly, the move away from compiling software ahead-of-time into distributable binaries, towards on-demand compilation of source code at run time (with caching of compiled forms, of course), means that the normal installation of some software is the source code, there is no need for an external "build environment" to convert your changes into a runnable package, and changes to the source code can be immediately used without going through any kind of build/install phase. Because this makes tinkering with the software so much easier, it can make it become a routine part of using the software, rather than something special done only by special people. The typical Emacs user will have overridden various internal functions of Emacs from within their personal configuration; although many configurations can be done without doing so, customisation through function overriding is so accessible that people routinely customise their Emacs installations in ways that the original developers didn't think (or didn't have time/energy) to add as a configurable option. I wish all open-source projects were written in such an open manner, but it will require a lot of migration away from the batch-compilation model of C, C++, and Java.
Poorly-architected existing code
This is a thorny issue; even if you can easily get into the code of your application and make the changes you want, and you understand all the tools it's built with, it can still be hard to make the changes you want because of one of a number of kinds of inherent "fragility" in the way the software is constructed.
Usually, this boils down to some variant on the idea of some information being repeated all over the code, rather than kept in one place. If your code relies on communicating between its components by using a special file, for instance, and every place where this file is read or written contains its own code to read and write this file directly, then changes such as storing the file in a different place, or adding some extra information to it, or replacing it with access to a database or something, will be difficult. You'll need to find all the places where the file is used, and individually re-write them to reflect your changes. This is laborious, and you might miss some, leading to bugs when those bits of code are run but don't reflect the changes.
However, if the mechanism of accessing this shared state (reading and writing the file) was isolated into one place in the software, with an interface that is used wherever required and reflects only the essentials of the access to the shared state, then that mechanism can easily be changed to another, as long as it still preserves those essentials, relatively painlessly and safely.
Software developers, before they even write a line of code, be thinking about how their software might be changed in future, and make sure that they split it into modules with clean interfaces, to make that easier. As side effects, it also makes their code easier to test and debug, as the interfaces serve to define and clarify the responsibilities and expectations of each module, which makes it easy to write comprehensive tests for the modules.
If this seems like hard work, then you're doing it wrong. We've all heard of code (usually in Java, for some reason) that seems to have taken the Design Patterns book as a checklist of things to do, and features pages and pages of AbstractFactoryWrappers that just indirect everything; the actual code that does the task at hand seems to be scattered thinly amongst all this framework. That's not what I mean by designing your software to be extensible. I just mean splitting it into bits with an explicit interface between each, and makings those interfaces reveal as little as possible about the workings behind them, and putting duplicated code into modules behind interfaces rather than writing the same logic more than once, rather than making it into one big ball of inter-related mud. If you're not saving time by writing software like this in the first place, or if it seems like a burden, then you need to re-think how you write code.
I think programming languages can do a lot to help us write cleaner code, too. I find that when I'm writing C and C++, it's often hard to cleanly pull bits of functionality out into other functions due to the manual memory management which complicates interfaces, and the lack of lexically-scoped first-class functions. Code written in Lispy languages tends to be a lot more easy to refactor as it grows, leading to cleaner interfaces (on average), and the automatic memory management tends to make the interfaces simpler as well.
Also, a culture of open extensibility in software means that extensibility of your code is high in the programmer's mind at all times. Developers of Emacs packages seem to expect bits of their software to be overridden, and write it accordingly.
Conclusion
I think that making programming more accessible has very many good consequences. It gives people more power to get more out of their computers. It gives people more reason to trust computers (and as we move to a more online society, people are forced to place their trust in computers in order to take part; but being forced to place your trust in something you don't trust is a harrowing experience), as they can peer inside the software to see how it works, and fix it if it doesn't. It also means that everyday users of computers have an easy, and natural, transition into learning programming, which is a very rewarding pastime; and more people contributing to open-source projects means we all get to have a better quality of life.
So, open-source software developers, I implore you to consider these points. Try to make your software open and welcoming to newcomers!