A fun way to look at programming from the intentional perspective is to notice the similarities with encryption.

As we know, encryption means to apply a difficult-to-invert function to some clear text and a key to get a coded message. To decode the message, that is to get the clear text from the message, the encryption process would have to be inverted – this is why the function is made as difficult to invert as possible. The goal of encryption is to prevent decoding or at least to drive the costs of decoding up.

When we program, we combine two sets of information: the problem details with the implementation details. For example if we work in banking, the problem details have to do with accounting, contract details, and legal requirements. The implementation details have to do with the rules of the programming language, operating system, databases, file formats and so on.

The result of programming is called source code – here is that pesky word “code” again. We all know that programming is “difficult-to-invert”, meaning that when we look at the code it is frequently unclear what the problem was. In this sense, programming is like encryption. With slight exaggeration we could say that one way to keep a company’s business processes secret is to implement them in COBOL and publish the code. So what’s wrong with that?

The problem is that when we do anything to the software (when we test it, optimize it, prove it correct, port it to a new platform, extend its functionality, change it to conform to changes in the problem) these activities are motivated by or defined by the inputs to the programming process, not in terms of the result of the process.

This is very clear when we look at real encryption. If a secret message needs a follow-up, for example a military commander wants to attack target B instead of target A, this will definitely not be implemented by directly editing the encoded message! Instead, the clear text will be edited and the encryption re-run. Why the seeming complication? Again, in case of encryption the answer is obvious, it would be very difficult to find the encoded “image” of A in the result and replace it by the encoded “image” of B. Because the military has a computer that does the encryption, and they also have the key, it is much easier to change the original intention in the clear text from A to B and re-encrypt the whole message, not just B.

Of course this is just a thought (“Gedanken”) experiment because modern encryption techniques are so good that the image of A would be impossible to find.

Still, in programming we are in effect editing the encoded message all the time. When the intention in the problem changes from A to B, the programmers in fact have to replay the whole thought experiment described above:

  1. identify the image of A in the code – effectively all the places where A had an effect on the code. This is called code-scattering in aspect oriented programming.
  2. encode B in terms of the same implementation assumptions that were in force when A and the rest of the code were first coded.
  3. edit the code by removing the image of A and inserting the image of B.

This is true for all the other activities mentioned earlier – for testing, optimizations, or even documentation. Everything that is done to software – other than literal execution on a computer – makes sense only in terms of the original problem. When we look at the code we need to map it back to the problem statement, otherwise any code is devoid of meaning. In other words, a statement:

    i=1;

could be either correct or it could well be a fatal error. We could decide which only in the context of what the problem was. Programming languages were not designed for the express purpose of encrypting the problem intentions, but they were not designed with the express purpose of retaining the intentions either. In fact, the best means most languages offer for preserving intentions is the trivial “comment” facility with its well known problems.

Programming also suffers relative to encryption in that programming is done completely manually – hence it can be slow and prone to errors – while the encryption process is always in an executable form. But the greatest drawback of programming is that the “inputs” to the programming process, namely the problem statement and the implementation assumptions, are not available in a database where they could be processed by a program. For example in step 2 above, the implementation assumptions are not readily available and have to be manually extracted from their encoded consequences.

On a positive note, we can also notice that there have been remarkable improvements in programming techniques, and they frequently addressed the issues of preserving problem intentions and separating implementation assumptions. For example, OOP techniques helped in many cases with the former, and type declarations and modules helped with the latter.

But if we keep the encryption metaphor in mind, we can get a clearer idea of what we need to do in order to improve the speed and reliability of programming even more.

Share →

6 Responses to Is programming a form of encryption?

  1. Michael Dynin says:

    The “encryption” metaphor is a stretch — the primary intent of encryption
    is to obscure the message, and *hopefully* this is not the primary
    intent of programming. Rather, I think of coding as compression.
    The programming language can then be thought of as a
    compression scheme. Two factors are important: to what extent
    is the compression lossy (to what extent the intent is lost),
    and what is the rate compression (how efficient is the language
    at eliminating redunancy.) Obviously, less lossy and high-rate
    compression is good.
    Better compression is usually achieved by domain-specific languages,
    just as custom compression schemes for pictures, audio and video
    can leave general-purpose compression algorithms in the dust.
    Language-oriented development promises a general-purpose
    “compression scheme” that performs as well as domain-specific ones.
    Compression algorithm can be arbitrarily complex, but there is cost
    associated with the complexity of the programming language. The real
    challenge is in providing a sufficiently better “compression scheme”
    that is simple enough so that cost of switching is acceptable.

  2. Magnus Christerson says:

    Dear Michael,
    Yes, the encryption metaphor is a stretch – as we said it’s a “fun” way to compare the similarities. Of course no reasonable programmer intentionally tries to hide the problem statement in their code!
    Your compression analogy is good and it points out the potential lossy-ness of a software program – some intentions are not only encrypted or compressed, but actually lost! This is a fact for both design information (for example an architectural decision can easily be lost in code) as well as problem domain information (today many legacy business rules are lost in the code).
    However, compression is used to get a more compact format for the message. This is not usually the case when we write a computer program. As we have blogged before, the problem statement of a program can typically be expressed in a more compact form than our typical source code.
    And you are right, domain specific languages, or more precisely, languages which have the appropriate level of abstractions for the problem at hand, do a better job of encoding the intentions of the software to be built. So ultimately encoding is the ideal scenario as opposed to today’s processes which are more similar to encryption or even compression.

  3. Michael Dynin says:

    There are in fact some truely encrypted problem statements — at Obfuscated Code Contests… :)
    I agree that encoding is the ideal scenario — with 1:1 mapping between decisions the programmer makes and constructs of the program representation.

  4. Hans Hurvig says:

    The analogy with encryption also holds when your move one abstraction-level down, from source code to machine code.
    Machine code is essentially cipher-text, and with optimizing compilers it is very unclear which bits of machine code reflect which parts of the source code, the “image of A”.
    Machine code is gibberish to most programmers, and no sane person would modify machine code if you had the source code and could modify and recompile it.
    Indeed, most programmers can be fully productive while being blissfully ignorant of the messy details of machine code, even its existence.
    How did we attain this happy state? By having powerful tools that completely shield us from these details.
    Now, of course, moving one abstraction-level back up, the trick is to attain the same level of power in your intentional framework so we can start forgetting about source code. Good luck!

  5. Hej Hans,
    You mean that intentional code is to generated code as high level source is to machine code generated by optimizing compilers. We agree. So the optimizing compiler represents knowledge about the target machine architecture, and also some general programming tricks such as constant folding and strength reduction.
    Generative programming is an extension of the programmer’s power to include other domain details and specific system architecture details into the “compilation process”.
    Now moving up the abstraction level above the source code has been tried before, for example the current model driven tools all try to do this as they generate source code. However, many of these tools still require you to also work at the level of output of the generators i.e. the source code, for example debugging, or adding some lower level code in the source code langugage to complement the model information. Since there is much more to say about this approach, we ‘ll write up as a separare blog entry on this.
    Thanks for your post!

  6. Jason Haley says:

    re: Disassemble vs. Decompile