Hungarian notation

Hungarian notation has had a fascinating history. Because I am still getting quite a few emails asking for details, history, or advice, I decided to sit down and write up some of the answers, even though Hungarian, on the face of it, has nothing to do with Intentional Software. So I am writing now as a private person but I would like to come back later with some interesting connections with intentionality.

It is a lot of fun to Google “Hungarian naming”. I get a particular kick out of this (scroll to #29 and #30) site but mostly the criticism is quite mild. Many people recognize that not all dialects have the original intent, and when the focus moved over to the implementation types – like short, int, or long – the conventions were less useful or even counterproductive. As we will see, the idea was really to distinguish types that are meaningful in the particular application, what we call application types.

I still have the listings of the program that started the whole thing – it was written in 1972 and 1973 while I was part-time employed at NASA Ames in Mountain View, Ca. (my other part times in those years were Xerox PARC and Stanford). The program, code named Peso, was a translator that inputted wiring diagrams and compiled special purpose simulators that could simulate the complex circuits so described, including the parallel simulation of simple failure modes of the circuits – this was at a time when complex circuits were indeed prone to failure and finding the failures was a practical problem.

Peso was written in PDP-10 machine code, which could be organized, with the help of macros and some discipline, as a simple programming language. As you might imagine, the program had to be quite complex. Among other things, we
had:

  • card numbers
  • card types
  • card pin numbers
  • DIP (meaning microchip) positions on the card
  • DIP types
  • DIP pins
  • DIP failure modes
  • DIP type simulating methods
  • signal names
  • signal simulation variables

and their connections to each other. The machine had its own implementation types: words (36 bits BTW), half words, and general byte pointers, among others, with specific operations being valid for specific datatypes only – however there was no automatic checking of these rules whatsoever.

After only a few days of work I realized that I had to create an order in this potential chaos and that is how the naming conventions were first used. All the above quantities and formats got their “tags” and then all the related structures could be easily named and the code practically wrote itself.

Here are some comment lines from somewhere in the 1000 odd pages of the code (yes it was all upper case):


; IF LC=(CP), SCAN BACKPANEL WIRELIST AND FOR EVERY
; (CN, BWPN) FIND LC ON THAT CARD AND CALL PRPLC(LC, CN)

I can see that here a local signal (LC) on a card reaches the connector at card pin number CP which is connected through backplane wiring that was defined in terms of card numbers (CN) and BWPN’s (pin numbers on the backplane were numbered differently than on the card – ask the engineers why…) which allowed the signal to propagate to other cards, etc. And yes, the code ran very well, finding lots of faults first in the wirelists and then in the 1970 vintage supercomputer itself.

The next project using the conventions was Bravo, the first WYSIWYG editor at PARC for the Alto, which was the first personal computer. Bravo was written in BCPL which was just one step above machine code, since there was only one data-type: the 16 bit machine word. I remember many tags from Bravo, for example the easy-to-say letter combination CP from Peso was reused for “character pointer”.

In the early days of Application Software Development at Microsoft, we did not legislate the use of the conventions, but as the software got more sophisticated, its use spread rapidly. Now we were using C, where the pointer arithmetic, in particular, had to be done very carefully and with full understanding what the application types (such as character pointers in an editor) and the implementation types were. Mac Excel in particular, was a very demanding application, initially half interpreted, half machine coded, with GUI, and a complex spreadsheet recalculation engine. The hundreds of types would have been very difficult to keep track of without the conventions. This was also the time that Doug Klunder published his summary of the Hungarian naming convention.

It was probably Doug, ever fond of irony, who first called the conventions “Hungarian” – a combined reminder of my Hungarian origins and a twist on the phrase “it’s Greek to me” ironically implying that code using the conventions would not be very readable. The term stuck maybe because it followed the pattern of “Reverse Polish Notation”. Many years later I was approached by a researcher for the Oxford English Dictionary for documentation on the usage and origins. I have not seen the term in the OED yet, but it would fill me with immense pride if this idea appeared there as only the 3rd sense of “Hungarian”.

With the diffusion of the GUI-experienced application programmers into the Windows development group, the naming conventions also spread and dialects inevitably appeared. Unfortunately one of the less attractive dialects got the largest circulation through Petzold’s excellent book on Windows programming. Since then a lot of things happened, the Web made it even easier for Hungarian to spread and mutate, and OOP has become universal which created counter pressure from other conventions, and also made some of the types part of the language (for example the Hungarian enumerated element coRed would be written in OOP as Co.Red, where Co could be the enumerated type.)

So much for the history. Now let’s look at some of the controversies.

You can probably guess how I feel about Hungarian: I am completely sold on it, all the code that my company now produces uses it internally, and I couldn’t live (OK, I could not program) without it. At the same time we recognize the resistance that it generates, one might say the “high activation energy” that is required, so we make sure that our API’s and publications do not depend on this.

If you read some of the above references, you are aware of the main goals of Hungarian, namely, that names be carriers of application type (as opposed to implementation type) information which then serves as an excellent mnemonic aid and thinking aid.

The most frequent issue with Hungarian is that it goes counter some well established and common sense ideas on naming, which are roughly the following:

a. Names should be descriptive, and as long as needed
    For example Font, Color, Window, Table, Red, CommonCacheForAuxiliaryFilePageTable etc. 

b. Words should follow in the same order as in grammatical
     English MyFont, LastElement, GetItem, InvalidateDependencyStructure

c. Abbreviations are to be avoided – words should be pronounceable without spelling out the letters

d. In general, program readability is very important.

But to me the question was always less “what the names should be?” but rather “what are the things that should be named?” When I look at problems this way, I see the need to name more and more things: for example in the Peso program the way the pins were numbered in the backplane (two columns of staggered wire-twist posts per board) and the way the pins are numbered on the card that plugs into the connector (one column of connectors on one side of the printed circuit board.) I would be dissatisfied by all of the obvious alternatives:

“they are all integers, call them, i, j , k; or pinInput, myPin, cardsPin etc.”

“describe them until you are satisfied: outputPinOnTheBackplane, pinNumberForTheBackplane, pinNumberOnTheCard”

Before we start choosing, we must realize that this is just the tip of this particular “pin number distinction” iceberg. We not only have to name a few fields or method parameters of this kind, but we will have a whole host of related entities. For example the various maps: the map from the backplane pin number to the backplane connections, or from the board pin to the printed circuit connections. If we have a variant record – a subclass if you will – we will have to name the variant having to do with the kind of pin. Again, the obvious alternatives can be very frustrating:

“these are special cases to this problem, to this domain, to this solution. You need to solve them one-by-one
using your engineering education and talents.”

“they are all maps, call them myMap, mainMap, backplaneHashTable”

“just say what you want” I guess that would be “map from the pin number for the backplane to a handle to a set of wirelist connections”, but that is not a name.

The reason for the frustration is that this iceberg is just a small corner of a veritable Antarctic ice shelf with thousands of similar icebergs related in complex ways. If the problem had to do with just one rather simple distinction there would be no need to worry about naming; we talked about it long enough, even iCard and jBack would work just fine. So it is the context of larger problems and especially in the context of wishing to make finer distinctions that the classical naming rules can be and should be questioned.

Many comments on the web point out that such distinctions can be made by types, and particularly by classes that
are effectively programmer-created types. That is true – with some reservations – but even then, the types or classes have to be named, and the naming issue remains.

I found the only way to create a map of the giant ice-shelf with the thousands of icebergs is to choose rather arbitrary short tags and combine them in consistent ways – in other words by using Hungarian.

Even today, I start each problem or subproblem by enumerating all the types, quantities, states, origins, units, meta-levels (meaning an entity vs. the description of an entity) and many other distinctions of interest and give them names. I do not mind changing names, I often do, but I can’t keep things straight unless I have some precise handle for them at all times. At this point I have not written a line of code yet, but I have already made more distinctions and introduced more names than most programmers would at this stage.

Are these distinctions real? Are they useful? They are certainly real, in that any correct implementation of the same problem will have implicitly incorporated them in some form: in part encoded in the execution trace, and in part encoded somehow in the fully qualified names that are used. I maintain that they are useful because in my case they already explicitly encoded in the initial, possibly incorrect implementation as well! This means that the distance from the initial implementation and the correct implementation can be very small! One can get to the correct results faster.

We all notice how modern type safety simplifies our work. I am saying that by making more and finer distinctions and making them explicit, we can go even further.

Would it help to enrich the type systems? It would not hurt, but again, the new distinctions would require new names, so unless we are careful, we would take on new burdens in order to get new benefits. It is better to get the benefits of finer type distinction without the burden of having to name and remember the names of new entities.

So what do people who do not use Hungarian do? Probably the most common thing in production code is to simply muddle through and treat each name as if it were one of the few in a small help-file example or homework problem and be defensible as such. The next level in engineering and in costs is to keep wider consistency within the project, which is not easy to do, but still stop at the edge of the distinction that the implementation language happens to support.

My favorite recent software horror story has to do with lack of such distinction – the Mars Orbiter that was lost because the measurement units were not indicated for a quantity. This is widely understood to be “documentation error” but I think if it were more common for the type systems (whether automatic or manual) to take units into account, problems like this would not happen. Another peculiar conclusion was to connect the problem to the “cheaper, faster” design philosophy that the Mars Orbiter project was developed under. This might be literally correct – if more money had been thrown at the project, the problem might have been found in time. But I think that better programming techniques are good not only for saving the rocket from blowing up, they are also good for saving the next test run from blowing up and therefore they make the software project “cheaper and faster”.

Why go beyond the language type system for type checking?

Programmers seem to loathe making distinctions that are not sanctioned by the implementation language. I believe there are two reasons for this. The first one is practical: In most languages the step from what is sanctioned and what is not is a pretty expensive one. So it is easy to go from int to float to double, but it is a much bigger step to go from double to a class, just to make a small distinction of coordinate origin or the measurement unit, for example. Here the implementation type distinction is sanctioned, the coordinate origin distinction is not, so a general purpose abstraction would have to be invoked for the purpose with relatively higher implementation costs – design, typing, debugging, and run-time costs. The second reason may be more psychological. Programmers trust the language designers (with a lot of encouragement from language designers and vendors) to choose the level of distinction that is proper. So what is not supported must be somehow beyond the pale, it would not be cricket to make a distinction in ways that is not in the “style” of the language.

The obverse of the coin is when the implementation forces the programmers – quite legitimately – to make some distinction they did not think of first. For example the same quantity may have to be kept in different formats for interfacing with fixed API’s, or when different code-bases are merged. Not being used to name distinctions, now the programmers have to – the compiler demands it. There is no argument with the compiler. This is where we start seeing some strange names even in the better engineered code bases. Names will be distorted with variant spellings, with abbreviations to refer to the alternatives.

Here my example is the use of “delegates” in C#. These are type-safe procedure pointers (with class capabilities) and perform a very important task. Unfortunately, for each use one needs to name a number of related quantities: the delegate type, the instance of the type, a definition of a procedure that matches the delegate type. The language does not consider this a problem. After all in the examples they can call these just A, B and C (or myDelegateType, myDelegateInstance, etc.) but that will hardly scale. Yet the programmer does not really care about this. The programmer cares about passing a comparison operation to a method and the name distinction should not be necessary.

We are not normally conscious of the fact that we do not use a different name for the definition of a method and
the call of the same method: the syntax, the language makes the minute distinction between the meta-levels, the call or the definition automatically. This distinction is sanctioned by the language and does not create demands for extra names (some IDE’s actually make this distinction visible by varying the color of the name: the definition may be shown in a different color than the name at the call) On the other hand, the distinction between the delegate type, instance, and definition are not sanctioned by the language, so we are forced to give them different names – and we are not normally conscious of the fact that we are unnecessarily forced to do so.

So what is the deal with being arbitrary about the basic tags, like cp, cn, lc in the Peso program?

Making up such terms is not only cheap, it is actually a very good tradeoff:

First, there are just not enough short and meaningful words to use and tags have to be short to be composable. For
the linguists out there: the Danish language, with 4 more written vowels in their alphabet than in English, is much richer in two letter words which makes Danish great for crossword puzzles or for Scrabble.

Even more interesting is the fact (or experience) that meaningful words can actually hinder communications about
programs, because it can be frequently unclear if the conventional or the technical meaning is meant.

It is claimed that the learning process for new programmers is shorter if the terms are already familiar to them. This is also contrary to experience. Of course they would have a superficial “aha” reaction right away, but with the typical complexity of software the details may be partially in clash with their prior expectation. So they still have to make the mental distinction between “previous familiar term” vs. “current familiar term”. This does not seem that simple, plus, more importantly, it will not scale to thousands of current terms, these simply can not all be “already familiar”. Making just a few of them familiar is also bad – it creates an unnatural aura for the particular aspects with the familiar names – why are they distinguished? Now we have to remember which ones are distinguished and which ones have other names (as in which of the 50 hash tables is actually called “hashTable”).

Of course the use of familiar terms in explanations, discussions, metaphors, comments, or as the bases of the mnemonics is essential. On the other hand, a phrase such as “character pointer” should not be a name, it should be only a part of the mnemonic legend and the explanation, the name or tag should be “cp”. In fact it might be interesting to poll people what they think a “character pointer” really was in Bravo, beyond that it helped with pointing at characters in an editor? I will give the answer to the curious at the end of this blog entry so you can test your expectation or ESP. My point is that using the “familiar” or “descriptive” term CharacterPointer would not have been superior to cp, and the short tag cp made hundreds of name combinations practical, so in the balance cp is much better than CharacterPointer.

Why the strange word order: type first, discriminator second?

This is a peculiar rule of the Hungarian language: family name comes first, and given name second as in the Government forms: for example Bartok Bela or Liszt Ferenc. Seriously, this is important in part for the same reason that the Government has: to sort related names close to each other. I was recently in a museum looking at the original ID plate on an artifact saying: “Engine, Rocket, Liquid Propellant” which is an extreme example of this usage.

But the main reason for the Hungarian word order is to be able to parse the name quickly, at a glance, into a type tag and the discriminator. If the type is kept at the end of the word it is much harder to parse for informal type checking:

Compare for instance:

    xdLeft += dxdNext;     (good) vs.
    leftXd += nextDxd;     (bad).

    (BTW: dxd is a constructed type, the difference between xd’s, so it is legitimate to add them to an xd.)

Lately we have started to separate type tags in some compositions to make them more parsable, for example, we write mappings as mpcn_xd, where the _ signals the start of a new type tag for the scanning eye.

What about readability? Isn’t that the most important?

Readability is one of those opaque concepts that is always praised but seldom defined. In my experience one seldom “reads” even just a part of a program aloud; Of course it is very important to be able to speak about the program and its concepts in a precise and comfortable way, most frequently without the reference to the actual text on paper or the screen, as in a team discussion. For this, the existence of names is the key – and as I mentioned in the standard practice many concepts, states, meta-levels or types remain nameless so they always have to be circumscribed informally, which is far from easy, quick or reliable way of communicating. Also, the use of common terms for local (“parochial”) entities can be also very confusing – as when something is called a “hashTable”.

When programs are read for understanding, the quality that is needed should be called “transparency” more than readability. That is, can one perceive the structure and the underlying intentions? Comments, by themselves, are seldom the answer. If we do not have sufficient precision in names, or rather in the resolution among our concepts, we can not write precise comments either. With names we get resolution, we get transparency of meaning and structure, and the comments become less significant.

Short names are the key for showing structure. This is one reason that mathematicians and physicists almost always use very short names, including Greek, Hebrew, and even made-up letters like the ħ (h-bar) of quantum mechanics. Extraneous detail at the leaves hides the shape of the tree. Einstein, for example, created a special notation for tensor calculus (the Einstein summation convention) just so that the large Sigma summation signs could be omitted from the tensor equations that comprise the General Theory of Relativity. This way the relationships are easier to recognize and the formulas are easier to manipulate. The theory is complex enough without having to worry about the sigmas everywhere. This is also true for programming. There is no shame in using similar tricks for our more modest application problems. Of course by “short” names here we mean tags of 2 to 5 characters which compose into fully distinguished names of 10 or more characters since we need hundreds of basic tags and thousands of names and we have to balance the need to show structure with the need to show the intention (the application type) as well.

Aren’t Hungarian names hard to learn?

My experience has been that, on the average, they are easier to recall than “descriptive names” – and by average I mean that we have to test against the precise recall of hundreds of names. For approximate recall of a few names, familiar terms would win handily. The problem with familiar descriptive names is that there is no single way to describe something, so to help oneself to remember a name, it is not very helpful to know that the name was “descriptive”, we would have to also remember what the particular quality was described. In Hungarian, the quality is given: it is the type – the intentional type, the domain type, the application type. If I know that – meaning that I know it by name – I will have the name, or at least the family name of the quantity.

I may not remember all the type tags in the 30 year old Peso program, but with a few minutes of inspection, I will be familiar again with cn, cp, lc, and the others, and then I will be ready to recognize and recall hundreds of variables, fields, and methods with a precision that is sufficient to make repairs or additions.

 

Here is the solution to the “puzzle” above:

cp was a 16 bit unsigned integer, an abstract pointer to the n’th character in the current state of a document doc.
Given doc and cp, there was a rather expensive operation to get a real pointer pch to the character in memory and many additional things that made the system flexible and efficient.

There are many more things to discuss about Hungarian, for example the use of the discriminators and type modifiers. These can be also standardized to a large extent, and can be of great help in writing and maintaining software. So I will return to this topic at a later time.