I am a potato – heading towards caramelisation. (Programming Language Threshold Concepts Part II)

Following up on yesterday’s discussion of some of the chapters in “Threshold Concepts Within the Disciplines”, I finished by talking about Flanagan and Smith’s thoughts on the linguistic issues in learning computer programming. This led me to the theory of markedness, a useful way to think about some of the syntactic structures that we see in computer programs. Let me introduce the concept of markedness with an example. Consider the pair of opposing concepts big/small. If you ask how ‘big’ something is, then you’re not actually assuming that the thing you’re asking about is ‘big’, you’re asking about its size. However, ask someone how ‘small’ something is and there’s a presumption that it’s actually small (most of the time). The same thing happens for old/young. Asking someone how old they are, bad jokes aside, is not implying that they are old – the word “old” here is standing in for the concept of age. This is an example of markedness in the relationship between lexical opposites: the assumed meaning (the default) is referred to as the unmarked form, where the marked form is more restrictive (in that it doesn’t subsume both concepts) and it is generally not the default. You see this in gender and plural forms too. In Lions/LionessesLions is an unmarked form because it’s the default and it doesn’t exclude the Lionesses, whereas Lionesses would not be the general form used (for whatever reasons, good or bad) and excludes the male lions.

Why is this important for programming languages? Because we often have syntactic elements (the structures and the tokens that we type) that take the form of opposing concepts where one is the default, and hence unmarked, form. Many modern languages employ object-oriented programming practices (itself a threshold concept) that allow programmers to specify how the data that they define inside their programs is going to be used, even within that program. These practices include the ability to set access controls, that strictly define how you can use your code, how other pieces of code that you write can use your code, and how other people’s code can use it, as well. The fundamental access control pairs are public and private, one of which says anyone can use this piece of code to calculate things or can change this value, the other restricts such use or change to the owner. In the Java programming language, public dominates, by far, and can be considered unmarked. Private, however, changes the way that you can work with your own code and it’s easy for students to get this wrong.  (To make it more confusing, there is another type of access control that sits effectively between public and private, which is an even more cognitively complex concept and is probably the least well understood of the lot!) One of the issues with any programming language is that deviating from the default requires you to understand what you are doing because you are having to type more, think more and understand more of the implications of your actions.

However, it gets harder, because we sometimes have marked/unmarked pairs where the unmarked element is completely invisible. If we didn’t have the need to describe how people could use our code then we wouldn’t need the access modifiers – the absence of public, private or protected wouldn’t signify anything. There are some implicit modes of operation in programming languages that can be overridden with keywords but the introduction of these keywords just doesn’t illustrate a positive/negative asymmetry (as with big/small or private/public), these illustrate an asymmetry between “something” and “nothing”. Now, the presence of a specific and marked keyword makes it glaringly obvious that there has been an invisible assumption sitting in that spot the whole time.

One of these troublesome word/nothing pairs is found in several languages and consists of the keyword static, with no matching keyword. What do you think the opposite (and pair) of static is? If you’re like most humans, you’d think dynamic. However, not only is this not what this keyword actually means but there is no dynamic keyword that balances it. Let’s look at this in Java:

public static void main(String [] args) {...}
public static int numberOfObjects(int theFirst) {...}
public int getValues() {...}

You’ll see that static keyword twice.Where static isn’t used, however, there’s nothing at all, and this (by its absence) also has a definite meaning and this defines what the default expectation is of behaviour in the Java programming language. From a teaching perspective, this means that we now have a default context, with a separation between those tokens and concepts that are marked and unmarked, and it becomes easier to see why students will struggle with instance methods and fields (which is what we call things without static) if we start with static, and struggle with the concept of static if we start the other way around! What further complicates is this is that every single program we write must contain at least one static method, because it is the starting point for the program’s execution. Even if you don’t want to talk about static yet, you must use it anyway (unless you want to provide the students with some skeleton code or a harness that removes this – but now we’ve put the wizard behind the curtain even more).

One other point I found very interesting in Flanagan and Smith’s chapter was the discussion of barriers and traps in programming languages, from Thimbleby’s critique of Java (1999). Barriers are the limitations on expressiveness that mean that what you want to say in a programming language can only be said in a certain way or in a certain place – which limits how we can explain the language and therefore affects learnability. As students tend to write their lines of code as and when they think of them, at least initially, these barriers will lead the students to make errors because they haven’t developed the locally valid computational idiom. I could ask for food in German as “please two pieces ham thick tasty” and, while I’ll get some looks, I’ll also get ham. Students hitting a barrier get confusing error messages that are given back to them at a time when they barely have enough framework to understand what these messages mean, let alone how to fix them. No ham for them!

THIS IS AN IMPORTANT QUESTION!

Traps are unknown and unexpected problems, such as those caused by not using the right way to compare two things in a program. In short, it is possible in many programming languages to ask “does this equal that” and return an answer of true or false that does not depend upon the values of this or that, but where they are being stored in memory. This is a trap. It is confusing for the novice to try to work out why the program is telling her that two containers that have the value “3” in them are not the same because they are duplicates rather than aliases for the same entity. These traps can seriously trip someone up as they attempt to form a correct mental model and, in the worst case, can lead to magical or cargo-cult thinking once again. (This is not helped by languages that, despite saying that they will take such-and-such an action, take actions that further undermine consistent mental models without being obvious about it. Sekrit Java String munging, I’m looking at you.)

This way of thinking about languages is of great interest to me because, instead of talking about usability in an abstract sense, we are now discussing concrete benefits and deficiencies in the language. Is it heavily restrictive on what goes where, such as Pascal’s pre-declaration of variables or Java’s package import restrictions? Does the language have a large number on unbalanced marked/unmarked pairs where one of them is invisible and possibly counterintuitive, such as static? Is it easy to turn a simple English statement into a programmatic equivalent that does not do what was expected?

The authors suggested ways to dealing with this, including teaching students about formal grammars for programming languages – effectively treating this as learning a new language because the grammar, syntax and semantics are very, very different from English.(Suggestions included Wittgenstein’s Sprachspiel, language game, which will be a post for another time.) Another approach is to start from logic and then work forwards, turning this into forms that will then match the programming languages and giving us a Rosetta stone between English speakers and program speakers.

I have found the whole book very interesting so far and, obviously, so too this chapter. Identifying the problems and their locations, regrettably, is only the starting point. Now I have to think about ways to overcome this, building on what these and other authors have already written.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s