Today’s keynote was given by Alan Noble, Engineering Director for Google Australia and long-term adjunct at the University of Adelaide, who was mildly delayed by Sydney traffic but this is hardly surprising. (Sorry, Sydney!) Whn asked to talk about Google’s Software Engineering (SE) processes, Alan thought “Wow, where do I began?” Alan describes Google’s processes as “organic” and “changing over time” but no one label can describe an organisation that has over 30,000 employees.
So what does Alan mean by “organic”? Each team in Google is empowered to use the tools and processes that work best for them – there is no one true way (with some caveats). The process encouraged is “launch and iterate” and “release early, release often”, which many of us have seen in practice! You launch a bit, you iterate a bit, so you’re growing it piece by piece. As Alan noted, you might think that sounds random, so how does it work? There are some very important underlying commonalities. In the context of SE, you have an underlying platform and underlying common principles.
Everything is built on Google Three (Edit: actually it’s google3, from Alan’s comment below so I’ll change that from here on) – Google’s third iteration of their production codebase, which also enforces certain approaches to the codebase. At the heart of google3 is something called a package, which encapsulates a group of source files, and this is associated with a build file. Not exciting, but standard. Open Source projects are often outside: Chrome and Android are not in google3. Coming to grips with google3 takes months, and can be frustrating for new hires, who can spend weeks doing code labs to get a feeling for the codebase. It can take months before an engineer can navigate google3 easily. There are common tools that operate on this, but not that many of them and for a loose definition of “common”. There’s more than one source code control system, for example. (As a note, any third party packages used inside Google have the heck audited out of them for security purposes, unsurprisingly.) The source code system used to be Perforce by itself but it’s a highly centralised server architecture that hasn’t scaled for how Google is now. Google has a lot of employees spread around the world and this presents problems. (As a note, Sydney is the 10th largest engineering centre for Google outside of Mountain View.) In response to this scaling problem, Google have tried working with the vendor (which didn’t pan out) and have now started to produce their own source control system. Currently, the two source control systems co-exist while migration takes place – but there’s no mandated move. Teams will move based on their needs.
Another tool is a tracking tool called Buganizer which does more than track bugs. What’s interesting is that there are tools that Google use internally that we will never see, to go along with their tools that are developed for public release.
There’s a really strong emphasis on making sure that the tools have well-defined, well-documented and robust APIs. They want to support customisation, which means documentation is really important if sound extensions and new front ends can be built. By providing a strong API, engineering teams can build a sensible front end for their team – although complete reinvention of the wheel is frowned upon and controlled. Some of the front ends get adopted by other teams, such as the Mondrian UI front-end for Buganizer. Another front end for Google Spreadsheets is Maestro. The API philosophy is carried from the internal tools to the external products.
Google makes heavy use of their own external products that they produce, such as Docs, Spreadsheets and Analytics. (See, dog food, the eating thereof.) This also allows the internal testing of pre-release and just-released products. Google Engineers are slightly allergic to GANTT charts but you can support them by writing an extension to Spreadsheets. There is a spreadsheet called Smartsheet that has been approved for internal use but is not widely used. Scripting over existing tools is far more common.
And now we move onto programming languages. Or should I say that we Go onto programming languages. There are four major languages in use at Google: Java, C++, Python, and Go (the Google language). Alan’s a big fan of Go and recommends it for distributed and concurrent systems. (I’ve used it a bit and it’s quite interesting but I haven’t written enough in it to make much comment.) There are some custom languages as well, including scripting languages for production tasks. Teams can use their own language of choice, although it’s unlikely to be Ruby on Rails anytime soon.
Is letting engineers pick their language the key to Google’s success? Is it the common platform? The common tools? No. The platforms, tools and languages won’t matter if your organisational culture isn’t right. If the soil is toxic, the tree won’t grow. Google is in a highly competitive space and have to be continually innovating and improving or users will go elsewhere. The drive for innovation is the need to keep the users insanely happy. Getting the organisational settings right is essential: how do you foster innovation?
Well, how do they do it? First and foremost, it’s about producing a culture of innovation. The wrong culture and you won’t get interesting or exciting software. Hiring matters a LOT. Try to hire people that are smarter than you, are passionate, are quick learners – look for this when you’re interviewing. Senior people at Google need to have technical skills, yes, but they have to be a cultural fit. Will this person be a great addition to the team? (Culture Fit is actually something they assess for – it’s on the form.) Passion is essential: not just for software but for other things as well. If people are passionate about one thing, something, then you’d expect that this passion would flow over into other things in their lives.
Second ingredient: instead of managing, you’re unmanaging. This is why Alan is able to talk today – he’s hired great people and can leave the office without things falling apart. You need to hire technical managers as well, people who have forgotten their technical skills won’t work at Google if they’re to provide a sounding board and be able to mentor members of the team.
The third aspect is being open to sharing information: share, share, share. The free exchange of information is essential in a collaborative environments, based on trust.
“Info sharing is power, info hoarding is impotence.” (Alan Noble)
The fourth thing is to recognise merit. It’s cool to do geeky things. Success is celebrated generously.
Finally, it’s important to empower teams to be agile and to break big projects into smaller, more manageable things. The unit of work at Google is about 3-4 engineers. Have 8 engineers? That’s two 4 person teams. What about meetings? Is face-to-face still important? Yes, despite all the tech. (I spoke about this recently.) Having a rich conversation is very high bandwidth and when you’re in the same room, body language will tell you if things aren’t going across. The 15 minute “stand up” meeting is a common form of meeting: stand up in the workplace and have a quick discussion, then break. There’s also often a more regular weekly meeting which is held in a “fun” space. Google wants you to be within 150m of coffee, food and fuel at all times to allow you to get what you need to keep going, so weekly meetings will be there. There’s also the project kick-off meeting, where the whole team of 20-30 will come together in order to break it down to autonomous smaller units.
People matter and people drive innovation. Googlers are supposed to adapt to fast-paced change and are encouraged to pursue their passions: taking their interests and applying them in new ways to get products that may excite other people. Another thing that happens is TGIF – which is now on Thursday, rather than Friday, where there is an open Q and A session with the senior people at Google. But you also need strong principles underlying all of this people power.
The common guiding principles that bring it all together need to be well understood and communicated. Here’s Alan’s list of guiding principles (the number varies by speaker, apparently.)
- Focus on the user. This keeps you honest and provides you with a source of innovation. Users may not be articulate what they want but this, of course, is one of our jobs: working out what the user actually wants and working out how many users want a particular feature.
- Start with problems. Problems are a fantastic source of innovation. We want to be solving real, important and big problems. There are problems everywhere!
- Experiment Often. Try things, try a lot of things, work out what works, detect your failures and don’t expose your users to any more failures than you have to.
- Fail Fast. You need to be able to tolerate failure: it’s the flip side of failure. (A brief mention of Google Wave, *sniff*)
- Paying Attention to the Data. Listen to the data to find out what is and what is not working. Don’t survey, don’t hire marketing people, look at the data to find out what people are actually doing!
- Passion. Let engineers find their passion – people are always more productive when they can follow their passion. Google engineers can self-initiate a transfer to encourage them to follow their passion, and there is always the famous Google 20% time.
- Dogfood. Eat your own dogfood! Testing your own product in house and making sure that you want to use it is an essential step.
The Google approach to failure has benefited from the Silicon Valley origins of the company, with the approach to entrepreneurship and failure tolerance. Being associated with a failed start-up is not a bad thing: failure doesn’t have to be permanent. As long as you didn’t lie, cheat or steal, then you’ve gained experience. It’s not making the mistake, it’s how you recover from it and how you carry yourself through that process (hence being ethical even as the company is winding down).
To wind it all up, Google doesn’t have standard SE processes across the company: they focus on getting their organisation culture right with common principles that foster innovation. People want to do exciting things and follow new ideas so every team is empowered to make their own choices, select their own tools and processes. Launch, iterate, get it out, and don’t hold it back. Grow your software like a tree rather than dropping a monolith. Did it work? No? Wind it back. Yes? Build on it! Take the big bets sometimes because some big problems need big leaps forward: the moon shot is a part of the Google culture.
Embrace failure, learn from your mistakes and then move on.
I’m writing this on Monday (and Thursday night), after being on the road for teaching, and I’ve been picking up the pieces of a hard drive replacement (under warranty) compounded by the subsequent discovery that at least one of my backups is corrupted. This has taken what should have been a catch-up day and turned it into a “juggle recovery/repair disk/work on secondary machine” day but, hey, I’m not complaining too much – at least I have two machines and took the trouble to keep them synchronised with each other. The worst outcome of today’s little backup issue is that I have a relatively long reinstallation process ahead of me, because I haven’t actually lost anything yet except the convenient arrangement of all of my stuff.
It does, however, reinforce one of the lessons that it took me years to learn. If you have an hour, you can do an hour’s worth of work. I know, that sounds a little ‘aw shucks’ but some things just take time to do and you have to have the time to do them. My machine recovery was scheduled to take about four hours. When it had gone for five, I clicked on it to discover that it had stopped on detecting the bad backup. I couldn’t have done that at the 30 minute mark. Maybe I could have tried to wake it up at the 2 hour mark, and maybe I would have hit the error earlier, but, in reality that wasn’t going to happen because I was doing other work.
Why is this important? Because I am going to get 1, maybe 2, attempts per day to restore this machine until it finally works. It takes hours to do it and there’s nothing I can do to make it faster. (You’ll see down the bottom that this particular prediction came true because the backup restoration has now turned out to have some fundamental problems).
When students first learn about computers, they don’t really have an idea about how long things take and how important it is to make their programs work quickly. Computational complexity describes how we expect programs to behave when we change the amount of data that they’re working on, either in terms of how much space they take up or how long they take to compute. The choice of approach can lead to massive differences in performance. Something that takes 60 seconds on one approach can take an hour on another. Scale up the size of data you’re looking at and the difference is between ‘will complete this week’ and ‘I am not going to live that long’.
When you look at a computing problem, and the resources that you have, a back of an envelope calculation will very rapidly tell you how long it will take (with a bit of testing and trial and error in some cases). If you don’t allow this much time for the solution, you probably won’t get it. Worse case is that you start something running and then you stop it, thinking it’s not going to finish, but you actually stopped it just before it was going to finish. Time estimation is important. A lot of students won’t really learn this, however, until it comes back and bites them when they overshoot. With any luck, and let’s devote some effort so it’s not just luck, they learn what to look for when they’re estimating how long things actually take.
I wasn’t expecting to have my main machine back up in time to do any work on it today, because I’ve done this dance before, but I was hoping to have it ready for tomorrow. Now, I have to plan around not having it for tomorrow either (and, as it turns out, it won’t be back before the weekend). Worst case is that I will have to put enough time aside to do a complete rebuild. However, to rebuild it will take some serious time. There’s no point setting aside the rebuild as something that I devote my time or weekend to, because it doesn’t require that much attention and I can happily work around the major copies in hour-long blocks to get useful work done.
When you know how long something takes and you plan around that, even those long boring blocks of time become something that can be done in parallel, around the work that also must happen. I see a lot of students who sit around doing something that’s not actually work while they wait for computation or big software builds to finish. Hey, if you’ve got nothing else to do then feel free to do nothing or surf the web. The only problem is that very few of us ever have nothing else to do but, by realising that something that takes a long time will take a long time, we can use filler tasks to drag down the number of things that we still have left to do.
This is being challenged at the moment because the restoration is resolutely failing and, regrettably, I am now having to get actively involved because the ‘fix the backup’ regime requires me to try things, and then try other things, in order to get it working. The good news is I still have large blocks of time – the bad news is that I’m doing all of this on a secondary machine that doesn’t have the same screen real estate. (What a first world problem!)
What a fantastic opportunity to eat my own dog food. 🙂 Tonight, I’m sitting down to plan out how I can recover from this and be back up to date on Monday, with at least one fully working system and access to all of my files. I still need to allow for the occasional ‘try this on the backup’ and then wait several hours, but I need to make sure that this becomes a low priority tasks that I schedule, rather than one that interrupts me and becomes a primary focus. We’ll see how well that goes.
When I was younger, I used to play a science fiction role-playing game that was based in a near-ish future, where humans had widely adopted the use of electronic implants and computers were everywhere in a corporate-dominated world. The game was called “Cyberpunk 2013” and was heavily influenced by the work of William Gibson (“Neuromancer” and many other works), Bruce Sterling (“Mirrorshades” anthology and far too many to list), Walter Jon Williams (“Hardwired” among others) and many others who had written of a grim, depressing, and above all stylish near future. It was a product of the 80s and, much like other fashion crime of the time, some of the ideas that emerged were conceits rather than concepts, styles rather than structures. But, of course, back in the 1980s, setting it in 2013 made it far away and yet close enough. This was not a far future setting like Star Trek but it was just around the corner.
And now it is here. My plans for the near future, the imminent and the inevitable, now include planning calendars for a year that was once a science fiction dream. In that dark dream, 2013 was a world of human/machine synthesis, of unfeeling and mercenary corporate control, of mindless pleasure and stylish control of a population that seeks to float as lotus eaters rather than continue to exist in the dirty and poor reality of their actual world.
Well, we haven’t yet got the cybernetics working… and, joking aside, the future is not perfect but it is far less gloomy and dramatic in the main that the authors envisioned. Yes, there are lots of places to fix but the majority of our culture is still working to the extent that it can be developed and bettered. The catastrophic failures and disasters of the world of 2013 has not yet occurred. We can’t relax, of course, and some things are looking bleak, but this is not the world of Night City.
In the middle of all of this musing on having caught up to the future that I envisioned as a boy, I am now faced with the mundane questions such as:
- What do I want to be doing in 2020 (the next Cyberpunk release was set in this year, incidentally)
- Therefore, what do I want to be doing in 2013 that will lead me towards 2020?
- What is the place of this blog in 2013?
I won’t bore you with the details of my career musings (if my boss is reading this, I’m planning to stay at work, okay?) but I had always planned that the beginning of October would be a good time to muse about the blog and work out what would happen once 2012 ended. I committed to writing the blog every day, focussed on learning and teaching to some extent, but it was always going to be for one year and then see what happened.
I encourage my students to reflect on what they’ve done but not in a ‘nostalgic’ manner (ah, what a great assignment) but in a way that the can identify what worked, what didn’t work and how they could improve. So let me once again trot out the dog food and the can opener and give it a try.
What has worked
I think my blog has been most successful when I’ve had a single point to make, I’ve covered it in depth and then I’ve ducked out. Presenting it with humour, humility, and an accurate assessment of the time that people have to read makes it better. I think some of my best blogs present information and then let people make up their own minds. The goal was always to present my thought processes, not harangue people.
What hasn’t worked
I’m very prone to being opinionated and, sometimes, I think I’ve blogged too much opinion and too little fact. I also think that there are tangents I’ve taken when I’ve become more editorial and I’m not sure that this is the blog for that. Any blog over about 1,100 words is probably too long for people to read and that’s why I strive to keep the blog at or under 1,000 words.
Having to blog every day has also been a real challenge. While it keeps a flow of information going, the requirement to come up with something every, single, day regardless of how I’m feeling or what is going on is always going to have an impact on quality. For example, I recently had a medical condition that required my doctor to prescribe some serious anti-inflammatory drugs and painkillers for weeks and this had a severe impact on me. I have spent the last 10 days shaking off the effects of these drugs that, among other effects, make me about half as fast at writing and reduce my ability to concentrate. The load of the blog on top of this has been pretty severe and I’m open about some of the mistakes that I’ve made during this time. Today is the first day that I feel pretty reasonable and, by my own standards, fit for fair, complex marking of large student submissions (which is my true gauge of my mental agility).
How to improve
Wow, good question. This is where the thinking process starts, not stops, after such an inventory. The assessment above indicates that I am mostly happy with what came out (and my readership/like figures indicate this as well) but that I really want to focus on quality over quantity and to give myself the ability to take a day off if I need to. But I should also be focused on solid, single issue, posts that address something useful and important in learning and teaching – and this requires more in-depth reading and work than I can often muster on a day-to-day basis.
In short, I’m looking to change my blog style for next year to a shorter and punchier version that gives more important depth, maintains an overall high standard, but allows me to get sick or put my feet up occasionally. What is the advice that I would give a student? Make a plan that includes space for the real world and that still allows you to do your best work. Content matters more than frequency, as long as you meet your real deadline. So, early notice for 2013, expect a little less regularity but a much more consistent output.
It’s a work in progress. More as I think of it.
I am currently being simultaneously beaten in four games of Words with Friends. This amuses me far more than it bugs me because it appears that, despite having a large vocabulary, a (I’m told) quick wit and being relatively skilled in the right word in the right place – I am rather bad at a game that should reward at least some of these skills.
One of the things that I dislike, and I know that my students dislike, is when someone stands up and says “To solve problem X, you need to take set of actions S.” Then, when you come to X, or you find that person’s version of a solution to X, it’s not actually S that is used. It’s “S-like” or “S-lite” or “Z, which looks like an S backwards and sounds like it if you’re an American with a lisp.”
There’s a term I love called “eating your own dog food” (Wikipedia link) that means that a company uses the products that it creates in order to solve the problems for which a customer would buy their products. It’s a fairly simple mantra: if you’re making the best thing to solve Problem X, then you should be using it yourself when you run across Problem X. Now,of course, a company can do this by banning or proscribing any other products but this misses the point. At it’s heart, dogfooding means that, in a situation where you are free to choose, you make a product so good that you would choose it anyway.
It speaks to authenticity when you talk about your product and it provides both goals and thinking framework. The same thing works for education – if I tell someone to take a certain approach to solve a problem, then it should be one that I would use as well.
So, if a student said to me “I am bad at this type of problem,” I’d start talking to them to find out exactly what they’re good and bad at, get them to analyse their own process, get them to identify some improvement strategies (with my guidance and suggestions) and then put something together to get it going. Then we’d follow up, discuss what happened, and (with some careful scaffolding) we’d iteratively improve this as far as we could. I’d also be open to the student working out whether the problem is actually one that they need to solve – although it’s a given that I’ll have a strong opinion if it’s something important.
So, let me eat my own dog food for this post, to help me get better at Words with Friends, to again expose my thinking processes but also to demonstrate the efficacy of doing this!
Step 1: What’s the problem?
So, I can get reasonable scores at Words with Friends but I don’t seem to be winning. Words with Friends is a game that rewards you for playing words with “high value” tiles on key positions that add score multipliers. The words QATS can be worth 13 or 99 depending on where it is placed. You have 7 randomly selected tiles with different letters, and a range of values for letters in a 1:1 association, but must follow strict placement and connection rules. In summary, a Words with Friends game is a connected set of tiles, where each set of tiles placed must form a valid word once set placement is complete, and points are calculated from the composition and placement of the tiles, but bonus spaces on the board only count once. The random allocation of letters means that you have to have a set of strategies to minimise the negative impact of a bad draw and to maximise the benefit of a good draw. So you need a way of determining the possible moves and then picking the best one.
Some simple guidelines that help you to choose words can be formed along the lines of the number of base points by letter (so words featuring Q, X or J will be worth more because these are high value letters), the values of words will tend to increase as the word length increases as there are more letters with values to count (although certain high value letters cannot be juxtaposed – QXJJXWY is not a word, sadly), but both of these metrics are overshadowed by the strategic placement of letters to either extend existing words (allowing you to recount existing tiles and extending point 2) or to access the bonus spaces. Given that QATS can be worth 99 points as a four-letter word if played in the right place, it might be worth ignoring QUEUES earlier if think you can reach that spot.
Step 2: So where is my problem?
After thinking about my game, I realised that I wasn’t playing Words with Friends properly, because I wasn’t giving enough thought to the adversarial nature of the occupancy of the bonus spaces. My original game was more along the lines of “look at letters, look at board, find a good word, play it.” As a result, any occupancy of the bonus spaces was a nice-to-have, rather than a must-have. I also didn’t target placement that allowed me to count tiles already on the board and, looking at other games, my game is a loose grid compared to the tight mesh that can earn very large points.
I’m also wasn’t thinking about the problem space correctly. There are a fixed number of tiles in the game, with known distribution. As tiles are played, I know how many tiles are left and that up to 14 of them are in my and my opponent’s hand. If I know how many tiles there are of each letter, I can play with a reasonable idea of the likelihood of my opponent’s best move. Early on, this is hard, but that’s ok, because we can both play in a way that doesn’t give a bonus tile advantage. Later on, it’s probably more useful.
Finally, I was trying to use words that I knew, rather than words that are legal in Words with Friends. I had no idea that the following were acceptable until (at least once out of desperation) I tried them. Here are some you might (nor might now) know: AA, QAT, ZEE, ZAS, SCARP, DYNE. The last one is interesting, because it’s a unit of force, but BRIX, a unit used to measure concentration (often of sugar) isn’t a legal word.
So, I had three problems, most of which relate to the fact that I’m more used to playing “Take 2” (a game played with Scrabble tiles but no bonus spaces) than “Scrabble” itself, where the bonus spaces are crucial.
Step 3: What are the strategies for improvements?
The first, and most obvious, strategy is to get used to playing in the adversarial space and pay much closer attention to which bonus spaces I leave open in my play and to increase my recounting of existing tiles. The second is to start keeping track of tiles that are out and play to the more likely outcome. Finally, I need to get a list of which words are legal in Words with Friends and, basically, learn them.
Step 4: Early outcomes
After getting thrashed in my first games, I started applying the first strategy. I have since achieved words worth over 100 points and, despite not winning, the gap is diminishing. So this appears to be working.
The second and the third… look, it’s going to sound funny but this seems like a lot of work for a game. I quite like playing the best word I can think of without having to constrain myself to play some word I’m never going to actually use (when we’re up to our elbows in aa, I will accept your criticism then) or sit there eliminating tiles one-by-one (or using an assistant to do it). Given that I’m not even sure that this is the way people actually play, I’m probably better off playing a lot of games and naturally picking up words that occur, rather than trying to learn them all in one go.
Of course, if a student said something along the last lines to me, then they’re saying that they don’t mind not succeeding. In this case, it’s perfectly true. I enjoy playing and, right now, I don’t need to win to enjoy the game.
Just as well really, I think I’m about to lose four games within a minute of each other. That’s four in a row – pity, if there were three of them I could do a syzygy joke.
Step 5: Discussion and Iteration
So, here’s the discussion and my chance to think about whether my strategies need modification to achieve my original goal. Now, if I keep that goal at winning, then I do need to keep iterating but I have noticed that with a simple change of aiming more a the bonuses, I get a good “Yeah” from a high points word that probably won’t be matched by winning a game.
To wrap up, having looked at the problem, thought through it and make some constructive suggestions regarding improvement, I’ve not only improved my game but I’ve improved my understanding and enjoyment of the activity. I feel far more in control of my hideous performance and can now talk to more people about other ways to improve that maintain that enjoyment.
Now, of course, I imagine that a million WwF players are going to jump in and say “nooooo! here’s how you do it.” Please do so! Right now I’m talking to myself but I’d love some guidance for iterative improvement.