Today’s keynote was given by Alan Noble, Engineering Director for Google Australia and long-term adjunct at the University of Adelaide, who was mildly delayed by Sydney traffic but this is hardly surprising. (Sorry, Sydney!) Whn asked to talk about Google’s Software Engineering (SE) processes, Alan thought “Wow, where do I began?” Alan describes Google’s processes as “organic” and “changing over time” but no one label can describe an organisation that has over 30,000 employees.
So what does Alan mean by “organic”? Each team in Google is empowered to use the tools and processes that work best for them – there is no one true way (with some caveats). The process encouraged is “launch and iterate” and “release early, release often”, which many of us have seen in practice! You launch a bit, you iterate a bit, so you’re growing it piece by piece. As Alan noted, you might think that sounds random, so how does it work? There are some very important underlying commonalities. In the context of SE, you have an underlying platform and underlying common principles.
Everything is built on Google Three (Edit: actually it’s google3, from Alan’s comment below so I’ll change that from here on) – Google’s third iteration of their production codebase, which also enforces certain approaches to the codebase. At the heart of google3 is something called a package, which encapsulates a group of source files, and this is associated with a build file. Not exciting, but standard. Open Source projects are often outside: Chrome and Android are not in google3. Coming to grips with google3 takes months, and can be frustrating for new hires, who can spend weeks doing code labs to get a feeling for the codebase. It can take months before an engineer can navigate google3 easily. There are common tools that operate on this, but not that many of them and for a loose definition of “common”. There’s more than one source code control system, for example. (As a note, any third party packages used inside Google have the heck audited out of them for security purposes, unsurprisingly.) The source code system used to be Perforce by itself but it’s a highly centralised server architecture that hasn’t scaled for how Google is now. Google has a lot of employees spread around the world and this presents problems. (As a note, Sydney is the 10th largest engineering centre for Google outside of Mountain View.) In response to this scaling problem, Google have tried working with the vendor (which didn’t pan out) and have now started to produce their own source control system. Currently, the two source control systems co-exist while migration takes place – but there’s no mandated move. Teams will move based on their needs.
Another tool is a tracking tool called Buganizer which does more than track bugs. What’s interesting is that there are tools that Google use internally that we will never see, to go along with their tools that are developed for public release.
There’s a really strong emphasis on making sure that the tools have well-defined, well-documented and robust APIs. They want to support customisation, which means documentation is really important if sound extensions and new front ends can be built. By providing a strong API, engineering teams can build a sensible front end for their team – although complete reinvention of the wheel is frowned upon and controlled. Some of the front ends get adopted by other teams, such as the Mondrian UI front-end for Buganizer. Another front end for Google Spreadsheets is Maestro. The API philosophy is carried from the internal tools to the external products.
Google makes heavy use of their own external products that they produce, such as Docs, Spreadsheets and Analytics. (See, dog food, the eating thereof.) This also allows the internal testing of pre-release and just-released products. Google Engineers are slightly allergic to GANTT charts but you can support them by writing an extension to Spreadsheets. There is a spreadsheet called Smartsheet that has been approved for internal use but is not widely used. Scripting over existing tools is far more common.
And now we move onto programming languages. Or should I say that we Go onto programming languages. There are four major languages in use at Google: Java, C++, Python, and Go (the Google language). Alan’s a big fan of Go and recommends it for distributed and concurrent systems. (I’ve used it a bit and it’s quite interesting but I haven’t written enough in it to make much comment.) There are some custom languages as well, including scripting languages for production tasks. Teams can use their own language of choice, although it’s unlikely to be Ruby on Rails anytime soon.
Is letting engineers pick their language the key to Google’s success? Is it the common platform? The common tools? No. The platforms, tools and languages won’t matter if your organisational culture isn’t right. If the soil is toxic, the tree won’t grow. Google is in a highly competitive space and have to be continually innovating and improving or users will go elsewhere. The drive for innovation is the need to keep the users insanely happy. Getting the organisational settings right is essential: how do you foster innovation?
Well, how do they do it? First and foremost, it’s about producing a culture of innovation. The wrong culture and you won’t get interesting or exciting software. Hiring matters a LOT. Try to hire people that are smarter than you, are passionate, are quick learners – look for this when you’re interviewing. Senior people at Google need to have technical skills, yes, but they have to be a cultural fit. Will this person be a great addition to the team? (Culture Fit is actually something they assess for – it’s on the form.) Passion is essential: not just for software but for other things as well. If people are passionate about one thing, something, then you’d expect that this passion would flow over into other things in their lives.
Second ingredient: instead of managing, you’re unmanaging. This is why Alan is able to talk today – he’s hired great people and can leave the office without things falling apart. You need to hire technical managers as well, people who have forgotten their technical skills won’t work at Google if they’re to provide a sounding board and be able to mentor members of the team.
The third aspect is being open to sharing information: share, share, share. The free exchange of information is essential in a collaborative environments, based on trust.
“Info sharing is power, info hoarding is impotence.” (Alan Noble)
The fourth thing is to recognise merit. It’s cool to do geeky things. Success is celebrated generously.
Finally, it’s important to empower teams to be agile and to break big projects into smaller, more manageable things. The unit of work at Google is about 3-4 engineers. Have 8 engineers? That’s two 4 person teams. What about meetings? Is face-to-face still important? Yes, despite all the tech. (I spoke about this recently.) Having a rich conversation is very high bandwidth and when you’re in the same room, body language will tell you if things aren’t going across. The 15 minute “stand up” meeting is a common form of meeting: stand up in the workplace and have a quick discussion, then break. There’s also often a more regular weekly meeting which is held in a “fun” space. Google wants you to be within 150m of coffee, food and fuel at all times to allow you to get what you need to keep going, so weekly meetings will be there. There’s also the project kick-off meeting, where the whole team of 20-30 will come together in order to break it down to autonomous smaller units.
People matter and people drive innovation. Googlers are supposed to adapt to fast-paced change and are encouraged to pursue their passions: taking their interests and applying them in new ways to get products that may excite other people. Another thing that happens is TGIF – which is now on Thursday, rather than Friday, where there is an open Q and A session with the senior people at Google. But you also need strong principles underlying all of this people power.
The common guiding principles that bring it all together need to be well understood and communicated. Here’s Alan’s list of guiding principles (the number varies by speaker, apparently.)
- Focus on the user. This keeps you honest and provides you with a source of innovation. Users may not be articulate what they want but this, of course, is one of our jobs: working out what the user actually wants and working out how many users want a particular feature.
- Start with problems. Problems are a fantastic source of innovation. We want to be solving real, important and big problems. There are problems everywhere!
- Experiment Often. Try things, try a lot of things, work out what works, detect your failures and don’t expose your users to any more failures than you have to.
- Fail Fast. You need to be able to tolerate failure: it’s the flip side of failure. (A brief mention of Google Wave, *sniff*)
- Paying Attention to the Data. Listen to the data to find out what is and what is not working. Don’t survey, don’t hire marketing people, look at the data to find out what people are actually doing!
- Passion. Let engineers find their passion – people are always more productive when they can follow their passion. Google engineers can self-initiate a transfer to encourage them to follow their passion, and there is always the famous Google 20% time.
- Dogfood. Eat your own dogfood! Testing your own product in house and making sure that you want to use it is an essential step.
The Google approach to failure has benefited from the Silicon Valley origins of the company, with the approach to entrepreneurship and failure tolerance. Being associated with a failed start-up is not a bad thing: failure doesn’t have to be permanent. As long as you didn’t lie, cheat or steal, then you’ve gained experience. It’s not making the mistake, it’s how you recover from it and how you carry yourself through that process (hence being ethical even as the company is winding down).
To wind it all up, Google doesn’t have standard SE processes across the company: they focus on getting their organisation culture right with common principles that foster innovation. People want to do exciting things and follow new ideas so every team is empowered to make their own choices, select their own tools and processes. Launch, iterate, get it out, and don’t hold it back. Grow your software like a tree rather than dropping a monolith. Did it work? No? Wind it back. Yes? Build on it! Take the big bets sometimes because some big problems need big leaps forward: the moon shot is a part of the Google culture.
Embrace failure, learn from your mistakes and then move on.
Mark’s 1000th post (congratulations again!) and my own data analysis reminded me of something that I’ve been meaning to do for some time, which is work out how much I’ve written over the 151 published posts that I’ve managed this year. Now, foolish me, given that I can see the per-post word count, I started looking around to see how I could get an entire blog count.
And, while I’m sure it’s obvious to someone else who will immediately write in and say “Click here, Nick, sheesh!”, I couldn’t find anything that actually did what I wanted to do. So, being me, I decided to do it ye olde fashioned way – exporting the blog and analysing it manually. (Seriously, I know that it must be here somewhere but my brain decided that this would be a good time to try some analysis practice.)
Now, before I go on, here are the figures (not including this post!):
- Since January 1st, I have published 151 posts. (Eek!)
- The total number of words, including typed hyperlinks and image tags, is 102,136. (See previous eek.)
- That’s an average of just over 676 words per post.
Is there a pattern to this? Have I increased the length of my posts over time as I gained confidence? Have they decreased over time as I got busier? Can I learn from this to make my posting more efficient?
The process was, unsurprisingly, not that simple because I took it as an opportunity to work on the design of an assignment for my Grand Challenges students. I deliberately started from scratch and assumed no installed software or programming knowledge above fundamentals on my part (this is harder than it sounds). Here are the steps:
- Double check for mechanisms to do this automatically.
- Realise that scraping 150 page counts by hand would be slow so I needed an alternative.
- Dump my WordPress site to an Export XML file.
- Stare at XML and slowly shake head. This would be hard to extract from without a good knowledge of Regular Expressions (which I was pretending not to have) or Python/Perl-fu (which I can pretend that I have to then not have but my Fu is weak these days).
- Drag Nathan Yau’s Visualize This down from the shelf of Design and Visualisation books in my study.
- Read Chapter 2, Handling Data.
- Download and install Beautiful Soup, an HTML and XML parsing package that does most of the hard word for you. (Instructions in Visualize This)
- Start Python
- Read the XML file into Python.
- Load up the Beautiful Soup package. (The version mentioned in the book is loaded up in a different way to mine so I had to re-enage my full programming brain to find the solution and make notes.)
- Mucked around until I extracted what I wanted to while using Python in interpreter mode (very, very cool and one of my favourite Python features).
- Wrote an 11 line program to do the extraction of the words, counting them and adding them (First year programming level, nothing fancy).
A number of you seasoned coders and educators out there will be staring at points 11 and 12, with a wavering finger, about to say “Hang on… have you just smoothed over about an hour plus of student activity?” Yes, I did. What took me a couple of minutes could easily be a 1-2 hour job for a student. Which is, of course, why it’s useful to do this because you find things like Beautiful Soup is called bs4 when it’s a locally installed module on OS X – which has obviously changed since Nathan wrote his book.
Now, a good play with data would be incomplete without a side trip into the tasty world of R. I dumped out the values that I obtained from word counting into a Comma Separated Value (CSV) file and, digging around in the R manual, Visualize This, and Data Analysis with Open Source Tools by Philipp Janert (O’Reilly), I did some really simple plotting. I wanted to see if there was any rhyme or reason to my posting, as a first cut. Here’s the first graph of words per post. The vertical axis is the number of words and the horizontal axis is the post number. So, reading left to right, you’ll see my development over time.
Sadly, there’s no pattern there at all – not only can’t we see one by eye, the correlation tests of R also give a big fat NO CORRELATION.
Now, here’s a graph of the moving average over a 5 day window, to see if there is another trend we can see. Maybe I do have trends, but they occur over a larger time?
Uh, no. In fact, this one is worse for overall correlation. So there’s no real pattern here at all but there might be something lurking in the fine detail, because you can just about make out some peaks and troughs. (In fact, mucking around with the moving average window does show a pattern that I’ll talk about later.)
However, those of who you are used to reading graphs will have noticed something about the axis label for the x-axis. It’s labelled as wp$day. This would imply that I was plotting post day versus average or count and, of course, I’m not. There have not been 151 days since January the 1st, but there have been days when I have posted multiple times. At the moment, for a number of reasons, this isn’t clear to the reader. More importantly, the day on which I post is probably going to have a greater influence on me as I will have different access to the Internet and time available. During SIGCSE, I think I posted up to 6 times a day. Somewhere, this is lost in the structure of the data that considers each post as an independent entity. They consume time and, as a result, a longer post on the same day will reduce the chances of another long post on the same day – unless something unusual is going on.
There is a lot more analysis left to do here and it will take more time than I have today, unfortunately. But I’ll finish it off next week and get back to you, in case you’re interested.
What do I need to do next?
- Relabel my graphs so that it is much clearer what I am doing.
- If I am looking for structure, then I need to start looking at more obvious influences and, in this case, given there’s no other structure we can see, this probably means time-based grouping.
- I need to think what else I should include in determining a pattern to my posts. Weekday/weekend? Maybe my own calendar will tell me if I was travelling or really busy?
- Establish if there’s any reason for a pattern at all!
As a final note, novels ‘officially start at a count of 40,000 words, although they tend to fall into the 80-100,000 range. So, not only have I written a novel in the past 4 months, I am most likely on track to write two more by the end of the year, because I will produce roughly 160-180,000 more words this year. This is not the year of blogging, this is the year of a trilogy!
Next year, my blog posts will all be part of a rich saga involving a family of boy wizards who live on the wrong side on an Ice Wall next to a land that you just don’t walk into. On Mars. Look for it on Amazon. Thanks for reading!