SIGCSE Day 2, Keynote 2, “Transforming US Education with Computer Science”, (#SIGCSE2014)
Posted: March 8, 2014 Filed under: Education | Tags: advocacy, blockly, Code.org, computer science, CSEdWeek, education, flappy bird, Hadi Partovi, higher education, Hour of Code, Kansas City, learning, Partovi, programming, sigcse, SIGCSE Special Projects, SIGCSE2014, teaching, US education Leave a commentToday’s keynote, “Transforming US Education with Computer Science”, is being given by Hadi Partovi from Code.org. (Claudia and I already have our Code.org swag stickers.)
There are 1257 registered attendees so far, which gives you some idea of the scale of SIGCSE. This room is pretty full and it’s got a great vibe. (Yeah, yeah, I know, ‘vibe’. If that’s the worst phrase I use today, consider yourself lucky, D00dz.) The introductory talk included a discussion of the SIGCSE Special Projects small grant program (to US$5,000). They have two rounds a year so go to SIGCSE’s website and follow the links to see more. (Someone remind me that it’s daylight saving time on Saunday morning, the dreaded Spring forward, so that I don’t miss my flight!)
SIGCSE 2015 is going to be in Kansas City, by the way, and I’ve heard great things about KC BBQ – and they have a replica of the Arch de Triomphe so… yes. (For those who don’t know, Kansas City is in Missouri. It’s name after the river which flows through it, which is named after the local Kansa tribe. Or that’s what this page says. I say it’s just contrariness.) I’ve never been to Missouri, or Kansas for that matter, so I could tick off two states in the one trip… of course, then I’d have to go to Topeka, well just because, but you know that I love driving.
We started the actual keynote with the Hour of Code advertising movie. I did some of the Hour of Code stuff from the iOS app and found it interesting (I’m probably being a little over-critical in that half-hearted endorsement. It’s a great idea. Chill out, Nick!)
Hadi started off referring to last year’s keynote, which questioned the value of code.org, which started as a hobby. He decided to build a larger organisation to try and realise the potential of transforming the untapped resource into a large crop of new computer scientists.
Who.what is Code.org?
- A marketing organisation to make videos with celebrities?
- A coalition of tech companies looking for employees?
- A political advocacy group of educations and technologies?
- Hour of code organisers?
- An SE house that makes tutorials
- Curriculum organisers?
- PD organisation?
- Grass roots movement?
It’s all of the above. Their vision is that every school should teach it to every student or at least give them the opportunity. Why CS? Three reasons: job gap, under-represented students and CS is foundational for every student in the 21st Century. Every job uses it.
Some common myths about code.org:
- It’s all hype and Hour of Code – actually, there are many employees and 15 of them are here today.
- They want to go it alone – they have about 100 partners who are working with the,
- They are only about coding and learning to code – (well, the name doesn’t help) they’re actually about teaching fundamentals of Computer Science
- This is about the software industry coming in to tell schools how to do their jobs – no, software firms fund it but they don’t run the org, which is focused on education, down to the pre-school level
Hmm, the word “disrupt” has now been used. I don’t regard myself as a disruptive innovator, I’m more of a seductive innovator – make something awesome and you’ll seduce people across to it, without having to set fire to anything. (That’s just me, though.)
Principle goals of Code.org start with “Educate K-12 students in CS throughout the US”. That’s their biggest job. (No surprise!) Next one is to Advocate to remove legislative barriers and the final pillar is to Celebrate CS and change perceptions.
Summary of first year – hour of code, 28 million students in 35,000 classrooms with 48% girls (applause form the audience), in 30 languages over 170 countries. 97% positive ratings of the teacher experience versus 0.2% negative. In their 20 hour K-8 Intro Course, 800,000 students in 13,000 students, 40% girls. In school district partnerships they have 23 districts with PD workshops for about 500 teachers for K-12. In their state advocacy role, they’ve changed policy in 5 states. Their team is still pretty lean with only 20 people but they’re working pretty hard with partnerships across industry, nonprofit and government. Hadi also greatly appreciated the efforts of the teachers who had put in the extra work to make this all happen in the classroom.
They’re working on a full curriculum with 20 hour modules all the way up to middle school, aligned with common core. From high school up, they go into semester courses. These course are Computer Science or leverage CS to teach other things, like maths. (Obviously, my ears pricked up because of our project with the Digital Technologies National Curriculum project in Australia.)
The models of growth include an online model, direct to teachers, students and parents (crucial), fuelled by viral marketing, word-of-mouth, volunteers, some A/B testing, best fit for elementary school and cost effectiveness. (On the A/B testing side, there was a huge difference in responses between a button labelled “Start” and a button labelled “Get started”. Start is much more successful! Who knew?) Attacking the problem earlier, it’s easy to get more stuff into the earlier years because they are less constrained in requirements to teach specific content.
The second model of growth is in district partnerships, where the district provides teachers, classrooms and computers. Code.org provide stipends, curriculum, marketing. Managing costs for scale requires then to aim for US$5-10K per High School, which isn’t 5c but is manageable.
The final option for growth is about certification exams, incentives, scholarships and schools of Ed.
Hadi went on to discuss the Curriculum, based on blockly, modified and extended. His thoughts on blended learning were that they achieved making learning feel like a game with blended learning (The ability to code Angry Birds is one of the extensions they developed for blackly) On-line and blended learning also makes a positive difference to teachers. On-line resources most definitely don’t have to remove teachers, instead, done properly, they support teachers in their ongoing job. Another good thing is to make everything web-based, cross-browser, which reduces the local IT hassle for CS teachers. Rather than having to install everything locally, you can just run it over the web. (Anyone who has ever had to run a lab knows the problem I’m talking about. If you don’t know, go and hug your sys admin.) But they still have a lot to learn: about birding game design and traditional curriculum, however they have a lot of collaborations going on. Evaluation is, as always, tricky and may combine traditional evaluation and large-scale web analytics. But there are amazing new opportunities because of the wealth of data and the usage patterns available.
He then showed three demos, which are available on-line, “Building New Tutorial Levels”, new tutorials that show you how to create puzzles rather than just levels through the addition of event handing (with Flappy Bird as the example), and the final tutorial is on giving hints to students. (Shout outs to all of the clear labelling of subgoals and step achievement…) That last point is great because you can say “You’re using all the pieces but in the wrong way” but with enough detail to guide a student, adding a hint for a specific error. There are about 11,000,000 submissions for providing feedback on code – 2,000,000 for correct, 9,000,000 for erroneous. (Code.org/hints)
So how can you help Code.org?
If tour in a Uni, bring a CS principles course to the Uni, partner with your school of Ed to bring more CS into the Ed program (ideally a teaching methods course). Finally, help code. org scale by offering K-5 workshops for them. You can e-mail univ@code.org if you’re interested. (Don’t know if this applies in Australia. Will check.) This idea is about 5 weeks old so write in but don’t expect immediate action, they’re still working it out.
If you’re just anyone, Uni or not? Convince your school district to teach CS. Code.org will move to your region in if 30+ high schools are on board. Plus you can leap into and give feedback on the curriculum or add hints to their database. There are roughly a million students a week doing Hour of Code stuff so there’s a big resource out there.
Hadi moved on to the Advocate pillar. Their overall vision is that CS is foundational – a core offering one very school rather than a vocational specialisation for a small community. The broad approach is to change state policy. (A colleague near me muttered “Be careful what you wish for” because that kind of widespread success would swamp us if we weren’t prepared. Always prepare for outrageous success!)
At the national level, there is a CS Education Act with bi-partisan sponsors in both house, to support STEM funding to be used as CS, currently before the house. In the NCAA, there’s a new policy published from an idea spawned at SIGCSE, apparently by Mark! CS can now count as an NCAA scholarship, which is great progress. At the state level, Allowing CS to satisfy existing high school math/science graduation requirements but this has to be finalised with the new requirement for Universities to allow CS to meet their math/science requirements as well! In states where CS counts, CS enrolment is 50% higher (Calc numbers are unchanged), with 37% more minority representation. The states with recent policy changed are are small but growing. Basically, you can help. Contact Code.org if your state or district has issues recognising CS. There’s also a petition on the code.org site which is state specific for the US, which you can check out if you want to help. (The petition is to seek recognition that everyone in the US should have the opportunity to learn Computer Science.)
Finally, on the Celebrate pillar, they’ve come a long way from one cool video, to Hour of Code. Tumblr took 3.5 years to reach 15,000,000, Facebook took 3 years, Hour of Code took 5 days, which is very rapid adoption. More girls participated in CS in US schools in one week than in the previous 70 years. (Hooray!) And they’re doing it again in CSEd Week from December 8-14. Their goal is to get 100 million students to try the Hour of Code. See if you can get it on the Calendar now – and advertise with swag. 🙂
In closing, Hadi believes that CS is at an incredible inflection pint, with lots of opportunities, so now is the time to try stuff or, if it didn’t work before, try it again because there’s a lot of momentum and it’s a lot easier to do now. We have growing and large numbers. When we work together towards a shared goal, anything is possible.
Great talk, thanks, Hadi!
SIGCSE 2014 Day 2, About to start, (#SIGCSE2014)
Posted: March 7, 2014 Filed under: Education | Tags: education, higher education, SIGCSE2014, teaching Leave a commentThe hall’s starting to fill up as everyone gets ready for the second keynote. Good to see so many people are still here, although how many people will be here for my workshop on Saturday afternoon is probably another matter!
SIGCSE 2014: Collecting and Analysing Student Data 1, Paper 3, Thursday 3:15 – 5:00pm (#SIGCSE2014)
Posted: March 7, 2014 Filed under: Education | Tags: codebrowser, education, education research, higher education, learning, novices, SIGCSE2014, teaching, thinking Leave a commentOk, this is the last paper, with any luck I can summarise it in four words so you don’t die of work poisoning. The final talk was “Using CodeBrowser to Seek Difference Between Novice Programmers” by Kenny Heinonen, Kasper Hirvikoski, Matti Luukkainen, and Arto Vihavainen, from the University of Helsinki. I regret to say that due to some battery issues, this blog is probably going to be cut short. My apologies to the speakers!
The takeaway from the talk is that CodeBrowser is a fancy tool for identifying challenges that students face as they are learning to program. it sues your snapshot data and, if you have lots of students, course outcomes and another measures should be used to find a small number of students to analyse first. (Oh, and penguins are cool.)
Helsinki has hundreds of locals and thousands of MOOC participants learning to program, recording student progress as they learn to program. The system is built on top of NetBeans and provides scaffolding for students as they learn to program. Ok, so were recording the students’ progress but so what? Well, we have snapshots with time and source and we can use this to identify students at risk of dropping CS1 and a parallel maths course. (Retention and early drop-out? Subjects close to my heart!) It can also be used to seek insight into the problems that students are facing. There are not a great many systems that allow you to analyse and visualise code snapshots, apparently.
Looks interesting, I’ll have to go and check it out!
Sorry, battery is going, committing before this all goes away!
SIGCSE 2014: Collecting and Analysing Student Data 1, Paper 2, Thursday 3:15 – 5:00pm (#SIGCSE2014)
Posted: March 7, 2014 Filed under: Education | Tags: blackbox, BlueJ, education, education research, higher education, learning, novices, programming, sigcse, SIGCSE2014 Leave a commentWhoo! I nearly burnt out a digit writing up the first talk but it’s a subject close to my heart. I’ll try to be a little more terse for these next two talks.
The second talk in this session was “Blackbox: A Large Scale Repository of Novice Programmers’ Activity” by the amazing Blackbox team at Kent, Neil Brown, Michael Kölling, Davin McCall, and Ian Utting. The Blackbox data is the anonymised student data from students coding into the BlueJ Java programming environment. It’s a rich source of information on how students code and Mark and I have been scheming to do something with the Blackbox data for some time. With Ian and Neil here, it’s a good opportunity to steal their brains. I tried to get Ian to agree to doing all the work but it turns out that he’s been in the game long enough to not say “yes” when someone asks him to without context. (Maybe it’s just me.)
Michael was presenting, with some help from Neil, and reviewed the relationship between Blackbox and BlueJ. BlueJ is an educational programming environment for CS education using Java, dating back to the original Blue in 1996. (For those who don’t know, that’s old for this kind of thing. We should throw it a party.) BlueJ is a graphically operated development environment so novice programmers can drag things out to build programs. It’s a well-established and widely used environment.
(Hey, that means BlueJ is 18. Someone buy BlueJ a beer.)
BlueJ has about 2,000,000 users in 2013, who use it for about three months and then move on (it’s not a production tool, it’s a learning environment). The idea of Blackbox came out of SIGCSE sessions about three years ago where some research questions were raised, nice set-up, good design and really small student groups. One of our common problems is having enough students to actually do a big study and, frankly, all of us are curious about how students code. (It’s really hard to tell this from the final program, trust me.) So BlueJ has lots of users, can we look at their data and then share this with people?
Of course, the first question is “what do we collect?” Normally, we’d collect what we need to answer a research question but this data was going to be used to support lots of different (and currently unasked) research questions. The community was consulted at SIGCSE in 2012 but there has been an evolution of this over time. There are a lot of things collected – go and look at them in the paper because Michael flicked past that slide! 🙂
From an ethical standpoint, participation is an explicit decision made by the student to have their data collected or not. (This does raise the spectre of bias, especially as all the students must be over 16 for legal reasons.) So it’s opt in and THEN anonymised just to make it totally tasty from an ethical perspective.
Session data is collected for each session: start time, end time, project, path and userID (centrally anonymised for tracking).
So much for keeping it short, hey? Here’s a quick picture to give you a break.

Other things that can be captured are object creation and invocation among many other useful measures. For me, the fact that you can see how and when students are testing is fascinating, as it allows us to evaluate the whole expectation, observation and reflection scientific cycle in action.
The Blackbox project has already been running for 9 months. The opt-in rate is 40% (higher than I or anyone else expected). This means that there’s data from 250,000 users, recording roughly 11 events per second, over more than 1,000,000 projects and 20,000,000 compilations. What a fantastic resource! Michael then handed over to Neil to talk about the challenges.
Neil talked about tracking users, starting front he problem that one machine profile does not necessarily correspond to one user. Another problem is anonymisation, stripping project paths and the code where possible. You can’t guarantee anonymisation, because people sometimes use their own names as variable or class names, but they do what they can. It’s made clear on opt-in what’s going to happen. Data integrity is another challenge. Is it complete? No – it’s client side and there’s no guarantee of completeness, or even connectivity. But the Data that you do have for each session you is consistent. So the data is consistent but not complete. If you want, locally, you can tie your local data to the Blackbox data that they have on your students but then ethics becomes your problem. This can be done by Experiment and Participant Identifiers as part of the set-up so your students can be grouped. More example mini analyses are in the paper.
Looking at Error Frequency, Neil talked about certain errors and how their frequency changes over the weeks of 2013 (Semicolon expected, unknown variable). Over time, the syntax errors decreased (suggesting a learning effect) but others stay more constant.
The data is not completely open, and you need to request access as a researcher, sign a privacy and access restriction agreement. Students need not apply! There’s a SIGCSE workshop on this Saturday but I can’t go as my Puzzle Based Learning workshop is on at the same time. Great resource, go and check it out!
The final talk was “Using CodeBrowser to Seek Difference Between Novice Programmers” by Kenny Heinonen, Kasper Hirvikoski, Matti Luukkainen, and Arto Vihavainen, University of Helsinki,
SIGCSE 2014: Collecting and Analysing Student Data 1, “AP CS Data”, Thursday 3:15 – 5:00pm (#SIGCSE2014)
Posted: March 7, 2014 Filed under: Education | Tags: AP Calculus, AP CS, AP exams, computer science education, demographics, education, educational research, higher education, learning, mark guzdial, sigcse, SIGCSE2014, teaching 2 CommentsThe first paper, “Measuring Demographics and Performance in Computer Science Education at a Nationwide Scale using AP CS Data”, from Barbara Ericson and Mark Guzdial, has been mentioned in these hallowed pages before, as well as Mark’s blog (understandably). Barb’s media commitments have (fortunately) slowed down btu it was great to see so many people and the media taking the issue of under-representation in AP CS seriously for a change. Mark presented and introduced the Advanced Placement CS program which is the only nationwide measure of CS education in the US. This allows us tto use the AP CS to compare with other AP exams, find who is taking the AP CS exams and how well do they perform. Looking longitudinally, how has this changed and what influences exam-taking? (There’s been an injection of funds into some states – did this work?)
The AP are exams you can take while in secondary school that gives you college credit or placement in college (similar to the A levels, as Mark put it). There’s an audit process of the materials before a school can get accreditation. The AP exam is scored 1-5, where 3 is passing. The overall stats are a bit worrying, when Back, Hispanic and female students are grossly under-represented. Look at AP Calculus and this really isn’t as true for female students and there is better representation for Black and Hispanic students. (CS has 4% Black and 7.7% Hispanic students when America’s population is 13.1% Black students and 16.9% Hispanic) The pass rates for AP Calculus are about the same as for AP CS so what’s happening?
Looking at a (very cool) diagram, you see that AP overall is female heavy – CS is a teeny, tiny dot and is the most male dominated area, and 1/10th the size of calculus. Comparing AP CS to the others. there has been steady growth since 1997 in Calculus, Biology, Stats, Physics, Chem and Env Science AP exams – but CS is a flat, sunken pancake that hasn’t grown much at all. Mark then analysed the data by states, counting the number of states in each category along features such as ‘schools passing audit/10K pop’, #exams/pop and % passing exams. Mark then moved onto diversity data: female, Black and Hispanic test takers. It’s worth noting that Jill Pala made the difference to the entire state she taught in, raising the number of women. Go, Jill! (And she asked a really good question in my talk, thanks again, Jill!)
How has this changed overtime? California and Maryland have really rapid growth in exam takers over the last 6 years, with NSF involvement. But Michigan and Indiana have seen much less improvement. In Georgia, there’s overall improvement, but mostly women and Hispanic students, but not as much for Black students. The NSF funding appears to have paid off, GA and MA have improved over the last 6 years but female test takers have still not exceeded 25% in the last 6 years.
Why? What influences exam taking?
- The wealth in the state influences number of schools passing audit
- Most of the variance in the states comes from under-representation in certain groups.
It’s hard to add wealth but if you want more exam takers, increase your under-represtentation group representation! That’s the difference between the states.
Conclusions? It’s hard to compare things most of the time and the AP CS is the best national pulse we have right now. Efforts to improve are having an effect but wealth matters, as in the rest of education.
All delivered at a VERY high speed but completely comprehensible – I think Mark was trying to see how fast I can blog!
SIGCSE 2014: Research: Concept Inventories and Neo-Piagetian Theory, Thursday 1:45-3:00pm (#SIGCSE2014)
Posted: March 7, 2014 Filed under: Education | Tags: concept inventory, education, education research, higher education, learning, peer instruction, SIGCSE2014, teaching, thinking Leave a commentThe first talk was “Developing a Pre- and Post- Course Concept Inventory to Gauge Operating Systems Learning” presented by Kevin Webb.
Kevin opened by talking about the difficulties we have in sharing our comparison of student learning behaviour and performance. Assessment should be practical, technical, comprehensive, and, most critically, comparable so you can compare these results across instructors, courses and institutions. It is, as we know, difficult to compare homework and lab assignments, student surveys and exam results, for a wide range of reasons. Concept inventories, according to Kevin, give us a mechanism for combining the technical and comparable aspects.
Concept inventories are short, standardised exempts to deal with high-levbe conceptual take-awaks to reveal systematic misconceptions, MCQ format, deployed before and after courses. You can supplement your courses with the small exam to see how student learning is progressing and you can use this to compare performance and learning between classes. The one you’ve probably heard of is the Physics Force Concept Inventory, which Mazur talks about a lot as it was the big motivator for Peer Instruction to address shallow conceptual learning.
There are two Concept Inventories for CS but they’re not publicly available or even maintained anymore but, when they were run, students were less successful than expected – 40-60% of the course was concepts were successfully learned AFTER the course. If your students were struggling with 40% of the key concepts, wouldn’t you like to know?
This work hopes to democratise CI development, using open source principles. (There is an ITiCSE paper coming soon, apparently.) This work has some preliminary development of a CI for Operating Systems.
Goals and challenges included dealing with the diversity of OS courses and trading off which aspects would best fit into the CI. The researchers also wanted it to be transparent and flexible to make questions available immediately and provide a path (via GitHub) for collaboration and iteration. From an accessibility perspective, developing questions for a universal pre-test is hard, and the work is based in the real world where possible.
An example of this is paging/caching replacement, because of the limited capacity of some of these storage mechanism, so the key concept is locality, with an “evict oldest” policy. What happens if the students don’t have the vocabulary of a page table or staleness yet? How about an example of books on your desk, via books on a shelf, via books in the library? (We used similar examples in our new course to explain memory structures in C++ with a supermarket and the various shelves.)
Results so far indicate that taking the OS course improved performance (good) but not all concepts showed an equal increase – some concepts appear to be less intuitive than others. Student confidence increased, even where they weren’t getting the right answers. Scenario “word problems” appear to be challenging to students and opted for similar, less efficient solutions. (This may be related to the “long document hard to read” problem that we’ve observed locally.)
The next example was on indirection with pointers where simplifying the pointer chain was something students intuitively did, even where the resulting solution was sub-optimal. This was tested by asking two similar questions on the exam, where the first was neutrally stated as a “should we” and the second asked them to justify the complexity of something, which gave them a tip as to where the correct answer lay.
Another example, using input/output and polling, presenting the device without a name deprived the students of the ability to use a common pattern. When, in an exam, the device was named (as a disk) then the correct answer was chosen, but the reasoning behind the answer was still lacking – so they appear to be pattern matching, rather than thinking to the answer. From some more discussion, students unsurprisingly appear to choose solutions that match what they have already seen – so they will apply mutexes even in applications where it’s not needed because we drown them in locks. Presenting the same problem without “constricting” names as a code examples, the students could then solve the problem correctly, without locks, despite almost all of them wanting to use locks earlier.
Interesting talk with a fair bit to think about. I need to read the paper! The concept inventory can be found at https://github.com/osconceptinventory” and the group welcome collaboration so go and … what’s the verb for “concept inventory” – inventorise? Anyway, go and do it! (There was a good reminder in question time to mine your TAs for knowledge about what students come to talk to them about – those areas of uncertainty might be ripe for redevelopment!)
The next talk was “Misconceptions and Concept Inventory Questions for Hash Tables and Binary Search Trees” presented by Kuba Karpierz ( a senior Computer Science student at the University of British Columbia). Kuba reviewed the concept inventory concept for newcomers to the room. (Poor Kuba was slightly interrupted by a machine shutdown that nearly broke his presentation but carried on with little evidence of problem and recovered it well.) The core properties of concept inventories are that they must be brief and multiple choice at least.
Students found hash table resizing to be difficult so this was nominated as a CI question. Students would sketch the wrong graph for resizing, ignoring the resize cost and exaggerating the curve shape of what should be a linear increase.The team used think aloud exercises to explain why students picked the wrong solution. Regrettably, the technical problems continued and made it harder to follow the presentation.
A large number of students had no idea how to resize the hash table (for reasons I won’t explain) but this was immediately obvious after the concept inventory exam, rather than having to dig it out of the exams. The next example was on Binary Search Trees and the misconception that they are are always balanced. (It turns out that students are conflating them with heaps.) Looking at the CI MCQs for this, it’s apparent that we were teaching with these exemplars in lectures, but not as an MCQ short exam. Food for thought. The example shown did make me think because it was deliberately ambiguous. I wondered if it would be better if it were slightly less challenging and the students could pick the right answer. Apparently they are looking at this in a different question.
The final talk was “Neo-Piagetian Theory as a Guide to Curriculum Analysis”, presented by Claudia Szabo, from our Computer Science Education Research group. This is the work that we’re using as the basis for the course redesign of our local Object Oriented Programming course so I know this work quite well! (It’s nice to see theory being put into practice, though, isn’t it?)
Claudia started with a discussion of curriculum analyse – the systematic processes that we use to guide teachers to identify instructional goals and learning objectives. We develop, we teach, we observe and we refine, but this refinement may lead to diversion from the originally stated goals. The course loses focus and structure, and possibly even lose its scaffolding. Claudia’s paper has lots of good references for the various theory areas so I won’t reproduce it here but, to get back to the talk, Claudia covered the Piagetian stages of cognitive development in the child: sensorimotor, pre-operational, concrete operational and formal operational. In short, you can handle concepts in pre-, can perform logic and solve for specific situations in concrete but only get to abstract thought and true problem-sovling in the formal operational mode. (Pre-operations is ages 2-7, concrete is 7-11 and formal is 11-15 by the time it is achieved. This is not a short process but also explains why we teach things differently at different age groups.)
Fundamentally, Neo-Piagetian theory starts from the premise that the cognitive developmental stages that humans go through during childhood are seen again as we learn very new and different concepts in new contexts, including mathematics and computer science, exhibited in the same stages. Ultimately, this means places limitations on the amount of abstraction versus concrete reasoning that students can apply. (Without trying to start an Internet battle, neo-Piagetian theory is one of the theories in this space, with the other two that I generally associate being Threshold Concepts and Learning Edge Momentum – we’re going to hold a workshop in Australia shortly to talk about how these intersect, conflict and agree, but I digress.)
So this peer is looking to analyse learning and teaching activities to determine the level at which we are teaching it and the level at which we are assessing it – this should allow us to determine prerequisite concepts (concept is tested before being taught) and assessment leaps (concept is assessed at a level higher than we taught it). The approach uses an ACM CS curriculum basis, combined with course-secific materials, and a neo-Piaget taxonomy to classify teaching activities to work out if we have not provided the correct pre-requisite material or whether we are assessing at a higher level than we taught students (or we provided a learning environment for them to reach that level, if we’re being precise). There’s a really good write-up in the paper to show you how conceptual handling and abstraction changes over the developmental stages.
For example, in representational systems a concrete explanation of memory allocation is “memory allocation is when you use the keyword new to create a variable”. In a familiar Single Abstraction, we could rely upon knowledge of the programming language and the framework to build upon the memory allocation knowledge to explain how memory allocation dynamically requests memory from the free store, initialises it and returns a pointer to the allocated space. If the student was able to carry out Single Abstraction on the global level, they would be able to map their knowledge of memory allocation in C++ into a new language such as Java. As the student developed, they can map abstractions to a global level, so class hierarchies in C++ can be mapped into similar understanding in Java, for example.
The course that was analysed, Object Oriented Programming, had a high failure rate, and students were struggling in the downstream course with fundamental concepts that we thought we had covered in OOP. So a concept definition document was produced to give a laundry list of concepts (Pro tip: concept inventories get big quickly. Be ruthless in your trimming.) For the selected concepts, the authors looked to see where it was taught, how it was taught and then how it was assessed. This quickly identified problems that needed to be fixed. One example is that the important C++ concept of Strings, assessment had been carried out before the concrete operational teaching had taken place! We start to see why the failure rate had been creeping up over time.
As the developer, in association with the speaker, of the new OOP course, this framework is REALLY handy because you are aways thinking “How am I teaching this? Can I assess it at this level yet?” If you do this up front then you can design a much better course, in my opinion, as you can move around the course to get things in the right order at the right time and have enough time to rewrite materials to match the levels. It doesn’t actually take that long to run over the course and it clearly visualises where our pitfalls are.
Next on the table is looking at second and third year courses and improving the visualisation – but I suspect I may have to get involved in that one, personally.
Good session! Lots of great information. Seriously, if you’re not at SIGCSE, why aren’t you here?
SIGCSE 2014: Automated Assessment Session, Thursday 10:45-12:00
Posted: March 7, 2014 Filed under: Education | Tags: automated assessment, education, education research, higher education, learning, SIGCSE2014, teaching Leave a commentThis session was the one I spoke in and I think it went well. Lots of good questions, which is always handy, and I can only hope that the answers made sense! The next talk was “Adaptively Identifying Non-Terminating Code when Testing Student Programs” presented by Stephen Edwards.
How do we handle infinite loops in student testing? Killing the process works but what happens to later tests if we use a timeout-based termination? What happens to the data from earlier tests? What we’re doing is wasting time up to the timeout. Stephen put the wasted time at 99.2 hours of cumulative delay in the 2012-2013 academic year, over nearly 9,000 loop cases. Coarse timeout would have resulted in the loss of any results from these programs.
(This is a problem close to my heart, so I was listening intently!) Stephen talked about using JUnit 4 rules, where you can add timeouts to a given rule, but these have to be added to every test class, it’s only in 4 not JUnit 3 and a single flat timeout can still cause delays. So, sadly, we can’t use this solution to address our key concerns. So they built off the JUnit 4 rules but wanted to:
- create adaptive timeout rules
- extend Junit to run Junit3-style tests under JUnit4
- Automatically inject the timeout rule in every test class transparently
The adaptive rule starts with a fixed timeout and then adapt it. I didn’t quite follow some of this so I’ll have to read the paper. There are hard upper and lower bounds on the time limits and are customisable, with the time taken being roughly equivalent to that of the slowest terminating code. They’ve now developed the unit and integrated it with their existing code.
To evaluate it, they deleted a single data structures programming assignment with 4,214 program submissions and regraded them using the new approach. 82 instructor-written references tests (!!!) resulting in 345,456 test executions (that’s a very funny number!). A very small number of tests caused very large problems for students – 2 students had previously received no feedback at all because everything that they did had an infinite loop in it!
One of the questions asked how you bootstrap the initial timeout periods – data driven would be ideal but, without any data, there’s a problem. Stephen wants to do this experiment ut hasn’t had a chance to do it yet.
The next talk was “Can Computers Compare Student Code Solutions as Well as Teachers?” presented by Matheus Gaudencio, from the Software Practices Laboratory. They use a lot of automatic tests and code comparison so their first question was whether they, as teachers, had a similar way of examining and comparing code (the old “how many different marks can you get for the same essay” chestnut). They evaluated 11 teachers and generate a reference solution which the teachers had to compare to two sample solutions, based on which was the best approximation to the reference code. Results varied to a low of 62% agreement. From eyeballing his data, it looks like 75-80% agreement is the average.
Matheus then looked at other strategies, including token-based and tree-based approaches (out of 7 different strategies), for computational comparison of code. There has to be a threshold (which the paper refers to as Delta) which allows some rubberiness in the similarity equations. The produced a hierarchal clustering tool, which can be found at http://relatedecode.appsot.com. If you’re interested in this you can contact Matheus at matheusgr@gmail.com
SIGCSE Keynote #1 – Computational Thinking For All, Robert M. Panoff, Shodor Education Foundation
Posted: March 7, 2014 Filed under: Education | Tags: bob panoff, education, higher education, keynote, learning, robert panoff, sigcse, SIGCSE2014, teaching, teaching approaches, thinking 2 CommentsBob Panoff is the wonder of the 2014 SIGCSE Award for Outstanding Contribution to Computer Science Education and so he gets to give a keynote, which is a really good way to do it rather than delaying the award winners to the next year.
Bob kicked off with good humour, most of which I won’t be able to capture, but the subtext of hits talk is “The Power and the Peril”, which is a good start to the tricky problem of Comp thinking for all. What do we mean by computational thinking? Well, it’s not teaching programming, we can teach programming to enhance computational thinking but thinking is the key word here. (You can find his slides here: http://shodor.org/talks/cta/)
Bob has faced the same problem we all have: that of being able to work on education when your institution’s focus is research. So he went to start an independent foundation where CS Ed where such activities could be supported. Bob then started talking about expectation management, noting that satisfaction is reality divided by expectations – so if you lower your expectations. (I like that and will steal it.)
Where did the name Shodor come from? Bob asked us if we knew and then moved to put us through a story, which would answer this question. As it turns out, he name came from a student’s ungenerous pattern characterisation of Bob, whose name he couldn’t remember, as “short and kinda dorky looking”.
I need to go and look at the Shodor program in detail because they have a lawyered apprenticeship model, teaching useful thinking and applied skills, to high schoolers, which fills in the missing math and 21st century skills that might prevent them from going further in education. Many of the Shodor apprentices end up going on as first-in-family to college, which is a great achievement.
Now, when we say Computational Science Education, is it Computational (Science Education) or (Computational Science) Education? (This is the second slide in the pack). The latter talks about solving the right problem, getting the problem solved in the right way and actually being right.
Right Answer = Wrong Answer + Corrections
This is one of the key issues in modelling over finite resources, because we have to take shortcuts in most systems to produce a model that will fit. Computationally, if we have a slightly wrong answer (because of digital approximations or so on), then many iterations will make it more and more wrong. If we remember to adjust for the corrections, we can still be right. How helpful is it to have an exact integral that you can’t evaluate, especially when approximations make that exact integral exceedingly unreliable? (The size of the Universe is not 17cm, for example.)
Elegant view of science: Expectation, Observation and Reflection. What do you expect to see? What do you see? What does it actually mean Programming is a useful thought amplifier because we can get a computer to do something BUT before you get to the computer, what do you expect the code to work and how will you now what it’s doing? Verification and validation are important job skills, along with testing, QA and being able to read design documents. Why? Because then you have to be able to Expect, Observe and Reflect. Keyboard skills do not teach you any of this and some programming ‘tests’ are more keyboard skills than anything else.
(If you ever have a chance to see Bob talk, get there. He’s a great speaker and very clever and funny at the same time.)
Can we reformable the scientific method and change the way that we explain science to people? What CAN I observe? What DO I observe? How do I know that it’s right? How am I sure? Why should I care? A lot of early work was driven by wonder (Hey, that’s cool) rather than hypothesis driven (which is generally what we’re supposed to be doing.) (As a very bad grounded theorist, this appeals.)
How do we produce and evaluate models? Well, we can have an exact solution to an exact model, an exact solution to an approximate model (not real but assessable), an approximate solution to an exact model and an approximate solution to an approximate model. Some of the approximation in the model is the computing itself, with human frailty thrown into the mix.
What does Computational Thinking allow you to? To build and explore a new world where new things are true and other things are false, because this new universe is interesting to us. “The purpose of computing is insight, not numbers” — R. Hamming, “If you can’t trust the numbers, you won’t get much insight” — R. Panoff. Because the computer is dumb, we have to do more work and more thinking to make up for the fast and accurate moron that does what we order it to do.
“Killing off the big lie” – every Math class you have, you see something on page 17 showing a graph and an equation which has “as you can see from the graph” starting it. Bob’s lament is that he CAN’T see from the graph and not many other people can either. We just say that but, many times, it’s a big lie. Pattern recognition and characterisation are more important than purely manipulating numbers. (All of this is on the Shodor website) Make something dynamic and interactive and student can explore, which allows them to think about what happens when they change things – set an expectation, observe and reflect, change conditions and do it again.
Going to teachers, they know that teaching mathematics is frequently teaching information repetitively with false rules so that simple assessment can be carried out. (Every histogram must have 10 bars and so many around the mean, etc) Using computing to host these sorts of problems allows us to change the world and then see what happens. Rather than worry about how long it takes students to produce one histogram on paper, they can make one in an on-line environment and play with it. There are better and worse ways to represent data so let’s use computational resources to allow everyone to do this, even when they’re learning. This all comes down to different models as well as different representations. (There is value to making kids work up a histogram by hand but there are many ways to do this and we can change the question and the support and remove the tedium of having to use paper and pen to do one, when we could use computing to do the dull stuff.)
Bob emphasised the importance of drawing pictures and telling stories, they hand-waving that communicates site complicated concepts to people. “What’s this?” “I don’t know but here comes a whole herd of them!”
The four things we need for computational thinking are: Quantitative Reasoning, Algorithm Thinking, Analogic Thinking, and Multi-scale Modelling. Bob showed an interesting example of calculating a known result when you don’t know the elements by calculating the relative masses of the Earth and Pluto using Google and just typing “mass of the earth / mass of pluto” Is this right? What is our reason for believing it? You would EXPECT things to be well-know but what do you OBSERVE? Hmm, time to REFLECT. (As the example, the earth mass value varies dramatically between sources – Google tells you where it gets the information but a little digging reveals that things don’t align AND the values may change over time. The answer varies depends upon the model you use and how you measure it. All of the small differences add up.)
The next example is the boiling point of Radium, given as 1,140C by Google, but the matching source doesn’t even agree with this! If you can’t trust the numbers then this is yet another source of uncertainty and error in our equations.
Even “=” has different interpretations – F = ma is the statement that force occurs as mass accelerates. In nRT = PV, we are saying that energy is conserved in these reactions. dR/dT = bR – the number of rabbits having bunnies will affect the rate of change of rabbits. No wonder students have trouble with what “s=3” means, on occasion. Speaking of meaning, Bob played this as an audio clip, but I attach the text here:
The missile knows where it is at all times. It knows this because it knows where it isn’t. By subtracting where it is from where it isn’t, or where it isn’t from where it is (whichever is greater), it obtains a difference, or deviation. The guidance subsystem uses deviations to generate corrective commands to drive the missile from a position where it is to a position where it isn’t, and arriving at a position where it wasn’t, it now is. Consequently, the position where it is, is now the position that it wasn’t, and it follows that the position that it was, is now the position that it isn’t.
In the event that the position that it is in is not the position that it wasn’t, the system has acquired a variation, the variation being the difference between where the missile is, and where it wasn’t. If variation is considered to be a significant factor, it too may be corrected by the GEA. However, the missile must also know where it was.
The missile guidance computer scenario works as follows. Because a variation has modified some of the information the missile has obtained, it is not sure just where it is. However, it is sure where it isn’t, within reason, and it knows where it was. It now subtracts where it should be from where it wasn’t, or vice-versa, and by differentiating this from the algebraic sum of where it shouldn’t be, and where it was, it is able to obtain the deviation and its variation, which is called error.
Try reading that out loud! Bob then went on to show us some more models to see how we can experiment with factors (parameters) in a dynamic visualisations in a way that allows us to problem solve. So schoolkids can reduce differential equations to simple statements relating change and then experiment – without having to know HOW to solve differential equations (what you have now is what you had then, modified by change). This is model building without starting with programming, it’s starting with modelling, showing what they can do and then exposing how this approach can be limited – which provides a motivation to learn how to program so you can fix the problems in this model.
Overall, an excellent talk about an interesting project attacking the core issue of getting students to think in the right way, instead of just getting them to conform to some dry mechanistic programming approaches. The National Computer Science Institute is doing work across the US (if they come and do a workshop, you have to give them a mug and they have a lot of mugs). NCSI are looking for summer workshop hosts so, if you’re interested, you should contact them (not me!) Here’s one of the quotes from the end:
“It was once conjectured that a million monkeys typing on a million typewriters could eventually produce all of the works of Shakespeare. Now, thanks to the Internet, we know that this is not true” (Bob Willinsky (possible attribution, spelling may be wrong))
What would happen if the Internet went away? That’s a big question and, sadly, Bob started to run out of time. Our world runs in parallel so we need to have be able to think in parallel as well. Distributed computation requires us to think in different ways and that gets hard, quickly.
Bob wrapped it up by saying that Shodor was a village, a lot of fun and was built upon a lot of funding. Great talk!
SIGCSE Best Paper Award Winner – Dr Claudia Szabo, University of Adelaide
Posted: March 7, 2014 Filed under: Education | Tags: education, education research, higher education, neopiaget, sigcse, SIGCSE2014, teaching, teaching approaches 1 CommentCongratulations to my colleague, friend and running partner, Dr Claudia Szabo on winning the SIGCSE Best Paper Award for a paper entitled “Student Projects are Not Throwaways: Teaching Practical Software Maintenance in a Software Engineer Course”. Claudia has three papers here because overachievement, but more seriously this is a fantastic achievement, especially for her first SIGCSE. This is also really useful research that has direct practical applications for people who are teaching Software Engineering AND we’re working together to build courses based on some of her earlier work on Neo-Piagetian analysis of existing courses.
Here’s a picture! Yay, Claudia!
SIGCSE 2014
Posted: March 7, 2014 Filed under: Education | Tags: blogging, community, education, higher education, sigcse, teaching, teaching approaches Leave a commentWell, another year, another SIGCSE. I’ll try to produce more short posts rather than infrequent brain dumps. (That was lucky, I caught the word bran before I posted…) I’ll also be tweeting so short thoughts will go over there and, with any luck, small essays will be here.
If you’re here at SIGCSE and want to meet, drop me a line.
Tweet me! @nickfalkner out in the Twitterverse.



