@macshonle@c.cim - Mastadon

14 September 2011

Interview on Keyvan.TV

I was recently interviewed on Keyvan.TV, where I talked about some of my feelings about software engineering:

18 May 2011

Why I Will Randomly Assign Students in Group Projects

There's a disturbing pattern when it comes to group projects in my software engineering classes. When it comes to the average group-- I'm not talking about the exceptional groups, I'm talking about the ones right in the middle-- I've noticed that groups generally only have five types of members.

This is surprising because, at first thought, when you let a random group of intelligent and creative people self organize, the result should be as interesting and varied as the people in the group. But instead of bringing out everyone's best qualities, it amplifies only a few, specific qualities:
1. The Visionary. Giving students the freedom to pick their own projects is a huge burden! What would you expect when you come up to a student and say: "Quick, come up with a great idea right now, because your grade depends on it!" Out of five students, just one would love the burden. That student is The Visionary.

The Visionary never has problems coming up with great ideas. They tend to think big thoughts often, and kick around various ideas for years. When they see a course project as an opportunity to pursue this idea, they jump at the chance. To get there, they'll enlist the help of...

2. The Code Monkey. The Visionary is already good friends with a Code Monkey, and respects how many languages the Code Monkey knows, and how many different graphics and networking libraries they've used. The Visionary doesn't want to do all of this work on their own, so they pick a competent peer they can trust. The Code Monkey always gets an A on programming assignments and the two quickly work out a deal: I'll do the write-ups and the presentation, and you do some coding spikes to be sure if this idea is feasible.
Wait, but that's just two people, the group still needs more. The next person to join is...
3. The Leech. The Leech is actually a great person. They respect people and the course, and they want to get something out of it. Specifically? They want to get an A. The Leech seeks out groups as they are forming and finds the group that they think the professor is most interested in. The Leech doesn't want to exactly gain at another's loss, the want to coast on another's gain.

The Leech typically knows The Visionary or The Code Monkey and is the "second pick" to join the group. The Visionary didn't ask The Leech first, because The Code Monkey was in higher demand. But the group needs to grow, so The Leech is accepted. The Leech's acceptance solidifies the group's mission, and already their roles are set in stone. Based on this solidifying service alone, The Visionary might be the one to approve of The Leech.

In a nice group, The Leech isn't even much of a Leech, and is more just an Understudy Code Monkey.

4. The Slacker always comes late. Groups by this point have already started to form into twos and threes, and time is running out before those left become "that group." You know, the group of people who are randomly assigned, because they just don't know enough people and so the only ones left are assigned to a group by default? Who wants to risk their grade with that!

The Slacker might've been a Visionary-in-Waiting, unable to convince anyone to follow their lead. Being too much of a leader to be a follower, The Slacker only reluctantly follows. The Slacker joins the group based not on what their different ideas are, but based on the path of least resistance. The Slacker is different from The Leech, because they aren't as engaged. Even though The Leech wants to coast on the work of others, The Leech knows how key it is for the group to be strong. The Leech is engaged by giving The Visionary all of the social support he or she needs. The Slacker can pull the group the other way. The Slacker might suggest ideas and changes only because it would be easier for their particular circumstances, not because it would lead to the best project.

Thus, groups sometimes end up with a project with a key component dedicated to some technology The Slacker is comfortable with. However, given that they are The Slacker, that key component will only be ready until "next week." As the deadline approaches, that key component is only half done-- if that-- and everyone needs to save face explaining why they just didn't get there. The Slacker is a drag on the group not because they don't do things, but because they've actively pulled focus away from where the group could have gone.

5. The Watertreader rounds out the group. The Watertreader could join the group at any stage, before or after either The Leech, The Slacker, or even The Code Monkey. A Watertreader might have even been brought on by The Visionary as The Code Monkey. Yet, when it comes to either the coding or the write-ups, The Watertreader is simply in over their head. This could be due to personal issues or inexperience. The Watertreader works hard, but no matter what only seems to be getting by.

Yet, the Watertreader might be a key part of the group, too. They don't make promises like The Slacker, so they don't drag down morale. If they were part of the project early on, they might have recruited the best people of all categories. There's an opportunity for The Watertreader to help gel the team, and even use their organizational skills to keep the group on track: "Come on guys, we need to set up a meeting for next time right now!"
And that's how it unfolds. Most of the time. Obviously, if the project is only 2-3 people, or 5-6, some people will be playing simultaneous roles, or change roles over time. The labeling isn't as important as the group dynamic that emerges.

As a result, I will no longer let students self assign groups. Even though in some cases it works out perfectly, in the average case it does not.

Random selection for group projects is worth the risk to me, given that this means I'm picking a policy that is less popular.

And remember being in "that group"? The one that ends up being randomly assigned by default? They always end up being the more interesting groups. Why? Because it brings out the visionary in everyone, so everyone is engaged, and no one has a choice but to be their best.

This post was inspired by @mattmight's post on "Classroom Fortress: The Nine Kinds of Students".

24 February 2011

If fonts were programming languages

If fonts were programming languages, this is what they'd be...
Helvetica - The C Programming Language. It's old, completely overused, and yet also the best solution to many problems.

Times - C++. This is also grossly overused, and is the second best choice for any large tasks.

Courier - Fortran. This font is old and reliable and we're going to be stuck with it for a long time.

Chicago - Lisp. It's quirky, old, and used in many surprising places.

Computer Modern - Fortress. A mathematical and beautiful font, but pedantic.

Garamond - Java. This font sure seems a lot like Times. Not used quite as much, and has some new flaws and quirks of its own.

Palatino - C#. This font is like Garamond, but some things like that uppercase-P just aren't connected. This makes it attractive for some users, and appalling to others.

New Century Schoolbook - Smalltalk. Initially, this font looks like it's for kids, but it's both serious and playful.

Comic Sans - Python. You wouldn't think that this font was serious, but it's used in a surprisingly large number of contexts. And it's a safer choice than it would appear at first.

Zapf Dingbats - Perl. This font is useful for patching things together. If you need that special symbol to make your sub-sub-bulleted list, this is your ad hoc solution.

LED Marquee - Javascript. This font is the unsung hero. Many times it's used improperly, making things overly flashy and distracting, but it's also sometimes the only venue for transferring very important information.

08 January 2011

How do you know if your software tests are any good?

Project management by check boxes gives you a nice, but false, sense of security that everything is going smoothly. Although three decades have passed since Glenford Myers wrote the classic The Art of Software Testing many practitioner’s approach to testing is to simply bang out some buzzwords and be done with it.

You can say that you've passed 100% of your unit tests, but that isn't meaningful if most of the tests are trivial or repetitive with each other. You might’ve achieved 95% code coverage, but that won’t matter if important edge cases haven’t been covered. So, how do you know if your tests are any good? If the purpose of testing is to find bugs, then your tests aren’t good unless they’ve found bugs. If a test does not find a bug, it fails as a test.

While that’s simple to state, it can still be daunting if you’re not familiar with testing. There are three main techniques you can use to improve your test design: (1) whitebox techniques; (2) blackbox techniques; and (3) mutation testing.

Whitebox techniques are used with specific source code in mind. One important aspect of whitebox testing is code coverage. E.g.,:
  • Is every function called? [Functional coverage]
  • Is every statement executed? [Statement coverage-- Both functional coverage and statement coverage are very basic, but better than nothing]
  • For every decision (e.g., if, while, ...), do you have a test that forces it to be true, and other that forces it to be false? [Decision coverage]
  • For every condition that is a conjunction (uses &&) or disjunction (uses ||), does each subexpression have a test where it is true/false? [Condition coverage]
  • Loop coverage: Do you have a test that forces 0 iterations, 1 iteration, 2 iterations?
  • Is each break from a loop covered?
Blackbox techniques are used with specific requirements in mind. Blackbox testing follows the principle that a test should not test a single program, but the full class of possible programs. The following blackbox techniques can lead to high-quality tests:
  • Do your blackbox tests cover multiple testing goals? You want your tests to be “fat”: Not only do they test feature X, but they also test Y and Z. The interaction of different features is a great way to find bugs.
  • But you don't want fat tests when you are testing an error condition, such as invalid user input. If you tried to achieve multiple invalid input testing goals (for example, a test to cover an invalid zip code and an invalid street address) it’s likely that one would just mask the other.
  • Consider the input types and form an equivalence class for the types of inputs. For example, if your code tests to see if a triangle is equilateral, the test that uses a triangle with sides (1, 1, 1) will probably find the same kinds of errors that the test data (2, 2, 2) and (3, 3, 3) will find. It’s better to spend your time thinking of other classes of input. For example, if your program handles taxes, you'll want a test for each tax bracket. [This is called equivalence partitioning.]
  • Special cases are often associated with defects. Your test data should also have boundary values, such as those on, above, or below the edges of an equivalence task. For example, in testing a sorting algorithm, you’ll want to test with an empty array, a single element array, an array with two elements, and then a very large array. You should consider boundary cases not just for input, but for output as well. [This is call boundary-value analysis.]
  • Another technique is error guessing. Do you have the feeling if you try some special combination that you can get your program to break? Then just try it! Remember: Your goal is to find bugs, not to “confirm” that the program is valid. Some people have the knack for error guessing.
Finally, suppose you already have lots of nice tests for whitebox coverage, and applied blackbox techniques. What else can you do? It’s time to test your tests. One technique you can use is mutation testing. Under mutation testing, you make a modification to (a copy of) your program, in the hopes of creating a bug. A mutation might be:
Change a reference of one variable to another variable; Insert the abs() function; Change less-than to greater-than; Delete a statement; Replace a variable with a constant; Delete an overriding method; Delete a reference to a super method; Change argument order.
Create several dozen mutants, in various places in your program [the program will still need to compile in order to test]. If your tests do not find these bugs, then you know you need to write a test that can find the bug in the mutated version of your program. Once a test finds the bug, you have killed the mutant and can try another.

Testing is complete when you have stopped finding bugs. Or, more practically, when the rate at which you find new bugs slows down and you see diminishing returns.

Bugs tend to “cluster” in certain modules and features: The moment you find a bug in one, you know that you should look in it further for more bugs. (For example, why does Apple keep on having troubles with the iPhone alarm? It’s a perfect candidate for increased testing efforts.) To find bugs, you can use the techniques of blackbox testing, whitebox testing, and mutation testing. As long as you are finding bugs, you know that your testing process is working!

This post is a revision to two of my answers on the Programmers StackExchange.