Thursday, May 26, 2011

ToC in Software

I always seem to find myself writing in the middle of the night.  I get an idea in my head and can't sleep.  I don't want to forget, and so I can't let it go and just rest... so here I am again, letting the blog remember so I don't have to.

I just got done listening to The Goal on audio book during my morning commute.  Besides being quite entertaining... I got a lot of good stuff out of it.  I've been thinking about, over the last week or so, how to use the new insights to understand the process of producing software better.  And while I was laying in bed, I discovered something new.

I haven't thought all this through yet, but the patterns make sense to me.

So in The Goal, they have a furnace as one of their bottlenecks.  There's a pile of parts sitting in front of it that need processing, some of which aren't going to be used in current customer orders and are to build stock.  Some of the parts in the pile have quality issues.  A couple things they do to increase throughput... move quality inspection to before the bottleneck, and prioritize the bottleneck work that makes money is prioritized over work that doesn't.

My first thought was that this sounds a lot like backlog management and planning.  Prioritize and prepare the work ahead of time so that when it's time to start executing on work, we don't clog up the capacity with unimportant work, and we maximize the time spent in development.  So I started thinking, does this process just assume that development is the bottleneck?  If it isn't, and we are upstream of the real bottleneck, won't we just cause chaos downstream?

Software has some major differences that drastically effect the complexity of the system.  In terms of constraints... all of the historical outputs of the system are also inputs to the system and greatly effect capacity.  You can consume capacity in order to increase capacity.   The work itself flowing through the system is highly variable, uncertain and linked with several dependent events.  The work flowing through the system is interdependent with other work flowing through the system.  Work is highly negotiable - you can cut corners, optimize for write over read (throughput over future capacity), reduce scope of the work, and assume more risk to save time.  You can bend the work to your will to an extent...

It's quite feasible that your bottlenecks could change, given all of these variables... but I wonder how much they really do...? We often bend the work in the system to make it fit.  We often keep bending it the same ways too.  In the book, they talk about a couple cases where it seems like the bottlenecks are drifting, but in fact they really aren't.  The problem was actually the flow of material.

Made me think of a common problem with QA.  QA is starved for work waiting until developers complete something so that they can test.  Then all the work gets done about the same time, and QA suddenly doesn't have enough capacity and we miss our deadline.  Did the bottleneck just change from development to capacity? It sure feels like it... but maybe this is just the same kind of flow problem?  A bottleneck is defined as any resource who's total capacity is equal to or less than demand.  In the overall system, there is usually a constant queue in front of development and the work demanded typically exceeds available capacity. 

In an iteration, if we consider just the smaller subsystem, development capacity is typically filled and then it flows downhill from there.  Development is no longer the bottleneck in the subsystem, they aren't (intentionally anyway) given more work than their capacity allows. Development only becomes the bottleneck if they unintentionally exceed their own capacity, which is quite easy to do as well.

But assuming development happens smoothly, we still slam a bunch of work through at the same time, and QA is suddenly buried.  If QA's total capacity is greater than the total amount of work that needs to be done, it is by definition not a bottleneck... if thats the case, and we could fix the flow problem, the system could actually operate smoothly.   When I started to think about why all the work gets done at once... the obvious answer occurred to me... because we typically start it all at once.  It's built into the iterative process.   I wonder what would happen if you made no other changes than just staggered the work starts?

There are definitely reasons to have synchronization points, but as we get closer to continuous, we could only synchronize when we actually need to synchronize and otherwise just maintain a sustainable pace?

When any part of the system is slammed we tend to bend the work... and we generally always pay dearly for it too.  There's nothing like a constant sense of urgency to trigger massive capacity sacrifice.

Wednesday, May 4, 2011

What's in a variable name?


Good variable naming is more than giving things good names... it requires decomposing your problem into thought-sized chunks so that you can give things good names.  If you cant think of a good name something, it should again lead you back to evaluating your design.  Sometimes though, you really do need to invent a new term to communicate a concept in your system.
 
You should be able to read code that calls out to several different methods and be able to have a decent idea whats going on. If you look at the implementation and are surprised at what it does, thats a problem. Surprises lead to misunderstandings -> mistakes -> and defects.  Theres general rules of thumb, e.g. only doing one thing, or not unexpectedly mutating input arguments.  But other kinds of surprises are much harder to just list.  We attach complex meanings to words as metaphors for concepts in every day life, ideas in a specific domain, or common patterns and conventions that we use in code.  Words don't always mean the same thing to different people, or can have multiple meanings in different contexts.  On most projects, we end up making a glossary at some point of agreed on terms and meaning so that we can communicate better in speech, but also in code.  It's a massive help to have a common set of building block ideas with the people you have to 'group-think' with in code.  
 
Learning design patterns will help you pick up some general vocab, but other conventions are established more from tools.  Like a method getXXX() should generally return a property, if it does something other than that, it will generally surprise people.  Even if the method does essentially 'get' something, its still surprising because its an established convention, so sometimes you need to use a different word that means the same thing to convey a different meaning. :)  Sometimes a popular tool will pick unfortunate names for things and they'll be adopted into general vocabulary and cause confusion.
 
Most of the code out there wasn't written to be understood.  And yet we spend way more time trying to understand existing code than we ever spend writing it.  But it's not easy... even when people try to write understandable code, they often still miserably fail.  Per Scott, "thinking is hard."  :)  Breaking down a problem into thought-sized chunks so that you can think about one chunk and know you don't need to worry about the other chunks... is hard.  That's the kind of stuff you spend your career working to master, and when I say a developer is 'really good' that's usually what I'm referring to.  It's the ability that crosses languages, tools, or frameworks that's responsible for massive increases in productivity.  

What's your unit testing philosophy?

A member of the Austin Software Mentorship group asked this question on the mailing list.  The kinds of questions being asked, is remarkable to me, and from college students! Just wow.  I had recently been a part of a great discussion at Lean Software Austin, and had been thinking about this stuff a lot.  So I took some time to reply, and helped me to capture my thoughts too...


When you are writing tests, one concern is to know if the code you wrote or are writing works or not.  But by automating the tests as a permanent addition to the code base, you have another set of responsibilities.
 
What's going to happen when one of your tests break and someone else on your team (or someone who will be on your team later or even you in a year or 2) has to understand what your tests were intending, whether that concern is still valid or needs to be modified or moved elsewhere?  If someone were to modify your code, what might they be likely to misunderstand and make a mistake? What conditions are not obvious from the design that might trip someone up? Can you change the code/design to make it obvious and clear? If you are delaying or unable to make the design more easy to understand, can you document the non-obviousness in a test that will fail in a way that will inform the developer of the mistake they made? 
 
More often than not, incomprehensible automated tests will end up getting dragged along on a project for years.  When they break and are either not understood or misunderstood, the bars just get turned green by flipping asserts.  The original intent and knowledge that went with the tests just get lost, and the tests just eat up time.  Even if the intent is clear, tests that break frequently when the app isn't broken, especially from non-functional changes are really annoying.  How can you write tests that will generally break when the code is actually broken, and not just because the code changed? How can you change your design so the code design so that your tests will be less brittle?  How can you simplify your design so that the things you are trying to test can no longer break?
 
For example, if you end up with a massive amount of test setup to control all of the dependencies for your test, something is wrong with your design.  It should trigger your spidey senses.  How can you restructure your design so that you don't have to be concerned about so many things at the same time?  Let your tests help you improve your design.
 
If all you care about is "seeing if it works now", there's no point in checking the tests into source control.  Just delete them. Once you check them in, your tests begin a whole new life, and if you aren't aware of the implications of that, you can create quite an expensive long-term burden. 

Friday, April 8, 2011

Assumptions, Predictions, and Plans

Another old post, I really liked this one.  Although, now I think there are more options to keep steering the right way that don't necessarily involve rumble strips/feedback mechanisms.  The metaphor kinda breaks down... but, you can also steer the way by making sure you can only steer the right way. :)

Poppendieck has a beautiful quote that I really loved on this topic, I'll have to find the whole thing at some point.  But basically, "predictions don't create predictability."

--- Feb 11, 2008

One of the hardest habits to break may be the trust and reliance on our assumptions, predictions and plans. The idea that making a prediction will somehow reign in the unpredictable and lead us to reliable results. Despite repeated failures caused by the guidance of a false sense of knowledge, we continue to walk down the same path. Our reaction to the failure is to reinforce the same habit; assuming our plan was just faulty, and we should put more effort into creating better plans.  Why don't we just stop?

Trying to imagine structure and order around something that is by nature unstructured and un-orderly, only provides an illusion of control. And that illusion will only lull us into more false assumptions, bad decisions and unreliable results. Instead of trying to plan your way to success, just throw your fear and discomfort out the window. Let go of any illusion that you have control, that you know what your customers want, that your solution will solve their problem. Because you know what? You don’t know that. You just want to think you do.

So how do we get to reliable results, despite an unreliable world? By creating knowledge. So your customers have a problem to solve. How do you KNOW that your solution will solve their problem? When you’ve solved it. That’s right, not any sooner. At that point, you will be able to grasp onto a tangible reality that you have created value for your customer. This is knowledge.

Suppose your code needs to find the latest edition of a book. How do you KNOW that your code does that? You test it. At that point, you know that under some specific conditions, the code behaves as you expect. It’s concrete feedback on reality... its knowledge. How do you know that in creating this new feature, you didn’t just break the one you wrote last month? Again, through feedback. If there are tests in place to protect all existing functionality, you already know.

Even if you don’t know where you are headed, you can still steer in the right direction to reliable value. At every possible opportunity, instrument your process with feedback that builds knowledge, and let that knowledge guide you to delivering reliable value. Get your software in your customer’s hands as often as possible, so you know if you’ve added value. And if not, adjust. Find out if that code really does what you think, test your assumptions. The longer you go without feedback, the further you are likely to be off track.
Create knowledge. And Let it do the steering.

A Rough Day in Training...

Another old post, this was one of my first adventures trying to do team-help consulting education type of stuff.  I thought I could just open up their skulls, dump in some knowledge, they would all say 'aha!' and we would be on our way to making things better.  Needless to say, it didn't quite work out that way.  This experiences really stayed with me though, and I spent a lot of time reflecting, and trying out new ideas for how to effect change.  But If people don't care to learn, don't want to try or think they need to improve, then better to go find a problem worth my time.

--- Feb 23, 2008

So yesterday I just had my first presentation on Lean Software Development. I did my best to make the presentation interactive to engage folks, but wow, I had no idea what I was up against.
So right before this meeting, earlier in the morning, the team decided that this would be a waste of their time because they already knew all this stuff.

Mind you, the reason I was asked to help out this project to begin with was because the project lead, who used to be the tech lead on my team could see the head lights of the software train wreck coming his way... but didnt know how to fix it, and asked me for help. But these guys knew it all right? They had it all down. Walking down those train tracks, completely oblivious that they had fundamentally changed nothing other than the fact, they were on a new project and had started over. Every project always starts with the best of intentions.

So my presentation went through a reality check of why waterfall is broken. Why its so expensive. Why you create this big hole of technical debt that is next to impossible to reverse. On my team we managed to get some hold on the problem and start to decrease the debt, but we are so far down the cost curve at this point that EVERYTHING is expensive. Its a beastly legacy project.
So the team continued with a ya, ya, speed it up, we know this stuff attitude... and then so we started talking about Lean.
So here's what I was up against, with the defensive guard unwilling to look at the problem another way (in pain order):
  1. Inspecting for defects and preventing them is the same thing
  2. Agile is the same thing as waterfall in a smaller box
  3. We must understand a solution to a problem to be able to make progress in solving it.
    They freaked when I said this: With an empirical control system (feedback and quick response aka agility), you dont have to know where you are going, you just need to know enough to steer in the right direction. (When I said this, it was like I said the aliens landed, or the world wasn't flat. I was just dismissed as crazy.)
  4. A good bug tracking system is critical to a successful project. (what if you have no bugs? This was still unfathomable)
  5. Customer sign offs on requirements have no impact on what the customer decides to put in the requirements.
  6. A stream of features is the same as a stream of value (major team risk of feature bloat)
The part that bothered me the most was that they left the meeting with the same sense that I crazy and they knew it all already.

How do I get these folks to question their thoughts? Is it just a waste of time to try and help them at all, especially when some people actually want my help? Does anyone have any thoughts on strategies for poking some holes in their reality that they'll have to reconcile?

Bleeding Away Our Knowledge

Kerry Kimbrough wrote this a while back (again, killing my other blog), and today, I think it mostly rings true for me...  except that I've always found it relatively easy to come 'up-to-speed' on a project relatively quickly via reading the codebase and learning the domain.  Even with a project I'm working on now, which has an extremely confused architecture, what I needed to understand to be productive on the project and to understand how it might be rearranged to make way more 'sense'.  The tests don't help much at all in this case at expressing intent, but I still think they can.  And have found working at trying to make them express intent quite valuable when I have to look at them again.  What I'm still missing is all the why's... especially for the decisions that aren't intuitive. 

Even still, seeing the impact of no clear architectural direction and no common understanding of the design, I see what a mess it causes.  And while I may be able to speed read through a code base, not everyone can do that as easily.  And I have to work with those people. :) Whatever it takes to maintain that common knowledge and keep folks on the same page has gotta be much more cost effective than the impact of not having it.

--- Feb 27, 2008, Kerry Kimbrough

Janelle observed that a sign of a troubled project is an exponentially increasing cost per change over time. My response was "No, this is virtually an invariant for any software system." You can quibble if the $/change/year is really exponential, what is the coefficient, etc. But always upward.
I think this issue is much deeper and more intractable than agility vs. waterfall. Agility per se doesn’t help and may even hurt.

Here’s the problem: dissipation of the knowledge behind the system design. The system design is the result of a large number of design choices. There is a huge amount of knowledge embedded in which choices were made and, more importantly, why. In most teams, this knowledge is completely tacit, and never really exists outside the skulls of a few people. What’s left is the code itself, which typically does not capture design knowledge (only the effect of the design), much less the design rationale. The result is that every individual has to guess for himself. Eventually, none of the skulls with the original design knowledge remain on the project. The code gradually becomes the vector sum of several different design concepts, some of them confused and faulty. This increasing incoherency and resulting complexity eventually outstrips anyone’s ability to fully comprehend. $/change increases because of the increasing effort required to know how things work and why things break.

Agility hurts to the extent that it refuses to create and maintain the non-code artifacts required capture the design knowledge and rationale. Most teams fail in this difficult job, but XPers actually stand there and say you shouldn’t even try!

But the benefit of actually making the design an explicit real-world artifact is not just is holding down $/change/year. The real benefit is that the discipline of creating and maintaining this artifact leads to a better, simpler, higher-quality design. The real benefit of taking it out of the skulls and into the real world is that the whole team can learn it, share it, and improve it.

It needn’t conflict with agility. You don’t need BUFD, but you do need a design that evolves incrementally. As the tests drive the code, so the design drives the tests.

Is Automated Testing Waste?

I wrote this several years ago, and I have mixed feelings about it now.  Wanted to hold onto it since I'm killing my other blog.  I think there is a blurry line between knowledge management and unnecessary with automated testing.  I've mostly settled on keeping the knowledge around is the value-add part, the rest of it is all a mechanism for trying to accompish that, that if you could reduce to 0, without impacting the knowledge part, then you should.  After the value stream mapping process with our software project and discussing ways to do less automation work, I was struggling with how this made sense.

---- Feb 28, 2008

When you are designing something, the knowledge you build from the discoveries along the way and the mistakes you make are important inputs into designing the code that solves a users problem. If you have to relearn this stuff, its waste. Maintaining knowledge is a crucial part of lean design. When we automate a test, we codify various things that we learn and preserve that knowledge as part of the system.

Once we learn something about what the software needs to do to work, or what it shouldn’t do, or what we are intending it to do… we can build protectors into the software, that whenever one of these important things we've learned, is violated, the system tell us and we can prevent it from harming the software. And when we are working on changing that software, we have examples that tell us the intent of the software, so that we don’t have to figure all of this out again and again.

QA resources have a unique and valuable skillset that is critical to building knowledge about a system. If we can use these knowledge building abilities to prevent bugs rather than detect them, we can accelerate our ability to solve customer problems, which adds value.