Todd Hoff's blog

Todd Hoff's picture

Handling Infinite Work Streams

Infinite work streams are the new reality of
most systems. Web servers and application servers
serve very large user populations where it is
realistic to expect infinite streams of new work.
The work never ends. Requests come in 24 hours a day
7 days a week. Work could easily saturate
servers at 100% CPU usage.

Traditionally we have considered 100% CPU usage a bad sign.
As compensation we create complicated infrastructures
to load balance work, replicate state, and cluster
machines.

CPUs don't get tired so you might think we would
try to use the CPU as much as possible.

In other fields we try to increase productivity by
using a resource to the greatest extent possible.

In the server world we try to guarantee a certain
level of responsiveness by forcing an artificially
low CPU usage. The idea is if we don't have CPU
availability then we can't respond to new work with a
reasonable latency or complete existing work.

Is there really a problem with the CPU being used
100% of the time? Isn't the real problem that we use CPU
availability and task priority as a simple cognitive
shorthand for architecting a system rather than having
to understand our system's low level work streams and using
that information to make specific scheduling decisions?

We simply don't have the tools to do anything other
than make clumbsy architecture decisions based on
load balancing servers and making guesses at the
number of threads to use and the priorities for
those threads.

We could use 100% of CPU time if we could:

0. Schedule work so that explicit locking is uncessary (though possible). This
will help prevent dead lock and priority inversion.
1. Control how much of the CPU work items can have.
2. Decide on the relative priority of work and schedule work by
that priority.
3. Have a fairness algorithm for giving a particular level of service
to each work priority.
4. Schedule work CPU allowance across tasks.

Todd Hoff's picture

Words are Rooted in Physical Metaphor

It's interesting Emerson in his Nature essay talks about
how words are rooted in physical metaphor. In this he anticipates Lakoff and Johnson by a few years.

http://en.wikipedia.org/wiki/Conceptual_metaphor.

Todd Hoff's picture

Thoughts On Interview Questions, the Process, and Resumes

Given that i and few other people i know will be interviewing a bit more now :-) I've
put together an interview related wiki page at http://www.possibility.com/epowiki/Wiki.jsp?page=InterviewQuestions.
It covers C++ and general programmin interview questions. It also has some thoughts on some issues
companies should consider when interviewing and some issues interview candidates
should think about during the interview and when making their resume.

Here's a bit of it.

Thoughts For The Company Doing The Interviewing

* Know the kind of person you want, the skills they should have, and design your interview process accordingly.
* Do pre-interview phone interviews. This can save a lot of time for both parties if there is an obvious lack of a match.
* Do you really have an open slot with money for it? Interviewing is a ton of work. It sucks to go through the entire process and the find out there really wasn't any money.
* Decide who gets to decide if a person is hired. Is the manager going to hire who they want no matter what? Then don't bother with interviews. Does it have to be unanimous? Is it majority rules?
* Every person in an interview should have a defined subject area. Have people know what they are supposed to ask and don't overlap questions.
* Is someone a friend of the interviewee? If so don't have them interview the person. Make sure that the friendship doesn't influence others when making the decision.
* People lie. Make people answer a wide variety of questions. Have them read code. Have them write code. Have them demonstrate specific knowledge. Have them demonstrate detailed knowledge. Do not accept generalities or diversions.
* People lie. Have someone verify that what is on the resume is true. People will say they know C++ but can't describe a destructor!
* Check references.
* Be able to tell a candidate what job they are being hired for.

Todd Hoff's picture

Software Is Really a Community

Software is far more a community than it is a well ordered bag of bits. This feeling struck me hard during a "transfer of knowledge" session for software i've worked on for over 6 years. The transfer of knowledge is due to an unfortunate plant closing.
Just how do you transfer knowledge of a huge piece of software that you have so lovingly worked on for 6 years? It is a daunting task. There's no real place to start and there's no real place to end. The stories are fractally infinite.

You hope you are transfering knowledge so that the software might live and even prosper. But in the back of my mind i know that this is not the case.

I can talk about the software for days. I can demo it. I can document it better and better. But that's not the software.

The software is really all the people and circumstances that gave rise to it, along with the culture that sustained it.

The meaning for the software isn't it in the code. It comes from the society of people who used it. From the traditions and culture that were built around it. The exciting moments when you were able to add something that made someone elses life easier.

Software is its community. Without a community software can not be said to live.

Anything complex does not stay alive by the written word. Software lives through continual use; through old people handing down knowledge to new people, sharing tips, tricks, and workarounds; through steady continual improvement based on the feedback of actual caring users so that the software fits its niche so well nobody can imagine it working any other way.

When going over each feature i can remember when it was added, who wanted it added, and why they wanted it. I can remember when the feature was completed and their thanks when it worked. Without that person or their living descendents, any explanation of the feature makes no sense. Inside, I know it will never be used again. It will just die.

Todd Hoff's picture

Swing, Threading, and Application Architectures

Here's an interesting thread on writing efficient swing
code (http://www.javalobby.org/thread.jspa?forumID=61&;threadID=13166).
It's interesting to me because it talks about improving swing perfomance
by not doing work in the UI thread. I would say this is obvious,
but i've noticed in general threads are not talked about much
in java.

As threads are built into java you might expect a more
energetic discussion.

But unfortunately threads in java make it so easy to screw
things up.

The UI by at least not defaulting to having work done in another
thread has caused Swing years and years of bad press.

Observers not requiring notifications to be processed in a separate thread,
for example, is a disaster waiting to happen. In the naive implementation of java
you can handle notifications sagely by proving a bridge to an Actor
type architecture, but few people know or will think to do it.
Instead you get tangles of recursive code with entirely unpredictable
latencies and deadlock characteristics.

High performance applications consider threading architectures very carefully,
as they do in SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/),
for example.

These issues are not related to swing only, they exist in every
application, every jvm, every system. Inputs like databases,
tcp/ip, rmi, soap, jms, servlets, etc all have the same problems
of dispatching work, getting work done, and dealing with
notifications from all the work performed which inturn causes more work,
more notifications, etc.

Container frameworks like Spring generally assume work is
processed in a single thread. Thread local variables are used
to transparently store transaction information or AOP is used to declaratively
support transactions.

This approach doesn't support moving work to different threads for different
processing steps. Nor does it allow you to condition your total work load by limiting

Todd Hoff's picture

Frameworks Encourage Poor Threading Models

A thread on The Server Side (http://www.theserverside.com/news/thread.tss?thread_id=29012)
turned to talking about a topic of special interest to me, namely application architectures
for high performance high load situations.

Here are some thoughts on the subject of
architecture: http://www.possibility.com/epowiki/Wiki.jsp?page=AppBackplane.

They are the result of years of conversations with some very smart people
working in one of the most difficult environments possible, a Class 5 core telecom
switch.

Comming to the java application server world of servlets, hibernate, struts, spring,
etc, i was confused at first by how these frameworks dictated the threading
architecture of applications by using ThreadLocal and a single threaded approach
for all requests.

I am curious if people are inerested in other approaches to application architectures?

Anyway...

> From the thread:

>don't you break with the common one thread per request
> scenario that us developers have come to depend on?

It needs to be broken. These frameworks force
an application architecture. Your application architecture
shouldn't be determined by a servlet or a database or
anything but the needs of your application.

Sure, a single threaded approach may work fine for
a stateless web back end.

But what if you are doing a real application on the
backend like handling air traffic control or a
manufacturing process?

In these cases a single threaded approach makes
no sense because a web page is just one of a thousand
different events an application will be handling.

All events are not created equal. Threads, queues,
priorities, CPU limits, batching, etc are all tools
you can use to handle it all.

It took me a while to figure out why i was having problems
with certain frameworks. It is because they hard code a
threading architecture into your apps.

If i want an object to participate in transactions from

Todd Hoff's picture

XP != Extreme Systems

A recent thread in comp.object has helped
me realize what has bugged me about extreme programming
is not actually XP itself. In my mind i kept thinking
XP should be about building systems. It isn't. XP is actually
about just what it says: programming.

XP addresses the programming part of any project and
that's it.

I am largely in agreement with the primary XP practices.
Some of the secondary practices, like using a separate
integration machine, are just, well, kind of silly.

But most of the other XP practices are sound. And I won't
say they are all just stuff i already did. That's not
true. I have learned a lot from XP.

Yet XP doesn't address developing a system and it
never said it did. But that's always the context
in which i evaluated XP and always found it wanting.

A major part of systems work are things like creating a
products requirements definition (PRD); complex hardware and
software codependencies; stringent high availability
requirements; stringent performance requirements; stringent
interop requirements; specifying hardware, much of which
has to be built; being compliant with many complex standards;
buy or build decions; predicting staffing, budgets, and
costs; and so on.

None of this is programming. It certainly impacts programming.
And you'll only find out some of it when you start programming. But
much of it must happen before any programming happens
because it is the kind of information that needs to be fed into the
planning game.

This is why the customer role in XP is by far the hardest
role. Much harder than programming because the system has
largely been figured out by the time the programmers
see it.

And figuring out the system is the semi mystical act of creation
that seems to defy systematization. Techniques like JAD seem inadequate,
but they are probably the best you can do.

Just-In-Time-Requirements are good for many things, but when you

Todd Hoff's picture

The Assumption Life Cycle

  • Assumptions begin as the easiest fit for available facts.
  • Assumptions become dogma when they fit with existing orthodoxy, they are common sensical, and are not immediately testable.
  • New observations are forced to fit the dogma.
  • As evidence piles up against the dogma, orthodoxy must collapse before new assumptions can take root.
  • Todd Hoff's picture

    New Prisoner's Dilemma Winner Sheds Light on US Winners and Losers

    There's an interesting new winner for the iterated Prisoner's Dilemma game
    described at http://www.wired.com/news/culture/0,1284,65317,00.html :

    The Southampton group, whose primary research area is software agents,
    said its strategy involved a series of moves allowing players to recognize
    each other and act cooperatively.
    ...
    The result is that Southampton had the top three
    performers -- but also a load of utter failures at
    bottom of the table who sacrificed themselves for
    the good of the team.
    ...
    What was interesting was to see how many colluders you need in a
    population. It turns out we had far too many -- we would have won
    with around 20.

    What interests me is this question: if we see the same result in another game
    can we assume a similar process has occurred?

    Consider the game that is the US economy.

    In the US: The top one percent are now estimated to own between
    forty and fifty percent of the nation's wealth, more than the combined
    wealth of the bottom 95%.

    Can we now ask if the winners of wealth in the US are playing
    a cooperative game to win at the expense of individual US
    citizens?

    Todd Hoff's picture

    Really, release as soon as possible?

    It is recommended to always "release as soon as it is possible"
    where this is taken to mean early and often.

    My reply is to release ASAP, but no sooner.

    To be my usual tiresome self i would like to interject a little "it
    depends" here.

    Define an ideal and come up with a rubric for pattern variation.

    These absolute rules always frustrate me because it does depend.

    The ideal is release as often and soon as possible.

    What that means in each context is something different.

    For example, in one of my favorite projects i was working on a large
    internal web site that had over 100 simultaneous active heavy users. It used
    perl and CGI so i made live changes continually. There was never
    a real release of anything. This worked 99% of the time and it
    was exciting.

    Training was an issue and i tried not to break features, but that's
    not always possible or even desirable. You can't make stuff
    better if you can't break it.

    On another project each release cost millions of dollars because an
    entire nationwide network had to be upgraded and we could cut all
    data traffic in large regions of north America. This customer treated
    each release like a nuclear attack so releases were infrequent to
    say the least. Yet other customers in a similar situation
    didn't care and wanted releases much faster.

    On another project the software was more your traditional enterprise
    software that was installed using installshield or whatever. Typically
    everyone was always very busy so releases were more of
    an annoyance to them.

    Todd Hoff's picture

    Code Lies as Much as Comments Do

    > Comments lie. Code doesn't.

    This sentiment is used as justification for having very few if any comments in your code. I just don't buy it for many reasons.

    Code lies like a dog under a shade tree in summer. Code lies because the variable names, class names, and method names don't match what the code does. People will use lame names or they will insert new code into a method or new methods into a class that change the nature of the thing. That is a lie in my book. And it happens all the time.

    You may say use good names and you won't have a problem. I agree to a large extent. But i can't make people program well no more than i can make people comment well. If you can accept that people must use good names then you can also accept that people must make good comments. Trust cuts both ways.

    And the lie continues because a name is flat. It relates only to one aspect of a thing. Things are multidemensional and can't be mapped meaningfully to a single name in all its contexts. To the government i am social security number. To my dog i am a pat and a meal. To those i disagree with i am an idiot. To my doctor i am a series of stats. What is my name?

    I have been mislead by comments, but have been helped far more than i have been misled, so that's a win in my book.

    I have yet to see any of this code that is self-documenting so i am unwilling to do away with comments on that assumption.

    XP assumes a continuous chain of oral tradition to make the use of comments less necessary. Perhaps on an XP project this make sense. But much of my experience is in large distributed teams with lots of churn so i don't think this is a generally applicable rule to do away with comments. No more than i would get rid of jails everywhere just because there is almost no crime in my house.

    Todd Hoff's picture

    Roads Gone Wild

    article titled Roads Gone Wild by Tom McNichol that reminds
    me a lot of the spirit of agile software development.

    The article is about a new kind of traffic engineering
    advocated by Holland's Hans Monderman. And by traffic
    engineering we are talking about roads, sidewalks,
    interestions, etc, not TCP/IP.

    The article lead in starts:
    No street signs. No crosswalks. No accidents. Surprise:
    Making driving seem more dangerous could make it safer.

    Another graphic has the title:
    How to Build a Better Intersection: Chaos = Cooperation

    Step 1: Remove Signs - The architecture of the road, not signs and
    signals dictates traffic flow.

    Step 2: Install Art - The height of the fountain indicates how
    congested the interstate is.

    Step 3: Share the Spotlight - Lights illuminate not only the roadbed,
    but also the pedestrian areas.

    Step 4: Do it in the Road - Cafes extend to the edge of the street,
    further emphasizing the idea of shared space.

    Step 5: See Eye to Eye - Right-of-way is negotiated by human interaction
    rather than commonly ignored signs.

    Step 6: Elimanate Curbs Instead of a raised curb, sidewalks are denoted
    by texture and color.

    Some interesting quotes:
    * Hans Monderman is a traffic engineer who hates traffic signs. ...
    To him, they are an admission of failure, a sign - literally -
    that a road designer somewhere hasn't done his job. The trouble
    with traffic engineers is that when there's a problem with a
    road, they always try to add something. To my mind it's much better
    to remove things.
    * Monderman ripped out all the traditional instruments used by traffic
    engineers to influence driver behaviour - traffic lights, road markings,
    and some pedestrian crossings - and in their place created a traffic
    circle. The circle is remarkable for what it doesn't contain: signs
    or signlas telling drivers how fast to go, or curbs separating the street and

    Todd Hoff's picture

    Friendy C++ Unit Tests

    In C++ the friend keyword makes writing unit test code easy and clean.

    The question is how do you keep your test code separate from your "real" code while having a minimal public interface and allowing seperate test classes access to the internals of the classes being tested?

    I want code separation so my code is clean and the final image doesn't include test code. As the test code is usually larger than the code being tested this is important. Please, no separate compilation using macros.

    I believe in testing everything that can break so i don't just test public interfaces. Public interfaces often use a common private interface that i want to be able to test directly so it doesn't have to be retested for each public interface. This requires another class to have non-public access to the innards of another class.

    In C++ this is what friend does for you, rather cleanly. Test classes can be put in another package. And with a forward declaration and the friend keyword, a class can be put under test by any number of other test classes.

    class TestClass;
    class ClassTested
    {
    private:
    friend TestClass;
    };

    The implementation source file would include the path to the full TestClass.

    I allow all the code in a package to touch the privates of other classes in the same package. This makes for a minimum public display of behaviour. I assume all code in the same package goes together somehow so there's no need for a class to protect itself from code in the same package.

    Todd Hoff's picture

    Scale Kills: Comair System Crash

    An interesting article in slashdot (http://it.slashdot.org/article.pl?sid=04/12/26/052212):
    30,000 people have had their flights cancelled by Comair this weekend thanks to
    a computer system shutdown

    A couple of posters said they didn't think it could be the software or shouldn't
    be the software. This post was a good example:
    > Computers don't freak out or get depressed
    > when work piles up. Backlogs mean nothing;
    > they just keep processing one piece at a
    > time until the pieces run out. I think
    > someone was speaking imprecisely.

    In my experience, it's just the opposite. Systems usually only seriously break when scale increases. That's why unit testing is never even close to good enough coverage. To find scale problems you need to test at scale, and few people want to pay for that. So all hell breaks loose when scale starts happening.

    Increases in backlogs may make queues sizes too small which causes drops which causes retransmissions which makes the problem spiral worse. Maybe a OS network stack queue gets full, a queue which you can't control, and you are in a downward spiral.

    Or the queues may not be flow protected and your memory use sky rockets which causes a cascade of failures including out-of-memory conditions that may reassert themselves even after a reboot which causes continuous failure.

    Any algorithms based on size X are now way too slow for 10X which can cause scaling problems everywhere else or pathologically slow times for certain algorithms.

    CPU time is sucked up which again causes push back and scaling problems everywhere else. Priorities that worked with a certain workload may now cause too much work to be done which kills responsiveness and starves other parts of the system which spirals into more problems.

    Todd Hoff's picture

    XP and the Bowling Game

    There was an interesting thread on comp.object related to how XP would code up scoring for a bowling game (http://www.xprogramming.com/xpmag/acsBowling.htm). One of the thread particpants was trying to say something interesting, but didn't do a very good job at it, so I think I'll take a crack at it. They were trying to talk about Fact Based Architectures, which I think is a better up front design for the bowling game solution.

    For the complete article please take a look at: http://www.possibility.com/epowiki/Wiki.jsp?page=FactBasedArchitectures