Why all the hating on OO?

I think I finally figured out why there is so much disdain if not outright hate for Object-Oriented techniques these days. TLDR; OO is not well-suited to the prevailing need for client-server (i.e. web) applications.

OO is essentially an extension of the idea of abstract data types (ADTs). Originally ADTs were implemented with a combination of a static structure definition and a set of procedures/functions that could manipulate them, but the data and algorithms were separate things and had no syntactic or semantic link. OO came along and bundled the two in one syntactic/semantic unit — the class. (It also added in polymorphism, inheritance, et. al. but these features aren’t critical to the issue at hand.)

An OO class combines both data and functionality. But here is the killer issue in today’s world: functionality cannot be expressed in a language-neutral way — it can’t cross the programming language barrier. If I have a C# or C++ class definition, I cannot use that class in Javascript or Ruby. Sure, there are sometimes awkward binding techniques available but there is a fundamental mismatch between the way things are expressed in different languages that always interfere. The data part of a class definition could be expressed in a limited, standards-based way (e.g. XML or JSON), but the algorithms cannot.

But the most common client-server scenario in our day is a web application, where the server code is written in one language and the UI in another. It’s no wonder that OO doesn’t work well in that case — the same class definitions can’t be used on both the client and server side! Trying to shoe-horn OO into a web app design leads to duplication of algorithmic code and the use of dumb DTO “objects” in between. Not much of a win from OO…

But even supposing you could use the same class definitions — suppose you had UI in Javascript and were using Node.js on the server — you still have a mismatch between what the algorithmic part of the class needs to do on the client vs. the server. On the client, you may need to make a call to the server, but on the server you’d need to persist or fetch data from a store. So even in this case much of your code would have an “if client do this, else if server do that” flavor to it. (This mismatch is not unique to web apps — it affects all client-server applications.)

So unsurprisingly, web developers have never been fond of OO and for good reason. I would also suggest that due to this mismatch the web community may never have developed a real appreciation for the substantial strengths of OO when applied to the right problem, or even a comprehensive understanding of the underlying notion of ADTs. Especially when you add in the fact that ostensibly OO languages (Javascript) really are not, the waters are very muddied and impede a developers ability to really learn and understand OO.

Now, all that said, you may still decide to use a class definition on either the client and/or the server in the appropriate language. But the client-server barrier limits the amount that you can leverage that work across a vertical slice of the application.


Posted in design, software engineering | Tagged | Leave a comment

A Framework for Evaluating Software Engineering Decisions

People of good will may disagree. This is just as true in software construction as in other areas of life. In life, you can agree to disagree and move on with your day, but in software construction a decision must be reached. Usually, this involves choosing between 2 or more competing ideas or solutions. In this post I’m going to outline a way of thinking about these evaluations.

“Decision” in this context can be very small or very large, ranging from a choice involving a class or method design versus selecting a language or methodology. Obviously, the larger the decision the more complex the evaluation will become, but here I try to provide a way of thinking about the decision that will work at any granularity.

Since engineers — and software engineers in particular — are opinionated buggers, we need to find a way to focus everyone on the same basic set of evaluative criteria. True objectivity may not be entirely possible, but we can at least try leaning in that direction…

The criteria I try to apply are simple: cost and benefit. It clearly makes no sense to do something that has no benefit but a high cost, and it clearly makes no sense not to do something with lots of benefit and no cost. But the fuzzy areas in between are where most of our decisions need to be made, and where we often get mired in competing notions of what constitutes a benefit and what constitutes a cost.

Clearly, when we are talking about “engineering”, or “software design” or “cost vs benefit” we are talking about an actual, pragmatic, real-world problem we are trying to solve. Therefore, we should be pragmatic in the way we resolve the fuzzy areas of decision evaluation.

I propose that both benefits and costs for a given alternative need to be evaluated and placed along a continuum:

  • Theoretical: The benefit is generally agreed to hold some value, but it is difficult to pinpoint its overall effect on the efficacy of the alternative. The cost is generally agreed to be non-zero, but it is difficult to gauge its magnitude.
  • Characterizable: Either a cost or benefit can be clearly characterized in real-world terms — e.g. a performance benefit, or a cost in increased programming errors due to complexity — but it is still an indefinite value either due to the nature of the alternative or some unknown or unknowable environmental influence.
  • Quantifiable: A benefit can be quantified in terms of engineering effort saved, performance increased, etc. A cost can be quantified in terms of engineering effort, error rate, etc.

The key thing we are looking for is to break down the simple idea of cost and benefit in some way that will help us compare alternatives in a systematic way.

A few clarifications and examples:

  • For something to be quantifiable it requires some form of measurement. What is meaningful here will depend on the organization and the team, and two alternatives may use different units: i.e. one alternatives benefit is performance and the other’s is development velocity.
  • When characterizing a cost or benefit it is valid to characterize it relative to another alternative, i.e. alternative A will be more expensive to implement that alternative B.
  • Things like “best practices” are of theoretical benefit unless there is some attribute of the problem or product that ties directly to it.

Correlating the costs and benefits by these categories:

Theoretical Benefit Characterizable Benefit Quantifiable Benefit
Theoretical Cost Meh. May be some benefit, may be some cost. Nothing compelling or distinguishing. Cheap, but with some benefit. This is ideal, especially if the magnitude of the benefit is large.
Characterizable Cost Unless the cost is extremely small, why would you use this alternative. Compare the magnitude of the cost and benefit. Compare the magnitude of the cost and benefit.
Quantifiable Cost Unless this cost is extremely low, don’t use this alternative. Compare the magnitude of the cost and benefit. Compare the magnitude of the cost and benefit.

Generally speaking, the more toward the quantifiable end of the spectrum the more reliable the evaluation of the alternative is.

This doesn’t solve all the problems of course: In the end you still need to pick an alternative and choosing between an alternative with theoretical cost and quantifiable benefit vs one with characterizable cost and benefit could be quite difficult. It’s here that the magnitude of the benefit/cost will come into play.

Posted in Uncategorized | Leave a comment

Stop the Madness!

I just watched an interesting talk about using data values as natural program boundaries. Some valuable ideas, but the main thing that it made me think about was unrelated to the talk itself. I learned about this talk when I was at the recent QCon conference; it was recommended by a speaker who was enraptured and completely convinced that functional programming is The Correct Way to Do It (CWDI pronounced “cow-dee”).

Programming style advocacy — whether for/against OO, imperative, declarative, functional, or whatever — has a problem. And I’m not referring (at the moment) to the language debates that they usually get mired in. The problem is that no single programming style is The Correct Way to Do It.

The proof of this is trivial:

  1. “Correct” must be evaluated first in terms of the functionality needed.
  2. All languages are Turing complete.
  3. All solutions are functionally equivalent.
  4. All solutions are “correct”.

That was stated in a bit of a light-hearted way, but don’t just gloss over it. If you disagree with the proof, it probably has to do with the way the word “correct” has been constrained. When we are talking about programming styles the word “correct” typically has a bunch of other implications, including maintainability, testability, aesthetics, expressiveness, and so on.

And this is all fine when we are having theoretical conversations and discussions. But it becomes harmful when we enter the realm of engineering non-trivial systems. (By non-trivial here I’m talking about systems whose complexity easily surpasses the capabilities of a single engineer — no “toy” problems, please!) Engineering decisions are always about trade-offs, and programming styles are no exception.

What is required is to match the programming style to what best suits the problem being solved. For a machine-control system functional programming is often considered the best solution; for complex data-modeling environments OO may be more well-suited. Real-world complex systems are often composed of multiple subsystems, each of whose solution may require different styles of programming. Add into the constraints of the problem being solved the constraints of the environment: does the team know how to use a particular style? do we need to then pick a new language? does the benefit outweigh the costs?

But it is just lazy thinking to say that OO or functional programming is always the right solution. (I don’t care about your generic arguments that “mutability == badness” either; just because immutable values are easier to test isn’t enough to trump all other factors.) (And I don’t care about your theoretical strong-typing either; sometimes type integrity enforced at run-time is just what is needed.)

Not everything is a web-page backed by MongoDB; not everything is a bespoke enterprise desktop backed by SQL; not everything is a machine control application. Tune your solutions to the problem!

But I guess the first, most important thing is to at least learn about and understand all the options available. You may not have the opportunity during your career to actually do serious projects with them, but at least be aware of the costs and benefits of all the major programming styles and don’t confuse those issues with programming language advocacy.

Stop the madness…

Posted in design, Programming languages, software engineering, Uncategorized | Tagged , | Leave a comment

Can we just say “Advice” instead of “Best Practices”?

Yeah, yeah, I know — what difference do the words make? A lot actually.

“Best Practices” implies many things, including:

  • There was some evaluation of multiple practices, and only these were deemed to be “best” by some unspecified but no doubt impressive criteria.
  • Many places/people/organizations do these same practices.
  • Following these practices will make you successful.
  • These practices are somehow scientifical…
  • The makers of the list really knew what they were doing.

I have yet to see any of these implications be true with respect to “Best Practices” I’ve seen.

What is true is that there are practices that have been shown by individuals (and sometimes groups of individuals) to be helpful to them in their situation. Which is perfectly valid and interesting in its own right, but there is no need (or reason) to overstate the case.

The humbler word “advice” is, IMHO, much closer to the mark. Maybe if we dial down the hyperbole truly great ideas will be able to stand out against the noise.

Posted in Programming languages, Rant, software engineering | Tagged | Leave a comment

Code Bootcamps Considered Harmful

To quote Charlie Brown, “AUGH!” This article is very depressing to me…

I don’t think it is depressing that these people are getting good jobs. I think that’s just businesses being stupid and not understanding what it is they need or they are getting. But what else is new on that front?

What is depressing to me is the implicit assumption that these short courses (24-week, or 30 hour, or whatever) can actually be compared in any meaningful way with a college education. Now I realize I may be in the current minority in believing that college is worthwhile, but how can a coding bootcamp or academy be viewed as equivalent to a structured course of study over 2 years (if we factor out the non-major portion of a degree program)? This is just rank obtuseness.

It’s not that I think bootcamps or code academies are bad, per se — I find coding interesting and engaging so I’m not surprised others do as well. However, there is a big difference between dabbling in a complex topic and training to do it!

My prediction is that within a few years the bootcamp bubble will pop and leave behind a mess of a) failed startups and b) failed projects at well-established, short-sighted companies.

And then they’ll have to call in the professionals to clean it up.

yay for us professionals…

Posted in Rant, software engineering | Leave a comment

Metrics: When the Numbers become the Goal

In general, I’m in favor of development teams gathering metrics. Metrics about your codebase or test execution times can tell you a lot, especially if you track them over time for trending. Metrics about your development process can help you plan more realistically. So metrics, properly used, are a good thing.

But to ensure that metrics are “properly used” it is necessary that they not be disseminated outside the development team. Managers like numbers — they are specific, they are concrete, they look good in reports (especially when they are trending up). But managers just can’t seem to resist using them improperly.

The trouble with metrics is that numbers have no context. A number doesn’t mean anything except in relation to other numbers; metrics don’t mean anything unless you understand how they are derived. Even then, a single metric rarely has much informational value — you usually need several to characterize anything. And once you’ve got multiple metrics, you’ll now have disagreements about their collective interpretation. Which is cause and which is effect?

But even in the rare cases that the development team fully understands the meaning and nuances of a particular metric value, their managers most likely will not.

Inevitably, the numbers will take on a life of their own when they escape the context that created them. The numbers will become the goals of the team, rather than a tool for the team. They become benchmarks and lose all their informative and predictive power.

So use metrics to learn about your code and processes, but hide the numbers!

Posted in Process, Rant, Software | Tagged , | Leave a comment

The Limits of Specifications

Most descriptions of software development processes begin with the idea that you have some notion of what you want to build. The degree of specificity varies. At the least precise is the very loose, one-sentence story point or “product idea”, and at the other end of the spectrum is the Specification. Modern, agile approaches tend to the former and more traditional approaches tend to the latter.

But both stress the wrong thing.

I suppose I’ll need to justify that remark…

Let’s start at the Specification end of the spectrum. Specifications take on many forms — requirements elicitation and enumeration, use case analysis, UX mock ups. All of these can be very useful, especially during the early stages of work where you are trying to (roughly) scope out the work and effort involved.

The Specification is handed to the Developer, who then Develops the Specification into Product. Depending on the company and the process, the developer may not even have had a hand in the creation of the Specification, but in any case must interpret the specification to actually create the code. Developers make hundreds of decisions in the course of an implementation, and many (if not most) of them, have to do with deciding the minutia of how the product or feature will behave. In essence, all of these are product design decisions.

Interpretation == Product Design

This is where the emphasis on initial specification is misguided. Well, not completely misguided, but it is at best unbalanced. If a company pays close attention to the specification process, but then does not train the developers to understand the domain of the product, then it is compromising the product design inherent in the hundreds of behavioral decisions made by developers every day.

Agile methodologies do better than BUFD methodologies in this respect because they explicitly have a user advocate on the team. (At least officially — my understanding is that the user representative piece of agile is often compromised in practice.) This gives developers someone to go to make decisions when they are in doubt. While inefficient, this is fine. However, a user advocate isn’t thinking in terms of the code because that is the developer’s job, and over time that can lead to unnecessarily complex code because each decision is considered in isolation and not as part of an overall implementation strategy.

So what I’m advocating here is that developers on a team must have a deep understanding of the customer needs and domain of the product, and that this understanding is at least as important as having a specification for the feature/product at the outset. A company must spend the money and effort to get their developers this understanding as well — it won’t just happen magically through osmosis or something.

Sadly, too often I think the industry views developers as simple technicians taking people-speak and turning it into code — and I think that a lot of developers look at things that way! Too often we hear that the requirements or the specifications were incorrect or incomplete as if that were not, at least in part, the developer’s responsibility (whether or not they are a full-time employee or a contractor).

The software development process is too large and complex (at least for non-trivial problems) to make it practical to isolate domain knowledge with business analysts or systems engineers. Domain knowledge must permeate the entire organization.

Posted in design, Process, requirements, Software | Tagged | Leave a comment

Complexity and Third-party Tools

Great article here from Alan Holub.

The only thing I would add is that I don’t think his experiences with Jersey are, in any way, an exception. I think they are the rule.

Which is why it is vitally important to always studiously and thoroughly examine, test, and vet any third-party components you choose to use:

  1. Start with a simple list of the basic capabilities you need. If you don’t have this, you shouldn’t even be looking at third-party components, even if they are the current hot tech in the field.
  2. Does a candidate provide the capabilities you need? Most evaluations seem to stop here. Don’t do that…
  3. How much more that you don’t need does the candidate provide? Whether or not you may take advantage of a feature you don’t need isn’t really relevant for the evaluation…
  4. How many other things does the package depend on? Are you going to end up with 3 or 4 logging packages because each third-party component you incorporate uses a different one?
  5. How easy is it to use? If its usage is more complex than the basic capabilities you need warrant, you’ll probably want to create wrappers, which will reduce the ROI on using the component.

Also, I think it’s important to realize that selecting a framework (in the sense of a full-featured development platform like .Net) that you are going to build your system on top of requires different considerations than when you are selecting a component to fill a specialized niche of functionality in your system. The biggest difference being that with a framework you really aren’t capable of rolling your own, but with a component that is always an option…

Posted in design, Software | Leave a comment

Interviewing Candidates

I was reading an interesting post on interviewing today, and came across this:

Whiteboard and coding interviews also routinely ask people to re-implement some common solution or low-level data structure. This is the opposite of good engineering. A good engineer will try to maximize productivity by using libraries to solve all but the most novel problems.

While I can’t disagree with the general point, I think it glosses over an important issue: Being able to reuse libraries does not imply being able to reuse them correctly.

I’ve been shocked and saddened at times to come across candidates who lack a basic understanding of fundamental data structures, algorithms, and ideas of computer science (or software engineering) as well as bedrock knowledge about how computers work.

One candidate I interviewed had just graduated with a B.S. degree and didn’t know how many bits were in a byte. She very quickly moved out of programming and into management. Another candidate didn’t know what a linked list was.

A linked list. Really.

Now I’m not saying they should write a linked list class in an interview — if they’ve been exposed to it at all it was probably in Data Structures 101 where they also implemented it. However, understanding the characteristics of a single- versus doubly-linked list, versus an array, versus a heap is important. Both for understanding performance profiles of the data structures and what they are suited for and not suited for.

Same with sorting. Bubble sort, insert sort, quick sort. From an interviewing standpoint what is important is not that a candidate can implement any particular sorting algorithm but that they understand the Big O notation and how the performance of sorting or merging is dictated by the data structures chosen as well as the nature of the data being sorted.

Now, there are lots of ways to assess whether or not a candidate is well-versed in these topics, and one of those ways is to ask them to sketch out an implementation of one of them. I don’t think that’s the most efficient or reliable way so I, like Laurie Voss, wouldn’t use that in an interview.

That’s the kind of evaluation I’m trying to make in an interview though: how well does the candidate understand relevant fundamentals — those I mention above, but also many others — and do they know how to apply them.

Posted in Software | Tagged | 3 Comments

What Do Students Know?

There is a lot of excitement about MOOCs these days, as well as conjecture about how higher education might look in the future. MIT is considering a more ala carte approach to structuring their programs.

I think the increasing access to sources of learning is fabulous and has no drawbacks of any significance whatsoever. However…

There’s a huge difference in what you learn in self-directed study as opposed to studying a complex topic under the tutelage of a professional or a curriculum designed by professionals. As a student coming in cold to a topic like computer science there are hundreds of directions your interests might take you, and there is also probably an online course or source of information that will help you along.

But I would contend that the learning-by-browsing model is not not an effective approach when applied to an entire field of knowledge. It works wonderfully for very directed, contained topics. But in my view the student doesn’t know enough to direct their own studies, at least until they’ve achieved a certain level of fluency with the basic material of the area.

And that, I think, is the difference between those who are self-educated and those who are educated via a curriculum; the self-educated have no way of being assured that they have a mastery of the breadth of the important material. Can the self-educated achieve such a mastery? Sure. But the mastery of a field like computer science involves training your mind how to think in abstractions while at the same time being aware of the minutest detail of the technology being used, and I don’t think it is easy, natural, or guaranteed that a course of self-study will get you there. You can be an absolute expert at Ruby and a horrible software engineer. (Replace “Ruby” by your technology of choice…)

So is college necessary in computer-science/programming/hacking? Not to master any particular bit of technology. But it is certainly the best path to ensuring a base level of knowledge from which to master such a huge and complex field of study. It may not even be necessary to secure a lucrative job or to start a company delivering software — but don’t be fooled. Monetary success does not mean you know what your doing.

Posted in Rant | Tagged , | Leave a comment