I completely agree with the prevailing wisdom that optimization should only follow a demonstrated need for it. All smug and self-assured I code away merrily until I run into a situation that makes things all messy and complicated…
Here’s the situation: I want to output a logging message containing a snippet of results from a JSON payload returned from a REST call:
string toLog = json.Substring(30);
Nice and simple. However, it turns out that the JSON data contains newlines which kind of mess up the flow of my logging statements — I’d prefer it if the string was all on one line. Easy enough to accomplish:
string toLog = json.SubString(30).Replace("\n", "");
Hm. But this is kind of wrong. Now instead of a 30 character snippet of the JSON data, I’ll have 30 characters less how ever many newlines there were within the first 30 characters. So to fix this:
string toLog = json.Replace("\n", "").SubString(30);
Nice! But wait… These JSON payloads can be fairly large — multiple thousands of characters. That means I’ll essentially be copying a very large string in the course of doing the Replace(), only to extract a very small part of it.
Now, in principal, I really shouldn’t worry about it until I’ve seen some evidence that the cost of doing the Replace() call is prohibitive. And depending on the context, I would do just that. And indeed, in this particular case, the performance overhead will probably not be noticeable.
But it chafes me. It steams my beans! I shouldn’t be creating a copy of a multi-K string just to grab 30 characters! That’s just plain wasteful and I don’t like it.
I’m prepared at this point to entertain the theory that I’m making a big deal out of a small thing. I do that sometimes. However, as engineers we are continually faced with this kind of decision over and over. In the course of building a million-line codebase, how much inefficiency and overall gunk does this kind of thing add? How do slightly inefficient operations add up to influence efficiency in the large? In the past, Moore’s Law has protected us, but some are arguing that that protection is soon to lapse.
I don’t have a solution. In this case I went with the more efficient solution because a) it only required a slight modification to the desired behavior in that I wouldn’t always get 30 characters, and b) it felt like the right solution.
No one likes to chafe…
(PS: I know that the Substring() method isn’t safe if you ask for more characters than the string has. Rest assured that in the actual production code I’m actually using an extension method that takes care of that little detail…)