Wednesday, March 15, 2006

The Good, Old Times

or, More Cogs Decrease Efficiency


Just when I thought I fresh ran out of rants, all is great and why the hell did I start a blog, I brushed again against the ugly spectre of bureaucracy. A few years ago, working on a new OOB XP technology, my attributions included a constant dialogue with IHVs (mainly driver developers). We were a bit understaffed on the PM side, the devs were loaned to MCE to hold their devs by hand through the most difficult process of writing working software and so we, the test, were wearing the polymorphic hat we sometimes do: test/dev/PM. (And doing none really well.) Driver devs had questions about the spec, and we'd get them the clarifications (as well as updating the spec with more examples). Then they would need test tools, and we'd give them a bucketful of test cases, along with the reference output. Lastly, we'd help them with debugging/unpublished error codes and code up small fixes that our devs would put in the product. It really was great - fast paced development of both our tests and their drivers, and you could see the quality of the overall experience going up every week. What more could you, as a professional, expect other than the instant gratification of "it's fixed/better"?

Well, too effin bad, 'cause those days are gone and they ain't coming back. We now have almost as many PMs as we have devs, pretty fresh in MS years, high on talk and full of process. To be fair, the process part is not of their conception, and it's by far the most damaging aspect of this "new" life. Need to speak with an IHV? Sure, talk to the PM presenting the problem until they understand it, send him/her an email summarizing the issue, the expected outcome, ping them again in two days and try to maintain composure. Then that email gets to the addressee (hopefully) or to their counterpart in the IHV company (realistically) and, in due time, I'll get back a response. Need a quick and dirty update to the spec being published? Likewise, we have to go through LCA, lest our treasure map is leaked outside and we end up begging for food in high, affluent foot-traffic areas. And you just know those helpful LCA people, as well as your PM, have assigned exactly the same priority to this task as you and your IHV counterpart did. (Speaking of LCA, I'm not accusing them of not working hard during their regular program, but did you ever happen to be in one of those buildings at 4pm? The elevators burst at the seams and they will run you over enroute to the nearest exit. Be warned. I think "long hours" in that building means working till 4:05pm. The offices are deserted and you just know that if you see one of them in during the weekend, they're up to no good. Just kidding.)

Try to circumvent this pointless detour and watch hell emerging; after all, what do you know, it's not as if you participate in all the high level meetings that the PMs do and you lack the clear overview to see that this tiny, 30 min technical issue must follow the rigorous chain of communicators in order to be properly solved, without presenting a risk to the company. That's how it is to be done, so get your issues back in line, Bucky.

I postulate that the number of bugs is the same in any generation of drivers/components. People don't learn from their more obscure mistakes, and when they do, the new version adds complications that compensate, with bugs, for the learned lessons. It's just that, what took in SP2 2 days to identify, solve and fix now takes a week, at best. To that, I add the subjective cost of losing the giddiness stemmed from seeing issues actively resolved. Why? Who decided it'd be a great idea to channel technical issues through non-technical people? Or that specs need to go through a 1 week of LCA brewing before going out to partners already under NDA? What massive losses will the company incur if a simpleton of a test app goes out to its intended audience directly from its source? What, do we risk not being able to file yet another pathetic patent on grounds of disclosure?

We need to step back and look at what we need to accomplish, and not what is the "safest" way to do it. If we'd deliver the software faster, we could make further progress and achieve more over the same time, versus locking up all the doors and making sure nobody is peeking over our shoulders at the laughable contraption we've put together.

Want an example from an extremely fast paced (figuratively and literally) field of economical activity? In Formula 1, there are no patents. The time and expense required to obtain a patent is time and funds taken away from development and/or improving on the last race. The patent would essentially give your competition the blue-prints to your clever finding, and the meager royalties you collect from such a small audience are absolutely nothing in contrast with the possibility of them bettering your form by adding your finding to theirs. And that really is high tech. Who cares about apeing a silly API to dethrone the mighty Windows?

But until the good old days return (or until I find another company that believes in agile development), I'll go back to quarreling with my PM about each one of us not including the other in important discussions.

Sunday, March 05, 2006

Who needs my software errata?

or, Updating Anonymous Subscribers

This week I've been bitten twice by the same problem, and had to live with the infamy of being the offending party in one of the situations.

Say you're providing a library (and associated headers)(or a generic service) to an interested party, without necessarily adding them to a subscription list. Such a list would be the most reliable mechanism, certainly, but one doesn't always have the luxury of (pick your choice) being able to collect this data, or having the time to update/consult it etc.

Versioning would also help, provided it is accompanied by broadcasting the current "safe" version, and provided that the interested parties poll for updates. In one of the cases I mentioned above, a dll I wrote and used by another group took on a new dependency. The dll was consumed by their own harness/component and, as you may have guessed, the setup of their component was not updated to copy the new modules. One or even two days of runs were lost (with detection, investigation and what not). Now they are on my list, but I'm not sure whom else can I add to the list. So, what's the solution?

The second of the two was a more classical example of broken backward compatibility. A widely used logging component (part of a larger system, let's call it UTT) revised the behavior of a method it exports. But instead of a more straightforward crash/fail when the old-style params are passed (which would instantly attract attention), it quietly went about working, in an invalid state, crashing only upon exit. The investigating parties had no idea such a change occured, and were exchanging pleasantries ("I think it's in your component", "Actually, I think it's in yours" "Mine hasn't changed" "Neither has mine"). Presumably, this is where versioning might have helped. But the detection of a version newer than expected would probably prevent situations like these in less than half of cases. (And the consumers would incur the cost of investigating whether the new version of the service provider should translate into updated client code. (Well, the second example was a simple case of dumbassness, but there's got to be a protection against that also.)

I'd love to hear your thoughts, words of advice (or condescendent snickers, as long as you have a simple solution to this).

Tevio Destre
Media Center quirks

Auto updates. Nothing more annoying than losing the signal while watching a show. An asteroid taking out a satellite is certainly an acceptable reason. A power outage is also palatable. Heck, even a bug check is okay - it's either an unstable configuration (unsupported drivers or what have you) or me (or someone else) haven't done my/our job properly. (Not that it's ok not to do my job, but we can't be all perfect.) However, when the rude interruption occurs due to a silly applet, the behavior becomes malign - by design. The dreaded auto system updated with a kink - the restarting is not left to the user's convenience, but it's rather an ultimatum. The Win update service decided the latest patch, applied to prevent an obscure Photo editor exploit, is so crucial, that your entire house will collapse after your computer has sustained spontaneous combustion. You have been warned (in the background) and have 10 minutes to comply. But alas, you're watching Whatever: Wherever in full screen mode in your Media Center session (or you're recording it), and the bloody applet restarts the system. I'm so lucky my TiVo doesn't run IExplore or Media Player.. One more hurdle in getting the PC in the living room.


Another quirk - let's say you're behind the current time in the 30 min buffer in live TV. You paused it and now you want to record the entire show. The silly thing starts the recording at the current live time, so you've lost all the content between current playback time and current live time. Who.. what.. Why?? You've even lost the 30 min buffer and that is definitely as a loss of data. I can't imagine the expected behavior would have been so difficult to implement, and I'd like to meet the person who decided this is "by design". Unbelievable. You'd think the first order of business when a new MCE is kicked off is to go out, buy a TiVo and spend 5 days with it in an office until you know exactly where its flaws lie and what are its strengths. (I hear this might be fixed in Vista, though.)


Lastly, can't modify an ongoing recording. For instance, the show being recorded is running longer - such a simple scenario. The only way to accomodate the overtime is (apparently) to cancel the current recording and set up a new one, with advanced settings. If you know of an alternate way, do let me know. If this, too, was "by design", I'd love to hear the reasoning behind this dimwitted decision.

Tevio Destre