May 11, 2016

Technical Debt is misnamed

Technical debt. Everyone knows what it is - things left undone or poorly done which get worse and worse and slow you down until you fix them. Like debt, accumulating interest. But... not really.

Technical debt is not like regular debt, because regular debt has a set payment schedule and eventually you default on it in an understood way or pay it off also in an understood way. Technical debt often does have a regular payment schedule of slow development. But unlike regular debt, you default on technical debt at a random time and in a random way. Suddenly the system explodes.

The way your technical debt defaults makes it more like mob debt than a mortgage. On some random day a guy comes around and breaks your legs. But there are several unpleasant differences that make it worse than mob debt. First, technical debt does not care if you can pay the loan. The mob wants you to keep working and pay them, being roughed up is just incentive. Technical debt doesn't care if it destroys you completely. Second, the mob lets you know it's displeased ahead of time. The first guy comes around and makes comments about how unfortunate it is you are behind on your payments. The next guy roughs you up. Technical debt might give you warning signs or it might be completely silent until the day it comes around and murders you and everyone you've ever talked to.

On the good side, sometimes you're lucky and technical debt just goes away. The thing you did wrong you don't even want to do any more, so all that code gets deleted and the problem is solved. But if you're not lucky, it might make more sense to call it technical risk or technical roulette (Russian subtype). It might slow you down a bit, and it may or may not murder you at any time.

February 24, 2016

Collapse of the Interview Calendar?

Through my career the college interview calendar has been pushing back. In the mists of time before the Y2K bug, new grads started interviewing in the spring and if they were on the ball locked down a job in March, for a June start date, three months lead time. As big companies developed an insatiable hunger for new grads, starting I think with Microsoft but exemplified by Google, they were willing to make offers earlier and earlier. The fall career fairs, originally for those graduating in January, became a place to make full time offers for next June, and the spring fairs were mostly for picking up interns. Then the insatiable hunger spread to internships and now most hiring happens in the fall, nine months before the start date.

This does everyone a vast disservice. Small companies can't hire new grads effectively because they cannot wait nine months for a new employee. College students are forced to make a major life decision unnaturally early. Big companies hire students into general new grad piles and them sort them out later, eliminating one of the most important steps of hiring - meet your new boss before you decide.

There might be a light at the end of the tunnel though - in the past year questions about reneging on job or internship offers on advice sites has gone from basically zero to a pretty regular occurrence. Personally I think a significant amount of reneging would be a good thing so that companies pull back their offer windows to something merely extended as opposed to downright ridiculous.

August 14, 2015

Papertrail vs. Logentries - One great tool beats a bag of OK tools

Recently I've been looking into a centralized logging solution for us so that we can lock down the servers but still see what is going on. Our analytics needs are not huge - we have a low number of valuable customers so we don't need to analyze behavior, we can ask them. But we do need to be able to solve problems for them immediately.

That means live tail and search of current and recent logs is a necessary feature, alerts are important, and graphs could be nice to have. And we want to hook right to the unix standard logging facilities rather than have another piece of code to wonder about.

While there are a lot of ways to run a centralized log yourself, now we have to monitor and maintain that server as well, and we're not log experts. It would be better to throw a reasonable amount of money at the problem. Looking around most hosted log solutions are heavy and analytics focused without live tailing. Only two really have that as a key feature: Logentries and Papertrail.

After initial reading and questioning evaluation, Logentries was the clear winner. It has live tail, it has ways to highlight and tag them, search them, graphically interact with them, it has dashboards that hook to these things and a bunch of integrations. Papertrail has a live tail you can search, and it can trigger alerts off the searches, and that's it. Papertrail also gives you about half the log volume and log retention per dollar that Logentries does.

Once the rubber meets the road though Logentries falls down. Live tail, as I was evaluating, lost some of the messages. Not many, and they were in the non-live log, but they were not in the live tail. Interacting with the UI it worked, but it could be fiddly, the sort of "if you click A and then B it gets a little confused" thing that shows up in so many JS heavy websites. The support staff was not able to diagnose that my initial connection problem might be because selinux was preventing the connection. The docs are sometimes a bit out of date (though generally pretty good). And the killer is that the live tail issues were known by support, but not in the incident log, so how many times has this sort of thing happened recently? Who can say.

Papertrail didn't grow any more features, but it performed flawlessly. Live tail chugged along, completely searchable. All of the UI features work smoothly and intuitively and have help links embedded right there. When I had the selinux problem I had solved it before they were awake but they knew all about how that happens and sent me the commands to fix it, which are also in their docs. Papertrail status includes (not many) slowdowns and multiple updates and goes back for years.

Papertrail is only one tool but it's one sharp well kept tool. Logentries is trying to do a lot of things, but they're not sharp. We're going to pay more, and get fewer things, but I think we're going to get more in the end.

July 30, 2015

Everywhere I've Worked that Worked was agile, mostly unrelated to Agile

Agile is at this point a tired buzzword. Too often it means "we run around a lot and generate Agile artifacts at a terrific rate" without actually producing useful product.

Thinking back I realize that everywhere I've ever worked that actually shipped things was using an agile process in the small a sense - cycles of customer feedback, planning, and work. Back before Agile, when BBN was trying to sell it to a government suspicious of any lifecycle other than plan/do/realize it all sucks and throw it out/repeat we called it spiral, where you spiraled out through successive cycles.

The next company didn't have the feedback cycle and just did stuff. That was fast, but it failed. After that I went to another which had the planning cycle, but no feedback. Work got done but it was crushed by the market. Forum Systems had both customer input and a planning cycle but again no Agile, just all the useful pieces, and that product worked (though the company had a rough ride for other reasons).

After that my next company had Agile with a capital A, but could not prioritize worth a damn or believe schedule estimates they didn't like. That failed (big surprise). Then Tripadvisor was a super agile cycle with weekly releases, prioritization, and planning, with great feedback via analytics, and that shipped an amazing amount of product updates that really worked. Again, TA has been working like that since before Agile got its A.

The current company is the first I've been at that uses actual Agile methods in an actually agile way. I suspect that compared to industry the ratio of functional to nonfunctional processes in companies I've been at is fairly high, and the ratio of capital A Agile to actually agile is fairly low.

December 19, 2014

Interesting paper on the system issues in Machine Learning

Not just a snappy title. Glue code, pipeline jungles, crazy problems familiar to any big data processing system and unique to ones where you're stuffing the data into a model.

Machine Learning: The High Interest Credit Card of Technical Debt