Sunday, April 3, 2011

What I've learned from consulting (so far)

Few years back, I moved from a financial industry IT job to the consulting world. The agency I moved to at the time was local to the same area I already worked in. The sell for me to move over was that they were basically growing and needed man power. I didn't know what I would be doing, expectations, anything. I didn't have a role, an assignment, nothing. All I knew is that it was a young, fresh environment, which was very different than my internal banking position. You know a company is doing well when they just need help for no specific roll and times are good.

Regardless, I leaned a lot in my first few years. I wanted to dump and share what nobody told me about consulting and what I learned. Everything I mention is limited to mostly the web and commerce driven sites. This has been my primary focus for a few years now. The points below applied directly in my environment, and I don't know how applicable the concept and rules apply elsewhere. It's also my view and opinion. I am likely wrong in some cases and I will learn better later, or never.

1. Software Projects Don't Fail

I always heard in school and also at my first job that software projects fail. They start up, they get funding, there is design, build starts, and somewhere along the way, the project never launches. This doesn't happen in consulting, at least where I worked. You succeed in degrees. Meaning you never complete anything, its always going, and therefore you never fail. You launch, otherwise. you are out of a job, there are no options.

2. Launching with a bug list

I always thought software went through a very refined and vigorous testing and launch phase. We are talking about something that millions of people will use and millions of dollars will be made from. I thought, a launch has to be perfect. I soon realized, that if this were true, then number 1 is then possible, which it can't be. Products are lunched with a bug list and are never perfect. They are good enough and are fixed later.

3. Formal technical documentation is in general useless

Similar to number 2, I assumed documentation is was a huge necessity for software. Its not. Smart people are good making sense out of nothing and nonsense. I think code, scripts, configurations, etc should be self documenting. Initial design and approach is important to spawn ideas and think out where you might go with a solution. Writing this all down and formalizing before implementation is a waste. Why? You design is likely wrong, not wrong in the sense that you are stupid, more is understood as you go. As you are going, time to go back and update formal documentation is slow and painful. Plus nobody is looking at the documentation anyways.

Like I said, your technical stuff should be self documenting. Use good naming, comment inline, be kind to developers that are going to have to read your work.

4. Formal functional documentation really matters

Number 3 does not apply to functional documentation. Functional in the sense of use cases, stories and business rules. These need to be hammered out and finalized per spring, build cycle or release. Changing functional details mid stream is the death to completion. I understand something changes, but you cannot allow functional level specs impact the sprint agreement, its the only way a sprint is a sprint.

5. Gut check work breakdown is good enough

When planning a large project, you would think after functional level design is completed, technical implementation design is completed and then the work is well understood and therefore proper estimation of work can occur. Since technical design is again in general worthless, you can't estimate properly. If you spend too much time doing technical design without implementation along the way, you've likely designed down a path that is wrong and change, and your specifics on work break down are already incorrect. The work breakdown can be a gut check after functional agreement. How do you gut check?

6. How to estimate work (or gut check it)

Nobody in school tells you how to estimate work given vague details. This is all about what number 5 is about. Its simple as problem solving. Break the problem down into minimal meaningful chunks. Each chuck should be somewhere in the range of 2,4,6,8,10 hours. More than that, break it down more.

This exercise might be wrong, you might be right, regardless, it helps you break down the problem into small pieces and tackle the solution in chunks. Even if you aren't responsible for estimation, its a good practice to take an assignment yourself and break it down from an approach perspective.

7. How to go live

Going live on the web is tough. Nobody told me this. Nobody prepared me for it either. Plan for 6 months of hell. Nights and weekends are gone. You won't see much of your family or friends. You need to pack in with your team, work close and break the rules. There are likely process and procedure in the release process. There is likely a date that everything is due. I am not sure why, even if everything is planned, and work is well understood, projects fall behind. But, as I said in number 1, they don't fail. Therefore, rules are broken. The team will naturally form self productive habits and their own procedures to help get to the finish line. This should not be messed with. If the project needs to go, then whatever it takes needs to be allowed. You will likely find true and useful patterns fall out of a team going live. Compare them to what your organization really does. The comparison I bet will show what is wrong in your organizations operating procedures and demonstration opportunity for improvement.

You also need to manage your client for going live. Shit is going to be wrong, and as in number 2, you are going live with a bug list already. Text alignment, images and styling are not show stoppers. Get the damn site up, start making money and fix it later. The technical team will not manage this, you will need a seasoned PM or client manager, without it, you will fail by pure escalation on the client to put the breaks on.

8. The go live date will move

Despite your best efforts, your launch date will move. Again, manage your client, put your head down and go. There is no worth in assigning blame in the process, get live, then figure out what when wrong and who pays for it latter. Stick to your guns on your functional documentation, number 4. These are the agreed upon site functions, if these change, you need proof of scope change and you need to hold your client responsible for the impact of the miss or change.

9. Logging really matters

You are going live with a bug list remember? Those are your known issues. On top of that, a bunch of shit you never thought of is going to fail. This is when your logging really matters. You need to be able to enable / disable logging on all your logic, including client and server site. Client side, use some JS console logger. You can also log client activity to the server with fancy solutions. Log data transformations, before and after. Log arithmetic. Log counters. Log array indexing. Log method entry and exit methods. Catch exceptions and log as much data around the exception as possible. Formalize key failure logging strings for monitoring and alerts. Doing this before a problem is incredibly valuable. Getting code changes in around a problem when it occurs just to add logging means you didn't do your job. Someone tells you your logging impacts performance, tell them to shut up.

10. Code review everything

I've learned so much from reviewing other people's code. I've caught so many of my own bugs walking someone else through my code. I've seen so many bugs when looking at a DIFF. There is no possible way a change can go live with two sets of eyes seeing the change and working through the problem. Even good technical people make mistakes. We are going live remember? We are working nights and weekends, we are going to mess up. The review doesn't have to be formal and should be desired,  you need to want people to beat up your code. It only makes you better and stronger. Build it into your process, add options for passing and failing.

I keep saying code, but review everything. Configuration and scripts count.

11. Performance and load testing is not optional

I've done launch without proper load and performance testing. You cannot assume anything. Unless it's measured, you dont know anything. Think your existing infrastructure and configuration can handle additional traffic? How do you know? You need to measure it.

12. Failure is good, do it fast

Shit is going to fail, I mean fail. Site will go down, users will complain, money will be lost. These are all very bad. How do you prevent it again? The failure needs to be well understood. It needs to be embraced like a success. It should be documented, presented and well explained. Tell your friends, tell your mom, tell your customers. You will talk about the failures of your work much more than the successes. You'll learn so much more from screwing it up, doing it wrong and fixing it.

Just remember to fail fast. Again, so long nights and weekends, you need to get the site back up and fixed. Put your game face on, and do it. These are your war stories, the death march. You can be a hero. Its the IT football field, its overtime, your team is down and the team needs you.

13. Be ready to rewrite an implementation

Scaling for growth is inevitable, but doing this upfront for extreme situations doesn't make sense because you don't know to what extremes. Expect your implementations to have a breaking point, it will stop to work at a certain load level and will require to be rewritten. This is a hard place to fail fast in, but if you use proper measurements, you should know when your current solution will stop working, and when you start planning to get ahead of it. If not, you likely will need to patch it with something dirty in the mean time to get to an actual fix.

14. Dirty is just fine most of the time

Well crafted algorithms and process are really cool, but usually some duct tape, smoke and mirrors will do just fine. Why? Because the dirty work will likely handle solving a problem with no negative affects for awhile, or forever. Think about the time you saved it your solution stands up forever that you designed for a few hours. Plus, once it breaks, you need to rewrite it anyways. We aren't building a house or a bridge, we're building software, web software. Technology doesn't need to stand as a structure for hundreds of years, platforms change within 5 years and you need to keep up anyways. Don't kill yourself over engineering a solution. Break out of your framework or stack, get the tape and make it happen.

And that's it... I likely forgot something, but this is a good start. I still am doing consulting work. I like being on the front lines, I like the thrill, I like fixing disasters for some reasons. I don't like giving up my whole life, so it's tough sometimes. I've learned that in order to put up with such demanding role, you need to take back what you put in. If you can't, you likely need to move on before you lose your personal life.

No comments:

Share on Twitter