Saturday, July 2, 2011

Is caching your crutch? Fix your code first then cache.

I read a lot about performance, specifically on the web. There is a lot of hype right now about page speed and appearance of speed to the user of a site. A lot of these discussions talk about caching strategies. While I feel caching is a vital layer in web performance, I feel more and more it's used as an excuse not to fix bad performing base code.

"Site is slow, oh we need to cache more". No, you your code sucks and it can't stand on its own with load, it also doesn't degrade gracefully.

There are so many layers of cache. You can have internal application caching, simple in memory data structures. There are also object caches like memcache. There are content delivery networks like Akamai. You also have browser storage (http cache, cookies, local storage, etc). Then there are warming processes, prefetching, background loading.

Being good at these all is hard, plus writing your code correctly and being performant is hard. I think writing better code is the first fix for performance gains. I've seen so much time spent on the cache aspects in order to cover up bad code. If the base problem is simply fixed, then you wouldn't need that much cache anyway.

There are some simple first steps to be performant before you move to caching to keep your site up.

Seriously, look at your loops. You might be looping over file records to construct a data structure, and then you have to loop the data structure. This makes sense in object oriented programming for organization, but not in performance. Do the logic on the data as it's read.

Define your variables for when they are used and then clean then up when done. Sometimes container objects are constructed and hold data that is never accessed. Lazy load this information. When your container constructs, you don't need to initialize or execute logic that isn't used in a given execution path. Have a deconstructions strategy. Delete references, close connections, clean up. Look at your code's memory usage.

Remove resources from your test environment. Test your code on minimal hardware. Reduce memory, reduce CPU, make disk I/O slow. In these conditions, how does your code perform? Badly? If you can get it to run fast on less, it's going to run better on more. You'll start looking at general consumption of your code like you weren't before. You'll want to run performance tests on this environment to know where the site falls down. I hate hearing, "this is my test environment, so the site might be slow".

Have you thought about threads yet? Break up your logic, distribute it. How about asynchronous logic? How about queues? All of these add the factor to delay until later when there is too much going on.

When there is too much going on, does your site degrade? Can you turn off sections at a time, verses putting up a complete outage page? If not, your users are likely getting upset. Add this.

What is your site download size? Getting over the wire without cache is huge. How large are your pages, scripts, images, styling, media, etc? Can it be compressed?

Now tune your platform software. Tune your web server, tune your app server, tune your database. Tune your OS. These likely have performance boosting settings built right in that can be adjusted for your software.

Now you can start thinking cache once all this is done (if you even need it). If you did caching first on-top of your code that doesn't know what it's doing anyways, then it's a crutch. As soon as it's removed from the equation, you code will fall over.

See also:


No comments:

Share on Twitter