Caches: an unpopular opinion, explained...


News: Caches: an unpopular opinion, explained...

  1. Caches: an unpopular opinion, explained... (6 messages)

    In his blog, Joseph Ottinger sets the record straight on why caches are not always the right solution to a performance problem, especially when it comes to live data.

    "The use of cache on live data, where the data is read-mostly to write-only, is what I find distasteful. There are circumstances which justify their use, as I’ve already said, but in general the existence of a cache indicates an opportunity for improvement."

    Check out Joe's full discussion on the right and the wrong way to use a caching strategy:

    Caches: an unpopular opinion, explained...

    Threaded Messages (6)

  2. Ted Neward said essentially this at the NFJS expert panel.  When confronted with a performance problem, everyone's first instinct is to say, "Oo!  We'll cache .", when it is really more or less a bandaid.  (Note I'm paraphrasing Ted here.)

  3. Yep, I can see that.[ Go to top ]

    I didn't see Ted at NFJS, but I can certainly see him saying this - caches for live data are a bandaid worse than the original problem. Fix the problem, not the symptom! This stuff is actually pretty well known in the high perf arena, I just catch flak for it anyway. :)

  4. I guess we should stop using Intel CPUs then, since they cache live data -- even in multi-CPU systems! If they didn't cache live data, their performance would be horrible -- and you can force them into such a mode by turning off all of the CPU caches (assuming your bios settings expose these options).

    At any rate, if you read Joe's article, you'll note that he is simply saying "instead of calling a coherent cache a cache, let's just call it a data grid", which is OK I guess. OTOH, most of the article was just an advertisement for his employer ;-)


    Cameron Purdy | Oracle

  5. Band aids are excellent solutions to bloody problems.

    One such bloody problem is latency, for which caches is a cheap, simple and fantastically efficient solution.

    Be pragmatic, dudes, not stupid.

  6. Disagree[ Go to top ]

    Caching live-data could be an absolutely correct decision and it's largely depends on the overall system design. Broad statements like that are simple misleading. Cache vs. data-grid vs in-memory data grid is just a hair splitting - these terms are becoming interchangeable. 


    Nikita Ivanov.

    GridGain Systems

    High Performance Cloud Computing

  7. Another kind of temporary data[ Go to top ]

    The article was interesting, but there are other types of temporary data. For example, say I'm building a trading application that does "what-if" analysis on a potential transaction. The trader may want to create several "what-if" scenarios and run it through compliance validation to make sure it is good. In a traditional system, all of that "what-if" data is saved to the database. The compliance report for each "what-if" transaction is also saved to the database. The problem is, the compliance report could be huge with thousands of rows of data. A simple transaction with a dozen trades could result in a report with 5K rows of data. Say a trader creates 10 "what-if" orders to explore his options. Once the trader is done, the 9 "what-if" transactions that weren't used need to be flushed from the system. Most systems do that during off hours.

    If the system used a data grid to keep that temporary data, we can avoid hitting the database and doing lots of cleanup on during off hours. I'm sure other people have seen other "temporary data" use cases that could benefit from a data grid.