The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Does this function pattern exist...

Let's say we have a function f(x,y);

Now, let's say that we have a mechanism for this (or any function) that intercedes and examines the parameters (x and y) and puts them in a hashmap.

So, the function runs, but before exiting, the return value is associated with the hash.

Then, on subsequent runs, if the hash is found in the map, the function can instantly return the value. Thus, functions of indeterminate runtimes all become O(1), within memory constraints.

It seems like this has _got_ to exist already, but I've never heard of it. It's most similar to the trigonometry lookup tables that exist, but what I describe 1) Work for anything and 2) Are created on the fly.
mynameishere
Wednesday, April 11, 2007
 
 
It's called caching results... typically used for database queries but works for anything...
Mike S Send private email
Wednesday, April 11, 2007
 
 
"Caching results", perhaps?

In general, result caches are not that useful.  This is  because if your inputs were limited enough that you had so few possible results that caching would save you anything, then pre-calculating a table to do the same thing would save you MUCH more time and take less code.

Also, a 'cache' can make your code harder to test -- for each set of inputs, you need two tests -- one where the desired result was IN the cache, and one where it was not.

The problem also is not how fast you can find something IN the cache, but how long it takes you to find out it ISN'T in the cache.  Whereupon you'll have to do the original calculation anyway.

Having said all that, I'm sure there are circumstances where your inputs can give you 90% cache hits, yet aren't so predictable a pre-calculated table makes sense.  In these very limited circumstances, a "result cache" can help.
AllanL5
Wednesday, April 11, 2007
 
 
Welcome to functional programming.

It only works in cases where your function has no side effects.

If your rolling your own instead of using a language which provides result caching, it makes more sense to seperate the function from the caching.

Wednesday, April 11, 2007
 
 
I guess the sort of application I'm thinking of would be similar to re-sizing fonts. There are billions of possibilities that occur when you combine font type+size+(italic|bold|etc)+user-specified spacing+word processor specific sizing+coloring, etc. But, for the most part, a user will set something up, and change very little for long periods of time. Thus, why calculate the images for "A" and "B" over and over?

So, there are languages/tools that support this already. I'm thinking I _need_ to do this for some Java code, and don't want to re-invent any wheels...
mynameishere
Wednesday, April 11, 2007
 
 
I vaguely remember the flyweight pattern which could be a solution to your specific problem.  Try wiki.

Wednesday, April 11, 2007
 
 
Perl Solution
Wednesday, April 11, 2007
 
 
The term "memoization" was coined by Donald Michie in 1968

Sigh. Not quite 40 years out of date.
mynameishere
Wednesday, April 11, 2007
 
 
Here's the link for the pattern which may help with your problem without the need for memorization.

http://en.wikipedia.org/wiki/Flyweight_pattern

Wednesday, April 11, 2007
 
 
Wow!  Thanks for the link on "memo-ization", I had no idea.  I'd still prefer to call it "result caching" though.
AllanL5
Wednesday, April 11, 2007
 
 
If you're using Java-5 or above, the ConcurrentHashMap.putIfAbsent() allows for easily adding memoization to your logic.
Manes
Wednesday, April 11, 2007
 
 
Perhaps this is stating the obvious but you need to be careful that x & y are the ONLY things which affect the outcome of f(x,y)

If f(x,y) also uses some config info or datasource which may change between subsequent calls - don't forget to expire your cache entry when such changes occur.
JJ Send private email
Wednesday, April 11, 2007
 
 
_be careful that x & y are the ONLY things_

Well, the f(x,y), and in fact the font example (which, yes, I got out of the Design patterns book) are simplified compared to what I really have to cope with.

I'm not sure anything I'm doing can be optimized. The side-effects are all over the place, and I may have to compromise by extracting what functionality I can (sans side-effects) and going from there.
mynameishere
Wednesday, April 11, 2007
 
 
An alternate way of implementing this is as follows. You call the cache in exactly the same way that you'd call the original function. The cache sees if it can answer the question, and if it can, it does. Otherwise it calls the original function, caches the answer, and returns it.

From the point of view of the code outside, it looks just the same as it always did.
Duncan Sharpe
Wednesday, April 11, 2007
 
 
Microsoft's Policy Injection Application Block lets you intercept calls to - and access the return values from - methods in .NET:

http://msdn2.microsoft.com/en-us/library/bb410104.aspx

I believe it comes with a handler to cache return values from methods.
Thom Lawrence Send private email
Thursday, April 12, 2007
 
 
Yes, Entlb 3.0 does come with a caching handler. It might be worth a look as an example of how to do memoization, even if you don't want to use the handler itself.
Chris Tavares Send private email
Thursday, April 12, 2007
 
 
bmm6o Send private email
Thursday, April 12, 2007
 
 
This is often refered to as "memoization".

The beginning of the function looks at the hash which has "memos" for different parameter values, and returns the stored value if an entry exists, as others have pointed out.

Not sure there are any current languages that will do this automatically.

In more dynamic languages you can create wrappers around exisiting functions to do this. Look in the book "Higher Order Perl" for examples for Perl programs.
dot for this one
Thursday, April 12, 2007
 
 
mynameishere,

Be sure your cache doesn't get overloaded with many no-longer-useful items.

In fact, a font cache is the classical example of how to get a memory leak even in a garbage collected language.
dot for this one
Thursday, April 12, 2007
 
 
The problem is that f(x,y) leads you to think that x and y are the only inputs and the return value the only outputs. That is very rarely true (except in cases such as trigonometric functions that you mention.) Other inputs and outputs would be the file system, global variables, databases, the software environment, connected devices, external inputs, the time of day and so forth.

If you are working in a pure functional language like Haskell then the inputs and outputs are absolutely constrained as above, and these automatic functions are often generated as part of the compile process (for example, partial evaluation tools can evaluate a substantial amount of a functional program before execution, kind of like constant folding on steroids.)

Monday, April 23, 2007
 
 
Yes, but often these can be dealt with.  For example, Microsoft's ASP.Net cache object allows item-specific cache dependencies.  So, you can add something to the cache and state that it is valid until file x is modified.  You can add a cache dependency based on a file, a database item, or another cache item.

As long as you know your function's dependencies, you can often build an effective caching model to speed it up.  This is really handy in framework based or code-generated apps.  These types of apps often have of repeated code that would be inefficient if not cached well.
JSmith Send private email
Friday, April 27, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz