The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Encapsulation and Information Hiding

People seem to apply information hiding at a design level, then let it drop into an abyss when it comes to implementation. If what I'm about to say twiddles your mind in the slightest, feel free to add your thoughts.

It begins with the following definitions:

* Encapsulation is the public interface that defines how an object can be used, and how its data is derived.
* Information Hiding prevents external objects from (ab)using the derived data altogether.

A gas-guzzling vehicle helps illustrate the difference:

 class Automobile extends Vehicle {
  public final static int EMPTY = 0;
  public final static int FULL = 1;
  public int tank = EMPTY;
 }

The snippet above shows a class with neither Encapsulation nor Information Hiding. This is bad for numerous reasons, not the least of which is extensibility. Instead ...

 class Automobile extends Vehicle {
  public final static int EMPTY = 0;
  public final static int FULL = 1;
  private int tank = EMPTY;

  public int getTankStatus() {
    return this.tank;
  }

  public void setTankStatus( int status ) {
    this.tank = status;
  }
 }

This is slightly better. The status of the tank is now encapsulated, but not hidden from the rest of the system. Changing the tank's type from "int" to "Tank" will break the next compile. No other object in the system should have knowledge of how the gas tank works ...

 class Automobile extends Vehicle {
  private final static int EMPTY = 0;
  private final static int FULL = 1;
  private int tank = EMPTY;

  private int tank() {
    return this.tank;
  }

  private void tank( int status ) {
    this.tank = status;
  }

  public void fillUp() {
    tank( FULL );
  }

  public void depleteOzone() throws GasExhaustedException {
    if( tank() == FULL ) {
      tank( EMPTY );
    }
    else {
      throw new GasExhaustedException();
    }
  }
 }

A simple interface fully Encapsulates and Hides Information: fillUp() and depleteOzone(). No other object in the system can use, or know the state of, the gas tank. This can be taken a step further by exclusively using object instances bound to an interface that defines these methods. But that's an entirely different ball of wax.

Perhaps I'm missing something, though, as I've seen code similar to the following that delcares itself as using Information Hiding techniques:

 class EncapsulationAndInformationHiding {
    private ArrayList widths = new ArrayList();
    public List getWidths() {
        return widths;
    }
 }

For the life of me, I cannot see what information is hidden. Yes, the abstraction of what *type* of list is being used (ArrayList vs. Hashtable vs. LinkedList) is concealed via the "List" interface, but it doesn't preclude the information in the list from being modified. Nor does it readily allow drastic changes to how widths work. Modifying the list introduces dependency coupling, and breaks the seal of autonomy. Instead:

 class EncapsulationAndInformationHiding {
    private ArrayList widths = new ArrayList();
    public Iterator getWidths() {
        return widths.iterator();
    }
 }

At least now the list of widths cannot be modified, even though it remains exposed for the outside world to see. The potential for dependency coupling has been nearly elimated and the private data remains completely under control of the declaring class itself.

The reason I say "nearly eliminated" is because the Iterator interface is still bound to the internal representation of the widths (ints vs. floats vs. objects), which introduces a level of coupling.

So my question: why are "get" accessor methods springing up like bad weeds, when more often than not they introduce tight coupling, transparent-box classes, and inextensible source code?
Dave Jarvis Send private email
Wednesday, June 15, 2005
 
 
"why are "get" accessor methods springing up like bad weeds, "

The simple answer: Because you are using Java which does not have built-in support for properties. The get/set naming scheme is a poor attempt to gain "some" of the benefits of properties in languages like VB and C#.

Whether or not properties are really useful in data hiding is your real question. Some will argue that they are really just another layer that doesn't really solve the problem. Others will disagree. There have been multiple posts on this out here already.
squidward
Wednesday, June 15, 2005
 
 
"So my question: why are "get" accessor methods springing up like bad weeds, when more often than not they introduce tight coupling, transparent-box classes, and inextensible source code?"

If you're simply returning/setting a value without any calculation, there's generally not a need to use them and it's unneeded effort.  In fact, Martin Fowler recommends that if your get/set methods are that boring, you shouldn't bother with them.

Now it makes sense to have them when you want to completely hide HOW they're being retrieved/stored/calculated and just work with the resulting value.

For example, if you're getting the population of Chicago, you may want to hide it because:
a) this could be the 2000 census number,
b) it may be a SUM() of values from a database,
c) it could be the 2000 census number adjusted for growth over the past five years.
KC Send private email
Wednesday, June 15, 2005
 
 
"If you're simply returning/setting a value without any calculation, there's generally not a need to use them and it's unneeded effort."

I disagree very strongly. You may not need to do any calculation or checking NOW, but it can happen down the road. That's exactly what getters and setters give you: The freedom to change the implementation down the road, even if you don't need it right now. If you have 100 files accessing your variable directly, you have to go back and change everything.

I've been burned enough times that I no longer expose variables, no matter how boring getters/setters are.
sloop
Wednesday, June 15, 2005
 
 
"That's exactly what getters and setters give you: The freedom to change the implementation down the road, even if you don't need it right now."

I guess my point got missed. Let's try this:

class Car {
  private static final int TANK_EMPTY = 0;
  private static final int TANK_HALF_FULL = 1;
  private static final int TANK_FULL = 2;
  private int tank;

  public int getTank() {
    return this.tank;
  }

  public void setTank( int tank ) {
    this.tank = tank;
  }
}

Now I want to change "tank" from an "int" to a real class "Tank".

It no longer makes sense to return an "int" representing the value of the tank. Thus, I cannot change the implementation down the road without changing the 100s of classes that call getTank(). And even if the language allowed multiple return types, TANK_HALF_FULL is now (at best) an approximation that has to be returned. Thus begins "code smell", and legacy system restraints.
Dave Jarvis Send private email
Wednesday, June 15, 2005
 
 
But that issue exists no matter how you decide to code it. That's not really an issue with encapsulation or information hiding.
squidward
Wednesday, June 15, 2005
 
 
I am not sure I'd model the fuel for a car in your manner..

interface MotorVehicle {

  public int tankCapacityInCC();
  public int tankContentsInCC();
  public int tankRecommendedReserveInCC();

}

Maybe this would be better than FULL and EMPTY constants etc?

It would be crazy to go measure CC of the tank and change it to return a Tank instead.

One way (in C++) around the re-typing of the return is to have a typedef for the return-type.  When you change from a POD to a class, you can further ease migration by having a cast operator from the class to a POD.  But this gets tricky and introduces other risks, so the benefit has to be weighted.  Oh well.
new nick, new rep
Thursday, June 16, 2005
 
 
If you want to preserve encapsulation and information hiding, then the ideal
solution to this problem is the separate Tank class.  It's always better to
define your own domain by adding unambiguous types than it is to use ambiguous
integral types such a 'int' and 'double' etc...

Remembering to spread responsibility evenly throughout the system, I'd make the
Tank class responsible for knowing its capacity, maintaining the current fuel
level and notifying its owner Vehicle object when ever it nears empty.

interface Vehicle
{
    Tank getTank();
}

interface Tank
{
    int getCapacity();
    int getCurrentFuelLevel();
    bool isEmpty();
    bool isFull();
    event NearlyEmpty;
}
Edward James
Thursday, June 16, 2005
 
 
"I disagree very strongly. You may not need to do any calculation or checking NOW, but it can happen down the road."

Actually, I've had the reqs change on me enough that I make get/set for everything.  I just said that it's considered unncessary.

I never keep any public variables because - as you said - it can be a nightmare.
KC Send private email
Thursday, June 16, 2005
 
 
<< Now I want to change "tank" from an "int" to a real class "Tank". <<

Changing the return type or parameter types is a change in interface, not implementation.  This type of change is taboo in OO (much more so than not hiding data), so it's best to work around it with an extension or new interface.  Your class can implement as many new interfaces as it wants without breaking old code.
Matt Brown Send private email
Thursday, June 16, 2005
 
 
Hey, Edward.

The "getCapacity()" method in the Tank interface returns an int. What happens when you want to move this to a "float". What I'm reading from your example is that the place where the coupling occurs has moved from the Vehicle interface to the Tank interface, without resolving the issue of exposing information. In this case, two pieces of information are being exposed.

1. The fact that a Vehicle has a tank. Bicycles, to my knowledge, don't have tanks. ;-) If you change "getTank()" to throw an exception if a tank doesn't exist, then you still have to go back and change 100s of lines of code that call getTank() to handle the exception. Unless you're using Smalltalk?

2. The capacity of the tank and its fuel level are still being exposed for all the world to see and abuse. Granted they cannot change (no corresponding set accessor defined in the interface), however I've now been limited to returning an int when I really want it as a float. (At least, some languages have this single-return-type restriction.)

Perhaps the following is a viable solution?

interface Tank {
  boolean isCapacity( int capacity );
  boolean isCapacity( float capacity );
  boolean isFull();
  boolean isEmpty();
}

No matter what type is used for the internal representation of the Tank's capacity, it can always answer the question defined by the isCapacity() method.

Lastly, if the problem is more about wanting to know the exact capacity, and when it changes, that's a good candidate for an observable pattern.
Dave Jarvis Send private email
Thursday, June 16, 2005
 
 
Dave,

To me it sounds like your abstracting just for the sake of abstractness. Taking your example, if we initially model the tank capacity as an int in the interface and much later (like after the product is completed, otherwise refactor it during development) it needs to be modelled as a float, create a new Tank interface, say Tank2 or some such, that inherits from the original interface and adds a new method that returns a float for capacity. Code depending on the original int method remains unchanged.

IMHO, your designing for a problem that does yet exist, by all means have a plan to be flexible to change as per my example above but don't waste time on gold plating when you don't know if you need it.
Gerald
Thursday, June 16, 2005
 
 
"What happens when you want to move this to a "float". "

Again, you are talking about an issue relating to changing an "interface" not an "implementation". Information hiding and encapsulation are designed for "implementation hiding". Interfaces are by definition public contracts. Any time you change an interface you can expect rework on the client side regardless of the programming methodology you use (setter/getters, public fields, anything else you can think off..).

I'm not sure whether your post has migrated toward a discussion of the best way to implement a gas tank class or if you still have issues with getter/setters. Anyway, remember that just because getters/setters can't fix the "inteface change" problem (because nothing can), that doesn't mean that you should ignore the many other benefits they provide when it comes to REAL information and implementation hiding.

public void setTankStatus(int status)
{
    this.tank = status;
   
    // Here is the real benefit and implementation hiding.
    this.updateGasGauge();
}
squidward
Thursday, June 16, 2005
 
 
If you think of your objects as aggregating some data, there aren't a whole lot of options here.  There's a very clear (and difficult) tradeoff between crippling your callers and crippling your future self.

Instead, think of your objects as providing some behaviors.  Construct the interface to your objects by thinking in terms of verbs, not nouns:  fillTank(), emptyTank(), etc.  What is the automobile object responsible for?  Internally, you'll need some data representation in order to fulfill these requests, but there's no need to expose that data representation to your callers.

This is a different way to think about your objects - it's less natural for me, but it leads to far more powerful and nicely-abstracted objects.  Wherever possible, use methods that actually do something, rather than just return some data.

(This comes from an article I read awhile ago, I think on artima.com, but I don't remember the specifics.  They had far better advice and examples than I can think of here - it'd probably be worthwhile for someone to find it...)

Thursday, June 16, 2005
 
 
This question springs from the idea that all information elements should be hidden, and only revealed by specific 'getters/setters' calls.

It is possible to have an 'Interface' (no, not the keyword) defined which contains both certain data elements, as well as certain methods.  The problem you then run into is that the object 'state' (which is what you want to read, I assume) can only be updated when you call one of its methods -- and you're back to a 'getter/setter' mentality.

So, if the 'state' does NOT need to be updated every time it is read, having a public data element which holds that 'state' which is then readable by class users is not necessarily a bad thing.  Yes, they then have to know about an internal detail of the class, which is now 'published' and will break other code if changed.  But sometimes that's not a problem. 

And as you point out, if you assign 'getters/setters' for the state variable, then class users HAVE to know about the getters/setters.  You do get a little more control in that case, as you don't have to have a 'setter' -- you can reserve changing the 'state' data element for other methods in the class.
AllanL5
Thursday, June 16, 2005
 
 
Oh, and if you DO have public data elements, then class user's have to know to NOT write to it.  To some designers, this 'loosey-goosey' approach is way too unsafe, and justifies implementing a 'getter' only.
AllanL5
Thursday, June 16, 2005
 
 
To Gerald,

What I'm trying to reduce the amount of tight coupling I see happening in systems. The exteded Tank2 interface idea won't work in Java, due to the single-return-type problem:

B.java:4: getA() in B clashes with getA() in A; attempting to use incompatible return type
found  : float
required: int
  public float getA();

Also, Tank2 feels like it has "code smell". An outsider looking a the code will immediately wonder why there are two tank interfaces, rather than an interface that decoupled the dependency from the start. Eventually the original Tank would become deprecated, and then Tank2 would replace Tank, but that's a lot of work that could have been avoided in the first place if you simply did not let other objects in the system know the tank's current state from the initial development.

I tend to dislike the number 2 at the end of a class because, to me, it implies something was done wrong the first time. :-)

To Squidward,

I believe we have some definition differences. Originally I wrote the definition of Information Hiding as preventing the information that encapsulation abstracts from being exposed.

What I've read from your comment is that the word "information" is a synonym for "implementation". Does this mean "information hiding" and "implementation hiding" are the same thing?

I see them differently. To me, encapsulation looks closer to "implementation hiding" (whereby you don't let classes know how the data is manipulated), than information hiding (whereby you don't let the classes see the data at all).

Perhaps someone could be kind enough to offer some alternate definitions for "information hiding", "encapsulation", and "implementation hiding" that can be illustrated with pseudo-code to clearly demonstrate their differences?
Dave Jarvis Send private email
Thursday, June 16, 2005
 
 
Dave, I'm not sure that the definitions matter. It's all semantics to me. In a couple of points you seemed to be saying that using getters/setters doesn't solve the problem that occurs when you change an int to a float. I agree. But they never claimed to solve that problem since it is unsolvable. That is not a strike against using getters/setters.

Beyond that, your other points are very valid but are typical for those that argue against getters/setters in favor of public fields. All in all, I think that the potential for what you call information hiding is only available via getters/setters. Using public fields gives you absolutely no ability to hide information at all. Of course, whether or not you use good design principles when implementing your getters/setters is an entirely different matter.
squidward
Thursday, June 16, 2005
 
 
Hi, Squidward.

As a side issue, how can we communicate if we haven't even agreed on the meaning of the words we are using, semantically? If I'm talking about apples and I say they are bright, tangy, soft citrus fruits, and you keep telling me how much you just love the crunch to an apple, we'll never understand each other. ;-)

The point I'm trying to make is to question the logistics of accessors:

Accessors, by their very nature, break the principle of "Information Hiding" (as per my original definition). Since Object-Orientation provides tools for Encapsulation and Information Hiding, why are accessor methods so prevalent (in certain languages)?

There was a point made that they are only used in Java (C++, etc.) because those languages lack viable constructs to avoid them. While I agree that certainly there are better provisions for data exchange not present in those languages, I disagree that they are unavoidable (in Java).

>> "But they never claimed to solve that problem since it is unsolvable."

Although this isn't my point, I would like to see the proof.

>> "Beyond that, your other points are very valid but are typical for those that argue against getters/setters in favor of public fields."

Oh, I would never, ever, argue in favour of public fields. Quite the contrary, especially because that breaks the principle of Information Hiding even more directly than accessor methods! In the last seven years I've never written a single line of code with public or protected-level variables.

This is what I mean about understanding each other's semantics. :-) I need to know what you mean by "encapsulation" and "information hiding".

"All in all, I think that the potential for what you call information hiding is only available via getters/setters. Using public fields"

Again, I feel we need some common understanding here of what is meant by "information hiding". When I use the term I mean to imply the complete and total autonomy of objects such that type-based information can not be tightly coupled; boolean verbs (isRunning, isStopped, etc.) and observer patterns notwithstanding.
Dave Jarvis Send private email
Thursday, June 16, 2005
 
 
Eiffel.

Sorry to break your concentration. :)

C# is as bad as Java realy, with it's alternate getters and setters.

Eiffel on the other hand says you can look at a variable or a function with 0 paramters in the same way.  Decide the type, you can change from having a variable to a function without breaking others code if the way you return that information changes, but otherwise you don't have to write "getters" as such.

Then there is modification of an object.  Most objects you won't just set a value, you will ask it to change in some way.  Further, even if the value being set was a straight value to be stored, the class should be doing something with it, and if does something with it, then there are likely to be limitations, and so now you have a function with pre and post conditions.  One might claim a 'function call' is onerous for setting a value, but what is the difference between typing (x) and :=x , not much.  You could claim it performs poorly, but that is a matter of the compiler performing optimisation, rather than you choosing up front, and the compiler doing it anyway and perhaps telling you that your wrong.


I guess the other point realy, is that if you start by deciding how to implement it, and build an interface to match the data, you are likely to break things as the data changes, and the interface doesn't support what you need to do.
Where as if you start by designing the interface to support how you wish to investigate & manipulate it, you can then build whatever you need under that interface.
Gorf Send private email
Friday, June 17, 2005
 
 
Dave,

If you do adequate design upfront you should minimize the need to create new interfaces. As to whether or not Tank2 has code smell, if you prefer you can give it a more descriptive name such as TankCapacity and give it methods that return the tank capacity in float, double, BigDecimal, etc. Personally though I have no issue with versioning interfaces in this way, Microsoft and others have been doing it for ages. There's been some interesting discussion on numbered interfaces on Cedric Beust's blog here:

http://beust.com/weblog/

As to incompatible methods, well just give them different names as follows:

int getTankCapacityAsInt();
float getTankCapacityAsFloat();
double getTankCapacityAsDouble();
etc..

I actually do agree with you that if you can use action verb methods like emptyTank() and fillTank() to modify state without showing that state then that is the way to go. However, at it's heart software is about manipulating information and thus getters are unavoidable, if the system needs one I'd invest my time in ensuring it is using the right return type from the get go rather then coming up with some overly complex system to handle it changing.
Gerald
Friday, June 17, 2005
 
 
If you want to decrease the coupling between the Tank class and its clients (and thereby the Vehicle class and its clients), here are a couple of options:

1. Specify your objects in terms of behavior, rather than as a collection of properties. An anonymous respondent already mentioned this, but to expand on it a bit:
class GasTank {
    public:
        bool IsNearlyFull();
        bool IsNearlyEmpty();
        void AddGas(double litersToAdd);
        void RemoveGas(double litersToBurn);
    private:
        int tankCapacityInMilliLiters;
         int currentLevelInMilliLiters;        
};
So, a client of the tank can find out whether it's nearly empty and turn on the "get gas soon" light, find out whether it's nearly full and stop pumping gas at the gas station, and burn up some gas.

Because there's no way to directly access the internal representation of the tank level and capacity, you can re-implement them in whatever fashion, using whatever units, you like, without disturbing the clients.

2. If you really do need to pass a value around, you can lower the coupling by returning an abstract value type. This is the SmallTalk (and Java) solution, and requires that either you or your programming environment provide a hierarchy of classes that includes a "Number" class that your other numeric types inherit from.

3. As an alternative to making your own "Integer" and "Float" classes as in the previous suggestion, you could create a FuelQuantity class, which you'd use everywhere that you do calculations with amounts of fuel. Providing that you provide conversions to a numeric type, yu can do anything you like with the internal representation.

4. From a practical standpoint, you can minimize the need to rework your original interfaces by choosing the types carefully. Choosing the "widest" natural type for the interface allows you to be as flexible as possible.

For example, if the property you're providing is a counting number, then make the getter return a long integer. Internally, you might "know" that you only need 8 bits to store all the possible counts, but the clients don't need to know that, and they won't need to change later if you change the internal representation.

-Mark
Mark Bessey Send private email
Friday, June 17, 2005
 
 
Hi,

"Microsoft and others have been doing it for ages."

That's an appeal to authority. Besides, I've seen what Microsoft does with software, and it frightens me. Or need I mention the development schedule for Longhorn? (Or was that Longhorn2?) ;-)

"As to incompatible methods, well just give them different names as follows:"

How does this address exposing information? How does this adhere to the principle of "Information Hiding"? And how does it avoid the inaccuracy of rounding errors that callers will experience if they use "getTankCapacityAsInt()"?

I certainly couldn't rely on that API to ascertain the core temperature for a nuclear reactor. I'd prefer to see something like:

class NuclearCore {
  public void addTemperatureObserver( Observer observer );
  public void setMinimumTemperatureNotification( int temperature );
}

This same mechanism can be used to avoid the get accessor methods in the Tank class. It's a bit more work, yet completely avoids: 1) exposing information; 2) having to come up with awkward (?) naming conventions; 3) the possibility of returning inaccurate data; 4) tight-coupling of objects.

Dave
Dave Jarvis Send private email
Friday, June 17, 2005
 
 
Dave,

Your making an argument based on the assumption that no one needs to get the actual value of the level of gas in the tank. If that is the case then I agree, don't bother having a getter. However my initial reply is based on the assumption that access to the amount of gas in the tank is required. This was the context of my reply and I was also specifically answering your post about what if you need to change the return type.

Thus if you do need to get the actual level of gas in the tank, your observer pattern is just pushing the problem out one level further but not actually solving the issue you originally posited, what if we need to change the return value? You haven't shown how your proposal solves this, what you have shown IMHO is added complexity that in no way solves the problem that was posed and that I was replying to, a possible change in return type.

Now if your saying you should use observer pattern instead of a getter then that is absolutely wrong in my opinion. The question of whether to provide an observer or not is entirely dependant on whether you think other systems need to be notified as the gas level changes, not whether or not they might need to retrieve the level.

In this case, I would say both a getter and an observer is required. I could see a GasGauge type class needing to use an observer to monitor the tank. I also could see the need for a getter for a Driver class that wants to check the gas level at a specific time to see if a stop at an upcoming gas station is required. (And yes, the driver could check the GasGauge class instead, this simply moves the argument about variance in return types and getters to a different class)

Gerald
Gerald
Friday, June 17, 2005
 
 
I don't see why get/set methods are being considered so evil, at least in the context of information hiding.  Encapsulating access to a field in get/set methods *is* information hiding, because the object is in control of the internal representation of the data.  Even if the internal and external representations are the same, it doesn't make the information any less "hidden."  The internal can change at any time, which is rather the point.

As far as OO is concerned, the principle is about ownership and access control, not secrecy.  The point is not that nobody should see the data; it's that the owner (i.e. the enclosing object) decides who can see the data and in what format.  Maintainability of public interfaces is a separate design issue, having more to do with communication protocols than access.
Matt Brown Send private email
Friday, June 17, 2005
 
 
Hi,

"I don't see why get/set methods are being considered so evil".

Neither do I. What I don't see is why they are used so often in ways that introduce tight-coupling thereby prematurely limiting the system.

"The internal can change at any time, which is rather the point."

How is this true in single-return-type languages like Java when the accuracy of the interface definition no longer corresponds to the underlying implementation? A good resolution to this problem was posted previously: avoid returning primitive types from accessors.

If "int Vehicle.getTank()" changes to "float Vehicle.getTank()", the compile (in Java) will break, and possibly introduce logic errors in other objects. This is a classic ripple effect. An alternate suggestion was to introduce extra methods, or interfaces, which reeks of implementation duplication and maintenance hassles.

The other example I like is the "hair colour" problem:

class Person {
  Hair hair;

  public Hair getHair() {
    return this.hair;
  }
}

I can now write:

Hair hair = bob.getHair();
hair.setColour( Colour.PINK );

Bob's hair colour is now pink, without Bob knowing, or even being asked. Bob will be very unhappy when he looks in the mirror. (Or very pleased, but that's a different issue.)

Unfortunately, this can lead to an insane proliferation of delegation methods, which are likely incorrect for the object's domain. Certainly there must be a better way?

When it comes right down to it, I guess I'm distraught at seeing things like the following:

String name = object.getName();
String subclass = object.getSubclass();

if( name.equals( SOME_NAME ) && subclass.equals( SOME_SUBCLASS ) ) {
  // Code goes here.
}

To reduce duplication, the above can be rewritten:

if( object.isClassification( SOME_NAME, SOME_SUBCLASS ) ) {
}

Or, in an ideal world, to eliminate duplication, encapsulate the behaviour, and completely hide the internal constructs:

if( object.isNameSubclass() ) {
}

The latter code is far more extensible than the former example, yet in the code I review, "get" accessors often split the code that does the work away from the object that owns the data.
Dave Jarvis Send private email
Friday, June 17, 2005
 
 
It still sounds to me like the code you review suffers more from a poorly-defined public interface (which implies poorly-defined project scope) than any sort of principle violation.

Why would something which once returned an int be changed to return a float after a bunch of components have already been coded to use it?  The public interfaces should be the most stable part of the project.  It's not a problem of returning primitive types vs. Objects, it's a problem of not deciding up front what the proper thing to return is.  (Note that I'm not saying accidents can't happen.  I'm saying they shouldn't happen as much as other types of accidents.)

I agree with the rest of what you said.  An object shoudn't return a reference to an internal object when said object shouldn't be modified externally. It's better to return a copy or immutable type.  My point is: the existence of a get method isn't really a problem if the underlying implementation is sound.  And of course, an unsound implementation can be changed to a sound one with zero impact to other classes.  Assuming the interface doesn't change.  ;)
Matt Brown Send private email
Friday, June 17, 2005
 
 
For context, you might want to read Allen Holub's article "Why getter and setter methods are evil":
http://www.javaworld.com/javaworld/jw-09-2003/jw-0905-toolbox.html

While I don't agree with that claim, I try to limit the number of getters and setters to improve encapsulation. However, my code still ends up with many getters and setters, since that's often the simplest approach I can devise. However, since I have a good refactoring browser (Eclipse) and don't publish my interfaces, I can alter my method signatures whenever it's convenient.
Julian
Monday, June 20, 2005
 
 
You might want to do a quick google on "The Laws of Demeter" for some discussion of when to use get-set methods.

For example, the hair color example is an instance where the Law of Demeter has been violated, and therefore causes future problems.

From "The Pragmatic Programmer", page 141:

The Law of Demeter for functions states that any method of an object should call only methods belonging to:

1) Itself
2) any parameters that were passed in to the method
3) any objects it created
4) any directly held component objects

In the hair example, the law was violated because the called went around Phil (or whatever the name was) and set the color property of the hair directly.

If this was important, the Person class should have had a "setHairColor" method that delegated.
Chris Tavares Send private email
Monday, June 20, 2005
 
 
> The reason I say "nearly eliminated" is because
> the Iterator interface is still bound to the
> internal representation of the widths

Iterators are only good when they are iterating over a copy of the data (or copy on write). Otherwise multithreading behavior isn't knowable.

> So my question: why are "get" accessor methods springing up like bad weeds

Because you don't have behaviors, you have data. That's fine. Lots of stuff is mostly data.

> rivate final static int EMPTY = 0;
> private final static int FULL = 1;

Make that a separate enum.

> private int tank = EMPTY;

Use the enum instead. That gets rid of int related type errors. If a value is seen outside a class then make it a class.

> when more often than not they introduce tight
> coupling, transparent-box classes, and
> inextensible source code?

I can, but doesn't have to. I would argue that getters reduce coupling because it allows you to move policy into another class. You wouldn't want your automobile to contain every possible object that can do something with your tank status, do you? The issue is when other objects update automobile after reading automobile state. That's not always avoidable, but should be minimized.
son of parnas
Wednesday, June 22, 2005
 
 
I think the principle of hiding implementation and providing access with getters/setters is OK. The problem is that, like so many development principles, it gets abused by the minions who don't understand the principle but slavishly follow some aspect of it.

For example, in a lot of VB code especially, I see classes where every member variable is private and EVERY variable also has a property get and property set. I've seen whole apps where every class does this. And not one of those getters and setters actually does anything other than get or set the encapsulated value.

You could argue that in this case, you're gaining zilch yet adding in a load of tedious typing and that you may as well make the member variables public. But the real problem is that there is an underlying poor design. Nobody has bothered (or is able...) to think about exactly what a class should be hiding and revealing to client classes. So they just effectively make the whole lot public.

Instead of:
R = CreateRect(x0, y0, x1, y1)

they would do:
R = CreateRect();
R.X0 = x0
R.Y0 = y0
R.X1 = x1
R.Y1 = y1

Because "you never know when you'll need access to a classes internals". Call me radical, but I think you really SHOULD know exactly what access you need to each of a classes internals.

Having said that, I am beginning to suspect that information hiding is overrated. Of the dozens of terrible problems I've encountered on projects, I can't think of any that was caused by revealing what should have been hidden. I can't think of one problem where the solution would have been to hide a variable. I've had many occasions where a variable has had to change it's type, but in those cases, we've still had to change every call to that properties getter/setter anyway, so there is no gain, just pain.
Mantissa
Wednesday, June 22, 2005
 
 
> I can't think of one problem where the solution would have been to hide a variable.

I don't know if the problems are of the broken arm variety.

One problem you'll have is code duplication because multiple attribute consumers code the same logic instead of placing it in the class.

You'll find it difficult to change the implementation of code because someone is using a stupid attribute you want to get rid of when they shouldn't have been using it in the first place. I have had this happen many times.

> I think you really SHOULD know exactly what access you need to each of a classes internals.

I like C++s use of friends so you can explicitly grant access to internals.
son of parnas
Wednesday, June 22, 2005
 
 
"someone is using a stupid attribute you want to get rid of when they shouldn't have been using it in the first place"

Yes, this does happen, but this also happens with the 'empty' getters that I see so many of, too. By empty I mean those that do absolutely nothing other than get/set a private variable.

To persevere with my example, instead of having a Rect.CalculateArea(), I see a sequence of Gets of class attributes, followed by the calculation. And this sequence will be repeated several times in calling code, which is clearly wrong-headed.

If you didn't have such a low granularity of Gets, then somebody would just HAVE to write the CalculateArea() method in it's correct class.

I think this is what the OP is getting at, that we have a principle of information hiding, but when a certain style is adopted, nothing is actually getting hidden, so what's the point?

It's straying into another oft-discussed topic, but I'll stress that I see the 'information hiding that isn't hiding anything' mainly in VB code. In C++, I've seen it much less often.

What really scares me is the hidden class variable called "UniqueID", that has a public Set function.
Mantissa
Thursday, June 23, 2005
 
 
"You wouldn't want your automobile to contain every possible object that can do something with your tank status, do you?"

No. However, I see a bigger issue: someone wants to do perform a task with my data. And I agree that you don't want all possible tasks to be part of the object that contains the data.

The Observer pattern can be used so that other objects in the system are notified of changes to data. Now objects receive data they seek and, at that time, use it to perform their task.

In other words, *why* get the gas tank level? A real-time display makes for the perfect observer (event-driven, no inefficient polling). Database storage can use serialization, saving the entire containment object. Printing can use an XML/XSL/FOP architecture whereby objects render themselves as XML documents, then are pretty-printed via XSL/FOP (and in turn sent to a printer as PostScript, or a browser as HTML).

"I think this is what the OP is getting at, that we have a principle of information hiding, but when a certain style is adopted, nothing is actually getting hidden, so what's the point?"

Yes, that is more to my point.
Dave Jarvis Send private email
Thursday, June 23, 2005
 
 
As mentioned previously though, the observer pattern doesn't fit every situation, sometimes a getter and allowing something to ask for the level when it needs it is ideal and most efficient. With an observer only approach, your forcing everyone who might have an interest in the gas tank level to track the state itself.

As an example, when your driving a car, what would you prefer to do: poll the gas tank by briefly looking at the gas guage or have the info pushed to you by a voice saying "Gas now half full" and having to remember what the last state was?
Gerald Send private email
Friday, June 24, 2005
 
 
"when driving a car, what would you prefer to do: poll the gas tank by briefly looking at the gas guage or have the info pushed to you by a voice ...?"

I feel that the analogy breaks here because there is no software-controlled mechanism that sits between a human's eyes and the physical machine. In other words, a human polling a gas level has absolutely nothing to do with how the software hides its internal member variables. Not only that, but it doesn't affect the efficiency of the software. ;-) Kind of like a free operation, if you will. I can look at the gas level for as long as I like and the software will be blissfully unaware that I am eyeballing its output.

How the gas level gets displayed, however, is perfectly within the realm of software development (provided there isn't a pure mechanical connection between the floating object sitting in the tank and the needle on the dashboard). And this fits quite nicely into the observable model, IMHO.

"With an observer only approach, your forcing everyone who might have an interest in the gas tank level to track the state itself."

Is this not a false dichotomy? I suggested a few ways of manipualating the gas tank level without resorting to an observer model that keeps the information hidden from the system. These are: serialization and XML/XSD/XSL/FOP.

Also, I do not see how it follows that all other objects in the system have to track the state of the tank. If an object wants to know the instant the gas tank is empty, it can become an observer of the tank. With polling, there is an inherent delay; too much polling makes for a sluggish system (since there is no time for the CPU to do anything other than poll).

Even in this hypothetical situation, if you only had the ability to observe the gas tank level through an observable interface, you could change the API without a ripple effect. Adding the ability for a new observer to know when the gas tank is empty can be implemented without having to change the existing observer classes.

Furthermore, there's nothing which says you cannot add another call to the GasTank class which asks it to notify its observers of the level. So if you wanted to know it "right now", you make a request for it to tell its observers "right now".

Anyway, it's likely we'll have to agree to disagree. Although I'm open to the possibility that of needing an accessor to get the gas tank's level, I have yet to see a case where it cannot be gracefully circumvented (so as to maintain information hiding).
Dave Jarvis Send private email
Friday, June 24, 2005
 
 
One last observation:

"I also could see the need for a getter for a Driver class that wants to check the gas level at a specific time to see if a stop at an upcoming gas station is required."

public class GasTank {
  public void setRefillAt( int level ) {
  }

  public boolean needsRefill() {
  }

  public void fill() {
  }
}

Now not only can the driver know when the gas tank needs to be refilled (and consequently stop for gas), but other classes in the system can reuse this API, rather than duplicating code. If your problem domain demands that you need to know when the gas tank needs to be refilled, to me it seems perfectly acceptable to add that behaviour to the gas tank itself. This promotes reusability and maintains information hiding.

Alternately, you could subclass:

public class RefillGasTank extends GasTank {
  public void setRefillAt( int level ) {
  }

  public boolean needsRefill() {
  }
}

This then adheres to the Open-Closed Principle, as well.

Now when I want to check the gas tank to see if it needs a refill, I can. Furthermore, I can change my "comfort level" using the API, still without knowing how the gas level variable is actually stored.

Also, I can still apply the observable pattern in this situation to notify a class when a refill is needed. For example, both the Driver and the LowGasIndicatorLight should be notified when the tank needs refilling. (This feature would have saved me some embarassment a few years back, when I forgot to "poll" the gas level ...)

Reduce, reuse, recycle! ;-)
Dave Jarvis Send private email
Friday, June 24, 2005
 
 
Maybe the use case here is too artifical, but sometimes data is data and you might as well model it that way.

For example, your need refill case. The behavior of when a refill is needed will vary widely from when I'm driving around town (refill when tank is bone dry) to driving through Northern Ontarion (refill when tank drops below half, not much civilization up there). Are you going to encapsulate every behavioral requirement in that one class? That would be silly IMHO, instead provide the data if there is a need for it and let other classes model the behavior and usage of that data if they need it. Otherwise you end up with this huge base class that has to cover all the bases because it stubbornly refuses to expose any data whatsoever.

BTW, I don't think we necessarily disagree by a lot, your just a little more towards Alan Holbuds (sp?) position of getters being evil then I am.
Gerald Send private email
Friday, June 24, 2005
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz