Saturday, February 19, 2005

Thinking: Runtime Geography of a Java VM

Alex Miller read about my notion of runtime geography and posted some excellent analysis. As I've thought about it more, I agree that the notion of runtime geography is more intuitive and feels more general than the notion of an ontology, so let's run with it and see where we end up.

If we think about locating things on a map, there are numerous types of things, and their locations can be specified in a variety of ways. For example, we have towns and cities, which have a name (not always unique) and occupy a particular area centered on a point that we can specify with a longitude and lattitude. We can also talk about things in a relative manner ("Chicago is about 350 miles north-northeast of St. Louis). We sometimes want to find things based upon nearness (Where's the nearest pizza place that delivers?). We can also talk about roads, rivers, right-of-ways, etc., all which follow a route (which is usually specified by a width and a series of points specifying its path).

So, if we're going to have a runtime geography for the objects in the VM, we have to solve several problems and answer a number of questions:

  • What are the various dimensions used to create the space? Anything that might be a dimension must be something for which we can directly assign a value to an object, or one for which we can derive a value for an object based upon some affiliated/associated object.

  • How do we make the geography available to Java code? Once we've come up with some definition of the runtime location of objects, we need to be able to assign a location to an object and we then need to be able to specify a location or region to locate one or more objects.

  • How do we avoid conflicts with other ways of organizing objects in the VM? There are at least two ways to define a 'location' for an object at runtime today: the ClassLoader that loaded it's class and the package it is defined in. We might be tempted to include the Thread(Group} which is executing code of this object, but this has the problem that (a) it is transient and (b) it can result in multiple locations for the same object. We should leverage these if possible or at least not clash with them.

  • How does our solution fit into a distributed system? There are many situations where multiple Java VMs communicate with each other. Should the runtime geography be a purely internal thing which is never exposed outside a VM, or should we expose it and use it across VMs?

To make the whole discussion concrete, I'm going to focus on two particular uses of a runtime geography that are fairly different from each other. While I hope a well-conceived runtime geography will be applied to many problems, these two seem like a good start:

  • Dependency injection at object construction time - finding objects required by the constructed object based upon where in the program the object is created

  • Location-dependant logging - specifying different logging levels for the same kind of object used in different locations in the program. For example, assume StringTokenizer produced logging messages for your favorite logging API. If a JDBC driver and some user-written piece of code in the same program use a StringTokenizer and you want to crank up the logging messages for StringTokenizers 'inside' the JDBC driver but not 'inside' the rest of the program, how can we use runtime geography to do this? What if we want to crank up all logging inside the JDBC driver?

In addition, we should meet some design goals:

  • Make things as simple to use as possible.

  • Make the description and use of geography easy to understand - in particular, draw upon the analogy of real-world geography as much as possible.

  • Put information in as few places as possible - avoid multiple configuration files that all depend upon each other and which depend upon magic strings or magic numbers. Of course, this can compromise the utility of things if we overly restrict the facility, so we need to 'right-size' it.

  • Geography locations should be defined at compile-time, just as in real-world geography. This is essentially saying that objects won't change their location during runtime. This might conflict with use in a distributed manner, but seems like a good simpifying assumption, at least for now.

  • It should be possible to specify the 'query' used to navigate the geography both at compile-time and at runtime.

  • Don't conflict with other ways of defining object location.

  • This feels like enough to start mashing ideas together. Now it's time to do the serious thinking... :-)

    Thursday, February 17, 2005

    Thinking: Dependency Injection and Runtime Geography (2)

    A friend of mine responded and asked how I envisioned solving the problem of runtime geography. I don't have an answer yet. To me, identifying the 'true' problem is valuable because now I can search for solutions to the problem and know what I'm looking for, instead of just having a vague sense that something is wrong.

    I don't think annotations are the right thing to do, because they are also compile-time entities. That might be somewhat helpful compared to the Java package/name hierarchy, but probably not by much. Instead, I think we need a way to categorize objects and group them at runtime. ThreadGroup is an example of doing this for threads, and that might be useful in some manner here (perhaps using annotations to instantiate thread(group)s which are marked differently). But that is focusing on the geography of the execution threads. I think we also need the means to map the geography of the objects and their state and references.

    Annotations in 1.5 don't let you assign annotation values at runtime. That's essentially what we need. We need the means to annotate information on objects independently of their class at runtime.

    I haven't found a means to do what I want yet. I'm looking for it. It may not be possible in Java as it exists today (though I'm going to keep looking in hopes of finding a means to do this). I am not a fan of Aspect-oriented programming, but AOP might provide a means to augment normal objects with this runtime capability.

    Of course, there are two parts to the problem: (1) how to annotate objects in a manner that creates a runtime geography and (2) how to utilize that runtime geography to solve a particular problem, like dependency resolution.

    Thinking: Dependency Injection and Runtime Geography

    I recently attended a St. Louis Java User's Group meeting where Alex Miller discusssed the various forms of dependency injection. It was a great talk, and it got me to thinking.

    I believe the problems we have with dependency resolution (which dependency injection attempts to solve) are not the real problem, they are the symptom. The real problem is that we have no ontology* for a running Java program except for the static package/class hierarchy. There is no practical mechanism for identifying objects within a running JVM, such that we could then describe how those objects should find things they depend upon (for an example of a failed attempt to do this, check out the Java Beans Activation Framework).

    When I instantiate a "Widget" object, there is no notion of geography in the JVM. As a result, there is no way for me to say that the Widget instantiated "over here" should use the "Cog" object "over here", while the Widget instantiated "over there" should use the Cog object "over there". The problem is that there is no notion of "over here" or "over there" within the JVM. The general problem of no geography leads to specific problems like dependency resolution, when you want to reuse the Widget class and need to resolve dependencies the Widget has when instantiated. It's particularly difficult when you might have more than one Widget in the same running JVM and need them to use different dependant objects.

    Essentially, we need a good ontology for objects in a JVM, and the one we have, the fully qualified class name, is completely inadequate since it is a compile-time characteristic.

    We don't need dependency injection solutions. We need ontologies to describe the objects in a running JVM. This will make solutions to dependency injection trivial. It will also facilitate lots of other useful things, because it will give us a runtime geography that allows us to describe (and therefore group) objects based upon their 'location' in the JVM at runtime.

    Now all we have to do is figure out the right runtime geography/geographies to support.

    *The hierarchical structuring of knowledge about things by subcategorising them according to their essential (or at least relevant and/or cognitive) qualities. --The Free On-line Dictionary of Computing, © 1993-2004 Denis Howe