July 2008
Introduction
It may seem strange to start a series of articles about REST systems and the
RESTful design approach with such a low level issue as binding. After all,
REST is a "high level" idea that originated from a post hoc analysis
of why the World Wide Web works so well [1]. However, if you'll allow us to
ground the discussion at this level, I think you'll see that binding is an
essential aspect of RESTful systems.
What is binding? Wikipedia [2] defines it as "the association of values with
identifiers" and continues "Binding is intimately connected with scoping, as
scope determines when binding occurs". In other words, binding is the association
of a logical "name" with something physical in code, such as the memory location
for a variable or the entry point of a method.
As developers, we are so accustomed to the idea of binding that we probably
don't think about it very much. We write code such as
for (i = 0; i < SIZE; i++) {...}
without thinking about the physical memory location used for the for-loop's
increment value. Who cares if it's at address B65A435 or CD55D43D, as long
as the programming language abstraction is preserved and we can always refer
to that location by the name "i"?
In this article, we will briefly explore some of the ways in which bindings
are done and the impact that the choice of binding method has on system flexibility
and performance.
Binding in Java
One of the main functions of a Java Virtual Machine is to find and load only
those classes which are actually used by a running program. The JVM does this
at runtime in a process known as dynamic
linking. When you compile Java source files, the Java compiler
creates class files that contain symbolic references to each other and to the
class files of the Java runtime libraries. During the process of dynamic linking,
the JVM must resolve the symbolic references in the class files and replace
them with direct references to the data structures in memory which represent
the loaded classes.
While there is some flexibility in when it can occur, every JVM classloader
uses a Resolution Phase, during which
it determines the mapping between each symbolic reference (a logical address)
and a direct reference (a physical address). After resolution, the symbolic
reference has been replaced with a direct reference, creating a direct and
permanent link between the code that uses the reference and a memory location.
It is important to note that in Java the resolution of symbolic references
occurs only once. Subsequent references to symbols are not re-resolved;
they simply use the established direct reference.
For example, in the following code, the instantiation of a new instance of
the Date class may cause the classloader to locate the class and load
it into memory (if this is the first use of the Date class), create an instance
of the class somewhere in the heap, call the class initialization method, and
resolve the symbolic reference representing the date variable to the
appropriate location in memory:
Date date = new Date();
Subsequent references to date, however, do not initiate another resolution
process by the classloader:
long time = date.getTime()
This is all fine until one starts to consider issues of system malleability.
One way to think about this code is from the perspective of a client and a
service. The program being written assumes the role of the client and the instance
of the Date class is the service. This may not seem like a problem
until you ask a question such as "How do I replace the Date service with an
updated implementation while the system is running?". If you need to do this
extrinsically (for example, you are managing a system and you need to replace
a portion of the system with updated code) you have a problem. With a traditional
Java application this is difficult since, as in our example above, the class
has been loaded into memory and all symbolic references have been replaced
with direct references. Adding this flexibility is possible, but comes at the cost of adding
complexity (such as class-loading and management code) to the program.
Binding in the World Wide Web
For contrast, let's look at binding in an incredibly flexible system; the
largest and most robust information system yet created: the World Wide Web.
The Web has succeed in part because of a fundamental economic property - one
can add value at a low "marginal cost" [3]. Whole web sites are added everyday,
existing web sites change their content and web sites disappear - all without
disrupting the Web as a whole and at a very low cost per unit. This is a fundamentally
different computing proposition than is found in Java applications; where a
change in a class can result in extensive work (propagating API changes across
the code base, integration testing, deployment, restarting the application,
etc.)
For example, if the Web worked the way Java does, one could add new web sites
at any time (just as Java programs cause the loading of new classes on demand).
Any changes or updates to a web site, however, would require the Web to be
restarted. Yes, this is a slight exaggeration - but hopefully it raises the
question "Why can't software applications have the same desirable characteristics
as the Web?".
From another perspective, the World Wide Web can be thought of as continuously
responding to requirements which are constantly changing. Like the Web, most
modern software systems also need to respond to frequently changing requirements.
So, what is needed to make software applications that are as flexible as the
Web in this respect?
Web-like Binding in Applications
Since the Web has interesting economic properties and great flexibility, one
might wonder if we can incorporate these same properties within our applications
by making our applications more "web-like". Is it possible to use web-like
binding inside of application software? If so, what technology can do this?
Before we explore these questions, let's review how the Web works. A client
program (in most cases, a browser) is given a logical address by its user and
the browser makes a request for a representation of the resource located at
that address [4]. When the resource representation is returned to the browser,
the browser "interprets" the representation and (usually) creates a visual
display for the user. For example, entering the logical address "http://www.theserverside.com" causes
the browser to request the resource representation at that address. The representation
returned is an XHTML document with embedded JavaScript and links to other resources,
which the browser must also request and then assemble into a final visual presentation.
From the user's perspective, they have entered an address that is available
within a global context. The user does not know that a temporary binding occurred
between their browser and the web server. In fact, the user does not know,
and need not care, where the resource representation physically comes from.
Technologically, what occurs is the following:
- The browser submits a query to the Domain Name
Service (DNS) for "www.theserverside.com" and gets back the IP address
of a server registered as the endpoint for the domain address.
- The browser
uses this information to make a TCP/IP connection to the IP address
at port 80.
- The browser uses the connection to send an
HTTP protocol request to "GET" the resource with the address "/" relative
to "www.theserverside.com".
- The server endpoint receives the HTTP
request and does whatever processing it deems necessary to obtain or
create and then return a resource representation for the given logical
address ("/").
- The browser receives the representation, processes
it and creates a visual display for the user.
- The browser drops the
TCP/IP connection and "un-binds" from the endpoint
[5].
Getting back to our question, how can we incorporate the desirable properties
of this proven architecture within a software application? When examining the
architecture of the Web, we note several key ideas:
- Client code is separate from server endpoints.
- An address resolution mechanism locates endpoints
for a given logical resource address.
- Endpoints return resource representations.
- The binding between clients and endpoints is
transient; only valid for the duration of the request.
- Endpoints have knowledge of the address they are processing.
One way to implement these characteristics in application software is to introduce
the idea of an Intermediary. Client
code can then send requests to the Intermediary which can resolve the logical
address to an endpoint, send the request to the endpoint, accept the returned
resource representation and pass that back to the client code. At the end of
this cycle, the binding between the client and the endpoint can be dropped.
We also need logical addresses that clients can use to request resources.
Universal Resource Identifiers (URIs) [6] work well for the Web and they can
serve us too. The URI specification supports the creation of new schemes. Because
existing schemes, such as http:, ftp:, etc., all have associated
protocols and are used outside of and between applications, we should define
our own scheme. Let's pick a somewhat arbitrary scheme, such as "resource:".
With this URI scheme, if a client wants a representation of all customers,
it can create and send a request to the Intermediary for
resource:/customers/
It is then the responsibility of the Intermediary to:
- Resolve the logical address to an endpoint.
- Send the request to the endpoint.
- Accept the representation returned by the service
endpoint and send it back to the requesting client.
- Drop any connection it establishes between the client and the endpoint.
If the clients and services were written in Java, what might they look like?
The client will need access to the Intermediary so, for now, let's say it is
available by reference in a variable named "context". The client code might
then look like this:
Request req;
Representation rep;
req = context.createRequest("resource:/customers/");
rep = context.issueRequest(req);
and the service endpoint code might look like this:
public void processRequest(Context context) throws Exception
{
Request request = context.getRequest();
...
Representation rep = // some code to create the representation
context.setResponse(rep);
}
Interesting, but does this really work like the Web? It seems to:
- Clients make a logical request for a resource
and don't know the location of the service endpoint that will process their
request.
- The binding between the client and the resolved
endpoint exists only for the duration of the request processing.
- Service endpoints return a resource representation
to the client.
- Additional services can be added at any time
- for example an endpoint for the address "resource:/accounts/...". Service endpoints can be updated at any time, as long as the Intermediary
can suspend the forwarding of requests to a service endpoint while the
update is handled.
This is a good start...but Web requests also include a "verb" that specifies
the type of operation which is being requested of the service endpoint. The
HTTP protocol, for instance, defines a well-known set of "methods" (aka verbs)
such as GET, POST, PUT, and so on. For our system,
let's introduce a slightly modified set of verbs: SOURCE, SINK, NEW, EXISTS,
and DELETE.
We should also think more deeply about the nature of the resource representations
that will be returned. Most browsers are designed to accept and process a number
of different physical representation types (HTML, XHTML, PNG, GIFF, CSS, JavaScript,
etc.) but presuming that application software clients can process this variety
of types may not make sense. To overcome the problems of strong type binding,
we will allow our system much more freedom in the handling of resource representations.
The key idea is that the client and the service endpoint will not need to
agree on the type of the returned representation. If the client does not specify
a return type, then the endpoint is free to return any type it chooses. If
the client does specify a return type but the endpoint is not able to return
that type, it will still return some representation. In this case, the Intermediary
is free to determine if it can transform the representation from the type provided
to the type requested (for example, transforming an XML fragment to a JSON
representation). Finally, we allow the result of a request to be a failure
return if the requested type is not available or cannot be created from the
returned representation.
Let's see how these additional factors play out in the client and service
endpoint code:
Request req;
Representation rep;
req = context.createRequest("resource:/customers/");
req.setVerb(Request.SOURCE);
req.setType(List.class);
rep = context.issueRequest(req);
and the service endpoint code might look like this:
public void processRequest(Context context) throws Exception
{
Request request = context.getRequest();
...
List customers = new LinkedList();
// add customers to the list
context.setResponse(customers);
}
While this may look like traditional linking between a consumer and a provider
in a Java application, the ability of the Intermediary to transform resource
representations enables a much looser notion of binding between the client
and server. For example, if the client code in the last example was the following:
req = context.createRequest("resource:/customers/");
req.setVerb(Request.SOURCE);
req.setType(DOM.class);
rep = context.issueRequest(req);
The List type returned by the service would not match the requested DOM XML
format. The Intermediary in this case would search for a service that could
transform the List representation into a DOM representation and request this
transformation automatically and transparently to both the client and the service.
This transformation of a representation or "transrepresentation" gives our
system increased decoupling between clients and services and increases the
ease with which applications can be composed.
Note, also, that the Intermediary provides complete physical isolation between
clients and services - the client never has access to the physical location
of the service; something that is not present in a typical, physically-bound
Java application. This isolation, achieved through logical binding, is one
of the ways that we can provide Web-like characteristics in application software.
Future Directions
Even though our Intermediary provides great power to our architecture it has,
so far, been fairly simple. As you may have guessed, the Intermediary itself
can be enhanced to provide additional capabilities; such as controlling threads
of execution, transforming representation types, loading and managing Java
classes, supporting other programming languages, and managing the specification
and mapping of logical address namespaces in which addresses are resolved.
In future articles, we will discuss some of these enhancements and the additional
power that they provide.
We have started to show how one can build a system for developing application
software that captures the economic benefits of the World Wide Web. We began
at a low level - with binding. We discussed an approach that uses URI logical
addresses and an Intermediary that resolves the addresses to service endpoints.
We then explored prototypical Java code for clients and services in such an
approach. Finally, we discussed the issue of matching the representation types
returned by a service to those requested by clients.
As interest in REST and RESTful design increases, we will see creative technologies
aimed at supporting this approach. Some will be RESTful only at the edges and
will miss the full benefits found by taking REST to the core of new applications.
In subsequent articles we will explore an approach to pushing REST all the
way to the core of new applications and how the sought-after economic benefits
are, in fact, realized.
References
[1] Roy Fielding, Dissertation for PhD at UC, Irvine, http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
[2] http://en.wikipedia.org/wiki/Name_binding
[3]
In economics the term "marginal cost" refers to the cost of adding a unit
of production, which in the case of the Web is a new page on a web site or
even a whole new web site.
[4] The browser will get an immutable copy of the
information located at the requested address.
[5] Optionally the browser may
keep the connection open to efficiently request other resources from the same
endpoint. This is just a performance enhancement and does not change the logical
model. (It is as if the browser "caches" the
resolution to the endpoint for a short period of time)
[6] http://www.ietf.org/rfc/rfc2396.txt
About the Authors
Tom Hicks is a software consultant at Tohono Consulting in Tucson, Arizona.
He specializes in enterprise architectures and technologies which make the "plumbing" of
the Internet more capable and easier to use. Tom holds graduate degrees in
Computer Science and Cognitive Science and is interested in the confluence
of these fields with Linguistics.
Randy Kahle is a director of 1060 Research, Ltd a company dedicated to researching
and realizing the benefits of RESTful systems. Previously he worked for Hewlett-Packard,
Microsoft, MageLang Institute and Variantia. He holds a BA from Rice University
in CS and EE and an MBA from the Tuck School at Dartmouth.
PRINTER FRIENDLY VERSION
|