Java Development News:
1060 NetKernel - A new Abstraction for Web-systems
By Peter Rodgers
01 Mar 2004 | TheServerSide.com
This paper presents a discussion of the 1060 NetKernel system. 1060 NetKernel is an open-source URI Request Scheduling Microkernel which provides the foundation for the 1060 NetKernel Standard Edition XML Application Server.
NetKernel started life as the Dexter research project at Hewlett-Packard Labs. In order to understand our motivation for developing NetKernel and to appreciate why it is relevant to the next-generation of Web-systems we need to establish some context.
Whatever your position on Web-Services - from the cynical: “WS-* is an incomprehensible and unproven stack; the product of a software industry desperate to create new revenue during the deepest industry recession ever”, to the pragmatic: “Prove it and I’ll think about using it”, the act of placing the two words “Web” and “Service” next to each other bears some examination.
So, setting aside the pre-conceived specifics of SOAP-services, what is really motivating Web-Services?
Some believe that the fundamental objective is to take the proven robust properties of Web-systems and apply them to general information systems. Why? Because the Web is the first and only distributed system which has scaled and generally works despite huge mismatches in both the capabilities and generations of clients and servers.
This observation is certainly one of the motivations behind NetKernel, but it has also motivated many other systems. What is different about NetKernel? Part of the answer is in our assertion that it is an abstraction of current Web-systems, not another framework. For most of these, the end product is a Web page viewed in a browser by a human. NetKernel views XHTML as just another dialect of XML. Even this feature doesn’t really mark NetKernel out from the crowd, but the abstraction goes further: to NetKernel the presentational Web-site is just one application in a continuum of XML-based services and, indeed, the HTTP application protocol is just one of many transport protocols.
Getting back to first principles then and to avoid the preconceptions associated with the term “Web-Service”, I suggest the following simple definition:
An XML-service is a URI addressable interface which consumes and/or generates XML.
That’s it – no transport or application protocols, just XML-in and XML-out of a URI- addressable interface – this is the core idea behind the NetKernel abstraction. It encompasses services using SOAP, but more importantly it allows that the presentational (XHTML) Web-site is, in fact, the original Web-service application – it just happens to be a distributed publishing application.
Why is XML important? After all, XML is just a simple syntax for expressing data in a tree-structure. It’s not predominantly because of the XML technology set, which has both good and bad technologies. It is important because the Web has proven that adherence to a universal mark-up language is a good system property. XML is important in the same way that 50Hz (or 60Hz!) alternating current is important.
It is XML’s communicability that makes it powerful – simply put, data expressed in XML can be communicated more broadly. I can hear you say, “Yes, but XML is tedious and a pain to code.” We’ll come back to this later.
Why are URIs important? Again, not as a technology, but because they are a proven common syntax for a universal address space. A service with a URI is addressable in the Internet. No other distributed system has a proven addressing mechanism that has scaled, is popular and is universally adhered to – URIs (URLs) are so well known that it no longer seems strange to find them on your cornflakes packet.
So the basis for the adoption of XML-Services is that XML is more communicable, URIs are universally understood, and the presentational Web has proven to be stable and maintainable across an enormously heterogeneous set of participants. In essence the Web has positive technical and economic properties which deserve to be re-used for more general information systems.
Let’s get to some real technology after which we can return to the examination of some of the less tangible Web-system properties, such as loose typing and loose coupling.
The NetKernel Abstraction
NetKernel Standard Edition is an XML Application Server based on the NetKernel microkernel. In order to understand the implementation of the XML Application Server we need to describe the NetKernel abstraction in more detail. We will then discuss how the NetKernel infrastructure realizes some of the desirable properties of a Web-system and how this helps in writing XML processes and services.
Bear in mind that this discussion is all about the internal architecture and abstract model of NetKernel, and not a description of how NetKernel replicates a current Web-system.
NetKernel is a microkernel for managing URI address spaces and scheduling URI requests. It was created around a set of simple concepts.
- Software components are addressed by URI.
- A component's URI is published in an internal URI Address Space.
- URI Address Spaces are encapsulated in modules which can be layered into Virtual Private Address Spaces.
- Software components may issue URI requests against the Virtual Private Address Space.
A NetKernel URI address space is an internal abstraction that has no relationship with HTTP, or the outside world in general (that is, until we get to the section on Transports) - think of NetKernel's URI address space management as like the virtual memory management in a modern operating system.
In many ways NetKernel’s software components are a generalization of Servlets.
A Servlet is bound to a URL address by the Servlet container. It receives HttpRequests on the HttpServlet interface that correspond with the HTTP verbs (POST, GET, PUT etc). The Servlet generates a resource by interacting with EJBs, or other Java-components, in the Java-domain (procedural Java classes).
NetKernel’s analogous components are called Accessors.
A URI address space may be bound to an Accessor class. A request for a URI which matches the URI address space will cause the Accessor class to be instantiated and a method on the IURAccessor interface to be called. Like Servlets, Accessors can generate a resource internally by interacting with Java components. However, in order to fulfil a request, an Accessor may issue further URI requests to the NetKernel system.
Figure1. HttpServlet and Accessors approach to the URI Address Space.
The use of the term “Accessor” is very deliberate. An Accessor accesses a resource, either by generating it internally or by issuing further requests. An Accessor is a client/server node located in the internal NetKernel URI address space. As well as hosting business logic and services, Accessors are the components used to house external client gateways to the outside world, as can be shown with an example.
Example file: request
A resource on the local file system can be accessed with the conventional file: URI scheme. NetKernel provides a File Accessor which services all URI requests in the file: scheme – so, for example, a request to SOURCE file:///C:pathresource.xml will cause the kernel to find and instantiate the File Accessor class and invoke the asyncRequest() method on it. The File Accessor will locate the file on the host file system and return a byte-stream representation of the resource.
NetKernel uses a generalized set of request types – SOURCE, SINK, DELETE, EXISTS, NEW which are internally consistent and independent of any application protocol. Those familiar with REST will recognise this is an abstraction of the HTTP REST verbs. Commonly an Accessor will handle SOURCE requests, as shown above. The file accessor also handles the other request types so it can be used, for example, to test for the existence of a file.
The file: example shows how an entire URI scheme can be handled by an Accessor. However this does not demonstrate how NetKernel performs dynamic invocation of URI service interfaces.
To invoke Accessors as URI addressable services, NetKernel implements the active: URI . An Active URI may contain multiple named arguments and, following our basic principles, each argument is a URI.
Here’s an example of an Active URI to invoke an XSLT service which applies an XSL transform, specified by the operator URI, to the XML resource, specified by the operand URI.
The '+' is a delimiter between arguments, the '@' is a delimiter between argument names and argument URIs. The example active URI should be parsed as
active: xslt + operand@file:///C:pathresource.xml + operator@file:///C:pathtransform.xsl
Where xslt is the URI of the service, operand is the name of a URI to a resource that is the subject of an operation, operator is the name of a URI to a resource that is the operation to be applied. An active URI can use any appropriate names for arguments – “operand/operator” is a convention we have adopted for many of the standard services supplied with NetKernel.
In the Java-domain, the kernel implements an active: URI request with a URRequest object.
The active URI is an important construct. Since a named argument is a URI, then it can also be another active URI. This allows a sequence of requests to be composed into a deep active URI call-stack which need only be evaluated when a resource must be interrogated to determine process flow. So the resolution of an active URI can be thought of as lazy evaluation – in the Java world we’re probably better off calling this Just-in-time evaluation.
In showing the examples above (with their self-contained address spaces and resource sets) we’ve implied the existence of Modules. A NetKernel Module is like a WAR in that it contains resources, Java classes and libraries. However , unlike a WAR, a Module hosts an encapsulated private URI address space. It may also export a public URI address space – in the examples above, the file: and active:xslt Accessors are contained within modules and are exported on the public interface of their host module.
A module may also import the public address spaces of another module into its internal private address space . In this way a module can be used as a shared library, or equally an application can be partitioned into cleanly separated layers. Importing a module's public address space does not effect the exported public address space of the importing module – the public and private address space are decoupled.
Figure2: An example of module encapsulation.
The Kernel takes care of management and isolation of all modules and their virtual URI address spaces. A request made in the address space of a module will be resolved against the module’s internal private address space.
A request may match the public interface of an imported module and enter that module. In order that a library module need not have pre-knowledge of all address spaces, a module may pass a URI request, which is not matched by its private internal address space, back up the module call-stack. It may then be resolved against the private address space of the calling module. This process may apply recursively.
All modules are identified by URI and have a version number. Modules may specify the minimum and maximum versions of a module that they will accept for import. By being tightly encapsulated and versionable the NetKernel abstraction ensures long-term system stability and even enables concurrent execution of different generations of the same application.
Each module has an associated ClassLoader which provides the same level of isolation for the Java class space as for the private URI address space, including respect for module version numbers.
In general an application on NetKernel will be composed from more than one module, each of which will be responsible for a particular functional unit of the application.
The management of the module address space is a key role of the kernel. The other key role is the URI request scheduler.
The scheduler is responsible for receiving, resolving and issuing requests against URI interfaces. Requests may be synchronous or asynchronous. The scheduler assigns Java threads to execute a request in an Accessor. It minimizes the total thread count necessary to perform complex processes. To minimize context-switches, where possible, it allocates jobs in a process to a common thread. The scheduler also monitors for thread-starvation and deadlock situations and attempts to resolve these either by boosting thread count or interrupting a job, respectively. It is possible to run some classes of concurrently executing processes on NetKernel with just a single Java thread.
Every resource has a URI which can be used transparently as the key for an associated cached resource. Requests for a frequently used URI will be served immediately from cache. Remember –all NetKernel components, including Accessors and the results of invoking Accessors, have a URI and so are intrinsically cachable.
The default NetKernel cache is a dependency cache, all resources accumulate dependency metadata which binds them with the resources used in their generation. If a dependent resource expires or is changed, for example a file is edited, the dependency chain is automatically and transparently invalidated and all cached dependents voided. Resources may also be time dependent and can be given an expected lifetime – this can be very efficient for distributed networked resources.
The opportunity to create layered applications enabled by the module infrastructure combines very effectively with the dependency cache and can mean that large parts of an application can offer equivalent performance to serving a static resource and yet be dynamically generated.
By default there is a single system-wide cache, though each module may implement its own cache. If caching is not important for an application set, the system functions without any cache.
A system is not much use unless it can receive requests from the real world. The final high-level component in the NetKernel abstraction is the Transport.
Where Accessors are the client-side of NetKernel, Transports are the server-side. Here is a point-by-point description of the operation of a NetKernel Transport.
- A Transport receives an application or application-protocol-specific event
- It maps the event, using its application (or application-protocol) specific context, to an internal NetKernel URI
- It builds a URRequest object holding the URI and any resources received with the event
- It issues the request to the kernel scheduler
- The kernel scheduler executes the request against the URI address space of the module which hosts the transport.
- The kernel returns the result of the request to the transport
- If necessary for the application protocol, the transport interprets the request, and issues a response to the triggering application-specific event.
All NetKernel processes are initiated by a Transport initiated request – Transports kick-start processes on NetKernel. A module may host any number of transports. Multiple modules or versions of modules can host application specific transports.
HTTP is a very important application protocol and NetKernel can be a full Web-application server. The NetKernel HTTP Transport is conceptually similar to the HttpServlet interface – it handles the HTTP verbs. Unlike an HttpServlet the HTTP Transport is simply a bridge to the internal URI address space. It maps the HTTP verb to NetKernel's internal verbs and issues an internal URI request. In general the internal URI is the same as the external URL address but this is now issued against the internal address space of the module which hosts the HTTP Transport. Note, that HTTP's auxiliary technologies such as Cookies, file uploads, multi-part encoded forms... all fit into the Transport abstraction.
The Transport abstraction can be applied to any application protocol. Transports can be written to integrate non-Web-like systems into NetKernel – for example we’ve written polling SMTP transports to create email processing systems, an Intray transport, a Telnet transport, SOAP-messaging transports ... we’ve even experimented with GUI client transports (Longhorn’s catching on)!
What can you do with the NetKernel Abstraction?
The NetKernel microkernel can be configured as an Application Server – it could be used simply as a next-generation Servlet container with the advantages of versioned modules, layered application structures and pluggable transports. Its microkernel footprint is small so it can be used for embedded applications. However these deployments do not utilize the real potential of the infrastructure.
XML Technologies as Services
NetKernel’s URI addressing architecture corresponds directly to the definition of XML-service we gave earlier. Taking a service-based approach to processes, we have built an XML Application Server on top of the NetKernel core. The high-level design pattern is to encapsulate standard XML technologies as URI addressable services exposed on the public interfaces of a set of library modules – hence the XSLT example given earlier.
Some advantages of making the standard XML technologies available as services are:
- Inherit the NetKernel URI infrastructure – the standard XML technologies assume that there is an implicit URI infrastructure and resolution mechanism (eg xsl:include, Xinclude, Xquery doc() function etc etc). NetKernel presents a managed URI address space and a URI request resolver to all XML technologies – so, for example, another NetKernel XML service can be invoked from inside an Xquery by using its active: URI.
- Exception Handling. Some standard XML technologies may have poor/patchy error handling. As a service on NetKernel they acquire a full exception handling infrastructure.
- Concurrency / Thread Safety - XML libraries have patchy thread safety. The NetKernel scheduler guarantees thread safe access to an Accessor – Accessors must declare that they are thread-safe before they will be concurrently scheduled.
- Cacheability – the results of a URI service request are always cacheable.
Which XML technologies are integrated as services with NetKernel Standard Edition? Listing some of them risks sounding like a marketing brochure but it does give an idea of the extent to which the NetKernel abstraction actually works for existing standard technologies. NKSE includes the standard transform technologies: XSLT, XQuery; a range of runtime evaluated schema languages: RelaxNG, XML Schema, DTD, Schematron as well as simple XPath assertion. But also higher level technologies such as: server-side XForms, XSLFO renderer, HTML to XHTML conversion; security services, XML Signature, XACML, GUID generator; RDBMS services, SQLQuery, SQLUpdate, SQLBatch; XML index and search; XML spell checker; WebDAV clients; image processing services, SVG2PNG, charts... In all, a growing set of more than 70 services.
Having a large set of service-based XML technologies is interesting, even novel, but things become far more interesting when we recall that an Accessor is a client as well as a server. That is, an Accessor can issue requests as well as receive them. From this premise we have built two XML runtime Accessors which execute declarative languages to create XML processes.
- Declarative Process Markup-Language (DPML) is a simple language for composing XML services. It can be used to create pipelines or more complex processes with conditions, iterations etc. DPML is good for coordinating workflows.
- XML Resource Linker (XRL) is a pull runtime which recursively traverses a set of linked XML resources to generate a final XML result. XRL is good for generating Web-sites.
Both DPML and XRL issue URI requests to invoke the encapsulated standard XML services – this pattern is extensible; new services and custom logic can be added as new Accessors, exported to the URI address space and invoked from the high-level runtimes. Equally, new language runtimes can be added as services.
Earlier I mentioned that XML is painful to process with procedural code – it’s not just a matter of the design of the procedural APIs. Many feel that there is a mismatch between the declarative XML-domain and the procedural-domain, it's analogous to the historical mismatch between relational databases and procedural code – something that declarative SQL ameliorated. Under NetKernel, creating XML processes declaratively seems to be both quick but also robust and adaptive to change.
Here’s a simple example of an XML process written in DPML to source
two separate database results sets as XML and combine them using XSLT into an
XHTML table. For a quick reference to the DPML syntax look at
<idoc> <seq> <instr> <type>sqlQuery</type> <operator> <sql>SELECT * FROM orders WHERE total > 1000 ; </sql> </operator> <target>var:orders</target> <instr> <instr> <type>sqlQuery</type> <operator> <sql> SELECT adresses.* FROM addresses, orders WHERE orders.total > 1000 AND addresses.customerid=orders.customerid GROUP BY addresses.customerid ; </sql> </operator> <target>var:addresses</target> <instr> <instr> <type>xslt</type> <operand>var:addresses</operand> <orders>var:orders</orders> <operator>file:///C:processstyle_customer_orders.xsl</operator> <target>this:response</target> </instr> <exception> <instr> <type>dpml</type> <operand>call_tech_support.idoc</operand> <exception>this:exception</exception> <target>this:response</target> </instr> </exception> </seq> </idoc> The exception block executes a sub-process, call_tech_support.idoc <idoc> <comment> Exception Handler Process </comment> <seq> <instr> <type>log</type> <operand>this:param:exception</operand> </instr> <instr> <type>copy</type> <operand> <div> There was an error processing your request - we could not generate the customer view - please notify <a href="http://support.bigcorp.com/poor/old/dave_sysop">Dave Sys-Op</a> and ask him to examine the log file to resolve the problem. </div> </operand> <target>this:response</target> </instr> </seq> </idoc>
Note that there are no active URI's - under the hood the DPML runtime generates and manages requests for active: URIs issued against the kernel scheduler. The developer is offered a high-level runtime, which if they wish can be peeled back to reveal the underlying NetKernel infrastructure.
In the example, the sqlQuery is an SQL utility service provided by NetKernel in the mod_db module. It performs a JDBC query operation and returns an XMLized Result Set consisting of a <results> root with <row> elements containing the result set rows as elements.
The example shows some of the efficiency of combining XML and SQL in the declarative domain without the need to go to procedural code. In order to keep things simple we haven’t shown how the SQL query could have been dynamically generated in the process.
For clarity we have used inline literal XML documents for the SQL queries but these could have been URIs to resources (eg file:///c:processcustomer_query.xml). Also we've not shown the XSLT to generate a presentational view of the results sets. The XSLT instruction applies a transform to the variable var:addresses and supplies var:orders as a parameter to the stylesheet.
This example also shows the exception handling of DPML. If an exception occurs in any of the DPML instructions (NetKernel requests) a DPML error handling process is executed. The user is told to contact tech support and the exception is logged – this is a simple example but illustrates what could have been a full diagnostic sub-process.
This is a very simple example of DPML. The full set of XML services and runtimes readily combine to build XML workflows or to process Web-services messages from any transport technology.
In the discussion of the Scheduler we did not mention that it supports breakpoints
in the URI address space. This allows any request to be intercepted and inspected
at runtime. NetKernel Standard Edition provides a high-level Debugger application
that enables the debugging of application URI requests. A stopped request may
be examined, any resources inspected and the state of all arguments viewed.
It is also possible to traverse the request call-stack and examine the state
of parent requests.
Due to their size and structure XML documents and processes are quite difficult to debug in procedural code. The NetKernel Debugger allows XML processes to be debugged with relative ease.
What are the properties of systems built on NetKernel?
Web-services on NetKernel are backwards compatible to the Web and indeed NetKernel can be deployed as a caching Web-application server. Having developed a number of production XML systems on NetKernel, including the XML application server's administrative services and tool set, we have observed some common properties.
An XML-service need not understand everything in a message, so long as it receives what it needs to do it’s job – compare this with the tight binding RPC approach of Corba or SOAP-RPC.
NetKernel performs runtime validation – frequently it is desirable to use simple XPath assertions to control process flow or perform service conformity checks, than to use tightly constrained schema. However schemas, including a choice of multiple schema languages, can be used anywhere and are evaluated at runtime as services.
NetKernel has no intrinsic types. A requestor may specify the representation of a resource when they issue a request. NetKernel will dynamically convert from the received form to the requested form – an example might be byte-stream converted to DOM, both aspects of the same XML resource. Unfortunately another article would be needed to explain in full NetKernel’s Representation Aspect Transrepresentation (RAT) model .
Address Space Protection
Encapsulating the components of an application behind modular public URI interfaces is beneficial from a development perspective but also has benefits for security/system integrity.
As an example, we wrote Blogxter, yet another blog application, but built from layered encapsulated modules. It extensively uses server-side XForms. We found that the final submission URI interface of a completed form can be hidden; not even exposed as an external URL on the HTTP transport. This is like having a form that POSTs to a hidden web-site – a spam agent cannot access the interface because it only exists internally and the form will only post on the internal interface when all it's constraints are satisified.
Systems built with layered modules exporting limited public interfaces are guaranteed to be protected, making the core parts of a system unreachable from any public Transport.
Loose coupling is a term that is often applied to a service based approach to systems. We’ve found that versionable modules and flexible URI address space model provide the foundation for systems that are flexibly coupled and very tolerant of change.
In addition we run production systems with multiple generations of services, and have found it relatively easy to deal with differential rates of change in service evolution. Furthermore systems seem to be easy to extend whilst retaining backwards compatibility.
What are the compromises in using NetKernel?
- Raising the level of abstraction increases CPU and memory load
Any system abstraction always has a cost relative to a ‘to-the-metal’ bespoke solution. The exact cost of using NetKernel is unquantifiable but definitely small, and indeed there are significant performance gains for well architected applications that take advantage of the dependency caching.
The dependency cache is very efficient at minimizing repeated computation or multiple resource requests. The dawn of 64-bit addressing and the falling cost of memory, mean that it’s cost-effective to use more memory than incur network latencies or computational overhead. Nevertheless, the cache size is tunable and in applications that are memory critical can be omitted entirely.
Of course the point of any abstraction is to provide payback in improved productivity, robustness, maintainability and system scope.
- Paradigm shift / learning curve
NetKernel requires an investment in learning a new model, though the model is an extrapolation of the existing Web-model.
NetKernel Applications can require multi-disciplinary skills/teams– with at least a basic knowledge of XML standards and technologies. It is important to plan the URI address space of the system and to layer the application/process so that the interfaces can encapsulate services cleanly. As applications grow in complexity it is often the case that high-level services will call on lower-level services which in turn use even lower level services .
- Some proprietary knowledge needed
Where possible all high-level services emphasise and incorporate standard technologies but custom Accessors must be built on the NetKernel APIs.
The core abstractions of the system will be offered as standards – for example the active: URI has been published as an IETF RFC.
NetKernel is open-sourced, it is commercial software released under a dual-license business model. Users are free to inspect, extend or replace any part of the NetKernel system
- Emphasis on declarative systems
NetKernel applications are biased towards the Declarative-domain. Though applications can be written procedurally it seems to be far more efficient to create custom-services/business-logic as services in Accessors and integrate them into declarative processes.
Java devlopers can create encapsulated Accessor services that can be declaratively integrated into a process; if necessary the interface can be relocated in the URI address space. It is necessary to implement an Accessor with the IURAccessor interface, however, there is a higher-level API (XAccessor) that is suitable for 80% of cases; the lower-level APIs and Kernel APIs can be progressively revealed if necessary.
- Not suitable for all application domains
NetKernel is not a magic bullet. It attempts to provide an infrastructure which offers the fundamental properties of Web-systems in a managed operating system. There are always alternative solutions to any problem and there are applications where the Web is not the appropriate model.
The Web architecture, which some are now calling REST, is relatively simple and pragmatic, yet it has proven to be the most successful distributed system ever. NetKernel is a natural evolution, although it can appear to be revolutionary since it takes several steps in one go. It provides a Web operating system upon which Web applications can be developed that are fully backwards compatible and which can in turn be used as the basis for truly Web-like XML-services, whatever your chosen application protocol.
About the Author
Peter Rodgers is a co-architect of 1060 NetKernel. Before founding 1060 Research he led HP Labs’ Dexter research programme. Before that he has a dark history as an industrial researcher. He holds a PhD in solid-state Quantum Physics from the University of Nottingham.
The NetKernel abstraction was discovered by standing on the shoulders of many giants. I am indebted to my former colleagues Dr Russell Perry and Dr Royston Sellman and most of all to Tony Butterfield, co-founder 1060 Research, co-architect of NetKernel and developer of the Java microkernel. Thanks to Robert Leftwich and Suhail Ahmed for reviewing this article. Finally, Tony and I thank our wives and kids, for going without food for a year and not complaining.
 If you think WS-* is incomprehensible you are not alone. Here's what Michael
Champion, Chair of the W3C Web-Services Architecture Working Group recently
said “[The WSA WG final report] tries more to state what we learned from
the process than to proclaim ex cathedra what the W3C Recommends about things
that are beyond current human comprehension. ;-)” email to xml-dev mailing
 NetKernel Whitepaper, Rodgers,
 REST, R.T. Fielding
 Butterfield et al, Active URI IETF RFC.
 NetKernel Standard Edition XML Accessor Reference.
 Representation, Aspect, Transrepresentation, NetKernel Model
 Blogxter, a blog application for NetKernel built from layered XML-services.