Java Development News:
When is SOAP a good idea in a project
By Billy Newport
01 Jan 2000 | TheServerSide.com
SOAP is basically an XML marshalling mechanism for RPC calls. It doesn't specify a transport although it is most commonly used with HTTP. It could also be used to encode an RPC call over a simple TCP/IP socket or a JMS message.
People have a tendency to use the latest technology for no other reason than 'just because'. This article gives the reasons why SOAP may be appropriate in conjunction with session beans and when it makes little sense.
Basic over view of operation
You need a dispatcher on the server side. The dispatcher receives a SOAP envelope from a transport. A SOAP envelope is an XML document that encodes the service required, the method on that service and the input parameters for the method. It then decodes the 'envelope' and invokes the real java method underlying the SOAP service. It doesn't have to be Java but I'll use Java as the server language in this story. This decoding and executing the java method is normally done using an off the shelf component such as IBM's SOAP4J package available from their alphaworks web site.
The underlying code then executes and returns to the SOAP runtime. It catches any exceptions, returned objects and output parameters. It then encodes these in to a response XML document which is transmitted back to the client using the transport, i.e. as a HTTP response with HTTP.
That's basically it. So, the SOAP runtime supplies all the code for the demarshalling of the request, invoking an adapter and then marshalling the response packet. We supply the adapter which delegates the call to the real underlying objects. The IBM SOAP runtime comes with code for a HTTP transport. But, you're free to use any transport you want, all the hooks for this are present. So, SOAP is basically giving us a simple RPC mechanism. It can be used as a replacement for RMI/IIOP but a couple of things have to be remembered. Indeed, Microsofts .NET is going down this path but I think it's a mistake to use it blindly, the ability to support a pluggable protocol stack (allowing DCE or IIOP or anything else) would be something Microsoft, or indeed J2EE vendors, should provide to give developers more performance options.
I would say that the following statements apply to SOAP implementations in general:
- SOAP will consume more bandwidth on the pipe than RMI/IIOP.
This goes without saying. It is XML based so it is going to be much larger than a binary marshalling like CDR or XDR.
- SOAP is more expensive to marshall than RMI/IIOP.
It's bigger so it's more bytes to push on to the pipe. It's an ASCII protocol so data needs to be converted to strings rather than transmitted in its binary form. This all consumes precious server cycles.
- It requires more memory.
Building those strings and parsing them will use more memory and possibly leave more garbage than an IIOP ORB would use.
- It is not a native EJB protocol.
It requires more work on your side than simply using RMI/IIOP which is essentially free.
So, given all of the above, I hope it's clear than blindly using SOAP/HTTP instead of RMI/IIOP for your RPC mechanism would lead to a slower system than takes more server side resources to support and is more complex (how complex depends on tooling etc) to develop. It also requires you to write code to bridge the SOAP runtime to your objects. Right now there is no tooling available to generate this code. While I expect this code to arrive, it isn't here yet so you'll need to write a generator or hand code this. If you have a complex interface then this may be a significant amount of code.
The next sections try to provide reasons why we may be willing to tolerate this overhead. The intent is to show when the benefits out weigh these disadvantages in some scenarios.
Clients/applets via a WAN or the internet.
If you have a browser based application or fat client connected via the public internet to your application servers then SOAP can make a lot of sense. When built on top of HTTP, it's more internet friendly than IIOP is. Why? Firewalls. There may be zero or more firewalls between the applet/client and your server, for example:
Probably no firewalls.
- Client using ISP.
Your DMZ is using two firewalls, one between the net and your DMZ and another between the DMZ and your intranet. If your app server is in the DMZ then thats one and if your app server is on the intranet (more likely) then that's two.
- Corporate client.
This is as above except the client is not directly connected to the net. It goes through its corporate firewall and possibly dmz (two firewalls). So, we may have 4 firewalls in this case.
Normally, any firewall should be configured to allow an applet to open a HTTP connection to the server that the applet was downloaded from. You can place your SOAP gateway on that server and therefore, your applet can communicate with the application server even in the presence of firewalls. Firewalls and IIOP do not mix very well. Very few firewalls will have the IIOP ports open for communication. This means that IIOP is very likely to be blocked by a firewall between your client and your server. The worst case above would mean that you need to convince the administrators of 4 firewalls in two different companies to open a set of ports.
- First, the need to force your customers to change their existing security infrastructure in order to use your application will be a reason for them not to use it.
- Second, these people are rightly paranoid about opening any incoming ports on their firewalls.
This means it is a bad choice as a protocol for your applet. Next, there is the code that you need to download with the applet in order to use IIOP. You would need to bundle your EJB's runtime and the stubs for the beans in to your applet. This could be a significant amount of code. You also need the RMI/IIOP add on for the Java 1.1 JDK. But, there is a slight problem here. This needs a native code library, ioser. You won't be able to download this. Even if you could, you may have interoperability problems between Java 1.1 RMI/IIOP and Java 1.2 RMI/IIOP.
RMI is actually a more friendly applet technology than RMI/IIOP simply because it is 100% Java and can be downloaded to an applet easily. Although, there are still compatibility issues between 1.1 and 1.2 (serialization). This means that J2EE servers that give your a choice of RMI or RMI/IIOP have an advantage or ones that don't when using applets. Of course, if your J2EE server has a 100% Java runtime for RMI/IIOP then this point doesn't matter. It's also possible that your EJB vendor doesn't supply a 1.1 compatible runtime. This means that your client need to install the 1.2 bridge in their browser. This can also be problematic.
RMI still has the same problems are IIOP with regards to firewalls. It is very likely that the ports needed by it will be blocked by the firewalls. The point is that to get RMI/IIOP comms between an applet and your server over anything except an intranet is hard.
Some products offer IIOP/HTTP tunneling. This puts the IIOP requests inside of a HTTP request, sends the request using HTTP to the server and then converts it back to IIOP to dispatch it. But, you still need a runtime that will work in the applet and you still may need the ioser library. You also still need to download the client jars. Your particular EJB server may help with some of these problems, it may even make it easy. I'm just pointing out the issues so you can check whether your particular implementation deals with them.
RMI/IIOP is not builtin to the JRE until V1.3
RMI/IIOP is an add-on for JRE 1.1 and JRE 1.2. It is not builtin. This explains why it must be downloaded with your applet. JRE 1.3 will include this code are part of the JRE for the first time. But, given that most browsers currently only support JRE 1.1, until the browsers support 1.3 (unlikely any time soon given Microsofts position on Java) then you will need your clients to install Suns 1.3 plugin for your applets to run. The only problem with this is that it complicates installation slightly, it's an extra step that would not be needed if we could just use JRE 1.1 for the applet. WebSphere does not currently support JRE 1.3 at the moment in any case but some other J2EE servers may.
So, how can SOAP help?
SOAP/HTTP uses HTTP as the transport. This means every JDK, even 1.0.x ones, have all the code necessary to support this transport (HTTP) without any additional code. IIOP is also normal unencrypted. Some servers do support IIOP/SSL but this is even less likely than IIOP to pass through a firewall. HTTP/SSL or HTTPS, on the other hand, will get through almost every firewall in existence, like HTTP. Every J2EE server on the market supports HTTP/SSL and it's a commonly used feature so it's going to work. Again, if your EJB server supports IIOP over HTTP tunneling then maybe they also support IIOP over HTTPS but again the runtime is even larger than before. If you are writing an application that sends sensitive data then encrypting this may be a requirement. But, HTTP/SSL is not supported out of the box with JDKs. You need an add on protocol handler to support this. Sun provide one for 1.2 JDKs and you can get 1.1 compatible SSL protocols also.
So, for me if you can't use tunneling then IIOP is not an option between your client/applet and your server when firewalls are in the communications path. The only protocol you can rely on in the presence of firewalls is HTTP. But, even if you can use tunneling then I'd say wait and think about this. You are about to depend on what's probably one of the least used features of your EJB server. There are not a lot of people using this. This means you are more likely to have problems and if you find a problem then you will probably be a low priority for getting it fixed by your vendor. Unless, you're absolutely sure it is solid, I'd still go for HTTP just to reduce your project risk. HTTP will give you a protocol that can be encrypted also with low risk.
Fat clients on your intranet
Here I wouldn't use it if you're sure that the client will not degenerate to the above case. If you're sure that this won't happen then use RMI/IIOP, it's easiest. Otherwise, use RMI/IIOP anyway but stick to a very simple request/reply interaction between your server and your client. That way, you should be able to switch to a SOAP approach later without needing to rewrite your application. Why a simple request/response interface? There are 3 levels of interface for me:
- Fine grained.
A very fine grained interface is a bad idea in any case. You will incur a lot of overhead from the LAN and server side you will find you spend more CPU cycles in the container stubs as a ratio to your business logic thereby stealing cycles away from your application. If you profile your application in this scenario then you'll probably find a large percentage of the CPU time is spend in the EJB runtime rather than your code. You need to balance the coarseness of your methods against this cost.
- Medium grained.
A medium grained interface is usually recommended for an intra net application as it's still 'nice' to use but works well on an intranet. Here an intranet is a local network or a relatively speedy corporate WAN.
- Coarse grained.
You can built your server side code with this type of interface or alternatively you can turn a medium grained interface in to a coarse one relatively easily by using session beans as a fascade to your existing medium interface. These session beans should have a very simple interface that could be wrapped using JMS or SOAP.
Non java clients
Whilst RMI/IIOP promises interoperability with non java languages this comes with some restrictions.
- The client really needs a Corba 2.3 ORB
Otherwise, you'll have severe limitations on the signatures for your interfaces. No objects by value, no vectors, no arrays etc.
- Limited security support.
You'll almost certainly lose security. This is a really serious problem. Usually, the only way you'll get security is by using an ORB from the same vendor as your EJB server. The problem here is that many EJB servers don't provide a standalone ORB (BEA and IBM). Iona/Visigenic do provide a standalone ORB but I'm not whether this is possible with even these vendors.
SOAP offers a good solution for these clients. HTTP is a very simple protocol to implement and you may be able to find a library you can use for this. The SOAP protocol isn't so hard to implement on top and you may be able to find a SOAP implementation off the shelf. Microsoft provides this on Windows platforms. The thing about HTTP is that security will work and be integrated with the security provided by your application server. The servlet that implements your request dispatcher on the server side just needs an ACL applied so that the EJB servers servlet container forces a HTTP authentication. Your client can easily handle this return code and bingo, it's authenticated.
Certificate based authentication.
You could also use HTTP/SSL with certificates if you need certificate based authentication. You just need a client side SSL implementation that supports certificates. This is new in WebSphere V3.5. It did not work in WebSphere V3.02. But, certificate based authentication for fat clients is not supported with WebSphere V3.5. If you want certificate authentication then HTTP or HTTPS are your only choices.
This is a big deal. You can now have strong security (authentication and encryption) with both java and non java clients.
VB and COM clients.
You can try to use COM-Corba or COM-EJB bridges or the Sun ActiveX/Bean bridge but when you're using Microsoft type clients given that you can get a very good SOAP/HTTP implementation from MS then why bother. Implement a SOAP adapter on the server side and use SOAP between the two. All of the Microsoft Office and VB type applications will support SOAP. Plus, you won't need to install any software on the client, it should be all built in. This means that if we were linking an Excel document to our server that anybody with Excel installed can just open the file and use it. I have done this using my own non SOAP RPC mechanism built with HTTP and XML. It greatly simplified the deployment of the 'application' as no extra software needed to be installed. Once Microsoft adds SOAP support to Office or if you implement a pure VB SOAP runtime that you can embed in the spread sheet then you should be able to do the same.
Right now, I think EJB servers are poised to take advantage of SOAP but it may be another 6 months before it's going to be easy. This is because we're only just seeing the tail end of RMI/IIOP adoption by the vendors. If you want SOAP support then you will need to write or generate a bridge through which the requests will pass. This means you receive and demarshall the request only to remarshall it as IIOP and vice versa on the return path. Hence, you may be doing marshalling twice and this will lower performance.
One vendor, Iona have thought ahead. Their EJB server is based on their excellent ART ORB which features a pluggable protocol stack. This means they may be able to receive a SOAP request and then directly dispatch it to the bean implementation thus avoiding the IIOP step altogether. This may give them a performance advantage over the bridge approach. But, the other vendors may simply modify their containers to also do the same for SOAP and gain the same efficiencies, i.e. their containers will listen for both IIOP and HTTP requests and dispatch these request directly to your beans.
So, here are the scenarios where I would think about using SOAP with an EJB server today:
- If your clients are Windows type clients or are on the other side of a WAN or the internet.
- If you have non Java or Microsoft clients.
- You need security and/or encryption with non java clients.
I would not use SOAP as the method for communicating within my system for the same reasons I would not use XML within the boundaries of a subsystem either. It's bulky and more expensive to process than RMI/IIOP or in the case of XML than serialized Java objects. Of course, in theory, you could do this and be buzzword compliant but I'm afraid in practise as you should now see, it's not quite so black and white.