One of the reasons Java became popular was the introduction of primitives in the language itself for the management of threads such as the synchronized keyword.
Everyone knows that concurrent programming and management of threads in general are necessary because the world of web programming is essentially concurrent because multiple clients can access in the time, is necessary but in turn is complicated.
In addition to the inherent complexity of the synchronization of shared resources, everyone knows that thread management is very costly for the operating system and resource-intensive and rarely many more than 100 threads are used, for example Tomcat 6 by default limits the maximum number of threads to 200.
The problem is the standard Java servlet which bounds a thread to only one request, so that until the request terminates the thread is not free to be used to execute another request. Therefore the maximum number of concurrent requests is determined by the maximum number of threads we can use for requests.
To avoid this problem proprietary extensions arose from manufacturers of application servers (Tomcat, Jetty, GlassFish ...), all of them in one way or another (though usually via Java NIO) break the association 1 request - 1 thread so the thread of the request ends without ending the request, because request-thread synchronization is avoided the request is "non-blocking" or "asynchronous."
In "normal" web requests end as soon as possible, the problem arises in long-polling Comet wherein the request may be retained for a long time, if you use the standard servlet the thread is retained (stopped) because the request (and its associated objects) can only be used within the thread of the request. If a thread is withheld cannot be used for other requests so you need to reserve as many threads as Comet users.
In the alternative Java NIO approach one thread is capable of processing multiple requests avoiding the problem of switching thousands of threads and allowing greater scalability, consequently a greater number of concurrent users.
The problem introduced by the "mono-thread" approach of NIO is that we must be aware, for example in Comet programming, that we must not block with our actions the NIO thread, because if the NIO thread locks many other requests will have to wait. An example of blocking is a database operation, such operation is clearly blocking and it may take time, meanwhile the system (NIO thread) is "stopped" waiting for the response from the database, and other requests are also stopped.
To avoid this problem we can always create a thread (or use one of a pool) and delegate the query to the database to the new thread, unblocking the NIO thread. The problem with this practice is that it introduces new multithreaded programming with all its drawbacks in scalability previously enunciated.
To solve this problem there is an emerging "new" paradigm, "asynchronous programming”, asynchronous programming promotes using the same thread to process multiple requests, sequentially, but with no request blocking the thread, as we will see later the operations performed by requests will be executed "in pieces." To get there we must avoid running tasks that blocking the thread, so we need a non-blocking API, for instance a database operation using a non-blocking API "registers" the operation but does not run immediately, so that the method call immediately returns without having accomplished the database task, in the same time we provide a callback to be called when the database process ends, so that the flow will continue through the callback, this way of executing code is named "asynchronous" with respect to how it would have been sequentially (the call does not return until the task finishes).
Through this registration the task is internally queued, the main thread will check if the queued tasks are finished using asynchronous APIs of the operating system or through a pool thread because the goal is not to block the main thread running the requests, because you are executing little by little the code of requests it seems they are executed run "in parallel" but with the same thread.
If we need to execute tasks in parallel, we just can doing more things immediately after the call of long asynchronous task, when the long asynchronous task finishes (which usually implies waiting for input / output of a device) the flow continues in the callback, but always using the same main thread. Because the same main thread is ever used to execute requests, there is no problem of concurrent access to shared objects so you are free of synchronization problems. This is obviously not possible for example with JDBC because this API is blocking (although nothing precludes an asynchronous API on top of JDBC API).
One example of this emerging programming paradigm is Node.js , a web application server in which you code in JavaScript. This paradigm fits well with JavaScript because this language does not support threads.
Node.js code example (note the callback registration)
var fs = require ('fs'), sys = require ('sys');
fs. readFile ('treasure-chamber-report.txt', function (report)
{
sys. puts ("oh, look at all my money:" + report);
});
fs. WriteFile ('letter-to-princess.txt','...', function ()
{
sys. puts ("can not wait to hear back from her!")
});
For more information about Node.js this link is useful
The API of Node.js is non-blocking, either because the task is not blocking or when it is, Node.js prevents blocking allowing us to register a callback. Every call to the Node.js API is an opportunity for Node.js engine to change the request and execute any pending callback waiting for a blocking operation to finish, thus Node.js COMMUTE between requests using the same thread so that running requests are gradually executed “in parallel”, because our code instead of following a synchronous flow or sequence, is broken in small pieces registered as callbacks, inside a callback we can again call new asynchronous methods registering new callbacks (pieces of code).
As you can see this strategy is JUST GREAT!
Notwithstanding, it just have a problem ...
IT WAS INVENTED AROUND 40 YEARS AGO AND IS A HANDMADE, INEFFICIENT AND TEDIOUS VERSION OF THE JOB OF A THREAD SCHEDULER!
And soon the problems arise as expected.
Software threads are a mere illusion, every core of a processor is almost mono-thread (common Intel processors), a thread scheduler switches processor (core) registers to continue running a different code area in a different stack for a very small time frame giving us the illusion of parallelism, only real when the processor has multiple cores or every core can execute several hardware threads. This context switch is automatic, regardless the code being executed, and following a policy of effective processor usage. For a manual alternative is very difficult to overcome a thread scheduler (there was a time that was possible in older Linuxes hence we had SEDA but this is no longer true) because a thread scheduler has the ability to reclaim control of the processor/core to the software thread at any time.
It is FALSE that thread management is costly as it was demonstrated in this article, I have recently found another link that corroborates the same but apparently it seems the opposite (I recommend reading my comment).
It is FALSE that thread management is costly, CPU usage is ZERO as you can see in this example of 3000 threads:
public static void main (String [] args)
{
for (int i = 0, i < 3000; i + +)
{
Thread thread = new Thread ()
{
final Object monitor = new Object ();
public void run ()
{
synchronized (monitor)
{
try {Monitor. Wait (40000); }
catch (InterruptedException ex) {}
}
}
};
thread. start ();
}
}
It is FALSE that we can effectively separate blocker and non-blocker tasks and make maximum use of the CPU with a single thread.
The following example execute many millions of iterations and increments of a simple integer variable, an extreme example of non-blocking task. Interestingly the higher number of threads set with the variable THREADS (the number of iterations and increments are the same) the more time is decreased despite the creation or initialization of threads (if it were a pool that cost would not exist). Try THREADS with the number of cores of your computer and then with 1000 threads for example, the time is less with 1000 threads! Now imagine the performance difference with more complicated and blocking tasks.
public static void main (String [] args) throws Exception
{
final int THREADS = 8;
final long LENGTH = 100000000000L / THREADS;
long start = System.currentTimeMillis ();
Runnable task = new Runnable ()
{
public void run ()
{
long j = 0;
for (long i = 0; i <LENGTH; i + +)
j + +;
}
};
Thread [] threadList = new Thread [THREADS];
for (int i = 0; i <threadList.length; i + +)
{
threadList [i] = new Thread (task);
threadList [i]. start ();
}
for (int i = 0; i <threadList.length; i + +)
threadList [i]. join ();
long end = System.currentTimeMillis ();
System.out.println ("END" + (end - start));
}
Fortunately Servlet 3.0 allows us to decide the number of threads that we estimate in our Comet application. An interesting example .
My recent experience with ItsNat Comet tells me that is not uncommon in our Comet applications need to notify ALL clients at the same time, and the most effective/performant way is that each client has an associated thread in server.
Please stop this nonsense and if necessary add threads to JavaScript...