Scalability: Can you say that one language isn't scalable?

Discussions

News: Scalability: Can you say that one language isn't scalable?

  1. J2EE developers often pride themselves on the scalability of the platform. We are always talking about multi-tier environments, and maybe seem to care more about these things on forums. Jack Herrington steps up to dispel the myth that "Java scales and PHP doesn't." Surely it's more than about a language?

    We all see applications that cannot scale in J2EE, and I am sure that many scalable sites run PHP, Perl etc.
    How much is about the language/platform... and how much is simply about the architecture?

    Has the PHP myth grown due to high profile cases like JBoss implementing Nukes in Java due to performance?

    Read: The PHP Scalability Myth

    NOTE: TheServerSide has an article coming out soon about scaling for performance and availability, discussing the use of distributed caches to help out the DB.

    Threaded Messages (23)

  2. First, let me say that PostNuke is an excellent framework. Very early this year, we ported the JBoss website to PHP PostNuke because we loved the functionality. Unfortunately, it didn't scale one bit, even a little bit. The main culprit, I believe, was that PHP PostNuke made SQL queries for every part of the webpage on every single HTTP call. No caching what-so-ever. It brought www.jboss.org to its knees and our community was upset for a few days until we brought back the old JBoss driven website.

    Since we liked Postnuke so much, we decided to start Nukes on JBoss. Which is built upon J2EE and JBoss and is now running www.jboss.org since May thanks to Nukes creator Julien Viet.

    Is Java more scalable than PHP? I don't know. But Nukes on JBoss is definately orders of magnitude more scaleable than PHP PostNuke. We know from experience.

    Bill
  3. we ported the JBoss website to PHP PostNuke because we loved the functionality. Unfortunately, it didn't scale one bit, even a little bit. The main culprit, I believe, was that PHP PostNuke made SQL queries for every part of the webpage on every single HTTP call. No caching what-so-ever.


    The problem is more on PostNuke's architecture, and _not_ PHP. If you think the problem is the PHP language, I'm interested to know which bit of the language (and not the application architecture) that caused it.

    > Since we liked Postnuke so much, we decided to start Nukes on JBoss. Which is built upon J2EE and JBoss and is now running www.jboss.org since

    Will the so called Nukes on JBoss have the same scalability issue if it makes SQL queries for every part of the webpage on every single HTTP call without any caching what-so-ever ?

    > Is Java more scalable than PHP? I don't know.

    I think Java is more scalable but not much, it's more on how the applications are implemented.

    I've been doing server side Java for years, and in the last 6 months I started using PHP.

    What I found was:
    - Many Java applications has much advanced architecture, designed following MVC / 3 tier, while most PHP applications are not. This triggers many people porting Java based applications to PHP. You can see PHP port of log4j, struts, etc and so on.
    - Why is this so? PHP is much easier to learn compared to Java, many PHP applications are written by people who have never learned good programming practices. I still find lots of PHP applications using globals that makes me shake my head in disbelief. Often I have to 'tweak' the sourcecode of those PHP applications in order to improve configurability and reusability.
    - PHP world seems to be moving to the right direction. I don't know if it will ever catch up the Java community.

    > But Nukes on JBoss is definately orders of magnitude more scaleable than PHP PostNuke. We know from experience.

    Nukes on JBoss ~ improved PostNuke.
    Maybe if Nukes on JBoss is written in PHP, and PostNuke is written in Java, we might say things differently.
  4. Article left bad impression...but I simply wanted to ask - does anyone know about something like Coherence for PHP? One can setup several boxes, running the same PHP application, put a load balancer in front and distribute session between them(somehow). Of course, for now it's possible to store session in the database...but then we'd have a bottleneck there.

    In fact, title is wrong. We can't apply term 'scalability' to languages.
  5. This article has many flaws and does not make an apples to apples comparison. The views of the author are never backed by experiments or numbers.
    Another study compared PHP, Servlet and EJB with the TPC-W benchmark. It is available here.
    Distributing tiers on several machines increases latency but allows to scale further.

    Emmanuel
  6. I don't see how you draw the conclusion about scaling further from the article you are citing. They put Java on multiple machines and it outperformed PHP on a single machine. Why didn't they put PHP on multiple machines too? It's not hard. All you need is an HTTP load-balancer and one of the many shared session approaches.
  7. Being a long time Java developer I recently moved partially to Python for most of my development and found that I really being much more productive!!!
    I also found that being more productive and releasing good results faster pays off because like I say in the title, who cares about Python being slower than Java? Just add more computers, processor power, memory, disks, load balancers – All this is cheap compared to the development fees.

    Just my €0.1.

    PS: I really keep loving Java :)
  8. Just to say something that I forgot.
    Using Python instead of Java and being more productive doesn’t mean that I'm forgetting about the good practices that I used in Java.
    I apply in Python every Design Pattern that I used in Java. The fact that, like PHP, Python gives us the chance of doing really ugly things doesn’t mean we have to do it :)

    Pedro Costa
  9. Python Use[ Go to top ]

    A question for you... how are you running your server-side Python applications? Are they CGI-driven or do you have an "engine" to execute them in?

    One of the irritating issues around CGI-based apps is the lack of a container with standardized APIs and capabilities. Atleast that used to be the case.. I'm not sure what the landscape is these days. I just remember having to re-open database connections, or to build rather hackey persistent CGI applications in place of a container-based approach.
  10. Python Use[ Go to top ]

    I hate CGI :)

    I'm using Zope Application Server (http://www.zope.org) and/or Webware (http://webware.sf.net)

    Zope is a very mature open-source python application server and has lots of features, documentation and a big and friendly community. It excell's in things like CMS's with its CMF Api(content management framework). A very nice content management application based on Zope is Plone: http://www.plone.org
    Zope has some nice concepts and has an object oriented database (the ZODB) Embedded.

    Webware is more 'Java-like' (has python servlets, python-server-pages, etc..)

    There are many others, but I'm using this two. Zope is the more mature solution.

    I should say that I decided to take a look at Python and Zope after the comments of Bruce Eckel (my favourite Java author) on its 'Thinking in Java' books.
    Bruce is a big fan of Python and Zope and is recently writing its 'Thinking in Python' book based on Design Patterns.
    Now I understand what Bruce used to say about Python and I became a Pythonista and Zopista.

    Pedro Costa
  11. Python Use[ Go to top ]

    Interesting.. I remember Zope from some time ago.. a few years I guess, but it was fairly recent at the time and didn't have a long track record. Its nice to know that its still alive and kicking.

    I think scripting languages have a good place, and people should always consider the best course when building their sites. I prefer Java for more complex solutions, but if people need a quick and dirty site, its much easier to find somebody (reasonably cheaply) to kick out a PHP-based site.. or perhaps Python..
  12. Python Use[ Go to top ]

    We are using Zope extensively as our CMF and have a few elements that link from the Java to the Zope server(s).

    Steve
  13. Ebay uses websphere??[ Go to top ]

    Even if it is true eBay was originally built in a kid's garage, the word is eBay uses Websphere on Windows 2000 boxes. If the latter is true that says a lot about what scalability actually demands !!
  14. Pedro I think you're on the wrong board then, Python hackers..stage left
  15. James,

    The title of this thread "Scalability: Can you say that one language isn't scalable?" opens the discussion about scalability in general, not only with Java. That’s why, being also a Java programmer like I told, I tried to give my contribution to the discussion talking about my experience with Python. After that I answered some questions.
    And you? What was your contribution for the discussion apart from taking space?
    Keep your angle open...

    Pedro Costa

    PS: Please dont bother to answer...
  16. I also found that being more productive and releasing good results faster pays off because like I say in the title, who cares about Python being slower than Java? Just add more computers, processor power, memory, disks, load balancers – All this is cheap compared to the development fees.


    Which is probably why people dont really worry about if PHP or pyhton or what ever is slower than Java or any other platform but rather if it is as scalable. The problem of a non-scalable platform is that it relatively quickly reaches the point where adding a new server - or a new cpu - or more memory doesnt give any additional performance.

    Br - J
  17. Johan,

    <quote>
    Which is probably why people dont really worry about if PHP or pyhton or what ever is slower than Java or any other platform but rather if it is as scalable. The problem of a non-scalable platform is that it relatively quickly reaches the point where adding a new server - or a new cpu - or more memory doesnt give any additional performance.
    </quote>

    I understand your point of view but I also think that:

    -Massive scalability problems that cant be solved with load balancing normally have to do (In my experience, I admit that maybe not in others) with database and interoperability with legacy systems.

    -With a good design, any PHP or Python web application can scale by distributing the load by more servers. If it is not possible it’s not because this tier. The very same applies to Servlets/JSP and you know that there are plenty of “Do we need EJB’s” discussions and articles out there.
    Don’t get me wrong, I personally like EJB’s. I only think that they are not for every project and not the first thing to think (like we don’t need cannon to kill an ant)

    -Even in this scenario you can use middleware to manage this. You can even use EJB’s.
    I already had a system coded in Python that interfaced, using web services, with WebSphere (in this case not for scalability problems, but because it was a requisite).
     
    -Another fundamental issue is the Cache policy. Tuning a cache and/or controlling a cache front-end like Squid can make a world of difference.

    But all this is taken by my experience… I admit that I maybe wrong.

    Regards,

    Pedro Costa
  18. Johan,

    >
    > <quote>
    > Which is probably why people dont really worry about if PHP or pyhton or what ever is slower than Java or any other platform but rather if it is as scalable. The problem of a non-scalable platform is that it relatively quickly reaches the point where adding a new server - or a new cpu - or more memory doesnt give any additional performance.
    > </quote>
    >
    > I understand your point of view but I also think that:
    >
    > -Massive scalability problems that cant be solved with load balancing normally have to do (In my experience, I admit that maybe not in others) with database and interoperability with legacy systems.
    >
    > -With a good design, any PHP or Python web application can scale by distributing the load by more servers. If it is not possible it’s not because this tier. The very same applies to Servlets/JSP and you know that there are plenty of “Do we need EJB’s” discussions and articles out there.
    > Don’t get me wrong, I personally like EJB’s. I only think that they are not for every project and not the first thing to think (like we don’t need cannon to kill an ant)
    >
    > -Even in this scenario you can use middleware to manage this. You can even use EJB’s.
    > I already had a system coded in Python that interfaced, using web services, with WebSphere (in this case not for scalability problems, but because it was a requisite).
    >  
    > -Another fundamental issue is the Cache policy. Tuning a cache and/or controlling a cache front-end like Squid can make a world of difference.
    >
    > But all this is taken by my experience… I admit that I maybe wrong.
    >
    > Regards,
    >
    > Pedro Costa

    Yes, I was just pointing out that the actual execution speed of a platform is of much less importance than scalability. I cant comment on the scalability of python or PHP since I have no practical experience of it.

    Br - Johan
  19. I understand your point of view but I also think that:

    >
    > -Massive scalability problems that cant be solved with load balancing normally have to do (In my experience, I admit that maybe not in others) with database and interoperability with legacy systems.

    Massive scalabability can only be solved, realistically, by stopping the creation coarse-grained methods for remote calls and using agent based, adaptive systems that migrate from system to system.
    Apps and classes that move from one system to another according to their requirements, means that you get distributed computing at (not much more than) local execution cost.

    > -With a good design, any PHP or Python web application can scale by distributing the load by more servers. If it is not possible it&#8217;s not because this tier. The very same applies to Servlets/JSP and you know that there are plenty of &#8220;Do we need EJB&#8217;s&#8221; discussions and articles out there.

    I think that 'adding more servers' is 'missing a trick', surely the first point of call should be how can I make my app more efficient in the systems I have (obviously you can't go to the nth degree of optimisation). I am not disagreeing that with a good design a PHP or Python or Java app can scale, but there should be some thought before automatically saying 'let's throw more hardware at it'. For instance, recompiling apache can give performance benefits.

    C.
  20. Not impressing article[ Go to top ]

    First (as also noted in the comments) it is rather silly with the
    the word scalibility in the title.

    The entire article discusses single server performance. Which is actually
    the opposite of scalability (unless you believe that one solution can both
    scale better and perform better in single server environment).

    I have no doubt that a PHP-MySQL solution (or a ASP-SQLServer solution)
    can perform as good or better than a J2EE solution in a single server
    environment.

    The tigther coupling the better performance.

    But scalability is about adding more boxes and seeing maximum
    throughput grow proportional with the number of boxes.

    That is what EJB delivers. And what neither PHP or ASP has.

    BTW, one get a rather bad feeling about the authors Java knowledge when
    he uses the term RMI to describe the web-server <-> servlet-engine
    communication.
  21. re[ Go to top ]

    Hi Arnie,
     The flowchart does describe the webserver <> servlet engine commn. as RMI but It looks like typo/proofing error rather than the Authers' understanding.

    But his comparison of a 'c' middle tier with 'j2ee' makes........ me..... think........................
  22. ah ... the arguement continues[ Go to top ]

    it would be like saying JSP vs J2EE?
  23. What a Jackass[ Go to top ]

    My favorite Jackass excerpt:

    "There's this obsession with Java in big IT departments now as if it's some kind of holy grail. All of the "breakthrough" sites like Yahoo, eBay, etc. were built in 2 weeks in some kids garage. Trust me, the next ones are going to be built the same way & it's not going to be with Java/J2EE sitting on $100k hardware."

    This guy has obviously never seen my garage.

    J2EE applications can be built quickly.

    Lastly, this guy needs to learn what scalability actually means.

    If the software smart enough to accommodate pluggable hardware such that the addition of hardware approximates a linear increase in performance and maximum concurrency of the application.
  24. Hi.

    - What is a language? a BNF - in the End Only Syntax! It's all about architectures. architectures (which by the way YOU build up and not a wizard and not a set of technology alone) using J2SE will definetly turn out to be completely different than J2EE based architectures, for example. PHP's architecture is mostly quite homogeneous, I suppose.

    - What does it mean that an architecture scales? It scales (or not) in an context - which again means requirements.

    - Most Requirements for projects using PHP-based architectures are probably quite similar and the PHP architecture as I mentioned is straight-forward and homogeneous. So things can get optimized for a set of requirements and a set of hardware (COTS).

    Keeping it stupid --- err keeping it simple, stupid :-)
    Greets
    agill