clustering with weblogic 6.1

Discussions

EJB programming & troubleshooting: clustering with weblogic 6.1

  1. clustering with weblogic 6.1 (4 messages)

    I'm sure the gurus on the list have a lot of experience on clustering with weblogic, but I'm facing a number of problems.

    1. There is heaps of information available about clustering, yet it takes days to reach a place where you get how-to information. Does anyone have a better how-to?
    2. I'n trying to setup a basic cluster. Here are the details:
     Administration Server: weblogic 6.1, Win'XP, listen port: 7001, IP: 192.168.1.135
     ManagedServer 1: weblogic 6.1, Sun OS, listen port: 7001, IP: 192.168.1.131
     ManagedServer 1: weblogic 6.1, RHL 6.2, listen port: 7001, IP: 192.168.1.239
     Snippet of my config.xml on 135:
        <Server Cluster="MyCluster" ListenAddress="192.168.1.239"
            Machine="dummy239" Name="wls239">
            <Log Name="wls239"/>
            <SSL Name="wls239"/>
            <ServerDebug Name="wls239"/>
            <KernelDebug Name="wls239"/>
            <ServerStart Name="wls239"/>
            <WebServer Name="wls239"/>
        </Server>
    .......
    .........
        <Server Cluster="MyCluster" ListenAddress="192.168.1.131"
            Machine="dummy131" Name="wls131">
            <Log Name="wls131"/>
            <SSL Name="wls131"/>
            <ServerDebug Name="wls131"/>
            <KernelDebug Name="wls131"/>
            <ServerStart Name="wls131" OutputFile="C:\bea\wlserver6.1\.\config\NodeManagerClientLogs\wls131\startserver_1029509118738.log"/>
            <WebServer Name="wls131"/>
        </Server>
    ............
    ..........
     <Cluster ClusterAddress="192.168.1.131,192.168.1.239" Name="MyCluster"/>

     nodemanager.hosts on 131:
    # localhost, loopback
    192.168.1.131
    localhost
    127.0.0.1
    192.168.1.135

    This is what I get when I start NodeManager on 131
    # ./startNodeManager.sh
    LD_LIBRARY_PATH=/home/weblogic/bea/wlserver6.1/lib/linux/i686:/home/weblogic/bea/wlserver6.1/lib/linux/i686/oci817_8
    <Aug 16, 2002 8:56:40 PM IST> <Info> <NodeManager> <NodeManager: for information on command line options, try "java weblogic.nodemanager.NodeManager help">
    <Aug 16, 2002 8:56:40 PM IST> <Info> <NodeManager> <Starting NodeManager >
    <Aug 16, 2002 8:56:44 PM IST> <Info> <NodeManager@localhost:5555> <SecureSocketListener: listening on localhost:5555>

    But, when I try to start weblogic on this managed server, I get the following:
    <Aug 16, 2002 8:24:14 AM PDT> <Error> <NodeManager> <Could not start server 'wls131' via Node Manager - reason: '[SecureCommandInvoker: Could not create a socket to the NodeManager running on host '192.168.1.131:5555' to execute command 'online null', reason: Connection refused: connect. Ensure that the NodeManager on host '192.168.1.131' is configured to listen on port '5555' and that it is actively listening]'>

    Any ideas what could be wrong? Any help will be appreciated.

    TIA

    Threaded Messages (4)

  2. clustering with weblogic 6.1[ Go to top ]

    Hi,

    Well, your startnodemanager command on Solaris has only linux library directories in the LD_LIBRARY_PATH, for one thing. Did you do a 'netstat -an | grep 5555' after the nodemanager was started to assure that things came up correctly? Don't rely on the command line spew.

    Here is the command you posted to start the node manager on Solaris:

    <quote>
    This is what I get when I start NodeManager on 131
    # ./startNodeManager.sh
    LD_LIBRARY_PATH=/home/weblogic/bea/wlserver6.1/lib/linux/i686:/home/weblogic/bea/wlserver6.1/lib/linux/i686/oci817_8
    <Aug 16, 2002 8:56:40 PM IST> <Info> <NodeManager> <NodeManager: for information on command line options, try "java weblogic.nodemanager.NodeManager help">
    <Aug 16, 2002 8:56:40 PM IST> <Info> <NodeManager> <Starting NodeManager >
    <Aug 16, 2002 8:56:44 PM IST> <Info> <NodeManager@localhost:5555> <SecureSocketListener: listening on localhost:5555>
    </quote>


    Bill
  3. clustering with weblogic 6.1[ Go to top ]

    I added the following to the start script of the NodeManager on 131:
    -Dweblogic.nodemanager.listenAddress=192.168.1.131 -Dweblogic.nodemanager.listenPort=5555

    #netstat -an | grep more
    tcp 0 0 192.168.1.131:5555 0.0.0.0:* LISTEN

    When I now try to start this managed server from the admin server, I get the following message:

             
     
     Connected to dummy135:7001 Active Domain: domain Aug 16, 2002 11:14:47 PM PDT
     
     

    <Aug 17, 2002 11:47:51 AM IST> <Info> <NodeManager@192.168.1.131:5555> <BaseProcessControl: saving process id of Weblogic Managed server 'wls131', pid: 17906>
    Starting WebLogic Server ....
    Connecting to http://localhost:7001...
    ***************************************************************************
    The WebLogic Server did not start up properly.
    Exception raised:
    java.net.ConnectException: Tried all: '1' addresses, but could not connect over HTTP to server: 'localhost', port: '7001'
    at weblogic.net.http.HttpClient.openServer(HttpClient.java:211)
    at weblogic.net.http.HttpClient.openServer(HttpClient.java:263)
    at weblogic.net.http.HttpClient.<init>(HttpClient.java:121)
    at weblogic.net.http.HttpClient.New(HttpClient.java:156)
    at weblogic.net.http.HttpURLConnection.connect(HttpURLConnection.java:111)
    at weblogic.net.http.HttpURLConnection.getInputStream(HttpURLConnection.java:283)
    at weblogic.net.http.HttpURLConnection.getInternalResponseCode(HttpURLConnection.java:661)
    at weblogic.net.http.HttpURLConnection.getResponseCode(HttpURLConnection.java:646)
    at weblogic.management.Admin.getBootstrapLocalServer(Admin.java:1073)
    at weblogic.management.Admin.initialize(Admin.java:340)
    at weblogic.t3.srvr.T3Srvr.initialize(T3Srvr.java:359)
    at weblogic.t3.srvr.T3Srvr.run(T3Srvr.java:206)
    at weblogic.Server.main(Server.java:35)
    --------------- nested within: ------------------
    weblogic.management.configuration.ConfigurationException: connecting to http://localhost:7001/wl_management_internal2/Bootstrap - with nested exception:
    [java.net.ConnectException: Tried all: '1' addresses, but could not connect over HTTP to server: 'localhost', port: '7001']
    at weblogic.management.Admin.getBootstrapLocalServer(Admin.java:1164)
    at weblogic.management.Admin.initialize(Admin.java:340)
    at weblogic.t3.srvr.T3Srvr.initialize(T3Srvr.java:359)
    at weblogic.t3.srvr.T3Srvr.run(T3Srvr.java:206)
    at weblogic.Server.main(Server.java:35)
    Reason: Fatal initialization exception
    ***************************************************************************

  4. This is what I found in the NodeManager logs on 131:
    #more config
    #Saved configuration for wls131
    #Sat Aug 17 11:47:51 IST 2002
    processId=17906
    savedLogsDirectory=/home/weblogic/bea/wlserver6.1/NodeManagerLogs
    classpath=NULL
    nodemanager.debugEnabled=false
    TimeStamp=1029565071811
    command=online
    java.security.policy=NULL
    bea.home=NULL
    weblogic.Domain=domain
    serverStartArgs=NULL
    weblogic.management.server=localhost\:7001
    RootDirectory=NULL
    nodemanager.sslEnabled=true
    weblogic.Name=wls131
    #

    Anybody?
  5. I have made some progress:

    1. The environment settings on 131 were missing. I executed setEnv.sh to setup the required environment variables.
    2. nodemanager.hosts (on 131) had the following entries earlier:
    # more nodemanager.hosts
    127.0.0.1
    localhost
    192.168.1.135
    #
    I changed it to:
    #more nodemanager.hosts
    192.168.1.135
    3. The Administration Server (135) did not have any listen Address defined (since it was working without it), but since one of the errors thrown by NodeManager on 131 was - "could not connect to localhost:70001 via HTTP", I changed the listen Address to 192.168.1.135 (instead of the null).
    4. I deleted all the logs (NodeManagerInternal logs on 131) and all log files on NodeManagerClientLogs on 135.
    5. Restarted Admin Server. Restarted NodeManager on 131. NodeManagerInternalLogs on 131 has:
    [root@]# more NodeManagerInternal_1029567030003
    <Aug 17, 2002 12:20:30 PM IST> <Info> <NodeManager> <Setting listenAddress to '1
    92.168.1.131'>
    <Aug 17, 2002 12:20:30 PM IST> <Info> <NodeManager> <Setting listenPort to '5555
    '>
    <Aug 17, 2002 12:20:30 PM IST> <Info> <NodeManager> <Setting WebLogic home to '/
    home/weblogic/bea/wlserver6.1'>
    <Aug 17, 2002 12:20:30 PM IST> <Info> <NodeManager> <Setting java home to '/home
    /weblogic/jdk1.3.1_03'>
    <Aug 17, 2002 12:20:33 PM IST> <Info> <NodeManager@192.168.1.131:5555> <SecureSo
    cketListener: Enabled Ciphers >
    <Aug 17, 2002 12:20:33 PM IST> <Info> <NodeManager@192.168.1.131:5555> <TLS_RSA_
    EXPORT_WITH_RC4_40_MD5>
    <Aug 17, 2002 12:20:33 PM IST> <Info> <NodeManager@192.168.1.131:5555> <SecureSo
    cketListener: listening on 192.168.1.131:5555>

    And the wls131 logs contain:
    [root@dummy131 wls131]# more config
    #Saved configuration for wls131
    #Sat Aug 17 12:24:42 IST 2002
    processId=18437
    savedLogsDirectory=/home/weblogic/bea/wlserver6.1/NodeManagerLogs
    classpath=NULL
    nodemanager.debugEnabled=false
    TimeStamp=1029567282621
    command=online
    java.security.policy=NULL
    bea.home=NULL
    weblogic.Domain=domain
    serverStartArgs=NULL
    weblogic.management.server=192.168.1.135\:7001
    RootDirectory=NULL
    nodemanager.sslEnabled=true
    weblogic.Name=wls131

    The error generated for the client (131) was:
    [root@dummy131 wls131]# more wls131_error.log
    ***************************************************************************
    The WebLogic Server did not start up properly.
    Exception raised:
    java.lang.ClassCastException: weblogic.security.acl.DefaultUserInfoImpl
    < --------------- nested within: ------------------
    weblogic.management.configuration.ConfigurationException: weblogic.security.acl.
    DefaultUserInfoImpl - with nested exception:
    [java.lang.ClassCastException: weblogic.security.acl.DefaultUserInfoImpl]
    at weblogic.management.Admin.initializeRemoteAdminHome(Admin.java:1042)
    at weblogic.management.Admin.start(Admin.java:381)
    at weblogic.t3.srvr.T3Srvr.initialize(T3Srvr.java:373)
    at weblogic.t3.srvr.T3Srvr.run(T3Srvr.java:206)
    at weblogic.Server.main(Server.java:35)
    Reason: Fatal initialization exception
    ***************************************************************************

    and the output on the admin server (135) is:
    <Aug 17, 2002 12:24:42 PM IST> <Info> <NodeManager@192.168.1.131:5555> <BaseProcessControl: saving process id of Weblogic Managed server 'wls131', pid: 18437>
    Starting WebLogic Server ....
    Connecting to http://192.168.1.135:7001...
    <Aug 17, 2002 12:24:50 PM IST> <Emergency> <Configuration Management> <Errors detected attempting to connect to admin server at 192.168.1.135:7001 during initialization of managed server ( 192.168.1.131:7001 ). The reported error was: < weblogic.security.acl.DefaultUserInfoImpl > This condition generally results when the managed and admin servers are using the same listen address and port.>
    <Aug 17, 2002 12:24:50 PM IST> <Emergency> <Server> <Unable to initialize the server: 'Fatal initialization exception
    Throwable: weblogic.management.configuration.ConfigurationException: weblogic.security.acl.DefaultUserInfoImpl - with nested exception:
    [java.lang.ClassCastException: weblogic.security.acl.DefaultUserInfoImpl]
    java.lang.ClassCastException: weblogic.security.acl.DefaultUserInfoImpl
    < --------------- nested within: ------------------
    weblogic.management.configuration.ConfigurationException: weblogic.security.acl.DefaultUserInfoImpl - with nested exception:
    [java.lang.ClassCastException: weblogic.security.acl.DefaultUserInfoImpl]
    at weblogic.management.Admin.initializeRemoteAdminHome(Admin.java:1042)
    at weblogic.management.Admin.start(Admin.java:381)
    at weblogic.t3.srvr.T3Srvr.initialize(T3Srvr.java:373)
    at weblogic.t3.srvr.T3Srvr.run(T3Srvr.java:206)
    at weblogic.Server.main(Server.java:35)
    '>
    ***************************************************************************
    The WebLogic Server did not start up properly.
    Exception raised:
    java.lang.ClassCastException: weblogic.security.acl.DefaultUserInfoImpl
    < --------------- nested within: ------------------
    weblogic.management.configuration.ConfigurationException: weblogic.security.acl.DefaultUserInfoImpl - with nested exception:
    [java.lang.ClassCastException: weblogic.security.acl.DefaultUserInfoImpl]
    at weblogic.management.Admin.initializeRemoteAdminHome(Admin.java:1042)
    at weblogic.management.Admin.start(Admin.java:381)
    at weblogic.t3.srvr.T3Srvr.initialize(T3Srvr.java:373)
    at weblogic.t3.srvr.T3Srvr.run(T3Srvr.java:206)
    at weblogic.Server.main(Server.java:35)
    Reason: Fatal initialization exception
    ***************************************************************************

    6. Now from the client (131) error, I thought it was something to do with security. So I tried to start weblogic manually (connected as the same user). Curiosly enough, it does start (it threw some errors for some EJBs, but I got the final message):
    <Aug 17, 2002 12:30:39 PM IST> <Notice> <WebLogicServer> <ListenThread listening on port 7001>
    <Aug 17, 2002 12:30:39 PM IST> <Notice> <WebLogicServer> <SSLListenThread listening on port 7002>
    <Aug 17, 2002 12:30:40 PM IST> <Notice> <WebLogicServer> <Started WebLogic Admin Server "myserver" for domain "mydomain" running in Production Mode>
    7. As you can see the domain on the client (131) is "mydomain". But shouldn't the Admin server be 192.168.1.135, since this is what I have configured for the NodeManager? Or is it that the error occurs because the Admin server Node Manager is configured to work with is 135, while in the default scripts the admin server is itself? I'm confused :-)


    Help, anyone?