cancel
Showing results for 
Search instead for 
Did you mean: 

Application resource is in w_offline state

pawan_n
Level 3

Hi All,

I have created a application resource in a veritas cluster of 2 nodes.

It does not have any dependency on any other resource.

One primary node, it's state is correct(OFFLINE) but on secondary node it is in w_offline state.Logs show that the cluster is not able to bring the resource to OFFLINE state on secondary node.

I tried hastop -local on sec. node followed by hastart, but no luck.

I even tried deleting my App. resource & re-created it with some other name, but results are same.

Any help would be appreciated!!

 

Thanks,

Pawan

2 ACCEPTED SOLUTIONS

Accepted Solutions

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

That is quite a list :)

 

I would say stick with the monitor program you wrote. It would need to exit with 100 (offline) or 110 (online).

View solution in original post

mikebounds
Level 6
Partner Accredited

You can't use pattern matching with Application agent so your choices are:

  1. Use MonitorProgram
  2. Use PidFiles
  3. Use Agent builder where you can use pattern matching using the "MonitorProcessPatterns" attribute (see https://sort.symantec.com/agents/detail/5711)

Mike

View solution in original post

11 REPLIES 11

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Review the application agent log in /var/VRTSvcs/log/

Sacheen_Birhade
Level 2

Hi Pawan,

 

Can you please post logs?

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Please post Application config in main.cf along with VCS logs.

pawan_n
Level 3

Hi All,

 

The issue is resolved now.

I was directly bringing my application up without bringing its resource ONLINE.

I first brought the resource ONLINE & then started my Application & the issue is resolved.

 

Thanks,

Pawan.

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

I first brought the resource ONLINE & then started my Application & the issue is resolved.<<< that doesn't make sense.

 

Onlining the resource should start your application. Unless there its some strange two tiered process to start this appl.

pawan_n
Level 3

Hi Riaan,

Sorry for the confusion from my end.

Actually, bringing the Application resource ONLINE is indeed starting my application automatically due to StartProgram attribute.

But, I had to write a monitor program to detect OFFLINE state of my application & use MonitorProgram attribute as MonitorProcesses attribute didn't work for me.

My application process string is of 28 lines(as per ps -ef output) & specifying its substring in MonitorProcesses doesn't help as it requires the exact string.

My question is can we specify such a huge string(like 28 lines of string in my case) in MonitorProcesses?

 

Thanks,

Pawan

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Can you post your main.cf

 

and details about the application (ps -ef etc)

pawan_n
Level 3

Hi Riaan,

Here you go...we are concerned about CLSGBPPM resource group-

-----------------------------------------------------------------------------------------------------

main.cf

 

cluster CLMBPPM (
    UserNames = { admin = JijBidIfjEjjHrjDig }
    Administrators = { admin }
    HacliUserLevel = COMMANDROOT
    )

system ithvvm606 (
    )

system ithvvm607 (
    )

group CLSGBPPM (
    SystemList = { ithvvm606 = 0, ithvvm607 = 1 }
    AutoStartList = { ithvvm606, ithvvm607 }
    )

    Application BPPMAPP_RES (
        StartProgram = "/opt/bmc/ProactiveNet/pw/pronto/bin/pw system start"
        StopProgram = "/opt/bmc/ProactiveNet/pw/pronto/bin/pw system stop"
        MonitorProgram = "/opt/bmc/ProactiveNet/bppmprog.sh"
        )

    DiskGroup CLSG_appdg (
        Critical = 0
        DiskGroup = appdg
        StartVolumes = 0
        )

    IP CLSG_IP (
        Device = eth0
        Address = "10.119.155.208"
        NetMask = "255.255.255.0"
        )

    Mount CLSG_MNT (
        Critical = 0
        MountPoint = "/opt/bmc"
        BlockDevice = "/dev/vx/dsk/appdg/opt_bmc_vol"
        FSType = vxfs
        FsckOpt = "-y"
        )

    NIC CLSG_NIC (
        Device = "eth0:0"
        )

    NIC appNIC (
        Device = eth0
        )

    Volume CLSG_VOL (
        Critical = 0
        DiskGroup = appdg
        Volume = opt_bmc_vol
        )

    BPPMAPP_RES requires CLSG_MNT
    CLSG_MNT requires CLSG_VOL
    CLSG_VOL requires CLSG_appdg


    // resource dependency tree
    //
    //    group CLSGBPPM
    //    {
    //    Application BPPMAPP_RES
    //        {
    //        Mount CLSG_MNT
    //            {
    //            Volume CLSG_VOL
    //                {
    //                DiskGroup CLSG_appdg
    //                }
    //            }
    //        }
    //    IP CLSG_IP
    //    NIC CLSG_NIC
    //    NIC appNIC
    //    }


group cvm (
    SystemList = { ithvvm606 = 0, ithvvm607 = 1 }
    AutoFailOver = 0
    Parallel = 1
    AutoStartList = { ithvvm606, ithvvm607 }
    )

    CFSfsckd vxfsckd (
        )

    CVMCluster cvm_clus (
        CVMClustName = CLMBPPM
        CVMNodeId = { ithvvm606 = 0, ithvvm607 = 1 }
        CVMTransport = gab
        CVMTimeout = 200
        )

    CVMVxconfigd cvm_vxconfigd (
        Critical = 0
        CVMVxconfigdArgs = { syslog }
        )

    ProcessOnOnly vxattachd (
        Critical = 0
        PathName = "/bin/sh"
        Arguments = "- /usr/lib/vxvm/bin/vxattachd root"
        RestartLimit = 3
        )

    cvm_clus requires cvm_vxconfigd
    vxfsckd requires cvm_clus


    // resource dependency tree
    //
    //    group cvm
    //    {
    //    ProcessOnOnly vxattachd
    //    CFSfsckd vxfsckd
    //        {
    //        CVMCluster cvm_clus
    //            {
    //            CVMVxconfigd cvm_vxconfigd
    //            }
    //        }
    //    }


------------------------------------------------------------------------------------------------

We have to monitor 7-8 processes with huge strings.Pid Files are created for only 2 of them hence PidFiles attribute of almost no use.

ps -ef o/p of one of those processes to be monitored-

root     17692     1  0 04:37 ?        00:03:27 /usr/pw/jre/bin/java -server -Djserver=DUMMY -Djava.net.namelookup.cache=0 -Dsun.net.inetaddr.ttl=0 -Djava2d.font.usePlatformFont=false -Djava.awt.headless=true -Djava.awt.fonts=/usr/pw/jre/lib/fonts -Duser.home=/usr/pw/jre -Djava.security.policy=/usr/pw/pronto/conf/java.security.policy -Djava.system.class.loader=com.proactivenet.reports.util.ReportClassLoader -Dpronet.demonname=ProactiveNet_JServer -Dpronet.properties=/usr/pw/pronto/conf/pronet.conf -Dpronet.rate.cell.cellName=pncell_CLMBPPM -Dias.home=/usr/pw/pronto -Dcloud.adapter.home=/usr/pw/pronto -Dmarimba.cas.tool.dir=/usr/pw/pronto/conf -Djava.security.auth.login.config=/usr/pw/pronto/conf/jaas.config -Djava.util.logging.config.file=/usr/pw/pronto/conf/ias_logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Dlog4j.configuration=file:/usr/pw/tomcat/conf/log4j.properties -Dcatalina.base=/usr/pw/tomcat -Dcatalina.home=/usr/pw/tomcat -Djava.io.tmpdir=/usr/pw/tomcat/temp -Daxis2.repo=/usr/pw/tomcat/webapps/bppmws/WEB-INF -Daxis2.xml=/usr/pw/tomcat/webapps/bppmws/WEB-INF/conf/axis2.xml -Dexpression.lib.home=/usr/pw/pronto -Djavax.net.ssl.trustStore=/usr/pw/pronto/conf/pnserver.ks -Djavax.net.ssl.trustStorePassword=get2net -Dcellservice.impactmanager.connectionpoolsize=10 -Dcellservice.CacheEnabled=true -Dhttp.maxConnections=500 -Xms64m -Xmx2048m -XX:MaxPermSize=256m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/pw/pronto/logs -XX:+UseCompressedOops -classpath /usr/pw/apps3rdparty/jdbc/ojdbc6.jar:/usr/pw/apps3rdparty/antlr-2.7.6.jar:/usr/pw/apps3rdparty/aopalliance.jar:/usr/pw/apps3rdparty/aspectjweaver.jar:/usr/pw/apps3rdparty/c3p0-0.9.1.2.jar:/usr/pw/apps3rdparty/commons-collections-3.2.1.jar:/usr/pw/apps3rdparty/commons-logging-1.1.1.jar:/usr/pw/apps3rdparty/dom4j-1.6.1.jar:/usr/pw/apps3rdparty/hibernate-annotations.jar:/usr/pw/apps3rdparty/hibernate-commons-annotations.jar:/usr/pw/apps3rdparty/hibernate-entitymanager.jar:/usr/pw/apps3rdparty/hibernate3.jar:/usr/pw/apps3rdparty/javassist-3.4.GA.jar:/usr/pw/apps3rdparty/jdbc/jconn4.jar:/usr/pw/apps3rdparty/jta.jar:/usr/pw/apps3rdparty/log4j/log4j.jar:/usr/pw/apps3rdparty/slf4j-api-1.5.0.jar:/usr/pw/apps3rdparty/slf4j-log4j12-1.5.0.jar:/usr/pw/apps3rdparty/persistence.jar:/usr/pw/apps3rdparty/jndi/jndi.jar:/usr/pw/apps3rdparty/jndi/ldap.jar:/usr/pw/apps3rdparty/jndi/providerutil.jar:/usr/pw/apps3rdparty/jndi/jboss-common.jar:/usr/pw/apps3rdparty/jndi/jnp-client.jar:/usr/pw/apps3rdparty/graphing/elegant/common.jar:/usr/pw/apps3rdparty/graphing/elegant/dialgauge.jar:/usr/pw/apps3rdparty/imcomm/imcomm.jar:/usr/pw/jre/lib/rt.jar:/usr/pw/jre/lib/tools.jar:/usr/pw/jboss/client/jnp-client.jar:/usr/pw/jboss/client/log4j.jar:/usr/pw/jboss/lib/jboss-common.jar:/usr/pw/sybase/asa/java/jlogon.jar:/usr/pw/sybase/asa/java/jodbc.jar:/usr/pw/tomcat/lib/servlet-api.jar:/usr/pw/tomcat/bin/tomcat-juli.jar:/usr/pw/tomcat/lib/tomcat-util.jar:/usr/pw/tomcat/lib/tomcat-coyote.jar:/usr/pw/tomcat/lib/tomcat-api.jar:/usr/pw/tomcat/bin/bootstrap.jar:/usr/pw/tomcat/webapps/pronto/WEB-INF/lib/spring-web.jar:/usr/pw/pronto/lib/pw_server.jar:/usr/pw/pronto/lib/multiserver/multiserver.jar:/usr/pw/pronto/lib/multiserver/dbmigration.jar:/usr/pw/pronto/lib/pw_client.jar:/usr/pw/pronto/lib/pw_util.jar:/usr/pw/pronto/lib/pw_agent.jar:/usr/pw/pronto/lib/domain.jar:/usr/pw/pronto/lib/cellservice.jar:/usr/pw/pronto/lib/psservice.jar:/usr/pw/pronto/lib/ngp_service.jar:/usr/pw/pronto/lib/genericdao.jar:/usr/pw/pronto/lib/ngp-persistence.jar:/usr/pw/pronto/lib/atrium-ws-client-7.5.jar:/usr/pw/pronto/lib/atrium-ws-client-2.1.jar:/usr/pw/pronto/lib/blwsclient.jar:/usr/pw/pronto/lib/bppmRoutingService.jar:/usr/pw/pronto/lib/configutil.jar:/usr/pw/pronto/lib/ptools.jar:/usr/pw/pronto/lib/i18n_service.jar:/usr/pw/apps3rdparty/ice/ice.jar:/usr/pw/apps3rdparty/xerces/xercesImpl.jar:/usr/pw/apps3rdparty/xerces/xercesSamples.jar:/usr/pw/apps3rdparty/xerces/xmlParserAPIs.jar:/usr/pw/apps3rdparty/graphing/sitraka/jcschart.jar:/usr/pw/colt/colt.jar:/usr/pw/pronto/lib/pw_applet.jar:/usr/pw/pronto/lib/pproxy_ui.jar:/usr/pw/pronto/li


----------------------------------------------------------------------------------------------------------------------------

Thanks ,

Pawan

 

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

That is quite a list :)

 

I would say stick with the monitor program you wrote. It would need to exit with 100 (offline) or 110 (online).

mikebounds
Level 6
Partner Accredited

You can't use pattern matching with Application agent so your choices are:

  1. Use MonitorProgram
  2. Use PidFiles
  3. Use Agent builder where you can use pattern matching using the "MonitorProcessPatterns" attribute (see https://sort.symantec.com/agents/detail/5711)

Mike

pawan_n
Level 3

Hi All,

Thanks for your inputs.

I have written a Monitor Program which detects ONLINE or OFFLINE state of my application & its working fine for me. Its a simple shell script which is monitoring the running state of my application.

 

Thanks,

Pawan