Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: Problem with monitoring - frequent alerts

  1. #1
    Join Date
    Feb 2010
    Posts
    9

    Default Problem with monitoring - frequent alerts

    Hi,

    I'm running single instance CF configuration with one web application. I turned on monitoring notification (Hyperic) for case of web app unavailability.

    Now I randomly receive alert emails (Subject "An alert has been triggered - Deployment myapp - context unavailable") that the application is not running, but it obviously is running fine.

    In access log of Apache I see two requests every 15 seconds:

    127.0.0.1 - - [17/Mar/2010:15:37:33 +0100] "GET /server-status?auto HTTP/1.1" 200 438 "-" "Jakarta Commons-HttpClient/3.1"
    127.0.0.1 - - [17/Mar/2010:15:37:33 +0100] "GET /myapp HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1"

    At the time when I get the alert emails, everything in log still seems to be fine - two requests.

    Do you have idea what could be wrong? Did anybody have this kind of problem?

    Thanks,
    P

  2. #2
    Join Date
    Jun 2005
    Posts
    102

    Default

    A few questions.

    1. Do you ever see a statusCode != 200 for GET /myapp in the Apache log?

    2. The management agent also does an HTTP GET /myapp on Tomcat/tc server port 8080. It is possible that this GET is failing instead (although the Apache one should fail similarly). Are there any errors in the tomcat log?

    3. When you log into the CF.com console and go to the deployment details page - what do you see? Are there error icons for the app server tier, web server tier or both (or neither)?

  3. #3
    Join Date
    Feb 2010
    Posts
    9

    Default

    1) No
    2) No
    3) No errors, everything fine (Apache, TC, DB) - all green icons.

    Couple of times I even worked with the app when the alert happened. But obviously I worked with the app without any problem.

    Every 15 seconds there is a checking request in apache log, which means 4 cycles per minute. I checked all requests at the time of alert (plus/minus some delta) and all requests were fine with response 200, however I received the alert email.

    Do you, please, know how exactly the monitoring engine decides that the app is not running? Thanks.

  4. #4
    Join Date
    Feb 2010
    Posts
    9

    Default

    Any help, please?

    Maybe the monitoring does something wrong for my configuration or so?

  5. #5
    Join Date
    Aug 2009
    Posts
    61

    Default

    Hi Pavel,

    We're still looking into this. Hopefully, we'll have more information in a couple of days.

    Thank you!

  6. #6
    Join Date
    Feb 2010
    Posts
    9

    Default

    Ok, thanks.

    If you need any additional information from me, let me know.

  7. #7
    Join Date
    Jun 2005
    Posts
    102

    Default

    Pavel,

    To answer your previous question. A context is considered unavailable if GET /context fails to connect or returns a status other than 200 (after following redirects). This check is done on both Apache port 80 and on Tomcat port 8080.

    After looking through the logs, the most likely explanation is that GET /context to tomcat port 8080 is sometimes failing. I'm looking into changing the monitoring code to only generate an alert after multiple failures.

    Sorry for the inconvenience.

    Chris

  8. #8
    Join Date
    Feb 2010
    Posts
    9

    Default

    Hi Chris,

    to my question: so it means, that if I turned on sending a notification on application unavailability (during deployment creation), I will get email alert whenever anything goes wrong EITHER with tc call on :8080 OR with apache call on :80, right?

    I checked again tc logs (/var/log/tcserver-catalina.out) and also my application log4j outputs and there is no error at the time of the alert emails. If you have any idea how I can help with this issue, please, let me know.

    Thanks,
    Pavel

  9. #9
    Join Date
    Aug 2009
    Posts
    20

    Default

    I've been getting similar false alerts from our two clusters.

    The logs are showing no errors and do not appear to be under load when the alerts fire.

    Is there some kind of timeout parameter that needs to be increased?

  10. #10
    Join Date
    Jun 2005
    Posts
    102

    Default

    Hi,

    CF.com has been updated with some fixes that should eliminate spurious context unavailable alerts. Please let us know if you continue to have problems.

    Chris

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •