[FIXED] the (monitoring) message spooler not sending emails

Home Page Forums Network Management Signal a BUG [FIXED] the (monitoring) message spooler not sending emails

This topic contains 7 replies, has 0 voices, and was last updated by  PatrickB 1 year ago.

Viewing 9 posts - 1 through 9 (of 9 total)
  • Author
    Posts
  • #44874

    PatrickB
    Member

    Hello.

    To make simple, the alert messages exist, they are listed in the log and said to be queued, but no email is sent unless the spooler is restarted (or the device rebooted).

    The email config is correct: the test feature sends one, and the whole queue is sent at restart.

    Waiting for >1 day or changing the “message max age” seems to have no effect. There is no evidence of a sending error anywhere.
    It just looks like periodic sending of queued messages is not scheduled at all.

    Context: Alix 2D13, ZS v.3.7.1, simple local mail server on another device of my LAN, no SSL, available all time, SMS alerts disabled, only email enabled.

    Did someone else experience that ? Any idea of where to dig ?

    Thanks, Best regards.

    #54556

    iulyb
    Member

    Hi
    This is kind of old.. but any way.
    I had the same hardware before and worked out of the box with gmail
    smtp : smtp.gmail.com
    security: starttls
    port: 587

    What to look for:

    root@zs scripts> grep -r smtp ./
    ./sendmail:smtp-cli --subject="$SUBJECT" --from="$SMTPSENDER" --to="$RECIPIENT" --server="$SMTPSERVER" --port="$SMTPPORT" $SECURITYSTRING $AUTHSTRING --body-plain=$BODY
    ./alerts_start: echo smtp.gmail.com > $CONFIG/EMAIL/SMTPServer
    root@zs scripts> smtp-cli --version
    smtp-cli version 3.6
    root@zs scripts> man smtp-cli
    No manual entry for smtp-cli

    Seem that you need to google for smtp-cli.

    On the other hand I would suspect some firewall or route issue.
    Try to telnet from your zs to local server.
    https://www.port25.com/how-to-check-an-smtp-connection-with-a-manual-telnet-session-2/

    #54557

    PatrickB
    Member

    Thanks but…

    As I explained, the sending of spooled messages works fine when restarting the spooler (at reboot notably).

    The problem is that the same spooled messages are never sent before.

    Actually there is nothing related in the crontab. I had a look at crontabgen and commented the removal of /tmp/crontab: there is code to generate the events what means spool them, OK this part works…

    Then I’m looking for clues about how the spooler is made and how to make it flush periodically.

    Thanks, Best regards.

    #54558

    iulyb
    Member

    You said on reboot. Everything restarts on reboot and order is important. You can get them sent because there is not a firewall that comes later.

    kerbinet script runs in a continuous loop sa no need for crontab. All relevant script are done in that way. I am not sure but this is what I noticed.

    Also try to isolate.. so first send an email using telnet if ok then using smtp-cli
    Also you didn’t specify if you server is internal or external so try to telnet and smtp-cli on both..
    If everything works then there is something else in scripts. Scripts are on /root/kerbynet/scripts

    #54559

    PatrickB
    Member

    I can get the messages currently spooled all sent without rebooting, for instance by changing the parameter “max age” of the monitoring and saving. Doing so must restart the spooler service specifically.

    The SMTP used is running full-time on another machine on the LAN, using no SSL, and actually, as I said, it works fine when the spooler wants so flush its queue. Also with the test message feature.

    Since all is on LAN side there is no firewall restriction between the ZS and the SMTP. Be sure that I did test all of that. Else it would never work.

    The issue is the spooler queueing definitively until it is restarted, this is why I’d like to find how the spooler works and what makes it flush its queue.

    If I understand you well, the spooler would be made of a infinite sleep-and-do loop in a script ? Why not… In this case it means that the criterion to enter the message sending clause is never met.

    OK I will try to locate such a script within the others.

    Thanks, Best regards.

    #54560

    PatrickB
    Member

    The spooler is made of the script spoolerd and it is not running because actually it aborts when I try to run it manually.

    The previously queued file was named:
    1_1508439602_Emergency_b031d51f_DISKFULL

    …and the email was properly sent

    Another one is named:

    7_949653234_Info_18c8f180_STARTED

    …and causes this error:

    ./spoolerd: line 28: 949653234_: value too great for base (error token is “949653234_”)

      MESSAGES="`ls -d * 2>/dev/null`"
    for M in $MESSAGES ; do
    TS="${M:2:10}"
    SEVERITY="`echo $M | awk -F_ '{print $3}'`"
    SUBJECT="`cat $M/Subject 2>/dev/null`"
    TYPE="`cat $M/Type 2>/dev/null`"
    MRECIPIENT="`cat $M/Recipient 2>/dev/null`"
    if [ "$TYPE" = Recipient ] ; then
    TYPE="`cat $CONFIG/Recipients/$MRECIPIENT/Type`"
    fi
    ID="`echo $M | awk -F_ '{print $4}'`"
    EVENT="`echo $M | awk -F_ '{print $5}'`"
    NOW=`date +%s`
    if [ $((NOW-TS)) -gt $MAXAGE ] ; then <
    $SCRIPTS/alerts_logger "$ID" "$EVENT ($SEVERITY): message expired."

    I guess that this is a timestamp assumed to use 10 digits then at early hours it is shorter, unless a change is done to left pad it with a zero.

    But if doing so, the test must force base10 to avoid another surprise with octal faults at 8 and 9 o’clock:
    if [ $(( $NOW – 10#$TS )) -gt $MAXAGE ] ; then

    Actually in this case it can be fixed a simplier way by just removing the potential trailing underscore:

      MESSAGES="`ls -d * 2>/dev/null`"
    for M in $MESSAGES ; do
    TS="${M:2:10}"
    TS="${TS%%_}" <
    SEVERITY="`echo $M | awk -F_ '{print $3}'`"
    ...

    Until this is integrated, it is necessary to replace spoolerd with a fixed copy in the PostBoot script, then run alerts_start again because it has already crashed.

    But this fixes the issue, ps -A shows the spoolerd running and the emails are sent. 8)

    Hope it helps, Best regards.

    #54561

    imported_fulvio
    Participant

    Hi,
    many thanks for the contribute. I am going to integrate the fix in the 3.8.1 release.
    Please let me know if I have only to add

    TS=”${TS%%_}”

    without
    the base 10 conversion.

    Regards
    Fulvio

    #54562

    PatrickB
    Member

    Hello.

    I asked several persons, nobody could explain me why the trailing underscore causes an error saying “value too great for base”.

    Anyway removing it keeps only digits, then does the job.

    Forcing base10 in the arithmetic operation $(( )) would be required only if the first digit could be a zero, because a leading zero means octal and there are digits 8 and 9 in the numbers.

    If there cannot be a leading zero, then no need to force base10.

    Actually fixing this bug has a little side-effect: now my couple of Alix spams me with DISK FULL alerts due to the profile crowded with downloaded modules (of several megs) and I’m searching how to purge that safely:

    /DB/_DB.002/var/register/system/AutoUpdate/pkgs/…

    Also I could see that the partitions don’t use the whole compact disk, then I expect to enlarge this one.

    😈

    Thank you for your work.
    Have a nice day.

    #54563

    PatrickB
    Member

    I confirm that the fix is integrated in release 3.8.1, then I could remove my mod from the PostBoot script.

    The messages are sent as needed.

    In addition I did some surgery to enlarge the partition “Profiles” from 1.1Gb to 2Gb and as a result it is only 52% in use now.

    I did it before upgrading ZS, and the ratio of 52% kept stable across the operation. Then I hope the AutoUpdate will not colonize the new area…

    Actually I’d like to find some doc about how the room is managed there and what causes to purge items. I only have 3 addons and quite no logs, so getting stuffed till Samba crashes was very annoying 👿

    Thank you in advance.
    Best regards.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic.