Monitor and restart Tomcat with PowerShell

One of our clients was having some issue with a Windows process that kept needing a restart and setting a Task Schedule to restart the service every night wasn’t quite cutting it. We decided that setting something to watch the process and restart it as required was a little more elegant as restarting every few hours. It ended up working really well.

At another job, we have been dealing with a server issue on and off for the past few months where Tomcat becomes unresponsive, but still appears to be running. Traditional service monitoring wasn’t detecting the crash, but our users were. This looks bad!

We eventually determined it is most likely a network hiccup between Tomcat and MSSQL that causes Tomcat to lock up because a simple restart was enough to get it back up and running. We started using UptimeRobot to notify us if it is down while we settled into thinking through the best way to “fix” the issue without having to “fix” the issue. These servers are old and the network is old, so they are going to be replaced at some point in the fairly near future, so we don’t want to spend too much time fixing old hardware before it is gone.

Since these servers are Windows, some of the more common *nix monitoring tools (wget, curl, etc) don’t work natively and we can’t just go installing stuff willy-nilly into the environment without the security team getting a bit obtuse, I went looking for PowerShell ideas. I realized that I could probably take the code used for the other client and modify it to solve this problem as well. This is what I came up with:

#Function to write logging information to the log files. 
#Another PowerShell task will delete older than 30 days.
function WriteLog {
    param ([String]$Msg)
    $TimeStamp = Get-Date -Format "dd/MM/yy HH:mm:ss"
    if ($Global:SaveLog) {
        if (-not (Test-Path "$Global:LogPath")) {
            New-Item -ItemType "directory" -Path "$Global:LogPath" -Force -Confirm:$false | Out-Null
        }
        $FileTimeStamp = Get-Date -Format "yyyyMMdd"
        $LogFileWithDate = ($Global:LogFileName.Replace(".log", "_$($FileTimeStamp).log"))
        Write-Output "[$TimeStamp] $Msg" | Out-File -FilePath "$($Global:LogPath)\$($LogFileWithDate)"  -Append
    }
    else {
        Write-Host "[$TimeStamp] $Msg"
    }
}

#Variables
$Global:SaveLog = $true
$Global:LogPath = "C:\Tools\TestTomcat\"
$Global:LogFileName = "TestTomcat.log"  # must end with .log
$Global:LogRetentionInDays = 30
$ServiceName = 'Apache Tomcat 8.5'
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$HTTP_Request = [System.Net.WebRequest]::Create('https://known-good-url')
$HTTP_Response = $HTTP_Request.GetResponse()
$HTTP_Status = [int]$HTTP_Response.StatusCode

#Check URL above to see if it responds OK or not. If Response is 200, it is OK.
If ($HTTP_Status -eq 200) {
    #Write-Host "Site is OK!"
	WriteLog "OK"
}
#Any other response suggests it is down and needs a boot to the head. 
#Sends email to help and restarts Tomcat and logs task.
Else {
    #Write-Host "The Site may be down, please check!"
	#Send-MailMessage -From 'Tomcat Node A <tomcat_a@techrescue.mn>' -To 'Help <help@techrescue.mn>' -Subject 'Tomcat Node A Restarted' -SmtpServer 'smtp.techrescue.mn'
	#Send-MailMessage -From 'Tomcat Node B <tomcat_b@techrescue.mn>' -To 'Help <help@techrescue.mn>' -Subject 'Tomcat Node A Restarted' -SmtpServer 'smtp.techrescue.mn'
	WriteLog "Restarting Tomcat"
	Restart-Service $ServiceName
}

#Closes Web socket and exits script.
If ($HTTP_Response -eq $null) { } 
Else { $HTTP_Response.Close() }
Exit(1)

I then ran this command to create a local PowerShell schedule that causes the script to check the status of the URL every minute and restarts Tomcat if necessary. One minute is the shortest interval I could get, but it seems to be working okay, as this application has two machines behind a load balancer. If Tomcat is down on one node, the LB redirects traffic to the other node and the likelihood of both nodes being down at the same time is small. This script seems to restart Tomcat within a minute and all is well. The CPU use is pretty minimal too, even with one minute intervals. The highest spike I have seen so far is 8%, lasting for about a second.

Register-ScheduledJob -Name 'Test Tomcat and Restart If Needed' -FilePath 'C:\Tools\TestTomcat\TestTomcat.ps1' -Trigger (New-JobTrigger -Once -At "12/5/2023 0am" -RepetitionInterval (New-TimeSpan -Minutes 1) -RepetitionDuration ([TimeSpan]::MaxValue))

I also found out that scheduled jobs like this are located in the Task Scheduler in the “Task Scheduler
Library\Microsoft\Windows\PowerShell\ScheduledJobs” folder. You can use Task Scheduler to view and edit the scheduled job.

The Log Rotation code is this:

function WriteLog {
    param ([String]$Msg)
    $TimeStamp = Get-Date -Format "dd/MM/yy HH:mm:ss"
    if ($Global:SaveLog) {
        if (-not (Test-Path "$Global:LogPath")) {
            New-Item -ItemType "directory" -Path "$Global:LogPath" -Force -Confirm:$false | Out-Null
        }
        $FileTimeStamp = Get-Date -Format "yyyyMMdd"
        $LogFileWithDate = ($Global:LogFileName.Replace(".log", "_$($FileTimeStamp).log"))
        Write-Output "[$TimeStamp] $Msg" | Out-File -FilePath "$($Global:LogPath)\$($LogFileWithDate)"  -Append
    }
    else {
        Write-Host "[$TimeStamp] $Msg"
    }
}

function CleanLogs {
    $Retention = (Get-Date).AddDays(-$($Global:LogRetentionInDays))
    Get-ChildItem -Path $Global:LogPath -Recurse -Force | Where-Object { $_.Extension -eq ".log" -and $_.CreationTime -lt $Retention } |
        Remove-Item -Force -Confirm:$false
}


$Global:SaveLog = $true
$Global:LogPath = "C:\Tools\TestTomcat\"
$Global:LogFileName = "TestLO.log"  # must end with .log
$Global:LogRetentionInDays = 30


if ($Global:SaveLog) {
    WriteLog -Msg "Cleaning logs older than $($Global:LogRetentionInDays) days"
	CleanLogs
}

And creating the task to run every week is this:

Register-ScheduledJob -Name 'Purge TestLO Logs' -FilePath 'C:\Tools\TestTomcat\CleanLogs.ps1' -Trigger (New-JobTrigger -Once -At "12/5/2023 0am" -RepetitionInterval (New-TimeSpan -Days 7) -RepetitionDuration ([TimeSpan]::MaxValue))

Leave a Reply

Your email address will not be published. Required fields are marked *