Linux Temperature Monitoring for Compaq DL380 G1’s, Dell PowerEdges & Asus A7N8X

With the summer fast approaching I’ve noticed that the temperatures in the Server room have been increasing. I’ve yet to turn the AirCon units on as that would use way more power than I’m prepared to use when the weather is still mild. All that aside I thought it best to start monitoring temperatures across some of the more critical servers in the rack.

I’ll start off with the Generation 1 Proliant DL380’s as these have the Proliant Support Pack installed which provides a very nice web interface to monitor stuff with but also adds an additional SNMP agent to the system allowing you to grab all sorts of additional data not least of all the temperatures.

CPU1:       .1.3.6.1.4.1.232.6.2.6.8.1.4.1.2
CPU2:       .1.3.6.1.4.1.232.6.2.6.8.1.4.1.3
IO Board:  .1.3.6.1.4.1.232.6.2.6.8.1.4.1.3

As always I am using MRTG to graph this lot so the following code should suffice:

Target[DL380.LAN_CPU]: CPU1&CPU2:read@127.0.0.1
MaxBytes[DL380.LAN_CPU]: 80
AbsMax[DL380.LAN_CPU]: 100
Title[DL380.LAN_CPU]: CPU Temperatures — DL380
Options[DL380.LAN_CPU]: gauge, nopercent, growright
YLegend[DL380.LAN_CPU]: Deg Celsius
ShortLegend[DL380.LAN_CPU]:°C
LegendI[DL380.LAN_CPU]: CPU#1:
LegendO[DL380.LAN_CPU]: CPU#2:
PageTop[DL380.LAN_CPU]: CPU Temperatures — DL380

Here are the results:
Temperature Graphs for the DL380
(Could you tell we just had a ‘real’ British Bank Holiday this weekend?)

The Dell PowerEdges can be queried in the same way once the OMSA software is installed and the OID’s that I use for them are as follows:
CPU1:       .1.3.6.1.4.1.674.10892.1.700.20.1.6.1.1
CPU2:       .1.3.6.1.4.1.674.10892.1.700.20.1.6.1.2

Thats great and made extremely easy by the software provided by the manufacturers. However the S411 servers are generic rack mount cases with Asus motherboards so no such luck with vendor software for monitoring temperatures.

The obvious choice here is the lm_sensors project (especially since they support the motherboards I’ve used).

After configuring LM Sensors I was presented with the following output:

Adapter: SMBus nForce2 adapter at 5500
VCore 1: +1.71 V (min = +1.31 V, max = +1.97 V)
+3.3V: +3.31 V (min = +2.96 V, max = +3.63 V)
+5V: +4.81 V (min = +4.49 V, max = +5.51 V)
+12V: +11.80 V (min = +9.55 V, max = +14.41 V)
CPU Fan: 8132 RPM (min = 4687 RPM, div = 2)
Chassis Fan: 0 RPM (min = 337500 RPM, div = 2)
Power Fan: 0 RPM (min = -1 RPM, div = 2)
M/B Temp: +51°C (high = +80°C, hyst = +75°C)
CPU Temp: +24°C (high = +80°C, hyst = +75°C)
Power Temp: +126°C (high = +80°C, hyst = +75°C)
CPU Temp: +25°C (high = +80°C, hyst = +75°C)
vid: +1.650 V (VRM Version 9.0)

Extremes aside I know that my PSU is not sitting at 126°C but hey never mind. All I really want out of this output is my CPU temperature. In the past when this machine was Windows Server 2003 and had the AsusMonitor software installed I knew that this CPU sits around 50°C (This particular machine does a LOT of work!) which means that lm_sensors “M/B Temp” is actually the CPU and the CPU temps are most likely the RAM temp and ambiant temps. Time for some shell scripting.

What I came out with in the end is this:

#!/bin/bash
################################
# Gets the CPU Temp from lm_sensors
###################################

CPU=`sensors | grep "M/B Temp" | awk '{print($3)}'`

echo ${CPU:1:2}

For those of you not too hot on Bash scripting all I have done is grabbed the sensors output, searched for the line that contains the word “M/B Temp” and then assigned only the 3rd column to a variable called CPU. This variable actually would contain the text “+52°C” which MRTG would not like, so I have outputted 2 characters starting at the 2nd character in the string. (I could do some maths and check if the temp is 3 characters or more but since the BIOS will shut the machine down when it hits 65 I really can’t see the point).

MRTG can utilise external scripts to get its data and providing the full path to a derivative of the script above would work just fine.

Directory[S411-1-cpu]: cpu
Target[S411-1-cpu]: `/etc/mrtg/scripts/S411-1.sh`
MaxBytes[S411-1-cpu]: 100
Options[S411-1-cpu]: gauge,growright,nopercent,absolute
ShortLegend[S411-1-cpu]: CPU
WithPeak[S411-1-cpu]: wmy
LegendO[S411-1-cpu]:CPU Temp
Title[S411-1-cpu]: S411 CPU Temp
PageTop[S411-1-cpu]: S411 CPU Temp

Eventually the AirCon will be needed but at least I can see whats been happening and maybe with a bit of RSS and whatnot I can also graph the ambiant air temp outside too just ‘cos.

Leave a Reply