I’ve been building a little SNMP Management Pack in the past few days to discover and monitor a bunch of PowerWare UPS’s, which turned out to take quite a lot more energy and time than expected. Mostly due to the facts that I am really bad with SNMP and how it works, I’ve never really looked into the inner working of building an SNMP management pack and also because we ran into a couple of errors preventing the discovery process to work alright.
To make it clear right away, this is not going to be a “Building an SNMP Management Pack Tutorial” since there’s plentiful good ones out there already, and to be extra helpful I’m gonna include a few links right away:
It’s the second, the NetApp one, I’ve used as a guide to building the UPS management pack since it goes through the process of building your own filtered discovery using SystemOID to identify your hardware-classes and then building the monitors on top of those.
Let’s get to it
When building the discovery of my hardware classes I ran into problems. The discovery simply did not work. At first I got some strange errors about “invalid queries”, something that turned out to be related to me reading two guides–seriously though, pick one guide that is closest to what you want to achieve and stick to it–and mixing up the XPathQuery variables. Silly me.
I got those errors to go away and I was able to get a few objects to my base-class, but none of the hardware classes who was populated through the return value of an SNMP OID got discovered.
The only error I got this time was the following:
Log Name: Operations Manager
Source: Health Service Modules
Date: 2010-09-02 11:19:12
Event ID: 11001
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: CENSORED
Description:
Error sending an SNMP GET message to IP Address XX.XX.XX.XX, Community String:=CENSORED, Status 0x6c.
One or more workflows were affected by this.
Workflow name: CENSORED.MP.CLASS.DISCOVERY
Instance name: CENSORED_DEVICENAME
Instance ID: {5C7EFB30-D885-8843-0DD7-EA86B4FD2311}
Management group: CENSORED
I went through all the other logical steps of troubleshooting an error like that which include double-checking firewall settings, OIDs, IP-addresses, allowed hosts and so forth. It wasn’t until I loaded the
PowerMIB into a MIB Browser installed on the proxy machine (in this case a Management Server) I realized that there was no problem sending an SNMP GET to the UPS from that server. I launched Wireshark and had it listen to SNMP traffic between the UPS and the Management Server. The thing that struck me right-away was the fact that I could see the a bunch of “SNMP Get-Request” but no “SNMP Get-Response” which means that Operations Manager did send an SNMP GET but there was no response.
After a bit of intense staring i noticed what you see in the screenshot.

For some reason Operations Manager does not care about what SNMP version you configure when you do the initial discovery of a network device. Even if you do specify SNMP v1, you probes may very well be using SNMP v2c instead and in many cases that will result in these SNMP GET errors in the Operations Manager event log.
Since I am such a nice guy, here’s an example of the working probe with the added line highlighted.
<IsWriteAction>false</IsWriteAction>
<IP>$Config/IP$</IP>
<CommunityString>$Config/CommunityString$</CommunityString>
<Version>1</Version>
<SnmpVarBinds>
<SnmpVarBind>
<OID>1.3.6.1.4.1.534.1.1.1.0</OID>
<Syntax>0</Syntax>
<Value VariantType="8"></Value>
</SnmpVarBind>
<SnmpVarBind>
<OID>1.3.6.1.4.1.534.1.1.2.0</OID>
<Syntax>0</Syntax>
<Value VariantType="8"></Value>
</SnmpVarBind>
<SnmpVarBind>
<OID>1.3.6.1.4.1.534.1.1.3.0</OID>
<Syntax>0</Syntax>
<Value VariantType="8"></Value>
</SnmpVarBind>
</SnmpVarBinds>
That’s it. Working perfectly now.
Best of luck to you too.
Ok, so I got the task to install the Linux Integration Service for Hyper-V R2 on a RedHat Enterprise Server 5. Something that turned out to be a bit more to handle than I would have thought. So here’s a little How-To.
Preparations
Read the documentation provided in the Linux Integration Services download. Much of the information in this article is in there, but some parts are not. Otherwise I would not have bothered writing about it.
I’m not going to go through the OS installation process here, but make sure to select the “Software Development” packages since you will be needing it. In case you missed it, you can install them later by running these commands.
# yum groupinstall "Development Tools"
# yum install kernel-headers
I’m not actually sure that you need to run the kernel-headers install manually or if it’s included in the “Development Tools” package.
The first gotcha i ran into was the fact that the link to the Linux Integration Services–previously known as Linux Integration Components or LinuxIC–on RedHat’s information pages gave me a 404 and a redirect to a bing-search that returned the exact same 404. The page have simply been removed by Microsoft without any form of redirection to the new page. Anyway, a search on http://download.microsoft.com for “Linux Integration Components” do return the new page, and that’s where I learned about the new name.
Thank you for making it easy for us Microsoft!
Here’s a direct link to the search on the current name: http://www.microsoft.com/downloads/en/results.aspx?freetext=linux+integration+services&displaylang=en&stype=s_basic
And here’s a direct link to the actual download page: http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=eee39325-898b-4522-9b4c-f4b5b9b64551
This download contains an ISO file that you can mount using the Hyper-V- or VMM-console, or you can do as I did and download the ISO to the virtual machine, mount it locally, copy the files and unmount it. Like this.
# mkdir /mnt/ISO
# mount -o loop /root/LinuxIC\ v21.iso /mnt/ISO
# mkdir /opt/linux_ic_v21_rtm
# cp /mnt/ISO/* -R /opt/linux_ic_v21_rtm/
# umount /mnt/ISO
You probably have to be root to do this by the way.
With that done, let’s get to the installation.
Installation
As root, do the following:
# export PATH=$PATH:/sbin
# cd /opt/linux_ic_v21_rtm/
# make
# make install
# reboot
Why the export PATH command? Apparently, on RHES5, /sbin is not in the PATH by default and this is something that the make scripts are completely unaware of. The “make install” will try to run “depmod” which will fail since it’s not in the default path. You could also add “PATH=$PATH/sbin” to the root users ~/.bashrc which will put it back in the PATH but only for the root user, but I don’t know if that’s recommended.
And, yes. You DO have to reboot after the install.
If you are running RHES5 64bit you also have to install the “adjtimex” package. It is in the RHN repository but also on the RHES5 Installation CD in case you have no internet connection. Install it with yum like this:
# yum install adjtimex
And from the CD (mount it first) like this:
# rpm –ivh /mnt/cdrom/Server/adjtimex-1.20-2.1.x86_64.rpm
And that’s basically it for the installation.
Verification
How do you know that the driver are installed?
After the reboot, try running “modinfo vmbus” which should return something like this:
# modinfo vmbus
filename: /lib/modules/2.6.18-194.11.1.el5/kernel/drivers/vmbus/vmbus.ko
version: 2.1.25
license: GPL
srcversion: 3C1899C419665CB2514F2D0
depends:
vermagic: 2.6.18-194.11.1.el5 SMP mod_unload gcc-4.1
parm: vmbus_irq:int
parm: vmbus_loglevel:int
Try that with netvsc, storvsc and blkvsc too (replace the vmbus part) and you should get something similar. If you don’t, the installation did not succeed.
The documentation also tells us to check that the components are running with “/sbin/lsmod | grep vsc” which should return:
# /sbin/lsmod | grep vsc
blkvsc 70184 3
storvsc 64264 0
netvsc 73504 0
vmbus 88304 3 blkvsc,storvsc,netvsc
scsi_mod 196953 6 scsi_dh,sg,blkvsc,storvsc,libata,sd_mod
The numbers will probably differ from installation to installation depending on blocksizes and allocation.
Configuration
Configuration is pretty straight-forward so I’ll keep this short.
When you install the drivers you will get a new network card called seth0, which I presume stands for Synthetic ETHernet. There’s nothing magic about it regarding configuration and “system-configuration-network” will work just fine.
The drivers will also give you a couple of SCSI-devices (if you have one attached) with the regular /dev/sd* naming. Simply configure these using fdisk or whatever GUI you might prefer.
There is also a note in the documentation about changing the grub configuration in the “Additional Information…” section. Do read that section.
Additional Comments
One thing I tend to do now that disk space is dirt cheap is to copy all ISO-files I use locally instead of mounting them when needed through Hyper-V. Simply because you can bet your insert-shorter-word-for-buttocks that the day you need it again, someone has been kind enough to have done som spring-cleaning or it’s locked by another machine in the cluster. If you have it locally and followed my instructions in the “Preparation” section, you will allready have a /mnt/ISO directory. Only thing you’ll have to do is
# mount -o loop /path/to/your.iso /mnt/ISO
And there you have it. Just remember to unmount it when you’re done.
I also almost never use the Hyper-V remote connection interface thingy since it will give you a GUI and the mouse just won’t work. If you haven’t configured a network card yet though, you could connect through Hyper-V and hit Ctrl+Alt+F1 to get a command prompt. Unfortunatly cut/paste don’t work here, but you could run system-configuration-network, assign an IP-address and then connect with an SSH client. I prefer PuTTY to a degree that I usually install the ported version on my Linux desktops aswell.
And I never logon using root. People should know this, but it should be stressed anyway. Always logon as regular user and su or sudo when needed. I can’t understand why RHES has root-login enabled by default in the SSH-server config.
Good luck!
I’ve been looking for at way to evenly distribute agents between Gateway Servers (or Management Servers for that matter, but I’ll stick to GWs this time) for some time but haven’t really got to fixing it myself until now.
The situation is basically that we’re monitoring customers through gateway servers connected to our central Operations Manager environment. To have a bit of redundancy we always put two (or more) gateway servers per site (or customer, really) and they, in turn, talks to a couple of central management servers. I guess a drawing would be nice, but I have no Visio on this computer. The gateways are manually configured to talk to different management servers and have the others configured for fail-over (through powershell) and since we’re talking about no more than a few handfuls (say 20-ish) it’s not a problem handling it that way.
Agents, on the other hand, are a different matter. Even though we try to spread them out somewhat evenly at deployment between the gateway servers at each site we still end up looking at a 3:2 ratio after a while and since agents do not automatically fail-over between gateway servers we need a way to fix that too.
So I wrote a little powershell script that takes a bunch of gateway servers (or management servers) as parameters, gathers all connected agents, spreads the agents evenly between the servers and configures the others as fail-over servers while at it.
It’s all pretty crude, but it works and you can download it from here: DistributeAgents.ps1
Save it somewhere on disk and call it from the Operations Manager Shell like this:
C:\DistributeAgents.ps1 gateway01.customer.local,gateway02.customer.local,gateway03.customer.local
Yes, you should replace “C:\” with whatever path you decided to save the script to and “gatewayXX.customer.local” with a real servername.
Ok, I’m a powershell freshman and I’m pretty sure you could do this a prettier way, but here’s the script:
Param([array]$CSVServerList)
$arrServerObject = @()
$arrAgentObject = @()
foreach($Server in $CSVServerList)
{
$arrServerObject += Get-ManagementServer | where {$_.Name -eq $Server}
echo "Looking for $Server"
}
$ServerCount = $arrServerObject.Count
if ($ServerCount -gt 1)
{
echo "Found $ServerCount management servers"
} else {
echo "Found only 1 (or less) management servers. Aborting..."
Exit
}
echo "Getting agents..."
foreach ($Server in $arrServerObject)
{
$arrAgentObject += Get-Agent | where {$_.PrimaryManagementServerName -eq $Server.Name}
}
$AgentCount = $arrAgentObject.Count
if ($AgentCount -gt 1)
{
echo "Found $AgentCount agents"
Start-Sleep -m 200
} else {
echo "Found only 1 (or less) agents. Aborting..."
Exit
}
$i = 0
foreach ($Agent in $arrAgentObject)
{
if ($i -ge $ServerCount)
{
$i = 0
}
$arrTemp = @($arrServerObject | Where-Object {$_ -ne $arrServerObject[$i]})
# $FailoverServers = $arrTemp -join ","
Set-ManagementServer -AgentManagedComputer: $Agent -PrimaryManagementServer: $arrServerObject[$i] -FailoverServer: $arrTemp
$arrTemp = $null
$i++
}
I have used it on a couple of occasions now and have only discovered a problem with an error when one of the servers don’t have any agents at all (probably a new one), but the script still works so I haven’t really dived into it.
Now, as with all scripts you download on the ‘net it’s up to you to test it in a lab before shooting wildly among your in-production systems. I really can’t give any warranties that it won’t FSU royally at your place.
After a long wait (definitely more than 90 days) the management packs for MSMQ 4 (Windows 2008) and MSMQ 5 (Windows 2008 R2) are finally released.
Both seem to be fully Cluster aware and pretty much holds the same monitoring as the the latest MSMQ 3 MP.
Message Queuing 4.0 Management Pack for Operations Manager 2007
Quick Details
Version: 6.0.6700.83
Date Published: 4/5/2010
Language: English
Download here: http://www.microsoft.com/downloads/details.aspx?FamilyID=cfc103b8-7185-4721-8098-110885fe9e9e&displaylang=en
Message Queuing 5.0 Management Pack for Operations Manager 2007
Quick Details
Version: 6.0.6700.88
Date Published: 4/5/2010
Language: English
Download here: http://www.microsoft.com/downloads/details.aspx?FamilyID=28349b78-8329-44aa-8a1f-81f4e3f84d0c&displaylang=en
This script has pretty much already been covered in my previous post about Changing or Replacing an Operations Manager Gateway Server.
This time I’ve basically put parameter support in it to make it easier to use.
Here’s the script anyway.
Param($OldGW,$NewGW)
$OldMS= Get-ManagementServer | where {$_.Name -eq $OldGW}
$NewMS = Get-ManagementServer | where {$_.Name -eq $NewGW}
$agents = Get-Agent | where {$_.PrimaryManagementServerName -eq $OldGW}
$agents = $agents
"Moving " + $agents.count + " agents from " + $OldMS.Name + " to " + $NewMS.Name
Start-Sleep -m 200
Set-ManagementServer -AgentManagedComputer: $agents -PrimaryManagementServer: $NewMS -FailoverServer: $OldMS
To use it, create a textfile called ChangeGW.ps1 and paste the code into it. Save the file somewhere neat (maybe C:\Scripts\) for easy access. If you don’t feel like copy/pasting, you can download the script here.
To use it, open the Operations Manager Command Shell and type:
C:\Scripts\ChangeGW.ps1 <old.gatewayserver.dns.name> <new.gatewayserver.dns.name>
For example:
C:\Scripts\ChangeGW.ps1 gwserver01.domainname.local gwserver02.domainname.local
Getting “ESENT Kerys are required to install this application” when you are trying to modify/change an agent installation?

This seems to be most common on Windows 2008 and i guess it’s because of the AUC and the fact that opening the Control Panel isn’t running in administrative mode.
To work around this you need to run the msiexec command on the correct installation GUID from an administrative command prompt.
Besides running through the registry to find the GUID, one of the easier ways is this:
- Open an administrative command prompt.
- run wmic product
- Locate your product by its name, the GUID (looks a bit like this {25097770-2B1F-49F6-AB9D-1C708B96262A}) directly after that is the one you want. Copy it.
- run msiexec /i <PASTEYOURGUIDHERE>
- Modify the agent as pleased
That’s pretty much it. Good luck.
Microsoft has released an update to the MSMQ (version 3) management pack.
System Center Pack for: Message Queuing 3.0
Version: 6.0.6615.0
Released on: 12/14/2009
Message Queuing (also known as MSMQ) is a server application that enables applications to communicate across heterogeneous networks and systems that may be temporarily offline or otherwise inaccessible. Instead of an application communicating with a service on another computer, it sends its information to Message Queuing, which sends the information to a Message Queuing service on the target computer where it is made available to the other application. Message Queuing provides guaranteed delivery, efficient routing, security, and priority based messaging.
Now, what’s really interesting is what you will find in the MP Guide under “Supported Configurations”.
The Message Queuing Management Pack for Operations Manager 2007 is designed to monitor Message Queuing version 3 only.
The Message Queuing Management Pack supports the following platforms:
· Windows Server 2003
· Windows XP
The Message Queuing Management Pack also supports monitoring clustered MSMQ components.
Text coloration is obviously added by me to highlight the interesting part.
Finally MSMQ monitoring seems to be cluster aware, which might mean that the home-made pack i did to have those (numerous) queues covered could be passed on to the scrap-heap. This is also confirmed under “Changes in This Update”.
The December 2009 update to this management pack includes the following change:
· Fixed a problem when working with an instance of MSMQ in a Cluster. The MP is now able to discover and monitor public and private queues in a cluster.
· Fixed a problem when discovering the local and cluster instance of MSMQ. The MP is now able to discover and monitor both instances.
The confusing double RunAs profiles seems to have been cleaned up too (you only have to worry about one now) as well as fixing some sloppy mistakes in the previous scripts (no Option Explicit? C’mon Microsoft! You write the best practices, try to stick to them.) and generally improving display and documentation.
Gonna import this to our staging environment today and let it roll during the holidays.
Cheers! Oh, and happy holidays!
Download and documentation:
http://www.microsoft.com/downloads/details.aspx?FamilyId=1D2B4398-8BC2-4A43-850C-852EBB0D983B&displaylang=en&displaylang=en
Here’s a little trouble-shooting guide for discovering Linux systems from OpsMgr R2 when getting the following error from the wizard:
<stdout>Generating certificate with hostname="COMPUTERNAME"
[/home/serviceb/TfsCoreWrkSpcRedhat/source/code/tools/scx_ssl_config/scxsslcert.cpp:198]
Failed to allocate resource of type random data: Failed to get random data - not enough entropy
</stdout><stderr>error: %post(scx-1.0.4-248.i386) scriptlet failed, exit status 1
</stderr><returnCode>1</returnCode>
<DataItem type="Microsoft.SSH.SSHCommandData" time="2009-08-05T11:15:01.5800358-04:00" sourceHealthServiceId="0EB1D6DA-202C-7FC5-3D46-BDBB9208547D"><SSHCommandData><stdout>Generating certificate with hostname="COMPUTERNAME"
[/home/serviceb/TfsCoreWrkSpcRedhat/source/code/tools/scx_ssl_config/scxsslcert.cpp:198]
Failed to allocate resource of type random data: Failed to get random data - not enough entropy
</stdout><stderr>error: %post(scx-1.0.4-248.i386) scriptlet failed, exit status 1
</stderr><returnCode>1</returnCode></SSHCommandData></DataItem>
But first, a little background on the actual “problem”. To generate the certificate, the entropy needs to be high enough to generate random data for the certificate creation. Without the certificate, the OpsMgr agent won’t be able to open up communications with the MS. So, what creates this entropy we need? Bluntly put, a selection of hardware components that are likely to produce non-predictable data. Like a keyboard, mouse and a monitor or videocard. Of course, there’s a lot more to it, but we really don’t need to know this. What we need to know is that there has to be a “bit bucket” of more than 256bytes of entropy for the certificate creation process to succeed. We also need to know that more enterprise-ish servers, like rack- or blade-servers tend to be void of things like directly attached keyboards, mouses and monitors that the linux kernel needs to be able to generate entropy. And herein lies the problem. If you have a new server that is not in full service (likely since we are trying to deploy the monitoring on it) which means that there’s not much random data flowing through the hardware and there’s no keyboard or mouse or monitor connected to it there is quite the risk that the system entropy is going to be very low. Of the linux systems that I have been deploying OpsMgr agents to, about half have failed because of “Not enough entropy”. So, here’s the steps I usually takes to ensure that discovery works. I use PuTTY to connect to the soon-to-be-monitored servers. This guide also assumes that you have SU rights on the system since all of these steps (except #1) needs it.
- Check you current entropy
cat /proc/sys/kernel/random/entropy_avail
Is it less than, or close to, 256? It probably is. If you don’t feel like connecting a mouse and start wiggling it around—not really feasible in a data center—and see if the entropy increases, you can generate your own random data.
- Generate you own random data.
Be advised that this forced entropy will not be as random as the system-created on and thus not as secure. How much more insecure it is, I don’t know, and quite frankly I prefer to have my systems monitored yet slightly less secure than not monitored at all. Anyway, you can force your own random data by running:
dd if=/dev/urandom of=~/.rnd bs=1 count=1024
This creates a .rnd file with 1024B of random data that the certificate creation process will use instead of the system entropy if the file exists.
- Uninstall and re-discover
The first failed attempt of discovery will most likely leave a non-working agent installation that we have to remove. Otherwise we will just be stuck with an “Access Denied” error. Run:
rpm –e scx
Now, try to discover the system again.
- Failed again?
Try generating the certificate manually by running:
/opt/microsoft/scx/bin/tools/scxsslconfig -f –v
/opt/microsoft/scx/bin/tools/scxadmin –restart
Retry discovery again.
- Still fails?
Uninstall the agent once more as instructed in step 3.
Stese steps have solved my problems 100% on both SUSE and RedHat and hopefully they will help you too.
Interestingely enough, these problems seems to be connected to some changes in the 2.6 kernel and basically everything that uses SSL-ish certificates will be affected. Even though the symptoms may be a bit more subtle, like time-outs and disconnects. For “headless” servers like those I usually to administer where the random data tend to be much lower, there’s even specialised hardware whose sole purpose is to generate random data, like the Entropy Key. I have also been told that new servers is likely to be equipped with entropy chipsets to make sure that there’s chaos enough to avoid these new-found oddities.
Sources:
http://social.technet.microsoft.com/Forums/en-US/crossplatformsles/thread/f94ec905-23ac-4444-b9f8-644fec3ae357
http://www.askrenzo.com/oracle/SCOM/SCOM_discovering_nodes.html
In SQL Server 2005 and 2008 the local Administrators account is not sysadmin by default. This makes it even more important that the one setting up the Database also remembers to add a SQL Server admins group to the sysamin role. If this step is forgotten, the user installing the database server is the only one that will ever be sysadmin.
In some extreme cases I’ve seen situations where no one except some dude on vacation is sysadmin and there’s a bunch of applications that needs to be installed/upgraded. In these cases I have also been assigned Local Administrator rights on the server, but since the local Administrators group isn’t sysadmin either I still cannot login to the SQL server.
What to do!?
Thanks to Raul Carcia’s blog post it’s not that a big deal. The instructions is written for SQL Server 2005, but works equally fine on SQL Server 2008 and the only requirement is that you are a local server administrator.
Here’s what to do:
- Open the SQL Server Configuration Manager.
- In SQL Server Services, open the properties for the SQL Server instance you need access to.
- In the Advanced tab, locate Startup Parameters.
- Add “;-m” to the end of that line.
- Press OK and restart the SQL Server into “Maintenance Mode” or “Single User Mode” if you like. (check that a restart is OK first
)
- Open a command prompt (right-click, “Run as Administrator” in Windows 2008) and go to C:\Program Files\Microsoft SQL Server\100\Tools\Binn\
(C:\Program Files\Microsoft SQL Server\90\Tools\Binn\ for SQL2005)
- Execute
sqlcmd /A /E /S<SERVER\INSTANCE>
(use . for local default instance and .\INSTANCE for local named instance)
- In the CLI, execute:
EXEC sp_addsrvrolemember 'DOMAIN\yourusername', 'sysadmin';
GO
- Return to the SQL Server Configuration Manager and restore the Startup Parameters to it’s previous settings.
- Restart the SQL Server instance to allow users to access it again.
Now, you should be able to login to the SQL server with sysadmin rights using your current user. This would also be a good point in time to actually establish a SQL Server Admins group (local or domain) to add to the sysadmin role to avoid having others to the above steps when you, yourself, happens to be on vacation.
As Raul Carcia point out in his original post, this is really a disaster recovery procedure and there’s definitely nothing sneaky about it since it leaves quite alot of trails in the event logs.
All in all, a Great article by Raul and all credit should go his way.
Microsoft has released an updated MP for SCCM SP2 (v6.0.6000.2, released on 10/28/2009) for OpsMgr R2.
The update basically contains support for x64 that was missin in the previous release.
The Configuration Manager 2007 SP2 Management Pack adds support for monitoring Configuration Manager 2007 SP2 in a 64-bit environment with Operations Manager 2007 R2 or Operations Manager 2007 SP1 with hotfix (KB971541) installed. This enables the Configuration Manager 2007 SP2 Management Pack to work with either the 32-bit or the 64-bit Operations Manager 2007 agent. Except for the 64-bit support, the other features and guidance for Configuration Manager 2007 Management Packs remain intact.
(coloration added by me)
Read more and download here:
http://www.microsoft.com/downloads/details.aspx?FamilyID=a8443173-46c2-4581-b3b8-ce67160f627b