Techspeak for the socially diminished

Let’s say you have followed this guide: http://support.microsoft.com/kb/938245/

Still not working? The one thing I forgot, or rather did not find in any of the guides, was to change the website application pool to “Classic .NET AppPool”. It is actually noted in KB938245 but only after the installation, during the configuration. For some reason I have not been able to install Reporting Services 2005 on Windows 2008 without changing this prior to the installation.

Maybe I am doing it wrong but this seems to be working all right for me.

What do you do when you cannot delete a file or folder on a windows server?

Check the file permissions! And if that doesn’t help?

Check the share permissions! Yes, if it is a shared folder. And if that doesn’t help?

Check the file ownership! Great! But then what?

Well, the file could be in use, and then you would have to shut the locking process down and perhaps kick a user out. In a really bad scenario it could also be a symptom of a broken filesystem, a reserved filename (like “lpt1” or “PRN”) or even an invalid name (silly things like a space in the beginning or the end of a filename).
Another possible reason could actually be that the path to the file or folder is too long. You won’t actually get an error telling you that the filepath exceeds the 255 characters Windows can handle but a simple “Acces Denied”.

There are some, more or less tedious, work-arounds for the problem. Like renaming, starting from the root, all the directories to shorter ones or using the old DOS (8.3, like “dokume~1.doc”) names that windows can auto-generate for you. Personally, I have two favourite ways of handling this.

  1. Map the parent-directory of the file/folder you are trying to access/delete as a network drive and access your files that way.
    This is particularly useful if the folder you are trying to access a DFS-share or perhaps a share on the central fileserver filepaths like “\\servername01\Central Projects\Central Services\IT Department\Develop Methods for Automatically Deploying New Central Servers\2.2.1 Auto-Deploying SQL-Server 2005 Cluster\Documents\Preparations\Whitepapers\SQL Server 2005 Failover Clustering White Paper.doc”
  2. Create a new share to a folder further down the hierarchy. This works locally too if you are logged on to, say, SRV01, you create a new share on “D:\Fileshares\Central Projects\Central Services\IT Department\Develop Methods for Automatically Deploying New Central Servers\” called “Autodeploymethods” and access it from “\\SRV01\Autodeploymethods\”. That way the filepath doesn’t exceed 255 characters.

Now. When designing fileservers, you really should think about how deep the filepaths may get. This is especially true on DFS-shares since you might have to deal with the full FQDN too, and not only the actual folder structure. Many big corporations I know uses “codes” for departments and assign a project ID (quite simply a number or maybe an abbreviation) to each project and uses theese for the fileshares too. Another scenario that could lead to similar problems are intranet sites where users can create and manage their own subsites and where filenames and folders are not stored in a database.

I have only seen this phenomena on Windows systems so far, and I’ve actually used a linux Live-CD on occasion when admin access is denied.

Read More:
http://support.microsoft.com/kb/320081

This update hasn’t showed up in the MP Catalog yet, but the System Center Operations Manager 2007 R2 Cross Platform Update can be downloaded here.

Besides SUSE 11 support, here’s the short overview.

The System Center Operations Manager 2007 R2 Cross Platform Update adds fixes for a defunct process issue on Unix/Linux Servers, as well as, adds support for SUSE Linux Enterprise Server 11 (both 32-bit and 64-bit versions) and Solaris Zone support.
Feature Summary:
The System Center Operations Manager 2007 R2 Cross Platform Update supports the monitoring of Unix/Linux Servers including:

  • Monitoring of SUSE Linux Enterprise Server 11 servers (both 32-bit and 64-bit versions)
  • Support of Solaris Zones
  • Fix for defunct Process issue
  • The Cross Platform Agent may not discover soft partitions on Solaris systems. Therefore, the disk provider may be unloaded, and the Cross Platform Agent may stop collecting information from the system disks.
  • The Cross Platform Agent may not restart after the AIX server reboots.

The latest versions of all the Operations Manager 2007 R2 Unix/Linux agents are included in this update.

Perfect timing, I must say, since I really need this today. :D

Update:
This is no small MP-update, which probably is the reason that we do not find it in the MP Catalog, but a ~250MB OpsMgr R2 Software Update. You need to run this on all Operations Manager Servers (RMS/MS, GW?) since it actually updates many of the agent Cross Platform binaries. It does add a new MP för SUSE 11 that you have to import from disk if you need it.

So, the installation goes somewhat like this:

  1. Install the Software Update (pick the right Architecture) on all OpsMgr R2 Servers
  2. Import the SUSE 11 MP if necessary
  3. Re-discover your Unix/Linux machines.

Files updated in this update for R2:

  • .\Microsoft.Enterprisemanagement.UI.Administration.dll (Version 6.1.7043.1)
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.aix.5.ppc.lpp.gz
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.aix.6.ppc.lpp.gz
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.hpux.11iv2.ia64.depot.Z
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.hpux.11iv2.parisc.depot.Z
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.hpux.11iv3.ia64.depot.Z
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.hpux.11iv3.parisc.depot.Z
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.rhel.4.x64.rpm
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.rhel.4.x86.rpm
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.rhel.5.x64.rpm
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.rhel.5.x86.rpm
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.sles.10.x64.rpm
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.sles.10.x86.rpm
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.sles.9.x86.rpm
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.solaris.10.sparc.pkg.Z
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.solaris.10.x86.pkg.Z
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.solaris.8.sparc.pkg.Z
  • .\AgentManagement\UnixAgents\scx-1.0.4-248.solaris.9.sparc.pkg.Z

Files added:

  • Microsoft.Linux.SLES.11.MP

All in all, the update contains the following fixes:

  • KB969342
  • KB973583
  • Q954049
  • Q956240

I’ve wrestled a bit with a critical status on one of the Organization States at a clients site that wont go back to green despite all the underlying monitors have gone back to green. And apparently I am not alone on this one. Others, like me, has read and re-read the MP-guide i search for a monitor/rule/discovery for overrides forgotten, and I don’t know how many times I’ve made a small change and tried resetting the health once again. Anyhow.
Marius Sutara posted an answer on TechNet forums last week with a “fix” (-ish), or rather the acknowledgement that the problem is not a 40c. The problem might be related to other MP as well, but I’ve only seen it on the new Exchange MP so far. In that same post, Pete Zerger provided some links to two nifty little tools that will help you reset the health of the monitor.

In case you wonder why on earth I post when there’s allready a “solution” out there; Pagerank, baby!
Not for me, but for the forum post making it show up earlier on google.

Microsoft released an updated MP (v6.1.7533.0, released on 10/8/2009) for monitoring the health the Operations Manager components.

Most significant updates, according to me, would seem to be:

Fixed an issue that was previously preventing all rules related to agentless exception monitoring from generating alerts.

Added the rule “Collects Opsmgr SDK Service\Client Connections” to collect the number of connected clients for a given management group. This data is shown in the view “Console and SDK Connection Count” under the folder “Operations Manager\Management Server Performance”.

Updated a number of monitors and rules to ensure that data is reported to the correct management group for multihomed agents.

Fixed the configuration of the rule “IIS Discovery Probe Module Execution Failure” to so that the parameter replacement will now work correctly for alert suppression and generating the details of the alert’s description.

The rest is mostly polishing, fine-tuning and complementary updates. Nothing really ground-breaking here, but still a welcome update.

Download at: http://www.microsoft.com/downloads/details.aspx?FamilyID=61365290-3c38-4004-b717-e90bb0f6c148

According to the OpsMgr Team blog, Microsoft wants to know what you think about their SQL Server MP. It’s really hard to come by a better opportunity to express your feelings and desires about monitoring SQL Server, so don’t miss this one out.

http://blogs.technet.com/momteam/archive/2009/09/25/sql-management-pack-survey-live-on-connect.aspx

If you are looking into replacing an (or just switching to another primary) Operations Manager 2007 Gateway Server for any reason, there’s a little more to consider than just right-clicking the clients and selecting “Change Primary Management Server” in the Operations Console.
You could end up with agents not being able to connect to the Management Group at all due to a small problem with the order in which Operations Manager do things.

Here’s basically what happens:

  • You tell Operations Manager to change Primary Management Server for AGENTX from GW1 to GW2.
  • The SDK Service (i guess) tells GW1 that “You’re no longer the Primary Management Server for AGENTX”
  • GW1 acknowledges this and stops talking to AGENTX. And I mean Completely stops talking to AGENTX.
  • OpsMgr then tells GW2 to start accepting communication from AGENTX.
  • OpsMgr tries to tell AGENTX that it should talk to GW2 since GW1 won’t listen.

Spotted the problem?
This modus operandi probably works when agents are on the same network and in the same domain where fail-over is sort of automatic. The problem we are facing now is that the server are telling the Gateway to stop accepting communications to and from the agent before the agent is notified that there is a new Gateway server to talk to. The agent will continue to talk to GW1 but will be completely ignored and you will probably start seeing events in the Operations Manager eventlog on GW1 with EventID 20000.

How do I get around this little feature then?

No matter if you found this article after running into the mentioned troubles or if you are googling ahead of time to be prepared, the fix is the same and consists of a few powershell scripts. These scripts are out there allready, but in different contexts, hence this post.

First step: Install the new Gateway

Documentation on this from Microsoft is good enough, but here’s the short version.

  1. Verify name resolution to and from Gateway server and Management Server
  2. Create certificate for the Gateway server
  3. Approve the Gateway server
  4. Install Gateway server
  5. Import certificates on Windows system
  6. Run MOMCertImport.exe on Gateway server to add the certificate into Gateway server configuration
  7. Wait

The wait is for the gateway server to get all needed configuration from RMS and to download all neccesary management packs, run all the discovery scripts and so on. When the Operations Manager event log has calmed down a bit, move to step two.

Second step: Configure Agent Failover

Connect to an Operations Manager Command Shell. Any will do, as long as it’s connected to the correct Management Group.
Then run the following script:

$primaryGW= Get-ManagementServer | where {$_.Name -eq 'GW2.domain.local'}
$failoverGw = Get-ManagementServer | where {$_.Name -eq 'GW1.domain.local'}
$agents = Get-Agent | where {$_.primarymanagementservername -eq 'GW1.domain.local'}
Set-ManagementServer -AgentManagedComputer: $agents -PrimaryManagementServer: $primaryGW -FailoverServer: $failoverGw

Remember to change “GW1.domain.local” to you OLD Gateway servername and “GW2.domain.local” to your NEW Gateway servername.
If you don’t know powershell, this script basically configures all agents using the old Gateway to use the new one as primare, but keep the old one as a fail-over server. The Gateways will still get to know the changes before the agents, but since the old on is still listening to the agents (though, as the fail-over host) it will be able to tell them to go to the new one, GW2.

Ok, so I reinstalled my linux partition with Ubuntu 9.04 x64 and decided to try EXT4 on the root partition. Like, yesterday.
Managed to get the Citrix client running (way more easy on Ubuntu than Fedora, I’ll be back on that) and all without too much fuzz.

First reboot gave me a “let’s FSCK!”. So I FSCK-ed and booted up to the desktop.

Second reboot gave me a “let’s FSCK!”. And I did. Booted to the desktop.

Third boot went smoothly, but all of a sudden all the icons decided to go AWOL. Rebooted again.

Fourth boot gave me a “let’s FSCK!”. I replied with “Well FSCK You!”

Fifth boot gave me a “let’s FSCK!”. I rebooted back to Windows 7.

Tonight I am reinstalling Ubuntu 9.04 x64 with EXT3.

Just wanted to raise a word of caution about the TCP Port Check in Operations Manager 2007.

Some customers have notices the the system-logs on some Unix machines are completely swamped with “connection error”, “TCP Connect failed”, “TCP Session Lost” and similar and after a bit och research the problematic servers were narrowed down to those monitored by Operations Manager. Specifically, those who are targeted by a TCP Port Check.

It would seem like the TCP-connection never fully initializes on the target server. Kind of like knocking on your neighbours door and then hiding. Then when the door opens, no one is there.

Maybe there’s a setting somewhere to modify how “deep” a Port Check should go before closing. Perhaps fully initializing and then sending a proper “Close” instead of just cutting the connection. In a few extreme cases we have noticed that the target server even goes so far as to start a session, but never ending it since there’s no closure and finally having no sessions to spare for the real users. But on most servers it’s just an annoyance since the “real” errors is very hard to be found in all the connection related logs.

Anyway. Just a good thing to keep in mind when running TCP Port Checks from Operations Manager 2007. Keep an eye on the logs when implementing the port checks.

I get this question every now and then and every time I find myself completely flabbergasted and having to look things up once again. To avoid wasting my time on the same question once again and perhaps help others doing the same, here’s a little guidance.

Don’t get me wrong now.
SQL Express has it’s applications and for a free database server, it’s not half-bad. Small development sites, minor, not that extremely important systems with lower performance and feature demands, minor website databases et cetera could do well with SQL Express.

Here’s my list of questions you have to ask to find if SQL Express is the correct choice.

  1. Do your applications support SQL Express?
    If your application developers cannot say “Yes” to this, you’re out of luck. You could probably get their applications to run on SQL Express anyway, but application support if something goes bad will most likely be zilch.
  2. Do your applications fit the hardware limitations?
    SQL Express is limited to 1GB RAM, 1 CPU and 4GB of databases. 1GB of RAM seems a bit tight to me for any production data. Also, on SQL Express 2005, according to Microsoft, you cannot run parallel queries.
    ”SQL Server Express can install and run on multiprocessor machines, but only a single CPU is used at any time. Internally, the engine limits the number of user scheduler threads to 1 so that only 1 CPU is used at a time. Features such as parallel query execution are not supported because of the single CPU limit.”
    If this is still true on SQL Express 2008, I don’t know and I haven’t found any information about it (yet).
    When answering this question, remember to calculate expected growth and possibly new databases/applications too.
  3. Do your applications use database replication?
    If, so. Do the new server need to act as a publisher? If, yes, then you’re out of luck. SQL Express do handle database replication, but only as a subscriber. If you need to publish data, then you need a “bigger” SQL Edition.
  4. Do you need Database Mail?
    SQL Express does not have Database Mail. You have to find other ways to code your notifications. This question has raised counter-questions from customers as to “What would I need Database Mail for?”. It is, evidently a feature not used by many. Personally, I find it useful. Clay McDonald has a nice blog-post on how to make SQL-triggers send mail on, for instance, inserts into a table using Database Mail. You could of course have it send mail on deletions as well. In my mind, this might come in handy in user-databases in CRM- or HR-systems. Every time an employee gets deleted from the database, the HR-admin could receive a notification.
  5. Do you need the SQL Agent?
    Perhaps not. Maybe you feel comfortable with scheduling your database backups using the windows scheduler and homebrew scripts. Just make sure your monitoring software (or IT-personnel) discover when the script fails. An increasing amount of applications require the SQL Agent to schedule and monitor recurring tasks, like Microsoft’s App-V. Without the SQL Agent, the databases would grow ad infinitum. How about index maintenance? This is also possible to go by using your own scripts and the windows scheduler. SQL Express can do most maintenance tasks you would need using scripts and T-SQL. The SQL Agent just makes it simpler and more manageable. Once again, double-check this with the application developer.
  6. Do you application use SSIS/DTS-jobs?
    This is not included in SQL Express. Maybe there’s a work-around, but I haven’t found it and I doubt it is supported by anyone.
  7. Do you need to be able to troubleshoot performance problems?
    You can do this on SQL Express with a great deal of knowledge and timers. The SQL-profiler, the Performance Data Collection and the Database Tuning Advisor makes it easier. Specifically, the SQL-profiler comes in really handy when you suspect the application (not the system) to be the bad guy since you can trace the queries and pin-point where the performance-hit resides. Using the SQL-profiler I have been able to optimise indexes to and thus making database servers go from a 98% CPU Load to 3% CPU Load. I have also been able to pin-point specific queries and use them as “evidence” that the problem is bad/sloppy code rather than problems with the database server. Also, using the SQL-profiler.

There’s more, of course, but these point are the most common pit-falls in my experience. As you can see, there’s three “do you need”-questions  and there are highly optional. Far from everyone use them and often because of lacking SQL Server knowledge. You don’t know what you can do. Still, the most important question is #1. Is SQL Express a supported database server for your applications. Hopefully, the developer knows the answer to this directly. No maybe’s. Yes or No.

Personally, I find that If you need a database server for production data, don’t go for SQL Express. Many customers have gone that way because “it’s free!” just to find themselves in the midst of a SQL Server upgrade and database migration a year later.