Archive for .NET
Gibraltar 2.1.1 Released
Posted by: | CommentsWe’ve published Gibraltar 2.1.1, you can download it right away. There are a number of great enhancements in this release. We’ve already covered a few of them before:
- Metric Analysis Enhancements: Tease out useful information from metrics even with large samples or odd values.
- New Agent Message Alert: Easy ways to monitor all of the messages flowing through the Gibraltar Agent and take actions.
Based on feedback on the beta release we’ve added some additional capabilities:
- Send Session Now: If you subscribe to the Message Alert event you can send the current session immediately to the Hub or Email based on the current configuration by setting one property. Check out the code sample to see how.
- Easy email notifications: You can leverage the same email configuration the Agent is configured with to send messages within your application for any reason.
- Anonymous Data Collection: Session data can be anonymized during collection so no personally-identifying information is sent to you. Just set one option and you’re good to go.
- Detailed .NET Memory Counters: You can now enable detailed memory performance counters that monitor the .NET CLR’s garbage collector and memory monitoring. Very useful for monitoring for memory leaks in production applications.
- PostSharp Enhancements: We’ve made argument tracking more sophisticated so you can do more without compromising the performance of your application.
We also fixed a number of defects (23) that mostly apply to edge cases, but no defect is minor when it affects you. In particular, our CEIP identified a error on first time startup in several cultures that prompted users to restart Analyst. Ouch. Fortunately we were able to figure it out and fix it. We addressed Thread Ids too.
Hub Subscriptions Live Too
On February 15 the Gibraltar Hub Service is fully live. You can get a free 30 day trial and then if you like what you see you can subscribe for terms from 1 month to one year at a scale that works for you – from a single laptop up to your whole large team. There’s no long term commitment, and you can even easily migrate from the Hub Service to your own private hub down the road if you want to.
This Release Made Possible By People Like You
We say it all the time, but this release in particular was driven entirely by end-user requests. We’re working on the next major release of Gibraltar but we stepped back and wanted to address requests from our customers and a few prospects as well. When you read the list of everything we’ve done, other than a few defects we found internally and through the CEIP this is based on what our customers felt was most important. Are we missing something you need? Let us know: we’ve proven we listen again and again and again.
You can read a thorough list of the new features, defect fixes, and changes at What’s New in Gibraltar 2.1.1.
Managed Thread Ids – Unique Id’s that aren’t Unique
Posted by: | CommentsWe had a customer quiz us about why one of our thread names was showing up on some of their log messages. We looked into the problem and were a bit baffled. We name all of the threads we create inside the Agent to ensure we can separate what they do from any client application. The name in question is used by a thread that the Gibraltar Agent creates and then destroys relatively early in the process. This thread isn’t taken from the threadpool or put back into one, we confirmed it gets created and released so there just seemed no way that they could be processing on our thread.
We checked the data up and down and were confident that it wasn’t a data corruption problem – the only assumption made by the code was that Managed Thread Ids are unique. This seemed pretty reasonable: the documentation for the ManagedThreadId property reads:
Thread.ManagedThreadId: Gets a unique identifier for the current managed thread.
But, we kept digging and found another scenario on a long running ASP.NET application where a similar event occurred – a thread that was created and destroyed relatively early in the application was clearly now in the thread pool and handling events. Researching more, we found this gem in the documentation. Not on the MSDN documentation for ManagedThreadId but rather for Thread.GetHashCode:
The hash code is not guaranteed to be unique. Use the ManagedThreadId property if you need a unique identifier for a managed thread.
OK, still pointing us that ManagedThreadId is the right guy for our use. But then there’s this note on the Thread Class itself:
GetHashCode provides identification for managed threads. For the lifetime of your thread, it will not collide with the value from any other thread, regardless of the application domain from which you obtain the value.
This started to cast some concern: That little bit of weasel room in the second sentence is troubling: “For the lifetime of your thread”… Was .NET reusing thread Id’s after a thread exits? The wiggle room in the statement above made that sound possible, even though there’s no reason necessarily that the hash code and the thread Id are related. My first read of this was that the variation was about the second part of the sentence – uniqueness across application domains (which we never assumed).
So we created a few brutal tests – creating and destroying threads then ramping up the thread pool’s activity. Sure enough, the same Managed Thread Ids showed up in the thread pool. These weren’t the same threads – the thread static variables we were using for tests had been reset – but they had the same Managed Thread Id.
Go Team
The fix for us is to not rely on Managed Thread Id for correlating events to threads. Instead, we’re using an internal thread static variable to track the relationship and identify it with our own unique identifier. Because we track the thread responsible for log messages and many other things we record we had to represent this in the smallest amount of data feasible, and remain backwards/forwards compatible with existing data.
We’ve updated the display to automatically generate unique display names to separate out threads with the same Id’s and had to do a range of other adjustments to ensure we treat the Managed Thread Id as nothing more unique than a display name. That way you’ll be sure that if two events are ascribed to something called “Thread 14″, they really are the same thread. All of the changes for this are included in Gibraltar 2.1.1 which will ship within the next few days (this was the last issue we needed to resolve before shipping).
Incomplete is worse than Missing
The frustrating part is that if the documentation had never made any claim about the uniqueness of the thread Id we’d likely have gone through a set of proof and qualification testing. Like many people, when there isn’t documentation on something we have to create experiments to tease out the true behavior, review source code, and then decide what risks we want to take. This is one reason we are passionate about documentation, even at the expense of extra features. We want to make sure that you never have a doubt about what something on our API does. We also know that people don’t want to review documentation if they don’t have to – so we try hard to make the API understandable just from Intellisense.
Now, I don’t want to knock Microsoft too hard here – .NET is a massive framework even if you just look at the core .NET 2.0 API. But, as we all rely more and more on ever increasing layers of abstraction over what’s really going on it’s more important than ever to be precise in the documentation – about what something is and what it isn’t. Precise is more important than being comprehensive, because it will set the right expectation for people about what they can rely on and what they’ll have to verify for themselves.
First, Do No Harm – Designing Robust Infrastructure
Posted by: | CommentsSeveral customers have requested a notification mechanism to be alerted when errors are detected in their programs. Simply raising an event is straightforward, but our promise to our customers is that we’ll do the hard thinking that ensures Gibraltar is safe and robust in production systems. Our mantra is: first, do no harm.
In this case, we asked ourselves questions like:
- What if a customer’s error notification logic is slow? How do we ensure that it doesn’t slow down the application as a whole?
- What if the program starts screaming thousands of errors? How do we ensure that we don’t swamp the error notification handler?
- What if there are errors in the customer’s error notification handler? What if it throws an exception? What if it hangs?
This resulted in a design that ensures that the logging infrastructure (including Gibraltar itself AND customer logic that interfaces with it) will be robust and safe.
Our central Log object in Gibraltar Agent now has a MessageAlert event that is raised when warning, error, or critical messages are recorded. This event has a number of safety features such as:
- Asynchronous: The event is raised on a background thread that is not part of the logging path, ensuring that time spent handling the event will not slow down logging or affect other threads.
- Batching: When a burst of messages are recorded that qualify they will typically be raised together to allow more efficient processing
- Throttling: A minimum delay between events can be easily specified to ensure the event isn’t raised too frequently, particularly in error cascade scenarios. Messages are batched up until the next time the event can be raised.
- Hang Protection: If the event handler never returns the Agent will continue to process messages and not queue them, allowing them to be released from memory.
- Loop Protection: Messages that are recorded by your event handler will not cause additional events to be raised. This prevents notification loops where an event handler records an error during notification which subsequently causes the message alert notification to be raised again.
- Low Overhead: We don’t spin up anything (the threading, queue, etc.) until someone subscribes to the event so if you don’t use this feature it doesn’t take up resources either.
The MessageAlert event is particularly useful for automatically triggering immediate data transmission in the case of an error and implementing your own error notification mechanism. The full detail of each log message is available in the event.
Check out our recent post on charting enhancements for more examples of how we are incorporating customer feedback to ensure that Gibraltar provides a robust logging infrastructure allowing you to build rock solid .NET software.
Upcoming Gibraltar Release
Posted by: | CommentsYou may have noticed that we didn’t publish an update at the end of September, and here we are halfway through October. What’s up?
Well, we’re hard at work on a major update of Gibraltar. There’ll be a full announcement in the next week but separate from that we’ve been busy incorporating a lot of end-user feedback. Gibraltar’s been in enough user’s hands that now we’re getting a good stream of detailed feedback on places where we can make it better, and frankly we want to accommodate all of it. Short of that, we’re trying to hit as much as we can.
What’s in the next update?
Plenty! Here are some of the new features:
Analyst Repository Viewer
- Make bulk delete & update operations on sessions faster.
- Enable more keyboard-friendly shortcuts like Ctrl-A for select all in any grid that supports multi-select.
- Open only one copy of Analyst so users can repeatedly open packages and have them open into the same copy of Analyst.
Analyst Session Viewer
- Intelligently hide columns that don’t contain interesting data.
- Hide Log Message Details tabs that are never used for the current session, make the others highlight better when they contain interesting data.
- Make it easier to see the full command line that was used to execute a session when it’s really long.
- Add a full screen view for a log message to make viewing really long log messages easier (we have sample data from customers where a single log message is multiple pages of text when printed)
- Intelligently resize the log message column for a more balanced display.
Agent
- Packages are about 30% smaller than before. (but still backwards compatible).
- Have Packager automatically split up large packages into smaller chunks when sending via Email to make sure attachments don’t get rejected because of size.
We’ve also fixed every customer defect that has been reported as well as a set that we’ve found internally. There are some performance enhancements for logs that have interesting data cases such as very long captions without white space or badly structured XML data in log details as well. As always, full information will be in the release notes once we ship.
When Will This Be Available?
We’re going to announce a preview version in the next week in concert with the major feature announcement. The preview version will be available to Gibraltar licensed users immediately. It’ll be available for trial download at a later date once the preview program has concluded.
Want to get in on the action? Every license of Gibraltar includes 12 months of maintenance – which means you get every update we ship in that timeframe and access to priority support. Priority support isn’t just help when things go wrong, it’s also our ear listening carefully to how we can make the product more effective for you.
Everything we listed above for new functionality – every last one – came from requests from real customers. Your voice really counts with Gibraltar Software.


