Archive for .NET
Gibraltar 2.2 Beta 1 Now Available
Posted by: | CommentsThe first beta of Gibraltar 2.2 is now available to download. The big picture items in this release are:
- New add-in integration API: You can do some fun stuff with this, like exporting data to your own data warehouse or integrating with your favorite defect tracker. We’ve provided some samples to get you started and make sure it’s easy.
- System tray and run at login integration: Particularly if you’re using Gibraltar Hub this is a killer feature that you have to experience. It makes integrating Gibraltar into your personal workflow much more fluid.
- View errors without opening sessions: You can preview the errors in a session without having to open it up to decide if it’s worth drilling into. This saves a lot of time when reviewing large numbers of sessions.
- Add metric groups to charts in one step: You can add any number of metrics to a chart at once just by dragging the folder they contain onto the chart. It’ll sort out the compatibility for you in one step. If you’ve been using metrics, particularly using our aspects for PostSharp, this feature will save you a lot of time.
A hidden feature for existing Gibraltar users is that we’ve dramatically reduced the memory used for packaging and sending sessions via email or writing them to files. Basically, we’ve optimized the case where a session isn’t fragmented to never buffer the data in memory resulting in a small and predictable memory footprint even for very large sessions.
Safe for production use
The reason this is considered a Beta release is because of the new Add-In integration API. This affects Analyst only. Otherwise it has passed all of our production tests and is safe to deploy and use broadly. We are likely to make changes to the Add-In API based on your feedback and our own experience for the final release.
We will provide our normal full support for this release at least through shipping the final release of 2.2.
Path to final release
We expect to wrap up the beta of 2.2 and ship the production version in the next 6-8 weeks depending on your feedback. We want to make sure there are several add-ins available on initial release and get in some other exiting capabilities we can’t talk about just yet before we close the books on Gibraltar 2 and move on to our next major release.
This is a great chance to influence the design of a major feature of Gibraltar – play around with creating add-ins and let us know what you think. What are we missing? What type of add-ins do you want to see (perhaps there’s someone else that wants the same thing)? Drop us a line and let us know.
Trial users
We’d encourage you to try out this new release as well, particularly if you think you’ll get more out of the new features we’ve added. All you need to do is register for an account (free and fully automated) and you can get it. Your existing 14 day anonymous trial or 30 day trial key will continue just fine with this beta release.
Gibraltar 2.1.1 Released
Posted by: | CommentsWe’ve published Gibraltar 2.1.1, you can download it right away. There are a number of great enhancements in this release. We’ve already covered a few of them before:
- Metric Analysis Enhancements: Tease out useful information from metrics even with large samples or odd values.
- New Agent Message Alert: Easy ways to monitor all of the messages flowing through the Gibraltar Agent and take actions.
Based on feedback on the beta release we’ve added some additional capabilities:
- Send Session Now: If you subscribe to the Message Alert event you can send the current session immediately to the Hub or Email based on the current configuration by setting one property. Check out the code sample to see how.
- Easy email notifications: You can leverage the same email configuration the Agent is configured with to send messages within your application for any reason.
- Anonymous Data Collection: Session data can be anonymized during collection so no personally-identifying information is sent to you. Just set one option and you’re good to go.
- Detailed .NET Memory Counters: You can now enable detailed memory performance counters that monitor the .NET CLR’s garbage collector and memory monitoring. Very useful for monitoring for memory leaks in production applications.
- PostSharp Enhancements: We’ve made argument tracking more sophisticated so you can do more without compromising the performance of your application.
We also fixed a number of defects (23) that mostly apply to edge cases, but no defect is minor when it affects you. In particular, our CEIP identified a error on first time startup in several cultures that prompted users to restart Analyst. Ouch. Fortunately we were able to figure it out and fix it. We addressed Thread Ids too.
Hub Subscriptions Live Too
On February 15 the Gibraltar Hub Service is fully live. You can get a free 30 day trial and then if you like what you see you can subscribe for terms from 1 month to one year at a scale that works for you – from a single laptop up to your whole large team. There’s no long term commitment, and you can even easily migrate from the Hub Service to your own private hub down the road if you want to.
This Release Made Possible By People Like You
We say it all the time, but this release in particular was driven entirely by end-user requests. We’re working on the next major release of Gibraltar but we stepped back and wanted to address requests from our customers and a few prospects as well. When you read the list of everything we’ve done, other than a few defects we found internally and through the CEIP this is based on what our customers felt was most important. Are we missing something you need? Let us know: we’ve proven we listen again and again and again.
You can read a thorough list of the new features, defect fixes, and changes at What’s New in Gibraltar 2.1.1.
Managed Thread Ids – Unique Id’s that aren’t Unique
Posted by: | CommentsWe had a customer quiz us about why one of our thread names was showing up on some of their log messages. We looked into the problem and were a bit baffled. We name all of the threads we create inside the Agent to ensure we can separate what they do from any client application. The name in question is used by a thread that the Gibraltar Agent creates and then destroys relatively early in the process. This thread isn’t taken from the threadpool or put back into one, we confirmed it gets created and released so there just seemed no way that they could be processing on our thread.
We checked the data up and down and were confident that it wasn’t a data corruption problem – the only assumption made by the code was that Managed Thread Ids are unique. This seemed pretty reasonable: the documentation for the ManagedThreadId property reads:
Thread.ManagedThreadId: Gets a unique identifier for the current managed thread.
But, we kept digging and found another scenario on a long running ASP.NET application where a similar event occurred – a thread that was created and destroyed relatively early in the application was clearly now in the thread pool and handling events. Researching more, we found this gem in the documentation. Not on the MSDN documentation for ManagedThreadId but rather for Thread.GetHashCode:
The hash code is not guaranteed to be unique. Use the ManagedThreadId property if you need a unique identifier for a managed thread.
OK, still pointing us that ManagedThreadId is the right guy for our use. But then there’s this note on the Thread Class itself:
GetHashCode provides identification for managed threads. For the lifetime of your thread, it will not collide with the value from any other thread, regardless of the application domain from which you obtain the value.
This started to cast some concern: That little bit of weasel room in the second sentence is troubling: “For the lifetime of your thread”… Was .NET reusing thread Id’s after a thread exits? The wiggle room in the statement above made that sound possible, even though there’s no reason necessarily that the hash code and the thread Id are related. My first read of this was that the variation was about the second part of the sentence – uniqueness across application domains (which we never assumed).
So we created a few brutal tests – creating and destroying threads then ramping up the thread pool’s activity. Sure enough, the same Managed Thread Ids showed up in the thread pool. These weren’t the same threads – the thread static variables we were using for tests had been reset – but they had the same Managed Thread Id.
Go Team
The fix for us is to not rely on Managed Thread Id for correlating events to threads. Instead, we’re using an internal thread static variable to track the relationship and identify it with our own unique identifier. Because we track the thread responsible for log messages and many other things we record we had to represent this in the smallest amount of data feasible, and remain backwards/forwards compatible with existing data.
We’ve updated the display to automatically generate unique display names to separate out threads with the same Id’s and had to do a range of other adjustments to ensure we treat the Managed Thread Id as nothing more unique than a display name. That way you’ll be sure that if two events are ascribed to something called “Thread 14″, they really are the same thread. All of the changes for this are included in Gibraltar 2.1.1 which will ship within the next few days (this was the last issue we needed to resolve before shipping).
Incomplete is worse than Missing
The frustrating part is that if the documentation had never made any claim about the uniqueness of the thread Id we’d likely have gone through a set of proof and qualification testing. Like many people, when there isn’t documentation on something we have to create experiments to tease out the true behavior, review source code, and then decide what risks we want to take. This is one reason we are passionate about documentation, even at the expense of extra features. We want to make sure that you never have a doubt about what something on our API does. We also know that people don’t want to review documentation if they don’t have to – so we try hard to make the API understandable just from Intellisense.
Now, I don’t want to knock Microsoft too hard here – .NET is a massive framework even if you just look at the core .NET 2.0 API. But, as we all rely more and more on ever increasing layers of abstraction over what’s really going on it’s more important than ever to be precise in the documentation – about what something is and what it isn’t. Precise is more important than being comprehensive, because it will set the right expectation for people about what they can rely on and what they’ll have to verify for themselves.
First, Do No Harm – Designing Robust Infrastructure
Posted by: | CommentsSeveral customers have requested a notification mechanism to be alerted when errors are detected in their programs. Simply raising an event is straightforward, but our promise to our customers is that we’ll do the hard thinking that ensures Gibraltar is safe and robust in production systems. Our mantra is: first, do no harm.
In this case, we asked ourselves questions like:
- What if a customer’s error notification logic is slow? How do we ensure that it doesn’t slow down the application as a whole?
- What if the program starts screaming thousands of errors? How do we ensure that we don’t swamp the error notification handler?
- What if there are errors in the customer’s error notification handler? What if it throws an exception? What if it hangs?
This resulted in a design that ensures that the logging infrastructure (including Gibraltar itself AND customer logic that interfaces with it) will be robust and safe.
Our central Log object in Gibraltar Agent now has a MessageAlert event that is raised when warning, error, or critical messages are recorded. This event has a number of safety features such as:
- Asynchronous: The event is raised on a background thread that is not part of the logging path, ensuring that time spent handling the event will not slow down logging or affect other threads.
- Batching: When a burst of messages are recorded that qualify they will typically be raised together to allow more efficient processing
- Throttling: A minimum delay between events can be easily specified to ensure the event isn’t raised too frequently, particularly in error cascade scenarios. Messages are batched up until the next time the event can be raised.
- Hang Protection: If the event handler never returns the Agent will continue to process messages and not queue them, allowing them to be released from memory.
- Loop Protection: Messages that are recorded by your event handler will not cause additional events to be raised. This prevents notification loops where an event handler records an error during notification which subsequently causes the message alert notification to be raised again.
- Low Overhead: We don’t spin up anything (the threading, queue, etc.) until someone subscribes to the event so if you don’t use this feature it doesn’t take up resources either.
The MessageAlert event is particularly useful for automatically triggering immediate data transmission in the case of an error and implementing your own error notification mechanism. The full detail of each log message is available in the event.
Check out our recent post on charting enhancements for more examples of how we are incorporating customer feedback to ensure that Gibraltar provides a robust logging infrastructure allowing you to build rock solid .NET software.


