Archive for Logging

Following up on our announcement of Gibraltar Hub, I wanted to talk a bit more about what it does, how it works and our design goals in creating it.

We’ve always had an eye to making it as easy as possible to get diagnostic data from where your application is run back to the developers that can make good use of it.  We built in a packager that could gather up all of the data and send it to you via email or write it to a single, highly compressed file.  But we always knew we also wanted a way to transmit logs over basic web protocols.

What is Gibraltar Hub?

With the Gibraltar Hub we’ve created a service that listens for Gibraltar data sent to it via email or our RESTful web API.  We’ve updated the Gibraltar Agent to be able to send data to the Hub just like it can send via email or write to a file.  On the other side we’ve updated the Gibraltar Analyst to be able to retrieve all of the data sent to a Hub automatically in the background.

The three main Gibraltar components: Agent, Analyst and Hub

The three main Gibraltar components: Agent, Analyst and Hub

It sounds like a small thing – we put a web service between the Agent and the Analyst – but the effect on using Gibraltar is remarkable.  First, it means that diagnostic data automatically flows to you without any extra work on your part.  When you have hundreds or thousands of installed copies of your software that’s key.  It makes it practical to gather and analyze all that data.

Second, multiple users can subscribe to the same Hub, meaning each gets a copy of the data.  This keeps your whole team in the loop on what’s happening in the field.  You can have your laptop and desktop both set up to get data so you can provide great customer support in the field and in the office.

Third, it still works with email.  Based on the suggestions from our beta users we added the ability for the Hub to connect up to  a mailbox and automatically pull in all of the Gibraltar data sent to it.  This provides a few neat capabilities:

  • End users can email in data collected from computers that can’t talk to the Hub directly.
  • You can start off using email and easily migrate to the Hub without having to change the configuration of the Agents.
  • You can use email as a fallback if for some reason a computer can’t access the web but can send email.
  • You can selectively forward sets of log data into your whole team by just forwarding the email.  This may fit well with your existing customer service system if it’s receiving the mail first.

It Just Works

Mobile User Support

Like many of our customers, we’re all over the place from day to day so it’s essential that our support infrastructure works wherever we are.  We designed the Hub for this exact scenario because only the Hub has to be in a well known, accessible location:  The Agents send to it, and the Analysts get data from wherever they are.

Large Sessions?  Bad Connections?  Firewalls?

We wanted to be absolutely sure that the Hub would Just Work.  That meant even if you created really large log files or had a very dodgy network connection it still had to function without complaint.   To support this, the Hub uses a chunked encoding method for uploading and downloading data to ensure that it’s efficient over bad or slow connections and won’t get halfway through sending a file only to discover that the web server won’t accept it because it’s too large.

We chose to use a RESTful API for the Hub with an eye to making it as compatible as possible with even the most strict firewall configurations.  We work with customers that categorically block all SOAP web services and do other interesting content restrictions.  Without being downright subversive, we wanted to make sure the Agent could always get data through.

On the Analyst side, it works in the background to stream the data from the Hub to your local repository so it’s available to you even if you lose your network connection.

Want to get started?

Upon release you can either purchase your own Private Hub to run on your own server or you can use the Gibraltar Hub Service that we host.

If you’re a registered user of Gibraltar you can try out the Gibraltar Hub Service right now.  Just contact us to get your preview account set up.  Depending on feedback we receive, we expect the general release to be in late December or early January.

If you hadn’t seen it in our previous post, here’s a short video tour (3 min) of Gibraltar Hub:

Categories : Development, Logging
Comments (1)
Oct
26

Announcing Gibraltar Hub for Easy CEIP

Posted by: Jay | Comments (2)

Gibraltar Hub is our new server-based product that works with Gibraltar Agent and Gibraltar Analyst to deliver an end-to-end solution for creating a customer experience improvement programs (CEIP) as well as remote debugging for customer support. We have been quietly developing and testing Hub for months and are thrilled with how well it’s working. We’ll be releasing it commercially later this Fall and are now inviting existing Gibraltar customers to participate in our beta testing program.

We’ve created a short video tour (3 min, below) to give you a sense of Gibraltar Hub as well as a podcast (8 min) of a conversation between Kendall and me talking about Gibraltar Hub and the problems it solves. You can read an abridged version of the interview below and we’ll be posting more technical details later this week.

If you like what you see and want to participate in our beta program, please shoot me an email.

More about Gibraltar Hub

What problems does Gibraltar Hub solve?

Gibraltar Hub is designed to address a couple key scenarios:

  • Collecting data from many application instances even past firewalls such as commercial software products.
  • Customer Experience Improvement Programs (CEIP) for proactively gathering feedback on application performance in the field through continuous data collection and analysis.

How does Gibraltar Hub complement Gibraltar Agent and Gibraltar Analyst?

Gibraltar Hub sits between Gibraltar Agent and Gibraltar Analyst making it easier to get data from users to the development team. It’s a web service providing two interfaces: one for Agents to submit logs, the other allowing logs to stream down into Analysts. With Gibraltar Hub you can collect, manage and analyze thousands of logs and provide every member of your development team with a consistent, near real-time view of all that data as well as simple, powerful tools to analyze the data and gain new insights into the areas of your applications most needing improvement.

Is Hub required to use Gibraltar?

No, Hub is totally optional. The existing email and file transfer mechanisms in Agent will continue to be supported. However, we believe Hub will provides the best user experience because both Analyst and Agent have been enhanced to support secure, reliable, background data transfers with Hub. This means the applications can be configured to silently stream logs in the background and the development team sees new data automagically appear like new mail popping into your inbox.

How does Hub help development teams?

Gibraltar Analyst has always made it easy to import and export packages containing logs. But some of our customers with large user communities or development teams with multiple members found it challenging to ensure that everyone had a consistent view of all the relevant data. With Gibraltar Hub each team member can subscribe to a shared feed and have all the data available and continuously updated.

What is a Customer Experience Improvement Program (CEIP) and how does Gibraltar Hub help?

Microsoft coined the term Customer Experience Improvement Program (CEIP) and describe it like this:

CEIP collects information about how our customers use Microsoft programs and about some of the problems they encounter. Microsoft uses this information to improve the products and features customers use most often and to help solve problems. Participation in the program is voluntary, and the end results are software improvements to better meet the needs of our customers.

The three components of Gibraltar correspond directly with the three key challenges for development teams wishing to create their own CEIP:

  • Agent efficiently collects data about how customers use programs and records details on problems they encounter.
  • Hub provides reliable, secure transmission of log data from end-users to each member of the development team. Data is highly compressed and the transfer protocol is firewall-friendly and reliable even when network connectivity is limited and intermittent.
  • Analyst indexes all that data and provides powerful visualization tools that help team members identify broad patterns spanning many logs as well as the ability to drill into each log to point the root cause of a single issue.

Does automatic transmission of logging data raise any privacy concerns?

Yes, dealing responsibly with information privacy is extremely important. At the same time, the considerations vary widely between different applications so we think that it’s important for Gibraltar to provide the flexibility to fit within a broad range of usage scenarios. For example, having a dialog for CEIP opt-in is a recommended best practice for a commercial software product, but for an in-house corporate application that is only used by employees, their employment agreement or computer login screen may already require informed consent to certain information being monitored. With this in mind, the default configuration settings for Gibraltar only transmit data on demand with explicit user consent. And we also make it easy to enable automatic background log transmission when appropriate.

UPDATE: Kendall has written a nice follow-up post on what Gibraltar Hub is, how it works and why we created it.

Want to get started?

Upon release you can either purchase your own private Hub to run on your own server or subscribe to the Gibraltar Hub Service we host. If you’re a registered user of Gibraltar you can try out the Gibraltar Hub Service right now. Just contact us to get your preview account set up. Depending on the final user feedback from this preview we expect the general release to be in the next 4 to 8 weeks.

Categories : Development, Logging
Comments (2)

kick it on DotNetKicks.com
One key requirement of the Gibraltar Agent is to be able to manage the data files it creates on disk to ensure that they can’t grow out of control.  After looking at a lot of options, we determined that we need a central index to track the locally generated files.  The problem was that it had to be absolutely safe to use from multiple processes without risk.  We had some simple xml-based ideas to solve the problem, but early prototypes were not encouraging.

Fortunately, before we launched into getting more and more aggressive with solving the problem someone on the team stumbled over VistaDB, a fully managed database that we could merge into our agent.  We wrote a quick technical prototype and were impressed:  Not only could we safely throw all the data we needed to track into it even in an extended run of our torture test, but we were able to do the evaluation of whether we needed to prune files or not (and what files to prune) within it which made for a fast, clean, maintainable index.

Even with the success of this prototype, we were very reluctant to go down this road.  Our previous generation solution had use MS Jet for a data store, and it had been a source of problems.  We had ported that to SQL Server, and SQL had become a source of problems for some of our clients.  We’d internalized the lesson that databases and logging do not mix if you want an easy to deploy, foolproof system.  We decided to cast a wider net and look at a range of options:

  • SQL Express / SQL Server Embedded: Microsoft’s free offerings.  SQL Express was right out because it was a windows service and would make our deployment complicated and huge.  SQL Server Embedded couldn’t solve our problem because only one process can access the database files at a time.  And oh yeah, it would make our deployment somewhat complicated and large.
  • Other third parties: Without getting into an exhaustive list, we decided it really had to be a completely .NET native, managed implementation that we could merge with our assembly because we were only going to ship one assembly for the agent.  Furthermore, it needed to support syntax at least largely similar to SQL Server so we wouldn’ t have to master multiple environments.
A Dashboard of sessions run locally

A Dashboard of sessions run locally

In the back of our minds was another consideration:  We’ve always intended to grow Gibraltar into offering a larger version for enterprises with centralized log storage and management for many computers.  It was a lead pipe cinch that this solution was going to use SQL Server, so the closer we could be to that on the client the better off we’d be.  The more we worked with VistaDB, the more we started to wonder:  Did we really need SQL Server even for our future larger version?  Could we perhaps just use VistaDB?  We did more prototyping and came to the conclusion that it could work very well technically.

At this point, we knew we had a winner in VistaDB.  A quick prototype showed that we could easily target both it and SQL Server with exactly the same code at each level.  Nothing else could do that:  We could use the same schema, stored procedure code and database access code and switch between SQL Server and VistaDB with none the wiser.

If I see so far, it’s because I stand on the shoulders of giants

local_severity_by_applicationThe best was yet to come.  With VistaDB it’s easy to create a new instance of a database anytime we needed our index data structure.  Opening and closing databases is relatively fast, so we were able to have a common relational database structure available everywhere.  This meant that we could have just one data model everywhere instead of separate ones for data collection and session data management.  Better yet, we could have one set of database access code for both cases, even if we ultimately supported SQL Server.

This opened a lot of opportunities:

  • The same data bound charts and views could be generated against any repository, allowing immediate analysis without having to copy data first into a central store.
  • Very large repositories could be supported because the index data didn’t need to be loaded into memory for processing
  • We were able to redirect the time saved not developing our own XML persistence format and structure into customer value-add features.
  • We could add a reporting system that would require rich, hierarchal data structures.

local_session_durationIn short, if it wasn’t for VistaDB most of the features in the repository view and the reporting system of Gibraltar would only be available in a future release that used a central server.  In a small way VistaDB did cause us to spend more development time than we would have – we got so excited about the potential for some of the features that we’d written off as being too expensive to implement that we held the release until we could get them done.

Now, we could have written our own thing for a lot of these pieces, but it’d taken a lot more time (particularly since we target .NET 2.0, not 3.5 so no LINQ for us) and there are some parts that we’d have always been worried about – namely the fundamentally hard problem of having may processes accessing the same file doing reads and writes.  That’s just a hard problem to get right period, and it was great to hand it off to someone who worried as deeply about that as we do about logging and metrics.  In the end, it’s a great example of not reinventing the wheel unless you want to learn a lot about wheels.  We already knew a lot about databases and shared files, enough to know that we didn’t want to learn any more.  We were much more interested in digging into the areas that added unique value to Gibraltar.

Beware the gift Trojan horse

One last concern we had was that adopting anything into our agent would create a long term obligation for us to support it.  Customers would expect that multiple agent versions would safely interoperate on the same computer.  Furthermore, because we would use the same format in our Package files used to send session data between computers we’d be required to support it for years.  We’d have to get comfortable that this was feasible.  To this end, we needed the solution to either be available completely through source code we had or through a company that we were absolutely sure would support it long term.

A great incentive for long term support is revenue.  People in general and for-profit companies in particular are motivated most directly by the idea that people will pay them to do something.   This was another place where we were a bit concerned with using one of the options from Microsoft because they were all free.   Microsoft’s motivation for creating these products is primarily defensive – prevent people from using other free options in the hope that they’d eventually upgrade to one of the nicely expensive server options.  That’s fine and good if you’re targeting one of those options and just want a free scale down option, but this was central to our product.

We could have gone open source, but there are a few issues there for us:

  1. No open source option even came close to offering what VistaDB did – namely the ability to support stored procedures compatible with SQL Server.
  2. We’d then largely be on the hook for our own source code support if the community wasn’t doing what we needed, and the whole point here was to not have to write something.
  3. We’d have to very carefully scrutinize the open source license to make sure we didn’t get GPL’d into oblivion.

Part of our concerns were mitigated by VistaDB offering a source code license, so we could get source code to make our own changes if we had to.  But really, I wasn’t looking to ever need to write this code, that’s why I wanted to get someone else’s solution.  With VistaDB, we found not only a strong community of folks that have built products around it, just like we were, but also a small company owned by a guy that believed fundamentally that folks like us had to succeed with his product for him to stay in business.  His goals tightly aligned with our needs, and that’s a great precondition for success.  That’s good, but there’s another big condition for success:  Was this company focused on listening to its customers?  Would it respond to our concerns?

As is our tradition, we sent an unsolicited block of product feedback to VistaDB.  There were some good things, but we had some concerns as well.  We got a point by point response from Jason Short, the CEO of VistaDB.  Better yet, we got an invitation to a conference call.  Now, at this point VistaDB had gotten a total of $300 from us.  We spent over 7 times that on the fancy graphing component we use, and compared to the criticality of it to our system I’d have spent a lot more to solve the problem we had.   I was impressed by a series of conversations we’ve had with Jason and his team, and how dedicated they are to doing the right thing in solving the challenges they’re up to.

When you think about how much money it costs to create software of any sophistication, it’s great to be able to pull off a piece of your complexity and hand it to someone who will care more about it than you do.  We’re so happy with how much VistaDB added to what we could offer our customers that you’ll notice in our About Gibraltar page that we’re proudly Powered by VistaDB.

about_gibraltar

kick it on DotNetKicks.com

Categories : .NET, Data, Development, Logging
Comments (3)
May
07

How Rapid is Rapid? How Quick is Quick?

Posted by: Kendall | Comments (2)

kick it on DotNetKicks.com
Over the past year, we’ve been looking at a lot of logging systems:  Free systems, expensive systems, big systems…  anything that seemed related to logging or .NET.  One thing that stuck out was how much most of these systems centered their design around being fast.  The approach to defining fastest varies a bit:

  • Fastest to commit to disk.
  • Fastest by filtering what messages get logged.
  • Fastest by doing everything on another thread.

We didn’t question this much until we were interacting with TheObjectGuy about his logging system and questioned what seemed like a very silly block of code:

/// <summary>
/// Write the String to the file.
/// </summary>
///
<param name="s">The String representing the LogEntry being logged.</param>
/// <returns>true upon success, false upon failure.</returns>
public bool WriteToLog(String s)
{
StreamWriter writer = null;
try
{
writer = GetStreamWriter();
writer.WriteLine(s);
}
catch
{
return false;
}
finally
{
try
{
writer.Close();
}
catch
{
}
}
return true;
}

This is about as non-optimal as you get:  Every log statement is causing the file to be opened, appended to, flushed, and closed.  That’s just crazy! It could be made so much faster!  When we asked about it, the response gave us pause:

…one thing I can say with certainty is that for every application that requires file logging performance greater than, say, 500 log entries per second (the performance I get on my old clunker of opening/closing files on each write), there are hundreds of applications where a user would like to be able to delete or rename a log file while the application is running.

So we set up a unit test to see just how bad it was.  It’s pretty simple code; in one case write all the messages with the file open (calling WriteLine each time) and the second case opens and closes it every time. And the results?  Indeed, the first method is much faster:

Keep File Open: 7.0014ms.  Average duration of 0.0070014ms per message.
Example Code Above: 161.0322ms.  Average duration of 0.1610322ms per message.

That’s an incredible improvement!  23 times faster!  And Faster is always better…. Right?

Well, let’s go back and consider what the point of the system is: It’s a logging system. In the test above it took 161ms  (0.16 seconds) to write 1,000 log messages to disk.  Extrapolating, that would mean this code could write around 6,000 messages per second, whereas the optimized code could go up to 142,000 messages per second.  But think about it:  Under what circumstances should a log system be storing 6,000 messages per second?

  • At what level of detail in your code would you have to be logging to generate that many messages?  Practically at the debugger level of detail.
  • If you’re logging at that level of detail, how easy would it be to dig out any useful information?  You’ve created a needle in the haystack problem.

The most common reason for generating that volume of messages would be that the code is stuck in a loop generating errors.  In that case, how important is it to be able to go faster?

  • In the error loop, faster logging just produces larger log files with more records (because it’s stuck in a loop)
  • Eventually the error loop will overwhelm something else:  Memory, disk performance, or disk capacity.

In short, nothing good is happening by being faster.  In the case of the code we were questioning, the author’s point was dead on:  It wasn’t worth loosing the functionality users wanted to for an unnecessary performance gain.

Speed is Seductive

Porsche type 997 GT3 RSRWhen you think about it, it’s pretty easy to create a logging solution that can handle 100 messages per second all day long, and you should really consider if you’re logging more than that under normal circumstances you probably have a different problem.  Assuming you want to keep the overhead of logging to less than 10% of the total runtime of the process that’d mean you only need about 1000 messages/second capacity.

If performance over 1000 messages/second is all that’s required, why do so many systems pursue performance dramatically in excess of this?  The obvious answer is that it’s an easy metric to make a yardstick from.

  • Everyone knows faster is better.
  • People rarely put numbers into context.

The problem is that making code faster usually requires trading off something else:

  1. It does less: This is the most common way to make something faster – have it do less because the fastest line of code is the one that isn’t there.
  2. It’s a lot more complicated: Every tricky optimization adds complexity, whether it’s fancy threading or buffering, it’s all complexity to maintain.

In this case, there’s a more subtle problem at work as well:  As a developer, if you believe that the log system is extraordinarily fast then why worry about what you log?  After all, it isn’t going to slow down the application.   That leads to logging strategies like this where the log is practically a substitute for a debugger.   On the surface, that doesn’t seem all that bad, but what are you really logging for?

Shotgun Logging = Deferred Design

If you’ve got a shotgun logging strategy that has messages for every object you construct or every method you call or a similar level of detail, you’re filling up your log with a lot of information but not much value.  In many cases, any logging pattern that could be autogenerated adds only a minor amount of value because it leaves out the reasons (the why) and consequences (and therefore…).  It’s precisely the effort you put into considering what messages to log and where to log them that adds the value to your logging you’ll need in support.  It’s possible that the shotgun messages can complement your hand-crafted messages to give you a better picture, but without the hand crafted messages when it comes time to actually use the log data to support your application you’ll still feel lost.

Instead of waiting until you’ve got a real problem on your hands to figure out what your log data means, use it throughout your testing and certification process to make sure that you’ve got enough of the right, high value messages to support your application.  Doing a little design work for your log messages as a feature of your application instead of just blanketing your code with generic log statements will pay off in spades.
kick it on DotNetKicks.com

Categories : .NET, Logging
Comments (2)