Author Archive

Jan
15

CodeMash Rocks

Posted by: | Comments (0)
Let's Rock the House at the CodeMash 2013 Jam Session!
Let’s Rock the House at the CodeMash 2013 Jam Session!

CodeMash 2012 was a “10″

I just got back from CodeMash 2012 – what a blast! This amazing developer-organized conference took place over three days in the brand new convention center at the Kalahari Resort in Sandusky, Ohio — a.k.a. America’s largest indoor waterpark. Inspired talks, awesome people, and meticulous planning by a few with generous support from many combined to make for a truly fantastic event.

On a personal note, there were two little details that touched and inspired me.

  1. Leon Gersing gave a heartfelt zen-infused pecha kucha talk on Love that reminded us that work/play/family/friends are synergistically intertwined aspects of a life passionately lived – not competing commitments in a zero-sum game.
  2. Designing Interactive sponsored a frickin’ Bacon Bar!
     

The Jam Session at CodeMash 2013 will turn it up to “11″

It got me thinking about what I could do to help make some aspect of next year’s CodeMash a bit more awesome. And it hit me: the Jam Session!

This is near and dear to my heart because I was beating on pianos, organs and various synthesizers for years before I ever touched a computer keyboard. Thanks to Carl Franklin and others, CodeMash has provided space for an open jam session at CodeMash for several years.  But there are a couple challenges with pulling off a great jam session that present opportunities for improvement:

  1. Some musicians are a little shy and need a bit of coaxing to feel comfortable jumping in
  2. Occasionally, someone needs a gentle nudge to take a break once in a while
  3. Negotiating what to play next can be tedious and time-consuming
  4. Having the right gear available to play and be heard is easier said than done

Creativity flourishes in a bit of structure. I’ve gotten approval from CodeMash to organize the 2013 Jam Session.  I want to arrange for a house sound system and gear (drum kit, percussion, keys, amps) to be shared as well as working out some simple ways to maximize the fun for players and listeners.  For instance, chord charts and lyric sheets can eliminate a lot of sputtering dead time trying to pick the next song and agree on a key.  Having a sound guy to get instruments miked, blend levels and kill feedback can be a godsend. Having a couple professional (i.e. full-time) musicians to facitate things could help keep things flowing the way a good talk show host makes all the guests sound interesting.

If you’d like to jam at CodeMash 2013 or have any ideas on how we can amp up the awesome, please share in the comments section below.

Categories : Events
Comments (0)
Dec
07

VistaDB 4.3 Performance Optimization

Posted by: | Comments (2)
Download VistaDB 4.3

Download VistaDB 4.3

We are happy to announce the biggest update to the VistaDB engine since Gibraltar Software took over the product last year. It’s actually our fifth update, though our earlier releases were more limited in scope consisting of a new streamlined licensing system and a number of bug fixes—often providing closer compatibility with SQL Server scripts.

The main focus of VistaDB 4.3 is query performance, particularly with multiple JOINs. We’re pleased to report 2.5x improvements in many cases and discuss below what’s happening inside the VistaDB engine to achieve these results. Let’s start with an overview of how VistaDB produces query results.

How Does My Query Work?

VistaDB queries are executed in three general phases:

  • Parse: The SQL query text is parsed into a tree of objects representing each language element.
  • Prepare: The statement/expression tree is recursively processed to identify table and column references and to determine the data types of results.
  • Execute: The statement/expression tree is recursively processed to execute statements and provide results in order (returning to the calling application as each row is ready).

If there were no optimizations at all, the engine would walk every row of the parent table. And for each of those rows, then walk every row of the first joined table looking for matches to the ON clause. For each of those rows it would then repeat the process for every row of every additional table referenced in the query. Obviously, this would be ridiculously slow when joining multiple tables of any real size, or even when querying a single table with a large number of rows when you don’t actually want most of them.

To be more efficient, VistaDB performs an additional optimization step at the start of the Execute phase. This step recursively walks the WHERE clause and ON clauses and converts the comparison expressions (and special functions such as BETWEEN) into a more efficient representation as constrains in which each constraint specifies a range of values for a particular column based on constants, parameter values, or the value of a column from an earlier table in the parse tree.

In optimizing the parse tree, VistaDB simplifies the execution plan into a series of constrained tables you could imagine as being evaluated left-to-right. Constraints that can’t be resolved are declared non-optimizable and must be handled by testing the WHERE clause. Similarly, target columns for which there isn’t an available index are also non-optimizable and must be tested row-by-row. These optimized conditions are then processed for logical ANDs and ORs to calculate an overall optimized filter for each table.

Building On What Already Works Well

VistaDB has always done well with queries in which a range of rows can be retrieved on a single-column index with an identifiable starting and ending value based on the current rows of tables “to the left” of it. For example, if the parent (FROM) table is restricted by a single comparison in the WHERE clause such as: WHERE ParentTable.ColA = @ChosenValue; (with an index on ColA), then the engine doesn’t need to walk every row of ParentTable, it can start with the first row in the index with a value of @ChosenValue for ColA, and walk each row in the index until it passes the last row with a value of @ChosenValue for ColA. If another table is then joined in it doesn’t need to consider any combinations outside of that range on ParentTable; they’re already certain to be excluded by that condition in the WHERE clause.

Improvements in VistaDB 4.3

We use VistaDB extensively in our Gibraltar application monitoring system and noticed that VistaDB performance left something to be desired for some of our more complex queries. For example, it is common to use a placeholder ID (perhaps a UNIQUEIDENTIFIER) as a foreign key into a small lookup table which can contain additional information fields universal to that value. In Gibraltar we have tables such as Application_Type and Boot_Mode which provide caption and description labels for display purposes. They can be joined directly into a query about one or more sessions, like so:

SELECT * FROM Session_Details SD
	JOIN Processor_Architecture OSA
		ON OSA.PK_Architecture_Id = SD.FK_OS_Architecture_Id
	JOIN Boot_Mode BM ON BM.PK_Boot_Mode_Id = SD.FK_OS_Boot_Mode_Id
	JOIN Processor_Architecture RA
		ON RA.PK_Architecture_Id = SD.FK_Runtime_Architecture_Id

The joined tables are tiny, only 5 or so rows each, so (in theory) this should be very efficient. Each joined table can have its unique matching row directly looked up based on the corresponding column value in the parent table. Perfect, right? But this query was taking several seconds. The base query (SELECT * FROM Session_Details) took less than half a second, and that’s querying the entire table! What we found in VistaDB 4.2 was that as each JOIN was added to this query, the overall query time nearly doubled! Something was clearly less efficient than it should be.


As we analyzed the engine internals, we found a lot of opportunities to improve performance which we’ll be implementing over the coming year. As a first step, we decided to focus in VistaDB 4.3 on reducing the overhead for multiple joins and optimize for the most common cases.

We expect that the majority of joins will be on a single equality between a single column from each table with a foreign key relationship between them. Since this should by definition identify a single value (and often a single row), it should be the most efficient filter to narrow down the joined table based on those “to the left”. So the optimization logic will now catch these top-priority conditions early on and bypass the rest of the expensive reduction pass.

And in queries like our example, we integrated column value caching to eliminate the need to search for the same rows over and over again. When a table is joined on a UNIQUE single column index, the table can cache the row in a Dictionary keyed by the column value from the other table, and each time it comes back to that value, it grabs the row from cache instead of searching the index and reading it from disk again.

We also coded our caching to ensure that it doesn’t consume excessive memory when processing large tables. The cache only holds hard references to the most recently accessed rows. By using weak references to less recently used rows they stick around when memory is plentiful but can be garbage-collected if necessary. For more info on on weak references, check out Kendall’s Code Project article and sample code on creating a single instance string store.

As shown in the graph above, VistaDB 4.3 is several times faster for many common queries. More importantly, in queries such as above, performance degrades linearly as more tables are joined, rather than exponentially as before.

Stay Tuned for More to Come

The query optimizations we’ve introduced in VistaDB 4.3 are just a start. Subsequent releases will have additional query optimizations as well as other performance improvements such as support for bulk insert and enhanced multi-user scalability. We also will be adding new features such as enhanced support for Entity Framework, enhanced compatibility with SQL Server and improvements to our development tools (Data Builder, Data Migration Wizard, etc).

We’ll be writing additional blog posts about our adventures taking VistaDB to the next level, so check back often or leave a question/comment below or in our support forums –we’d love to hear from you!

To understand why I’m so passionate about Gibraltar you have to first appreciate that writing software is really, really hard.  As Edsgar Dykstra wrote nearly 40 years ago when dinosaurs roamed the earth typing punch cards as even DOS programs and 80×25 VT100 terminals had yet to be invented:  

The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility…
Edsger W. Dijkstra, The Humble Programmer  

Yet awaiting Kurzweil’s technological singularity, most computer programs are still written by imperfect human beings like me who can screw-up something as simple as making a pot of coffee.  Case in point…  

My coffee machine

I'm not worthy of my coffee maker

Nothing is so simple you can’t mess it up

My wife and I drink our coffee black, so you’d think the only way to screw it up is to forget to add either the water or the coffee beans.  I’ve done both, as well as variants involving too much or too little of either.  I’ve also forgotten both, as when Cindy comes downstairs in the morning asking “Jay, did you make the coffee yet?” Oops!  

You see, when you’re married with kids, you need a division of labor.  In ours, Cindy gets the kids off to school, manages our social calendar, pays the bills, plans the vacations, does the laundry, cooks the meals and does all the shopping.  I program a little.  And make coffee.  Badly.  

We have a semi-fancy Cuisinart coffee maker with a hopper on top for whole beans, a water reservoir and a built-in burr grinder. I mistakenly thought I’d explored the full range of coffee errors sometime ago when I learned that the coffee path between the grinder and the filter basket needs to be cleaned periodically lest backed up grinds prevent beans entry from the hopper. I still forget to clean the mill, but can now distinguish pitch differences in the sound of the grinding that alert me to this oversight.  

This morning the mill was clear, hopper loaded, water reservoir full, filter basket clean and all properly positioned.  I clicked the start button as I dashed to stop our cat from using my stereo speakers as a scratching post and heard the pitch-perfect grinding and water gurgling happily behind me.  Speaker saved, I sat down to enjoy the wafting aroma of French Roast brewing while enjoying a Sudoku.  

Precisely six minutes later, feeling clever at how quickly I’d solved the puzzle, I went to pour two cups of Morning Joe to sip in bed with Cindy as sun and breeze caressed us through the open bedroom window on this glorious Sunday morning in those precious quiet moments before our two boys awoke with breaking news of massive school projects and major tests due tomorrow.  

Our morning coffee was particularly fragrant, benefiting from an unprecedented abundance of surface area.  Coffee was pooled all over the counter and the floor – behind the stove, under the vitamins, and coating the bottoms of the cups and plates waiting patiently beside the sink in the aftermath of last night’s birthday party for my father.  

Mocking me from the counter (rather than below the filter basket where it belongs) was the empty coffee carafe.  

So, what’s this have to do with software engineering?

I suppose I could apply Scrum to my coffee making, but there’s not always someone around to collaborate in Paired Brewing.  Or maybe introduce a stage-gate Coffee Preparedness Review in a quest for ISO-9000 certifiable Brewing Process Maturity.  Or maybe use value stream mapping to achieve Lean Brewing.  I think my best bet is to pause for a moment’s contemplation before pressing the start button on how I suck at making coffee as I double-check that all is ready to go.  

Likewise, in software development, the humble programmer should build defense-in-depth against human fallibility.  Pick a methodology that works for you and stick with it.  Get feedback early and often.  Introduce as much automated testing and quality assurance as you can.  And close the loop in your software development process by measuring how your apps perform in the field so you can be more responsive in the short-term and continuously improving in the long-term.  

We wrote Gibraltar because we’re passionate about rock-solid software.  Like any asymptotic goal, the destination is ultimately impossible, but you can get closer and closer.  The journey is the fun part.  Gibraltar isn’t perfect either, but it’s very good and getting better and better.  Try it and see if we can help you move faster and have more fun on your software engineering journey.  

Join the conversation!

Have a similar story to share?  As professionals entrusted to create rock-solid software, what are we to about accomodating our unlimited human potential for error?  What are your thoughts?  

Let’s continue the discussion in the comments.

Comments (0)
Hippocrates - 460-377 BC

Hippocrates’ Primum non nocere, “First do no harm”

Several customers have requested a notification mechanism to be alerted when errors are detected in their programs.  Simply raising an event is straightforward, but our promise to our customers is that we’ll do the hard thinking that ensures Gibraltar is safe and robust in production systems.  Our mantra is: first, do no harm.

In this case, we asked ourselves questions like:

  • What if a customer’s error notification logic is slow?  How do we ensure that it doesn’t slow down the application as a whole?
  • What if the program starts screaming thousands of errors?  How do we ensure that we don’t swamp the error notification handler?
  • What if there are errors in the customer’s error notification handler?  What if it throws an exception?  What if it hangs?

This resulted in a design that ensures that the logging infrastructure (including Gibraltar itself AND customer logic that interfaces with it) will be robust and safe.

Our central Log object in Gibraltar Agent now has a MessageAlert event that is raised when warning, error, or critical messages are recorded.  This event has a number of safety features such as:

  • Asynchronous: The event is raised on a background thread that is not part of the logging path, ensuring that time spent handling the event will not slow down logging or affect other threads.
  • Batching: When a burst of messages are recorded that qualify they will typically be raised together to allow more efficient processing
  • Throttling: A minimum delay between events can be easily specified to ensure the event isn’t raised too frequently, particularly in error cascade scenarios.  Messages are batched up until the next time the event can be raised.
  • Hang Protection: If the event handler never returns the Agent will continue to process messages and not queue them, allowing them to be released from memory.
  • Loop Protection: Messages that are recorded by your event handler will not cause additional events to be raised.  This prevents notification loops where an event handler records an error during notification which subsequently causes the message alert notification to be raised again.
  • Low Overhead: We don’t spin up anything (the threading, queue, etc.) until someone subscribes to the event so if you don’t use this feature it doesn’t take up resources either.

The MessageAlert event is particularly useful for automatically triggering immediate data transmission in the case of an error and implementing your own error notification mechanism.  The full detail of each log message is available in the event.

Check out our recent post on charting enhancements for more examples of how we are incorporating customer feedback to ensure that Gibraltar provides a robust logging infrastructure allowing you to build rock solid .NET software.

Categories : .NET, Development, Logging
Comments (2)