Minutes to Milliseconds
We just published a second preview release of VistaDB 5.0 that includes support for paging result sets using OFFSET and FETCH as is done in SQL Server 2012 and other databases. It also slashes the execution time of some queries from minutes to milliseconds.
The release notes include comprehensive information on everything included in VistaDB 5.0 Preview 2. For this post, I want to focus on paging and performance. Before getting in OFFSET and FETCH, let’s first talk about TOP and how it relates with the ORDER BY clause.
Optimizing TOP with ORDER BY
Consider this query:
The idea is to find the ten largest orders and list them in descending order along with the name of the customer placing the order. On my machine, Preview 2 executed this query against a database containing a million orders and ten thousand customers in 193 milliseconds. The same query took over three minutes with VistaDB 4.3. Yes, Preview 2 is about 1000x faster than VistaDB 4.3 for this particular query!
Smarter Execution Plan = Faster Results
Previous versions of VistaDB used an execution strategy like this:
- JOIN the tables
- Sort the results based on the ORDER BY clause
- Take the TOP 10 rows from the sorted results
In contrast, VistaDB 5.0 considers the implications of the TOP and ORDER BY clauses in designing a query plan more like this:
- Analyze the full query and select indexes tuned to the ORDER BY clause
- JOIN the tables and find results in pre-sorted order using the selected indexes
- Once we’ve found 10 rows, we’re done
In other words, Preview 2 can take advantage of indexes such that the results it emits are pre-sorted, whereas previous versions of VistaDB would internally create and sort a temp table to process any query containing an ORDER BY clause.
TOP Speed Results
Here is some data from running a few variants of the query that simply vary the number of rows requested in the TOP clause.
The performance of Preview 2 varies linearly with the number of rows returned. The execution time for the same queries in VistaDB 4.3 is nearly constant at 3-4 minutes because execution is dominated by the time spent creating and sorting a million row temp table only to discard all but the few rows requested in the TOP clause.
OFFSET and FETCH Paging Support
Our implementation of OFFSET and FETCH builds on the fundamental improvements we’ve made to the engine. Consider this query:
This is functionally equivalent to selecting TOP(100), and has the same measured performance. As we increase the OFFSET, the execution time increases linearly because the engine still has to walk over those offset rows.
If you’d like to try our VistaDB 5.0 Preview 2, I encourage you to get the bits. If you’d like to replicate my results on your own machine, I have posted my sample database along with the queries I ran and my results.