SQL query parser and binding code generator for C++, Java, OCaml.
Writing database layer code is usually tedious and error-prone, due to the mix of different languages. SQL queries constructed dynamically need to bind external data (from application), and the resulting rowset must be decomposed into application native data. Data crossing these application-to-database boundaries is what causes troubles. One can factor out all common database communication code, hide the database under some application-specific abstraction, but one always needs to manually specify correspondence between SQL query binding slots (or resulting rowset columns) and code variables. This mapping should be updated manually every time SQL query is modified.
SQL query parser and code generator which ensures that application code and database queries are in sync. It analyzes SQL query and determines the set of input parameters (values for INSERT, run-time substitution parameters) and the set of resulting columns (for SELECT). Then it generates the code in host language, matching query input and output to function parameters and return values with corresponding native data types. So basically you provide an SQL query and generator creates a function which takes the set of typed parameters as required to fill slots in a query. Generated code binds provided parameters into query and executes it. SELECT statements additionally return the collection of structures with fields representing columns of resulting rowset (or pass those structures to callback-function). The most fruitful consequence of such approach is that the host language compiler will itself check that functions generated from SQL queries will be called correctly (i.e. all parameters bound with correct types). So if you modify the query and forget to update the code -- the compiler will point on erroneous parts.
Suppose we have the database of some money relationships for the group of people. We use the following SQL tables and queries:
-- @create_person
CREATE TABLE person (id INTEGER PRIMARY KEY AUTOINCREMENT,name TEXT,surname TEXT);
-- @add_person
INSERT INTO person (name,surname) VALUES;
-- @create_money
CREATE TABLE money (src INTEGER, dst INTEGER, amount INTEGER);
-- @add_money
INSERT INTO money VALUES;
-- @calc_total
SELECT name || ' ' || surname AS fullname, SUM(amount) as total FROM person JOIN money ON src = id GROUP BY id;
-- @list_donors
SELECT DISTINCT surname FROM person JOIN money ON src = id AND dst = (SELECT id FROM person WHERE surname LIKE @surname) LIMIT @limit;
Generate the binding C++ code (boilerplate omitted, only function prototypes shown):
// DO NOT EDIT MANUALLY
// generated by sqlgg 0.2.0 (735e2815)
#pragma once
template <class Traits>
struct sqlgg
{
// CREATE TABLE person (id INTEGER PRIMARY KEY AUTOINCREMENT,name TEXT,surname TEXT)
static bool create_person(typename Traits::connection db);
// INSERT INTO person (name,surname) VALUES
static bool add_person(typename Traits::connection db, typename Traits::Text const& name, typename Traits::Text const& surname);
// CREATE TABLE money (src INTEGER, dst INTEGER, amount INTEGER)
static bool create_money(typename Traits::connection db);
// INSERT INTO money VALUES
static bool add_money(typename Traits::connection db, typename Traits::Int const& src, typename Traits::Int const& dst, typename Traits::Int const& amount);
// SELECT name || ' ' || surname AS fullname, SUM(amount) as total FROM person JOIN money ON src = id GROUP BY id
struct data_4
{
typename Traits::Text fullname;
typename Traits::Int total;
}; // struct data_4
template<class T>
static bool calc_total(typename Traits::connection db, T& result);
// SELECT DISTINCT surname FROM person JOIN money ON src = id AND dst = (SELECT id FROM person WHERE surname LIKE @surname)
struct data_5
{
typename Traits::Text surname;
}; // struct data_5
template<class T>
static bool list_donors(typename Traits::connection db, T& result, typename Traits::Text const& surname, typename Traits::Int const& limit);
}; // struct sqlgg
Things to note above:
[sqlgg] name=function_name
(or simply @function_name
) before the query specifies the name
of the generated function (this annotation optional but very convenient).Traits
. It specifies the
correspondence between SQL and native types, provides types for database connection and
implements actual code to execute statements. Traits
should be implemented
once for every specific database API.add_money()
function takes three data parameters -- the values to INSERT into table (note the
names and types).calc_total()
returns data via result
parameter. Under the scenes it will
bind columns of resulting rowset to the fields of T::value_type
, which should provide fields
fullname
of type Traits::Text
and
total
of type Traits::Int
(otherwise it will fail to compile). For convenience a structure
satisfying the requirements for output type is generated alongside the function, data_4
in
this particular case, so std::vector<data_4>
for result
is fine.list_donors
were inferred correctly (limit
is Int
and surname
is Text
. SQL is not a statically-typed language so the inferred types are based on some
reasonable assumptions.Then manually-written C++ code boils down to (without error-checking):
#include "../sqlite3_helper.hpp" // sqlite3 traits
#include "demo_cxx_gen.hpp" // generated
#include <iostream>
#include <vector>
using namespace std;
typedef sqlgg<sqlite3_traits> gen;
typedef long long int64;
int main()
{
sqlite3* db = NULL;
sqlite3_open(":memory:", &db);
// create tables
gen::create_person(db);
gen::create_money(db);
// add all person records
gen::add_person(db,"John","Black");
int64 john = sqlite3_last_insert_rowid(db);
gen::add_person(db,"Ivan","Petrov");
int64 ivan = sqlite3_last_insert_rowid(db);
gen::add_person(db,"Sancho","Alvares");
int64 sancho = sqlite3_last_insert_rowid(db);
// add money relations
gen::add_money(db,john,ivan,200);
gen::add_money(db,john,sancho,100);
gen::add_money(db,john,sancho,250);
gen::add_money(db,sancho,ivan,300);
// summarize by person
typedef vector<gen::data_4> collection;
collection all;
gen::calc_total(db,all);
// output
cout << "Total transfers:" << endl;
for (collection::const_iterator i = all.begin(), end = all.end(); i != end; ++i)
{
cout << i->fullname << " = " << i->total << endl;
}
// list donors
typedef vector<gen::data_5> people;
people p;
gen::list_donors(db,p,"petrov",100);
cout << "Donors:" << endl;
for (people::const_iterator i = p.begin(), end = p.end(); i != end; ++i)
{
cout << i->surname << endl;
}
// properly close database
sqlite3_close(db);
return 0;
}
The code is straightforward and free of SQL-specific details. More importantly it is statically
checked by the compiler -- suppose we change the database schema and add one more field to money
table (e.g. timestamp of transfer). Compilation rightfully fails:
demo_cxx.cpp: In function `int main()':
demo_cxx.cpp:29: error: no matching function for call to
`sqlgg<sqlite3_traits>::add_money(sqlite3*&, int64&, int64&, int)'
demo_cxx_gen.hpp:75: note: candidates are: static bool sqlgg<Traits>::add_money(typename
Traits::connection, const typename Traits::Int&, const typename Traits::Int&, const typename
Traits::Int&, const typename Traits::Int&) [with Traits = sqlite3_traits]
The main idea is that the generator should take care only of semantic binding between SQL and code sides, being as unobtrusive as possible. So the choice of the specific database and API is a programmer's choice. Similarly, queries to the database are expressed in plain SQL, so that the generator can be easily plugged in any existing project -- just move all SQL statements used in the code to separate file and feed it to generator.
Distinguishing feature of sqlgg is that it starts off with actual SQL queries, not object models or SQL table descriptions. Ideally, it generates {meta,template,abstract,generic} code which is later specialized for actual database/environment (if the target language provides such ability).
This is work in progress and there is plenty of room for improvement. For now the status of this project is works for me . I use it for some simple database-access code with sqlite3 engine (using suitable sqlite3_traits.hpp helper). This project was started when I found myself editing existing code with tons of C++ wrappers for SQL queries, each binding several parameters and decomposing results.
Sqlgg understands the subset of the standard SQL language with arbitrary database-specific extensions. The goal is to cover as much SQL as possible in order to accept most of the real-world SQL statements unmodified.
Available output languages:
2009-05-25