SQL query parser and binding code generator for C#, C++, Java, OCaml.
Writing database layer code is usually tedious and error-prone, due to the mix of different languages. SQL queries constructed dynamically need to bind external data (from application), and the resulting rowset must be decomposed into application native data. The impedance mismatch between database layer and application code, as well as the absence of static (compile-time) checking, frequently leads to runtime failures (and sometimes even security holes). One can factor out all common database communication code, hide the database under some application-specific abstraction, but one always needs to manually specify correspondence between SQL query binding slots (or resulting rowset columns) and code variables. This mapping should be updated manually every time SQL query is modified.
sqlgg is an SQL query parser and code generator which ensures that application code and database queries are in sync. It analyzes SQL query and determines the set of input parameters (values for INSERT, run-time substitution parameters) and the set of resulting columns (for SELECT). Then it generates the code in host language, mapping query parameters on function arguments with corresponding native data types. So basically you provide an SQL query and generator creates a function which takes the set of typed parameters as required to fill slots in a query. Generated code binds provided parameters into query and executes it. The code for SELECT statements additionally invokes a strongly-typed callback function for every row of the resulting rowset. The most fruitful consequence of such approach is that the host language compiler will itself check that functions generated from SQL queries will be called correctly (i.e. all parameters bound with correct types). So if you modify the query and forget to update the code – the compiler will point on erroneous parts.
Suppose we have the database of some money relationships for the group of people. We use the following SQL tables and queries:
-- @create_person
CREATE TABLE person (id INTEGER PRIMARY KEY AUTOINCREMENT,name TEXT,surname TEXT);
-- @add_person
INSERT INTO person (name,surname) VALUES;
-- @create_money
CREATE TABLE money (src INTEGER, dst INTEGER, amount INTEGER);
-- @add_money
INSERT INTO money VALUES;
-- @calc_total
SELECT name || ' ' || surname AS fullname, SUM(amount) as total FROM person JOIN money ON src = id GROUP BY id;
-- @list_donors
SELECT DISTINCT surname FROM person JOIN money ON src = id AND dst = (SELECT id FROM person WHERE surname LIKE ?) LIMIT ?;
DROP TABLE IF EXISTS person;
DROP TABLE IF EXISTS money;
Generate the binding C++ code (boilerplate omitted, only function prototypes shown):
// DO NOT EDIT MANUALLY
//
// generated by sqlgg 0.2.2+ (f5b76ac7) on 2009-05-30T19:22Z
// visit http://ygrek.org/p/sqlgg/
#pragma once
template <class Traits>
struct sqlgg
{
struct create_person
{
create_person(typename Traits::connection db);
bool operator()();
}; // struct create_person
struct add_person
{
add_person(typename Traits::connection db);
bool operator()(typename Traits::Text const& name, typename Traits::Text const& surname);
}; // struct add_person
struct create_money
{
create_money(typename Traits::connection db);
bool operator()();
}; // struct create_money
struct add_money
{
add_money(typename Traits::connection db);
bool operator()(typename Traits::Int const& src, typename Traits::Int const& dst, typename Traits::Int const& amount);
}; // struct add_money
struct calc_total
{
calc_total(typename Traits::connection db);
template<class T>
bool operator()(T result);
}; // struct calc_total
struct list_donors
{
list_donors(typename Traits::connection db);
template<class T>
bool operator()(typename Traits::Text const& _0, typename Traits::Int const& _1, T result);
}; // struct list_donors
struct drop_person
{
drop_person(typename Traits::connection db);
bool operator()();
}; // struct drop_person
struct drop_money
{
drop_money(typename Traits::connection db);
bool operator()();
}; // struct drop_money
create_person create_person;
add_person add_person;
create_money create_money;
add_money add_money;
calc_total calc_total;
list_donors list_donors;
drop_person drop_person;
drop_money drop_money;
sqlgg(typename Traits::connection db) : create_person(db), add_person(db), create_money(db), add_money(db), calc_total(db), list_donors(db), drop_person(db), drop_money(db)
{
}
}; // struct sqlgg
Things to note above:
[sqlgg] name=function_name
(or simply @function_name
) before the query specifies the name of the generated function (this annotation is optional but very convenient).Traits
. It specifies the correspondence between SQL and native types, provides types for database connection and implements actual code to execute statements. Traits
should be implemented once for every specific database API.struct
which takes connection object as a constructor argument. This allows to reuse prepared statements several times (if database supports this feature).add_money()
function call operator takes three data parameters – the values to INSERT into table (note the names and types).calc_total()
returns data via result
parameter, which is a callback that gets executed for each row of the resulting rowset, with statically typed parameters binding each column of a row. In this case the parameters are fullname
of type Traits::Text
and total
of type Traits::Int
.list_donors
were inferred correctly (limit
is Int
and surname
is Text
.#include "../impl/sqlite3_traits.hpp" // sqlite3 traits
#include "demo_cxx_gen.hpp" // generated
#include <iostream>
#include <vector>
using namespace std;
typedef sqlgg<sqlite3_traits> gen_t;
typedef long long int64;
struct output_transfers
{
void operator()(std::string const& fullname, int total)
{
cout << fullname << " = " << total << endl;
}
};
struct output_donors
{
void operator()(std::string const& surname)
{
cout << surname << endl;
}
};
int main()
{
sqlite3* db = NULL;
sqlite3_open(":memory:", &db);
gen_t gen(db);
// create tables
gen.create_person();
gen.create_money();
// add all person records
gen.add_person("John","Black");
int64 john = sqlite3_last_insert_rowid(db);
gen.add_person("Ivan","Petrov");
int64 ivan = sqlite3_last_insert_rowid(db);
gen.add_person("Sancho","Alvares");
int64 sancho = sqlite3_last_insert_rowid(db);
// add money relations
gen.add_money(john,ivan,200);
gen.add_money(john,sancho,100);
gen.add_money(john,sancho,250);
gen.add_money(sancho,ivan,300);
// summarize by person
cout << "Total transfers:" << endl;
gen.calc_total(output_transfers());
// list donors
cout << "Donors:" << endl;
gen.list_donors("petrov",100,output_donors());
// properly close database
sqlite3_close(db);
return 0;
}
The code is straightforward and free of most database-specific details. More importantly it is statically checked by the compiler – suppose we change the database schema and add one more field to money
table (e.g. timestamp of transfer). Compilation rightfully fails:
demo_cxx.cpp: In function `int main()':
demo_cxx.cpp:29: error: no matching function for call to
`sqlgg<sqlite3_traits>::add_money(sqlite3*&, int64&, int64&, int)'
demo_cxx_gen.hpp:75: note: candidates are: static bool sqlgg<Traits>::add_money(typename
Traits::connection, const typename Traits::Int&, const typename Traits::Int&, const typename
Traits::Int&, const typename Traits::Int&) [with Traits = sqlite3_traits]
Distinguishing feature of sqlgg is that it starts off with actual SQL queries, not object models or SQL table descriptions. Ideally, it generates {meta,template,abstract,generic} code which is later specialized for the actual database/environment by the means of the target language. The generated code doesn’t require any special runtime support and doesn’t impose any restrictions on the application.
The main idea is that the generator should take care only of semantic binding between SQL and code sides, being as unobtrusive as possible. So the choice of the specific database and API is a programmer’s choice. Similarly, queries to the database are expressed in plain SQL, so that the generator can be easily plugged in any existing project – just move all SQL statements used in the code to a separate file and feed it to generator. The generated code can be easily inspected just as any other code in the project, and you can even edit it manually if such need arises (though it is highly not recommended). In order to keep code and database layer in sync the generator must be invoked every time SQL queries are modified.
For now the status of this project is works for me . It is evolving and there is plenty of room for improvement. The generated code may occasionally change in incompatible way so it is recommended to keep generated code in the project tree, to prevent unexpected breakage. NB ocaml sqlite3 and mysql db traits are available as sqlgg.sqlite3
and sqlgg.mysql
packages respectively. For other languages will need to keep db traits in the project. Feedback/suggestions on output code structure/naming best fitting the target language/platform are very welcome.
This project was started when I found myself editing existing code with tons of C++ wrappers for SQL queries, each binding several parameters and decomposing results. I used it successfully for C++ desktop applications with sqlite3 engine and C#/Mono server-side component with mysql database and OCaml projects with sqlite3 and mysql too.
Sqlgg understands the subset of the standard SQL language with arbitrary database-specific extensions. The goal is to cover as much SQL as possible in order to accept most of the real-world SQL statements unmodified. It will not (currently) warn on incompatible syntax (e.g. from two different sql dialects) used in the same project (but it may do so in the future).
Available output languages:
Install from OPAM with:
opam install sqlgg
Or download latest release and build manually from ygrek.org or github.
See changes history.
Source repository at github or repo.or.cz.
2020-05-22
ygrek at autistici dot org