Rosetta 3.5
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Interface for SQLite3 Database Library

Metadata

Last edited 01/14/11. Matthew O'Meara mattj.nosp@m.omea.nosp@m.ra@gm.nosp@m.ail..nosp@m.com.

overview Overview of Relational Databases

Relational Databases are standard datastructures for persistent managment of large quantities of data. Compared with flat file formats, relational databases offer the following advantages:

Database Support In Rosetta

While standard implementations such as SQLite, PostgreSQ, MySQL and Oracle are in many ways quite similar, they each have unique details. To be useful in different contexts, Rosetta uses an abstraction library, cppdb, to present a common interface within the code supporting multiple databases backends.

Why the cppdb library?

When choosing to use cppdb as the database abstraction layer we evaluated several approaches and projects. The general strategy of an abstraction layer is to have backend drivers that each connect with a different library–in this case database engine–and frontend drivers that present an interface for a different programming environment. Effective abstraction layers should simplfy software design by avoiding having to implement interfaces for all combinations of programming environments and libraries.

Usage of Databases

Overview

The general process of using a database involves the following three basic tasks steps:

Session Management

Currently only SQLite3 databases are supported, though if adding more database types will be doable. To establish a session, resquest a session from the DatabaseSessionManager. #include <utility/sql_database/DatabaseSessionManager.hh>

...

using utility::sql_database::DatabaseSessionManager; using utility::sql_database::sessionOP;

DatabaseSessionManager * dsm = DatatabaseSessionManager::get_instance(); sessionOP db_session = dsm->get_session(database_filename); The DatabaseSessionManager is singleton mananaged on non-mpi builds and boost::auto_ptr managed for mpi builds. One thing to note: In order use owning pointers with a session object use the utility::sql_database::session. It derives from both cppdb::session and utility::pointer::ReferenceCount.

Usually your application will want to use the database filename the user specifies in the option system:

  -inout:database_filename

by looking it up like this:

  #include <basic/options/option.hh>
  #include <basic/options/keys/inout.OptionKeys.gen.hh>
  #include <string>

  // ...

  using namespace basic::options;
  using OptionKeys::inout;

  std::string database_filename(option[database_filename].value());

Once a database session has been established. One of the main tasks it to execute statements. For example to create a table:

  #include <cppdb/frontend.h> // for 'statement' and 'result' classes

  // ...

  using cppdb::statement;

  statement create_table_stmt = (*db_session) <<
    "CREATE TABLE table1 ("
    " column1 INTEGER PRIMARY KEY,"
    " value1 TEXT);";
  create_table_stmt.exec();

  statement insert_row_stmt = (*db_session) <<
    "INSERT INTO table1 (null,?)" << "hi";
  insert_row_stmt.exec();

Another main task is to execute queries over a database. For example:

  #include <cppdb/frontend.h> // for 'statement' and 'result' classes

  // ...

  using cppdb::statement;

  result res = (*db_session) <<
    "SELECT * FROM table1;";
  while(res.next()){
    int col1;
    string val;
    res >> col1 >> val;
    // use col1 and val
  }

For a working expanded example see test/utility/sql_database/DatabaseSessionManagerTests.cxxtest.hh

For more documentation on the cppdb API see the CppDB Online Documentation and external/dbio/cppdb/frontend.h.

Metadata

Last edited 10/18/10. Code and documentation by Matthew O'Meara mattj.nosp@m.omea.nosp@m.ra@gm.nosp@m.ail..nosp@m.com.

Overview

The SQLite3 library is simple SQL database engine. The Sqlite3Interface class makes SQLite databases from within Rosetta.

Why Use a database? Why use SQLite?

Do your csv tables have way to many columns? Do they not fit into memory anymore? Are you protocols turning into multiple phases and you are having trouble keeping track of which paramters go with which structure predictions? Are your runs so long they sometimes die and your data may be getting corrupted? Have you hacked together code to read several tables of parameters? Do you wish you could do "SELECT ... FROM ... WHERE ..." queries over all your data tables?

It may be time to think about using a database.

Data management tasks of different scales require different levels of sophistication. Simetimes using a spreadsheet is the right way to go. Sometimes having several comma separated tables in various directories is the right way to. And sometimes using a full fledged SQL database is the right way to go.

Don't be afraid, SQLite3 is pretty easy to use as far as databases go and the Sqlite3Interface makes about as easy as writing to a Tracer.

Databases like PostgreSQL or MySQL run as in a client-server server architecture. In contrast, SQLite is library linked directly into the program. Since each SQLite database is stored in memory or as a single, they are quite portable and easy to configure.

Setup for SQLite

Although SQLite is in the public domain it is not (currently) distributed with Rosetta. To prevent it from being global dependency, Developers using the Sqlite3Interface must specify they intend to use SQLite at compile time by adding 'sqlite' to the list of 'extra' options passed to scons.

./scons.py extras=sqlite  <other commands>

This will instruct gcc to define DB_SQLITE3 as a preprocessing variable which is used in #ifdef blocks in the code. Here is simple example which is fleshed out in the unit test test/utility/sql_database/Sqlite3Interface.cxxtest.hh.

#ifdef DB_SQLITE
// This requires the external dependency of the sqlite3 library
// To use compile with $scons.py extras=sqlite

#include <utility/sql_database/sqlite3_interface.hh>
using utility::sql_database;


Sqlite3Interface sqlite3_interface("/tmp/test_db.db3");

// A transaction groups together multiple statements to be more
// efficient.
sqlite3_interface.begin_transaction();

sqlite3_interface.execute_sql( "\
TABLE table1 (\
  id INTEGER PRIMARY KEY AUTOINCREMENT,\
  value REAL,\
  math_const TEXT,\
  awesome BOOL );");


sqlite3_interface
  << Sqlite3Interface::begin_row("table1")
  << Sqlite3Interface::sqlite3_null
  << 3.14
  << "pi"
  << true
  << Sqlite3Interface::end_row;

sqlite3_interface.end_transaction();


#endif // DB_SQLITE