MicroDB

Today I am announcing the first Beta release of MicroDB: a super simple NoSQL embedded database.

MicroDB is an embedded database meaning that there is no server to configure or connect to. Your application links to MicroDB via API and the database is stored on the filesystem. MicroDB follows in the steps of several other NoSQL databases in which data is stored as documents instead of in tables. Data is modeled as POJOs and feels very natural to use. No more wasted overhead of SQL queries and no more writing serialization code. MicroDB takes care of all the details of persisting data to disk.

MicroDB is basically just a serialization engine built on top of LevelDB which is itself an embedded key-value database. MicroDB stores data documents in UBJSON format, which is a type of binary JSON. I choose to use UBJSON over regular JSON because it is more efficient and a little easier to parse than regular JSON. MicroDB always assigns a UUID primary key to each data document. A data document committed to MicroDB can always be recalled using this id. MicroDB also features a very simple 'map-reduce' framework that is used to create arbitrary other indices on data.

Here is how data is modeled and stored in MicroDB using the Java bindings:

@DBObj
public class Person extends DBObject {

  private String firstName;
  private String lastName;
  private int age;

  public Link<Address> address;

  //transient fields are ignored
  private transient String notSaved;

  // + Standard setters and getters here
}

MicroDB uses a Java annotation processor to generate code that handles the serialization and deserialization to/from UBJSON format. This removes a ton of tedious and error-prone work for the programmer.

mDatabase = DBBuilder.builder(new File("path/to/dbfile")  
              .build();

//...

Person newPersion = mDatabase.create(Person.class);  
newPersion.setFirstName("Santa");

System.out.println("Created new person object. Database id is: " + newPersion.getId());

//...

Person samePerson = mDatabase.get(someid, Person.class);  

MicroDB emphasizes simplicity for the programmer. Ideally, the programmer is not even aware that data is being stored on disk. Data objects are seamlessly being marshaled back and forth by MicroDB. Let's take a look at the create and get function to get an idea of how MicroDB can manage this:

public synchronized <T extends DBObject> T create(Class<T> classType) {  
  T retval = classType.newInstance();
  UBObject data = new UBObject();
  UBValue key = UBValueFactory.createString(UUID.randomUUID().toString());
  data.put("id", key);
  retval.init(data, this);

  mLiveObjects.put(key, new SoftReference<DBObject>(retval));
  return retval;
}

Notice that create acts as a factory method: it creates a new instance of the object type, assigned it a unique id, and returns it.

public <T extends DBObject> T get(UBValue id, Class<T> classType) {  
  T retval;
  DBObject cached;
  SoftReference<DBObject> ref = mLiveObjects.get(id);
  if(ref != null && (cached = ref.get()) != null){
    retval = (T)cached;
  } else {
    UBObject data = mDriver.get(id);
    T newObj = classType.newInstance();
    newObj.init(data, this);
    retval = newObj;
    mLiveObjects.put(id, new SoftReference<DBObject>(retval));

  }

  return retval;
}

The get method is also pretty simple, it first checks if the object already exists in the mLiveObjects cache else the data is loaded from disk. Every MicroDB object has an automatically generated init function that handles the deserialization from UBObject.

When DBObjects are to be garbage collected, MicroDB first checks if they have been marked dirty (an object is considered dirty if the application has made any modifications to its fields via its setters). If it is found to be dirty, the data is written back to disk on a dedicated write thread. In this way, the programmer is never concerned of an object's persistent state. Objects just 'exist' and when there data changes, MicroDB will write those changes to disk.

Paul Soucy

Read more posts by this author.