What is an ORM and Why the Madness?





 After discussing HarperDB with the developer community this past fall one of the most commonly asked questions was “Do you have an ORM?” The simple answer is, no, but the reason behind this answer is more nuanced and in depth than we typically would have time to discuss.  Our, no, is not because we hadn’t had development cycles to create or integrate with an ORM. Our, no, is because we believe an ORM is a conveniently flawed design pattern in building applications and in the way some databases handle data.

What is an ORM

ORM stands for Object-Relational Mapping. ORMs are plugins/APIs that abstract a database from application code by  internalizing the connection, connection pooling and CRUD operations as well as map your application’s objects to your database tables.  In a bare metal implementation  of a database, a developer would have to write some thing like: 


var database = require('database_library');
 
var con = database.createConnection({
  host: "localhost",
  user: "yourusername",
  password: "yourpassword",
  database: "mydb"
});
 
con.connect(function(err) {
  if (err) throw err;
  con.query("SELECT * FROM dogs WHERE id = 10", function (err, result, fields) {
    if (err) throw err;
    console.log(result);
  });
});

As opposed to something that aligns more with your Object structure and is more intuitive: 

Dog d = Dog.get(10);

 Some ORMs can do basic code generation for you. They can analyze your database schema and build out your core Objects for you, typically a tedious task. 

How does this help?

This sounds awesome! In theory, a developer doesn’t need to write or even know SQL or how the database works, they can simply work with their objects and build amazing apps. If the organization needs to migrate from one database to another the ORM is simply configured to point to the new database and no other code needs to be changed.  The need for writing & maintaining extra code which parses the database results into objects and converting the objects into SQL statements is over. In a nutshell the ORM is an amazing Rosetta Stone.

Cool, but…

It’s just not that simple. ORMs are complicated systems as they are solving a complicated problem and taking a heavy lift off of developers. The first issue is configuration, in order to effectively tune your ORM someone on your team needs to understand the database at a deep level level, then understand how the ORM is talking to the database, finally the minutiae of configuration settings needs to be set.  This is just to get connected properly, following that, the Objects that are intended to be abstracted from the database need to be hooked to the ORM and still some form of query language needs to be written if not raw SQL.  In many instances you end up coding in accordance to the guidelines of the ORM and end up in a narrow bow.  If you need to interact with your data in a way not intended then you will need to code significant work arounds .  

As intended by the ORM, you are abstracted from your database so you no longer know what is happening when things go wrong.  You have a kick ass IDE and elite debugging skills, but if the issue lies inside the murky code of the ORM you now need to crack open that black box and/or run a query analyzer on the database side.  Once you do find the issue and it is determined to come from the ORM there my not be a lot you can do other than submit an issue or fork the code yourself if you feel real brave.   

Performance is the final and not least significant issue with ORMs, they have a great understanding of databases and mapping them to Objects, but not necessarily your database or your Objects.  An ORM will get you about 80% of the way to data nirvana, but that final 20% is a damn slog and if your app needs to scale or respond in milliseconds, the architecture choice of an ORM will make that 20% brutal to attain.  ORMs need to perform reflection or some form of iteration and comparison of your Object structure to the results returned and convert them into your Objects, this is costly.  Also because your ORM is doing generalized work it could be executing a loop of queries that could have been executed with one query thereby causing performance bottlenecks that just don’t need to happen.  

Where do we go from here?

Be sure you are architecting a solution that makes sense for all needs technical and business related.  Many times aspects of a project are chosen because it is the new hotness, it is what the team is familiar with, or it seems like it will launch the project off the ground in no time.  But the question is really, will it do what we want it to do and where are the edges.  Handcoding a DAO(Data Access Object) sounds like a massive pain and time your team could be spending writing the core app. So maybe the problem isn’t necessarily with ORMs, it could be with our databases themselves and how they have not kept up with our programming languages.  If you need to work with your data as Objects shouldn’t your data store natively handle Objects?   

NoSQL is your answer! These data stores were built to natively ingest and return Objects. You can throw away that bulky ORM and implement a simple SDK or better yet perform direct access with some code snippets. Access is simple, mapping between data and Object is intrinsic and ingest rate is high. Again, where are the edges? Typically the edge lies on deep analytics.  You have some indices set up that cover your basic app use cases and then the business asks you to create some dashboards that require table joins, aggregates, conditions on non-indexed columns. To add indices would be expensive either financially, from a server resource perspective or both. Essentially the business is now asking for requirements your database can’t handle or handle easily. 

This is where a true SQL / NoSQL database shines like HarperDB.  Our database handles the Object ingest and retrieval natively, this keeps your need for an ORM out of the picture.  HarperDB also handles full ANSI SQL so when you need to write that dashboard you can execute SQL operations against the same API and get deep analytics with no need to perform expensive scans or multiple calls and transforms to transmute raw data into something actionable. 

Stephen wrote a post discussing Database as a Microservice.  What comes along intrinsically with this model is since the database is returning objects to you, there is little need to wire in your own Objects.  The database handles the ingest and generation of the Objects for you, offloading what has been a developer headache for decades. So rather than blame the ORM, I say we choose data stores that make sense for today’s developer.  A data store that is agile and flexible to multiple use cases and implements with a few lines code.