20
votes

Object-relational mapping has been well discussed, including on here. I have experience with a few approaches and the pitfalls and compromises. True resolution seems like it requires changes to the OO or relational models themselves.

If using a functional language, does the same problem present itself? It seems to me that these two paradigms should fit together better than OO and RDBMS. The idea of thinking in sets in an RDBMS seems to mesh with the automatic parallelism that functional approaches seem to promise.

Does anyone have any interesting opinions or insights? What's the state of play in the industry?

7
Actually, does this really make sense? Functional programming does not provide any standard way of data modeling which could be somewhat compared to relational or OO data modeling. So asking for a "mapping" IMHO is not a meaningful question. One could ask if it would make sense to add functional concepts to an RDBMS, in fact SQL has already some functional concepts.Doc Brown

7 Answers

17
votes

What's the purpose of an ORM?

The main purpose of using an ORM is to bridge between the networked model (object orientation, graphs, etc.) and the relational model. And the main difference between the two models is surprisingly simple. It's whether parents point to children (networked model) or children point to parents (relational model).

With this simplicity in mind, I believe there is no such thing as an "impedance mismatch" between the two models. The problems people usually run into are purely implementation specific, and should be solvable, if there were better data transfer protocols between clients and servers.

How can SQL address the problems we have with ORMs?

In particular, the third manifesto tries to address the shortcomings of the SQL language and relational algebra by allowing for nested collections, which have been implemented in a variety of databases, including:

  • Oracle (probably the most sophisticated implementation)
  • PostgreSQL (to some extent)
  • Informix
  • SQL Server, MySQL, etc. (through "emulation" via XML or JSON)

In my opinion, if all databases implemented the SQL standard MULTISET() operator (e.g. Oracle does), people would no longer use ORMs for mapping (perhaps still for object graph persistence), because they could materialise nested collections directly from within the databases, e.g. this query:

SELECT actor_id, first_name, last_name,
  MULTISET (
    SELECT film_id, title
    FROM film AS f
    JOIN film_actor AS fa USING (film_id)
    WHERE fa.actor_id = a.actor_id
  ) AS films
FROM actor AS a

Would yield all the actors and their films as a nested collection, rather than a denormalised join result (where actors are repeated for each film).

Functional paradigm at the client side

The question whether a functional programming language at the client side is better suited for database interactions is really orthogonal. ORMs help with object graph persistence, so if your client side model is a graph, and you want it to be a graph, you will need an ORM, regardless if you're manipulating that graph using a functional programming language.

However, because object orientation is less idiomatic in functional programming languages, you are less likely to shoehorn every data item into an object. For someone writing SQL, projecting arbitrary tuples is very natural. SQL embraces structural typing. Each SQL query defines its own row type without the need to previously assign a name to it. That resonates very well with functional programmers, especially when type inference is sophisticated, in case of which you won't ever think of mapping your SQL result to some previously defined object / class.

An example in Java using jOOQ from this blog post could be:

// Higher order, SQL query producing function:
public static ResultQuery<Record2<String, String>> actors(Function<Actor, Condition> p) {
    return ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
              .from(ACTOR)
              .where(p.apply(ACTOR)));
}

This approach leads to a much better compositionality of SQL statements than if the SQL language were abstracted by some ORM, or if SQL's natural "string based" nature were used. The above function can now be used e.g. like this:

// Get only actors whose first name starts with "A"
for (Record rec : actors(a -> a.FIRST_NAME.like("A%")))
    System.out.println(rec);

FRM abstraction over SQL

Some FRMs try to abstract over the SQL language, usually for these reasons:

  • They claim SQL is not composable enough (jOOQ disproves this, it's just very hard to get right).
  • They claim that API users are more used to "native" collection APIs, so e.g. JOIN is translated to flatMap() and WHERE is translated to filter(), etc.

To answer your question

FRM is not "easier" than ORM, it solves a different problem. In fact, FRM doesn't really solve any problem at all, because SQL, being a declarative programming language itself (which is not so different from functional programming), is a very good match for other functional client programming languages. So, if anything at all, an FRM simply bridges the gap between SQL, the external DSL, and your client language.

(I work for the company behind jOOQ, so this answer is biased)

10
votes

The hard problems of extending the relational database are extended transactions, data-type mismatches, automated query translation and things like N+1 Select that are fundamental problems of leaving the relational system and -- in my opinion -- do not change by changing the receiving programming paradigm.

7
votes

That depends on your needs

  1. If you want to focus on the data-structures, use an ORM like JPA/Hibernate
  2. If you want to shed light on treatments, take a look at FRM libraries: QueryDSL or Jooq
  3. If you need to tune your SQL requests to specific databases, use JDBC and native SQL requests

The strengh of various "Relational Mapping" technologies is portability: you ensure your application will run on most of the ACID databases. Otherwise, you will cope with differences between various SQL dialects when you write manually the SQL requests .

Of course you can restrain yourself to the SQL92 standard (and then do some Functional Programming) or you can reuse some concepts of functionnal programming with ORM frameworks

The ORM strenghs are built over a session object which can act as a bottleneck:

  1. it manages the lifecycle of the objects as long as the underlying database transaction is running.
  2. it maintains a one-to-one mapping between your java objects and your database rows (and use an internal cache to avoid duplicate objects).
  3. it automatically detects association updates and the orphan objects to delete
  4. it handles concurrenty issues with optimistic or pessimist lock.

Nevertheless, its strengths are also its weaknesses:

  1. The session must be able to compare objects so you need to implements equals/hashCode methods. But Objects equality must be rooted on "Business Keys" and not database id (new transient objects have no database ID!). However, some reified concepts have no business equality (an operation for instance). A common workaround relies on GUIDs which tend to upset database administrators.

  2. The session must spy relationship changes but its mapping rules push the use of collections unsuitable for the business algorithms. Sometime your would like to use an HashMap but the ORM will require the key to be another "Rich Domain Object" instead of another light one... Then you have to implement object equality on the rich domain object acting as a key... But you can't because this object has no counterpart on the business world. So you fall back to a simple list that you have to iterate on (and performance issues result from).

  3. The ORM API are sometimes unsuitable for real-world use. For instance, real world web applications try to enforce session isolation by adding some "WHERE" clauses when you fetch data... Then the "Session.get(id)" doesn't suffice and you need to turn to more complex DSL (HSQL, Criteria API) or go back to native SQL

  4. The database objects conflicts with other objects dedicated to other frameworks (like OXM frameworks = Object/XML Mapping). For instance, if your REST services use jackson library to serialize a business object. But this Jackson exactly maps to an Hibernate One. Then either you merge both and a strong coupling between your API and your database appears Or you must implement a translation and all the code you saved from the ORM is lost there...

On the other side, FRM is a trade-off between "Object Relational Mapping" (ORM) and native SQL queries (with JDBC)

The best way to explain differences between FRM and ORM consists into adopting a DDD approach.

  • Object Relational Mapping empowers the use of "Rich Domain Object" which are Java classes whose states are mutable during the database transaction
  • Functional Relational Mapping relies on "Poor Domain Objects" which are immutable (so much so you have to clone a new one each time you want to alter its content)

It releases the constraints put on the ORM session and relies most of time on a DSL over the SQL (so portability doesn't matter) But on the other hand, you have to look into the transaction details, the concurrency issues

List<Person> persons = queryFactory.selectFrom(person)
  .where(
    person.firstName.eq("John"),
    person.lastName.eq("Doe"))
  .fetch();
3
votes

I'd guess functional to relational mapping should be easier to create and use than OO to RDBMS. As long as you only query the database, that is. I don't really see (yet) how you could do database updates without side effects in a nice way.

The main problem I see is performance. Todays RDMS are not designed to be used with functional queries, and will probably behave poorly in quite a few cases.

3
votes

I haven't done functional-relational mapping, per se, but I have used functional programming techniques to speed up access to an RDBMS.

It's quite common to start with a dataset, do some complex computation on it, and store the results, where the results are a subset of the original with additional values, for example. The imperative approach dictates that you store your initial dataset with extra NULL columns, do your computation, then update the records with the computed values.

Seems reasonable. But the problem with that is it can get very slow. If your computation requires another SQL statement besides the update query itself, or even needs to be done in application code, you literally have to (re-)search for the records that you are changing after the computation to store your results in the right rows.

You can get around this by simply creating a new table for results. This way, you can just always insert instead of update. You end up having another table, duplicating the keys, but you no longer need to waste space on columns storing NULL – you only store what you have. You then join your results in your final select.

I (ab)used an RDBMS this way and ended up writing SQL statements that looked mostly like this...

create table temp_foo_1 as select ...;
create table temp_foo_2 as select ...;
...
create table foo_results as
  select * from temp_foo_n inner join temp_foo_1 ... inner join temp_foo_2 ...;

What this is essentially doing is creating a bunch of immutable bindings. The nice thing, though, is you can work on entire sets at once. Kind of reminds you of languages that let you work with matrices, like Matlab.

I imagine this would also allow for parallelism much easier.

An extra perk is that types of columns for tables created this way don't have to be specified because they are inferred from the columns they're selected from.

3
votes

I'd think that, as Sam mentioned, if the DB should be updated, the same concurrency issues have to be faced as with OO world. The functional nature of the program could maybe be even a little more problematic than the object nature because of the state of data, transactions etc of the RDBMS.

But for reading, the functional language could be more natural with some problem domains (as it seems to be regardless of the DB)

The functional<->RDBMS mapping should have no big differences to OO<->RDMBS mappings. But I think that that depends a lot on what kind of data types you want to use, if you want to develop a program with a brand new DB schema or to do something against a legacy DB schema, etc..

The lazy fetches etc for associations for example could probably be implemented quite nicely with some lazy evaluation -related concepts. (Even though they can be done quite nicely with OO also)

Edit : With some googling I found HaskellDB (SQL library for Haskell) - that could be worth trying?

1
votes

Databases and Functional Programming can be fused.

for example:

Clojure is a functional programming language based on relational database theory.

               Clojure -> DBMS, Super Foxpro
                   STM -> Transaction,MVCC
Persistent Collections -> db, table, col
              hash-map -> indexed data
                 Watch -> trigger, log
                  Spec -> constraint
              Core API -> SQL, Built-in function
              function -> Stored Procedure
             Meta Data -> System Table

Note: In the latest spec2, spec is more like RMDB. see: spec-alpha2 wiki: Schema-and-select

I advocate: Building a relational data model on top of hash-map to achieve a combination of NoSQL and RMDB advantages. This is actually a reverse implementation of posgtresql.

Duck Typing: If it looks like a duck and quacks like a duck, it must be a duck.

If clojure's data model like a RMDB, clojure's facilities like a RMDB and clojure's data manipulation like a RMDB, clojure must be a RMDB.

Clojure is a functional programming language based on relational database theory

Everything is RMDB

Implement relational data model and programming based on hash-map (NoSQL)