Blog: How Can I Write This Query Using An ORM?

The Propel Team – 02 February 2011

The Propel mailing lists often shows that typical question: How can I write this complicated query using the new Query syntax? The answer is not as simple as referring to the right section in the extensive Query documentation, because most of the times the Query object is not the solution. And actually, the true answer is complicated, because it implies a deep understanding of the Object Relational Mapper approach. Let’s see through a few examples how various answers can lead to a better usage of ORMs.

Answer #1: You Don’t Need An ORM

A recent post on the propel-users mailing list asked for the Propel version of the following query:

SELECT COUNT(t1.user) AS users, t1.choice AS lft, t2.choice AS rgt
FROM Choices t1 iNNER JOIN Choices t2 ON (t1.user = t2.user)
WHERE t1.choice IN (...) AND t2.choice IN (...)
GROUP BY t1.choice, t2.choice;

This query is not object-oriented, it’s purely relational, so it doesn’t need an Object-Relational Mapping. The best way to execute this query inside an ORM is to skip the ORM and use PDO directly:

<?php
$con = Propel::getConnection();
$query = 'SELECT COUNT(t1.user) AS users, t1.choice AS lft, t2.choice AS rgt
  FROM choice t1 iNNER JOIN choice t2 ON (t1.user = t2.user)
  WHERE t1.choice IN (?, ?) AND t2.choice IN (?, ?)
  GROUP BY t1.choice, t2.choice';
$stmt = $con->prepare($query);
$stmt->bindValue(1, 'foo');
$stmt->bindValue(2, 'bar');
$stmt->bindValue(3, 'baz');
$stmt->bindValue(4, 'foz');
$res = $stmt->execute();

Hints of a purely relational query are:

  • The SELECT part cherry-picks some columns of the main table
  • The SELECT part aggregates data from several tables
  • The selected columns use vendor-specific SQL functions
  • The query joins tables through columns that don’t share a foreign key
  • The query is long and makes several joins
  • The query uses GROUP BY or HAVING
  • The user posts the query, but has no idea of the corresponding object model

That’s the most common answer to the “How Can I Write…” question. It is not a bad thing to resort to a direct database query inside a project using an ORM when it’s the right tool for the job. If Propel makes the code much more complex to write, not reusable, or painfully slow, then don’t use it. Be pragmatic.

Answer #2: You Don’t Need a Query Object

Some queries appear closer to the object-oriented world but still very complex. The WHERE and JOIN parts still look very long, there may even be a subselect, but the user selects all the columns of the main table, and expects ActiveRecord objects as a result. For instance:

// find all the books not reviewed by :name
SELECT * FROM book
WHERE id NOT IN (SELECT book_review.book_id FROM book_review
 INNER JOIN author ON (book_review.author_id=author.ID)
 WHERE author.last_name = :name);

Crafting this query using Propel’s Query objects (whether Criteria or ModelCriteria) would take a long time, or might even be close to impossible. But the query is already there, so why use the Query objects? To get Model objects as a result? You don’t need Query objects for that, just use a formatter object:

<?php
// prepare and execute an arbitrary SQL statement
$con = Propel::getConnection(BookPeer::DATABASE_NAME);
$sql = "SELECT * FROM book WHERE id NOT IN "
    ."(SELECT book_review.book_id FROM book_review"
    ." INNER JOIN author ON (book_review.author_id=author.ID)"
    ." WHERE author.last_name = :name)";
$stmt = $con->prepare($sql);
$stmt->execute(array(':name' => 'Austen'));

// hydrate Book objects with the result
$formatter = new PropelObjectFormatter();
$formatter->setClass('Book');
$books = $formatter->format($stmt);

Once again, if you already have a working query and if there is no possible reuse, PDO can be the right tool for the job. Propel can hydrate Model objects based on a PDO resultset, so all you need is a PropelObjectFormatter. Yan can even hydrate objects from several tables in a row (in a similar fashion to what Propel does with with()) using a properly configured Formatter object.

That’s closer to the ORM philosophy, because you eventually deal with objects. But the Query itself is everything but object-oriented.

Answer #3: You don’t Need A Full Query

Sometimes the query is just long, and users find it tedious to use Query methods instead of plain SQL. The problem often reveals a bad usage of the Query objects prior to that. For instance, consider the following query:

SELECT * FROM book
LEFT JOIN author ON (book.AUTHOR_ID=author.ID)
WHERE book.TITLE like '%war%'
AND book.PRICE < 10
AND book.PUBLISHED_AT < now()
AND author.FAME > 10;

If someones asks for a Propel query version of the SQL query, it’s probably because the job of adding simple methods to the Query class wasn’t executed before. It’s very likely that a previous query in the same project looked like:

// find cheap books
SELECT * FROM book
WHERE book.PRICE < 10;

And another one looked like:

// find published books
SELECT * FROM book
WHERE book.PUBLISHED_AT < now();

And maybe even one like:

// find books by famous authors
SELECT * FROM book
LEFT JOIN author ON (book.AUTHOR_ID=author.ID)
WHERE author.FAME > 10;

You get the point: little pieces of the query are reusable, and may even have been written previously. The proper way to handle these cases would be to improve the BookQuery class little by little, as follows:

<?php
class BookQuery extends BaseBookQuery
{
  public function cheap($maxPrice = 10)
  {
    return $this->filterByPrice(array('max' => $maxPrice));
  }

  public function published()
  {
    return $this->filterByPublishedAt(array('max' => time()));
  }

  public function writtenByFamousAuthors($fameTreshold = 10)
  {
    return $this
      ->leftJoin('Book.Author')
      ->where('Author.Fame > ?', $fameTreshold);
  }
}

Now writing the query becomes trivial:

<?php
$books = BookQuery::create()
  ->filterByTitle('%war%')
  ->cheap()
  ->published()
  ->writtenByFamousAuthors();

And since filtering on a word in the book title may be a common need, this ability should be added to the BookQuery class:

<?php
class BookQuery extends BaseBookQuery
{
  // ...
  public function titleContainsWord($word)
  {
    return $this->filterByTitle('%' . $word . '%');
  }

Now the query is even easier to write, and more readable as well:

<?php
$books = BookQuery::create()
  ->titleContainsWord('war')
  ->cheap()
  ->published()
  ->writtenByFamousAuthors();

The idea is to add meaningful methods to the Query class piece by piece, so you never have to bake complex SQL. By doing so, you will realize that the Query classes contains more and more of your business logic, while the database only contains data. That’s a step further in the ORM paradigm.

Answer #4: You Need More Than One Query

Computer Science taught you to minimize queries, so if you were a good student, you might end up with queries looking like the following:

// find all books written by Alexandre Dumas, fils, and Alexandre Dumas, père
SELECT * FROM book
LEFT JOIN author ON book.AUTHOR_ID=author.ID
WHERE author.LAST_NAME = 'Dumas';

But in the Object-Oriented world, it’s not a Bad Thing to execute several queries in a row. It may even make your code a lot clearer:

<?php
$dumasAuthors = AuthorQuery::create()
  ->filterByLastName('Dumas');
  ->find();
$books = BookQuery::create()
  ->filterByAuthor($dumasAuthors) // ok, it's only possible in Propel 1.6 :)
  ->find();

By doing so, you move some logic away from the database (the join) and back to the PHP code (filtering by objects). You may pay the expense of an additional trip to the database, but in the end your model logic is more decoupled, and fully object-oriented. And depending on the indices present in the tables, some PHP logic and two SQL queries may be faster to execute than an single SQL query with all the logic.

Propel makes it even better: you can keep the single SQL query while actually using two query objects by embedding queries. That’s exactly what the useXXXQuery() methods allow:

<?php
$books = BookQuery::create()
  ->useAuthorQuery()
    ->filterByLastName('Dumas')
  ->endUse()
  ->find();

Combining several query objects allows for very complex queries in a very reusable way.

Answer #5: You Don’t Use The Right Query

While we are at separating queries, maybe part of the logic of a complex query can be moved to another write query. Let’s see an example:

// show all Dumas authors, together with the number of books they wrote
SELECT author.*, count(book.ID) as nb_books
FROM author LEFT JOIN book ON (author.ID = book.AUTHOR_ID)
WHERE author.LAST_NAME = 'Dumas'
GROUP BY author.ID;

The count() might be expensive, especially on a large book table. It may be a better idea to denormalize the author table to add a nb_books column, updated each time a book is added or removed for a given author. Once again, this might sound counterintuitive to serious Computer Science students, but it’s a very common technique in the ORM world.

Propel makes this kind of denormalization a piece of cake thanks to the aggregate_column behavior. In fact, you don’t even have to worry about keeping the column up to date. Just set it up in the schema, and you’re good to go:

<table name="author">
  <column name="id" type="INTEGER" required="true" primaryKey="true" autoIncrement="true" />
  <column name="first_name" type="VARCHAR" />
  <column name="last_name" type="VARCHAR" required="true" primaryString="true" />
  <behavior name="aggregate_column">
    <parameter name="name" value="nb_books" />
    <parameter name="foreign_table" value="book" />
    <parameter name="expression" value="COUNT(id)" />
  </behavior>
</table>
<table name="book">
  <column name="id" type="INTEGER" required="true" primaryKey="true" autoIncrement="true" />
  <column name="title" type="VARCHAR" required="true" primaryString="true" />
  <column name="author_id" type="INTEGER" />
  <foreign-key foreignTable="author" onDelete="cascade">
    <reference local="author_id" foreign="id" />
  </foreign-key>
</table>

Now use the Query object to retrieve Author objects, and the number of books comes free of charge:

<?php
$authors = AuthorQuery::create()
  ->filterByLastName('Dumas')
  ->find();
foreach ($authors as $author) {
  echo $author->getFirstName(), ': ', $author->getNbBooks(), "\n";
}

Web applications often execute much more read queries than write queries. If a read query is complex and expensive in execution time, then you might consider simplifying it by adding more data at write time.

Conclusion

Propel offers a lot of ways to deal with complex queries. But if there is one thing to remember, it’s that in an ORM world you should think about objects, not SQL. If you come up with a complex SQL query to translate, it means you’ve probably taken the problem upside down. Put your business logic in the right place (in ActiveRecord or Query classes), and you’ll quickly forget about the pain of complex SQL queries.