Monday, September 27, 2010

On repositories

I was writing code for a (actually not so) massive data export in a legacy application and I needed to retrieve all objects of a certain type modified after a given date, so I started writing a repository.

The simplest test that came into my mind was something like this:
@Test
public void queryingWithDateInTheFutureReturnsEmptyList(){
DateTime tomorrow = new DateTime().plusDays(1);
List<MyObject> result = instance.findModifiedAfter(tomorrow);
assertTrue(result.isEmpty());
}
Some clickety-clack (SHIFT+CTRL+I, ALT+SHIFT+F, CTRL+s, CTRL+F6 for the most curious) and NetBeans gives me a red bar. That's good as there's no code yet :-)

The first implementation was really too easy to write it, but as a very few people believe I work this way and all the rest think I'm wasting time I'll publish it anyway, saving my opinion for another occasion:

public List<MyObject> findModifiedAfter(DateTime startDate){
return Collections.emptyList();
}
So far, so good. Now, in the framework used in this project all persistent objects inherit from MyPersistenceObject (names have been changed to protect the innocents), which is a sort of Active Record with a hint of Row Data Gateway. Persistent object may also be managed by a MyPersistenceObjectController object, which is a sort of DAO that also manages transactions. Should you wonder, the class is not a controller at all, but the original developers thought that the name would fit. Before we go on, let me say that the framework works pretty well under many circumstances.

Not having a dedicated database that gets populated for each test end cleaned afterwards (did I mention the fact that this is a legacy application?) the next test tried to retrieve some data:

@Test
public void queryingSinceLastMonthReturnsAtLeast5kObjects(){
DateTime lastMonth= new DateTime().minusMonths(1);
List<MyObject> result = instance.findModifiedAfter(lastMonth);
assertTrue(result.size() > 5000);
}

I know it's not great, and that it breaks if someone truncates the corresponding table, but we must start from somewhere, right? Reading the tests also suggests that the method should probably be named findModifiedSince, but that is hardly the point now - even if it shows another useful feature of writing tests.

A little lookup on the existing code easily gave me a first implementation based on the "glorious" copy-paste-fix pattern:

public List<MyObject> findModifiedSince(final DateTime date) {
if (date == null || date.isAfterNow()) {
return Collections.emptyList();
}

try {
Object[] values = new Object[]{date.getMillis()};
String[] orderBy = null;
Criteria criteria = new Criteria("tms", "MyObjectImpl");
SimpleCondition simple = new SimpleCondition("tms", SimpleCondition.GE);
criteria.add(Criteria.NOP, simple);
Vector objects = new MyObjectImpl().retrieveByAlternateKey(
criteria,
values,
orderBy);
List<MyObject> result = new ArrayList<MyObject>();
result.addAll(objects);
return result;
} catch (Exception ex) {
return Collections.emptyList();
}
}
Clickety-clack, CTRL+F6...

...
...
(yawn)
...
...
(wtf?)
...
...

Green bar. After an insane amount of time. A little logging informed me that the test took 16 seconds to run, excluding the time needed to start and stop the persistence container.

A little profiling confirmed that also the memory usage grew abnormally. And all this for a little more than 6000 objects...

Now, to quote Eric Evans, a repository is

...an object that can provide the illusion of an in-memory collection of all objects of that type.

Well, this framework cannot provide that illusion, at least not whithout freezing everything else. I must admit that lately I was quite in clover, as using Hibernate I could simply write the very same method like this:


public List<MyObject> findModifiedSince(DateTime date) {
return HibernateUtil.getSession().
createCriteria(MyObject.class).
add(Restrictions.ge("tms", date)).
setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY).
list();
}
and get my list, which is actually a list of proxies, in about no time. Now Evans says that repositories

...return fully instantiated objects or collections of objects whose attribute values meet the criteria, thereby encapsulating the actual storage and query technology.

Proxies are not exactly fully instantiated objects, but as they pretend to be I'm prepared to live with that :-)

At this point all I could do was to revert to the old give-me-the-ids-and-I'll-get-the-objects-myself :-(

No comments: