The Big Unit

Despite the soul-crushing complacency and widespread lack of a desire for alternative solutions, my company does employ large-scale patterns successfully, even if they don’t explicitly trumpet their employ, even to their own, er, employees.

You might recall how we do data updates: Character Large OBjects, or CLOBs, otherwise known as “Modification Requests”. In my rants against it and critiques of its design, I glossed over the one aspect that is the most beneficial: unit of work.


Heh, he pitches for the SF Midgets now. That’s funny to me.

Many months ago, and much more recently, when we have found it necessary to create or update additional entities in addition to our central concepts, a debate ensues: do we chain the additional Modification Requests to our main one or send them separately?

Chaining is very appealing because not only do you gain transactional semantics, but you also make a single trip to the database for multiple updates. In my neck of the woods, performance is such a king that it happily influences the object design of most developers I deal with.

There is not a day that goes by that I look at some section of code and day-dream as to how I would re-write it and re-design it. Gradually, these dreams have addressed performance concerns, like saving only what changed. None of these dreams, however, have involved the need for a unit of work.

Until now.

I know, I know, there are unit of work frameworks out there already. I know. Trust me, if I wanted to simply use something and forget it, burying my head in the sand, I would. However, on the EIP web-ring, we like to explore and bleed through the basics of enterprise development techniques first-hand. Only then are we comfortable off-shoring the details.

So my goal is to incorporate the unit of work with DDD repositories and data access objects. To keep things transparent, my backend will be an imagined MySQL database that exposes stored procedures for all data access tasks.

My first stop is to visit Martin Fowler’s timeless definition of the pattern. Believe it or not, I think Fowler’s definition is slightly too specific and could be better. It is tailored to creations, updates, and deletions of single objects, which is fine, but rarely is the world so OO. In fact, most of the development world is not OO, I believe.

I’m thinking I’ll go with something more like this:

interface DataTask
{
	String asString();
}

interface UnitOfWork
{
	void commit();
	void add(DataTask task);
}

DataTask can be any command you can send to the database. In our thought-example here, it will be a stored procedure. I know there is no rollback(). More on that later.

Here are some DataTasks:

class AddNewEquipmentTask implements DataTask
{
	...
	stored procedure stuff here
	...
	String asString()
	{
		...
	}
}

class UpdateEquipmentTask implements DataTask
{
	...
	stored procedure stuff here
	...
	String asString()
	{
		...
	}
}

Yeah I wasn’t very creative with these two examples. Here are the abbreviated repositories:

class RoomRepository
{
	private UnitOfWork work;

	void save(Room r)
	{
		if(r.id().equals(""))
		{
			AddNewRoomTask command = new AddNewRoomTask(r);
			work.add(command);
		}
		else
		{
			UpdateRoomTask command = new UpdateRoomTask(r);
			work.add(command);
		}
	}
}

class EquipmentRepository
{
	private UnitOfWork work;

	void save(Equipment e)
	{
		if(e.id().equals(""))
		{
			AddNewEquipmentTask command = new AddNewEquipmentTask(e);
			work.add(command);
		}
		else
		{
			UpdateEquipmentTask command = new UpdateEquipmentTask(e);
			work.add(command);
		}
	}
}

Context is good. I want us to all be on the same page here.

class UnitOfWorkImpl implements UnitOfWork
{
	private List tasks;
	private Connection dbConnection;	

	void commit()
	{
		String workString;

		for(DataTask task : tasks)
			workString += task.asString();

		dbConnection.executeTransaction(workString); 

		// error-handling omitted
	}

	void add(DataTask task)
	{
		tasks.add(task);
	}
}

Alright, some comments here: executeTransaction can surround workString with the BEGIN and COMMIT statements to make sure the commands sent to the database are transactional. As for the error-handling, there are options.

I can see dbConnection.executeTransaction(workString); surrounded by a try/catch block, and a ROLLBACK being sent to the database in the appropriate case.

I can also see commit() allowing such commit exceptions to bubble up to the caller. The caller can catch them and send a rollback() message, newly added to the UnitOfWork interface.

Oh, wait — did I forget to show you the calling code? I did. Here’s how you’d use this design:

RoomRepository roomRepo = ...;
EquipmentRepository equipRepo = ...;

UnitOfWork work = ...;

Equipment laptop = equipRepo.retrieveById(52);
Equipment router = new RouterFactory().newInstance();

laptop.ownedBy(fred);

Room basement = roomRepo.retrieveByAddress("Hell");

basement.houses(router);
basement.supervisedBy(fred);

roomRepo.setUnitOfWork(work);
equipRepo.setUnitOfWork(work);

equipRepo.save(laptop);
equipRepo.save(router);

roomRepo.save(basement);

work.commit();

Side note: I imagine needing only equipment IDs when creating/updating rooms. Lazy loading would come in handy here, but that was the topic of another post.

Oh, hey, I see a problem here. This example shows us adding a new equipment, updating an existing equipment, and updating an existing room. Let’s say that the stored procedure for AddNewEquipmentTask returns the ID of the newly-created equipment. How would we access that value, especially when we’d probably need it for the stored procedure that updates rooms?

These are separate issues, I think. First, there is the general problem of accessing the return values of the DataTasks in a UnitOfWork. I definitely believe it is technically possible to construct a result set in dbConnection.executeTransaction(workString);. Such a result set can be accessed, but by whom? In this example, it is EquipmentRepository that needs the ID.

Or is it? Maybe the calling code, likely a Service of some kind, should be responsible for this. A slight shift in thinking might be needed.

I need code. Too managerial.

...
equipRepo.save(laptop);
equipRepo.save(router);

roomRepo.save(basement);

Map<String, String> returnValues = work.commit();

router = router.copyWithNewID(returnValues.get("CREATE_EQUIP_SP"));

As for the second issue of the room SP needing the output of the CREATE_EQUIP_SP… it might not be possible unless we say that Equipment objects are part of the Room aggregate and provide the room SP with the ability to take in Equipment data in order to create them and associate them with the room being updated.

Wait. No — I do understand that.

Failing that sort of a design, I don’t see how it is possible unless we again shift our way of thinking and abandon the whole “get the new object’s ID by creating it first” mentality and instead fetch the next available ID first, and then create the object.

Finally, a last “gotcha” that may or may not come into play is the order in which you save objects. If one stored procedure is dependent on the side effects of another, you have to make sure your calling code is aware of that constraint.

Now that I think about it, would I even want to use a pre-packaged unit of work framework? I’m wary of these auto-magic persistence frameworks. I hear horror stories of developers trying to debug and optimize them. I don’t know — maybe my testicles are too large for this.

Announcer: You’re reading the EIP web-ring.

Leave a Reply