Thursday, February 28, 2008

Domain-Driven Design? now you can have it quickly

Eric Evans is the author of Domain Driven Design, a text which explains the importance of having a software closely related to the real domain it models; the book presents best practices, principles and techniques based on real experience.

I found it challenging but very interesting, but one friend of mine found it thick as a brick. Luckily for M. (and I'm sure for many other people out there) InfoQ has released Domain Driven Design Quickly, which does not introduce new concepts but summarizes the essence of what DDD is, drawing mostly on Evans' book as well as other sources.

You can download a free online version, but you are invited to support the work and buy a print copy.

Monday, February 25, 2008

The Empire Strikes Back

Ok I know England is not an empire any more, but the temptation was too big. England defeated France at the Stade de France for the second time in a row, in front of an astonished public. After a very enjoyable and quite balanced match Yachvili brought the score at 13-19, leaving France only six points behind with about five minutes to score a try and possibly win the match. At this point the English scrum, only a few inches from the French goal line, did a terrific job keeping possession of the ball for three endless minutes and threatening to kill the match. The final and deserved try by Wigglesworth fixed the score at 13-24. Brian Ashton declared himself "pretty pleased".

At Croke Park the home team overpowered Scotland 34-13. The Scots will likely contend for the unwanted Wooden Spoon with Italy.

We wish all the best to the Englishman James Haskell (nothing to do with the programming language) and the Scot Jim Hamilton who both suffered an ankle injury.

If we say it won’t be closed, then it won’t be closed.

These are the words Mallet said after rain was forecast for the clash between Wales and Italy; they regard the possibility to close the roof of the beautiful Millenium Stadium in Cardiff (even if the Arms Park had a whole different charm which I really loved). Mallet believes (and I do believe) rugby should be a game for all weathers, and that players should adapt to any type of condition. Maybe he also believes he has a pack stronger than the backs, and a muddy field would help their game, as the Welsh backs are far stronger than the Italians.

Well, the roof was not closed. Neither the Italian defense was, but it rained tries and points and the Welsh simply run over us, turning the half time 13-8 into an embarassing 47-8: 34 unanswered points. At least we scored a try, but there were too many careless errors. And this Wales is hungry. A nice debut in the Six Nations for young Andrea Marcato who played a good match.

Now Wales is heading for the Triple Crown and for the Grand Slam, with Ireland and France on the way. Are the Seventies back?

Friday, February 22, 2008

Severance pay part II

Right, you spotted me. I closed last post with a joke. No, no, the scalping part was ok, I was talking about the estimates. You do have to estimate. And we did. And we went to talk to the management, reporting that the new feature would only require about a couple of days. If the code were reasonably clean and understandable, which is not. At this point we tried to explain why we need to spend five days writing tests and refactoring (actually ten days, as two team members are pairing on it).

The discussion went on like this (M = Manager, T = team):

M: How can we ask our customer to pay twelve days...
T: ...er... fourteen, we already spent a couple of days trying to figure out...
M: Exactly! how can we bill fourteen days for a feature that would normally require two?
T: That's right... but if you want to bill two days, nobody will ever pay us for the remaining twelve...
M: Well... the domain logic is very complicated, we could ask for five days...
T: Yes, but we would still need a couple of days whenever the customer asks to fix or add something just to reckon what's going on under the hood, and we can't always say we need more than we actually do... that would lead us to an unhappy customer... and...
M: OK we'll talk about it next week... I want to understand.

Now, what's the point? he is right. But we are right as well. And the customer too. How will our heroes sort that out? More on this will follow...

Thursday, February 21, 2008

Severance pay

We need to add a new feature to a severance pay software. Legacy software. Very legacy. Normally we would discuss the requirements, inspect the code, estimate, sign an agreement, develop, release and get the money. We do not even need more than one iteration, as the change is relatively simple and small. All is well until we reach the "inspect the code" part (yes, just the second step...).

The code is a nightmare in which side effects rule the world, and you're not even aware of it. As time goes by, and your understanding of the code increases ("understanding" is a real overstatement) you start to realize that democratically elected objects and methods are no longer in power: the dark times of side effects have come, good code long time forgotten. And you are doomed to become insane. Of course, being already on my way to insaneness, I have an undeserved advantage. But that's a whole different story.

Well, we definitely have to understand this code. After banging our heads on the wall for some time we decide that the first thing we'll do is to look for the side effects and turn them into methods which names tell you the story, so if a method sets the foo variable it is called setFoo and not getTheValueOfThatFieldCalculatedForThatPeriod. Easier said than done, because side effects have side effects, which in turn have side effects and so on. The FIRST thing we have to do is to build a test harness (don't tell me you thought for a single moment the code had tests). Thus, as everything that happens deep in the dungeons of the code is a strange mixture of lore, smoke and magic - and an insane amount of luck - we decide to test the final output of the whole calculation. Ok let's do it.

First: let's create some test data, decoupled from the database. After all, we don't want to waste unnecessary time hitting the database (and we should all refer to the same one). That's long but easy, as many values and properties are required. while (not done) {Clicketi clacketi tic tic clack click clack} ok we're done. Let's write a simple test and let's check the very last value that the output page would display. Clack click clack... Run test...

...Error: xxx.persistence.XxxPersistenceException: pkg1.pkg2.pkg3.CoefficientTable: no default connection found.

NO DEFAULT CONNECTION FOUND? Whatthe... ok, ok, the application reads the coefficients used in the calculations from a database. We make a note to decouple it later, maybe the application will load them during the startup, we do not care where they come from, what we really want is to inject the coefficient data when we actually need to make the calculations. For now let's use a TestSetup and connect to the database (and disconnect when we're done). Running JUnit tests...

1 test failed... junit.framework.ComparisonFailure

ComparisonFailure? But if I actually print the page I get the right result... and the property I check in the test is the one we use for the output... What...???

A-ha! That's where we discover that the software that calculates the final values does not really calculates the final values, despite the name of the methods, because the class that generates the HTML of the output page (why a class? why not a servlet? why not a simple jsp page and a bean?) makes even more calculations. Yep, you guessed, based on side effects as well. Undeclared and inconceivable side effects, of course.

I do not dare to ask my team to estimate the tasks for the new feature... I do not even dare to ask them to estimate the time they will need to be able to estimate the new feature...

Let's close our eyes, hold our noses and refuse to listen... any free wall? we've got some heads to bang... BUT NOT OURS! My team members want the scalp of the original developers... and who am I to prevent them to be happy?

Scholfield Huxley

On with the Spoon River Anthology... today's verses are quoted from Scholfield Huxley:

How would you like to create a sun
And the next day have the worms
Slipping in and out between your fingers?

Fills you with cheerfulness, doesn't it?

Tuesday, February 19, 2008

Hallway usability testing, TDD and pair programming

The hallway usability testing is one of the twelve steps to better code proposed by Joel Spolsky; it goes

"A hallway usability test is where you grab the next person that passes by in the hallway and force them to try to use the code you just wrote. If you do this to five people, you will learn 95% of what there is to learn about usability problems in your code."

I never heard of such a simple yet smart trick before. Maybe because the hallway usability testing is a close relative of the combination of different XP (and not only XP) practices such as Test Driven Development, in which tests are the first users of your classes, pair programming with frequent pair rotating, in which you don't need to stalk the hallway to get somebody's attention, and collective code ownership, which causes everyone to shower insults at you (in a friendly manner, of course) when you deserve it.

That is another demonstration, should it be needed, that there are different effective paths leading to the same goal.

Samuel Gardner

This morning I read Samuel Gardner from the Spoon River Anthology by Edgar Lee Masters. It really struck me for the abrupt end, really unexpected after the description of "Samuel's" wonderful elm tree:

Now I, an under-tenant of the earth, can see
That the branches of a tree
Spread no wider than its roots.
And how shall the soul of a man
Be larger than the life he has lived?

Doesn't that give you something to think about?

Monday, February 18, 2008

Free Java Programming with Passion! Online Course

Mr. Sang Shin, a great Java Evangelist (and a really nice person), offers a free java programming online course in which he presents all the basics features of the language; I attended it (and got my certificate!) as a review and to learn the new funcionalities of the J2SE 5.0, and found it really well done. I highly recommend it to anyone interested in learning Java.

Mr. Shin offers many other online course, each of them providing students with a great value. I can only thank him for the great job he's doing for us.

Bold text in PDF Docs

One of my colleagues had some problems with a PDF document generated using iText, as she didn't know how to have different fonts in the same PdfPTable cell. The code was something like this:

table.addCell("fixed normal test: " + variableToBeRenderedInBold + " " + anotherVariableToBeRenderedInBold);

It was quite simple to modify it: you can use another signature of the same method which accept a Phrase as a parameter and build the Phrase with different Chunks:

Chunk ch1 = new Chunk("chunk 1 ");
Chunk ch2 = new Chunk("(bold) chunk 2 ", FontFactory.getFont(FontFactory.HELVETICA, 11, Font.BOLD));
Chunk ch3 = new Chunk("chunk 3 ");
Phrase phrase = new Phrase();
phrase.add(ch1);
phrase.add(ch2);
phrase.add(ch3);

That's a starting point, you can decide to refactor as much as you like but that's "the simplest thing that could possibily work". The first refactor could be the elimination of duplicate code that occurs when you have to build another Phrase with normal and bold fonts, but I'll leave that to you.

Thursday, February 14, 2008

Yellow belt

I know, I know... it's not much (yet), but it is the first step...


If you don't know JavaBlackBelt let me quote from their site: it is "a community for Java & open source skills assessment. It is dedicated to technical quizzes about Java related technologies. This is the place where Java developers have their technology knowledge and development abilities recognized. Everybody is welcome to take existing and build new exams. " If you're interested check out this link.

Evidence Based Scheduling

Today I read this interesting article about Evidence Based Scheduling (EBS), a scheduling system based on evidence emerging from historical timesheet data recorded by individual developers. I pretty much agree with Joel but I have two notes:
  1. Joel says that if you extimate tasks that take more than 16 hours to complete you're "officially doomed", and that you should break up every task to fit within the limit. As a matter of fact, in iterative development you accept the fact that initial estimates are not really estimates, but "kind and tentative answers" based on the experience of the developers. You can't break up all the tasks at the beginning of a project, and even in the middle of it you normally estimate precisely only for the next iteration or two. So the schedule gradually emerges, but it never goes too far in time. It would be unrealistic, and it would not consider the Uncertainty Principle: "uncertainty is inherent and inevitable in software projects and processes". That's the "bad" news for the management. The good news is that the schedule will be more and more precise, though I never thought of the Monte Carlo method (which I'm going to try in the future). Then, I agree with Joel on designing short tasks, but only in the context of the iteration.
  2. Joel says that when you're interrupted you should "keep the clock running" and add the "wasted" time to the original task you were working on. I have been using the Pomodoro Technique for quite a time, which has a quite different approach: basically you have a 25-minute time slot called a pomodoro, which is your unit of measurement. You try to manage and pospone interruptions, but if you have to deal with them at the moment you simply cancel your pomodoro, which means that you don't count it as time spent on your original task. At the end of the day you track the number of pomodoros you have completed, and add them up to your personal record. That can give you an idea of how many ideal engineering hours (IEH, a metric used in the XP method) you can work in a day. That's just another way to keep track of the wasted time: you work 8 hours a day, but you maybe have only 5 of them to spend on your project. That doesn't go against Joel, but simply gives another point of view which leads more or less to the same (good) result.
I liked the metaphore of the schedule seen as a box of wood blocks. To manage overscheduling you ideally talk the Product Owner and have him choose which tasks should slip to next iteration, which is another way to say which blocks should go into the next box. To quote Craig Larman, "people remember slipped dates, not slipped features".

Monday, February 11, 2008

Blood Brothers in Arms: JR meets JPR

Rugby legend JPR Williams teamed up with Jamie Roberts to promote the Welsh Blood Service. You can find the whole story here.

Red light for the green shirts

Despite a remarkable comeback from 26-6 to a narrow gaped 26-21 Ireland lost to France, who is heading toward a Grand Slam. If Wales will let it happen, of course. Ireland just scored one try while France scored four, three of which with Vincent Clerc's signature, showing a great ease of play.

Wales wins again

It seems Wales is determined to run for another Grand Slam; with a clear 30-15 win against Scotland they're marching at full points. Paterson, who did not miss a single penalty in Six Nations 2008, struggled to keep Scotland close to the Welsh, but the Scots' defence was not as good as usual, and Wales pierced it repeatedly; a spectacular try by Shane Williams put an end to the Scottish efforts. I'm pretty sure I would not have awarded that try, but that's what all those cameras are for. I can not imagine what would happen in a soccer match... We hope the Scots will get their confidence back, that they'll stop losing possession and that they'll start scoring some tries.

Almost got them...

Another defeat for Italy, who lost their home match with England with the final score of 19-23. Another good match for the forwards, a long waited for continuity from the backs, where Andrea Masi played almost every ball; unluckily it was not enough. Possession and territory were balanced in the first half, but Italy almost doubled their opponents in the second half, and only a couple of errors really costed Italy the match. Bortolussi improved his kicking, but Wilkinson showed us again a great skill. Maybe we could have him marry a velina (an Italian showgirl) and persuade him to play for Italy?

As a final note of satisfaction, Sergio Parisse, in his second match as a captain, was elected man of the match.

Mallett was satisfied with the Azzurri performance, and so was I :-) let's hope for the next match... we have too many wooden spoons by now! Wales will be a though opponent, but I'm sure we shall fight to the last breath.

Thursday, February 7, 2008

Star schema

OLAP (On-Line Analytical Processing) tools are among the most common front end systems for data warehouses; they allow dynamical and multidimensional analysis to be performed against a huge amount of records in order to produce a small set of data which can be used as a dashboard for business process management by the so-called the "knowledge workers".
There are two widespread approaches to OLAP implementation: ROLAP (Relational OLAP) and MOLAP (Multidimensional OLAP). Ok, there is also an hybrid solution, called... you guessed, HOLAP (Hybrid OLAP).

Why should people use the relational model to implement a multidimensional model? There are many reasons, the most important being the diffusion of advanced RDBMSs and the expertise of IT people. Moreover, ROLAP systems don't have a "sparse data" problem, thus being far more scalable than MOLAP implementations. Unluckily, the relational - and bidimensional - model, in which we find attributes, relations and integrity constraints, has a reduced expressivity when it comes to describe the multidimensional model, in which we find facts, measures, attributes, dimensions and hierarchies. That's why we have to find a workaround, which leads us to the (notorious) star schema.

The star schema consists of one (or more) fact table(s), which represents facts, referencing many dimension tables, which represent the dimensions of analysis. Fact tables typically have a lot of columns, and newbies almost always smell the rat of a very bad use of the relational model where they should see a very good compromise instead.
One of the reasons behind the star schema is the very poor performance shown by RDBMS when they have to aggregate a huge amount of records belonging to many different tables, thus involving many expensive join operations: denormalization can then improve performance at the cost of the increased disk space required. Another way to improve performance is redundancy: you materialize derived tables (views) based on the most used aggregations to speed up typical analysis. In addition, ROLAP implementations often use surrogate keys, another feature that make newbies shrug.

The star schema can have some variations, as the snowflake schema, obtained decomposing one of more dimension tables eliminating transitive functional dependencies contained in the tables. Dimension tables which keys are imported in the fact table are called primary while the others are called... why, secondary, what else?

Wednesday, February 6, 2008

Star schema?

Today we started coaching two customers (two IT people) on our data warehousing tool. They have a lot of expertise in their software and a vast knowledge of their operational domain, not to mention the intensive use of SQL, so all went quite smooth until we talked about the structure of the tables in the data mart. Or should I say... the structure of THE table (the fact table), as the dimension tables are quite simple to understand.
For a newbie it is quite a shock to see a star schema: all that redundancy... many unused columns... SO MANY columns... referring to unrelated data... Actually there are a lot of reasons behind this schema, and I shall talk about them... sooner or later ;-)

Tuesday, February 5, 2008

Hudson and Cobertura

I've finally managed to setup a Continuous Integration (CI) server (you waterfall people can find a wonderful explanation on CI here). My choice fell on Hudson, a simple but effective server wich can be expanded with a bunch of plugins, which you can also write by yourself. But, being programmers lazy guys, and not wanting to be an exception, I scanned the published plugin list and picked the Cobertura Plugin. After some head-banging on the wall (yes, programmers never read the instructions) I tweaked the right files and had all the system running and configured for a pilot project: now the system periodically updates the local repository, builds the project, injects the Cobertura code ("instruments the classes bytecode" would probably be a better description), runs all the tests and displays the report on test results and code coverage. And if someone has broken the build it sends her a warning mail. Thanks again to Fabrizio for the inspiration.

The next plugin I'm going to try is the Google Calendar Plugin (my most informed friends know why).

Monday, February 4, 2008

What a comeback!

Six Nations 2008 started with a bang: last Saturday an impressive Wales defeated England in their Twickenham temple as it had not happened in 20 years. With only 25 minutes left and 13 points behind, the Dragons never gave up and scored 20 unanswered points that left England absolutely stunned with the final 19-26 score. Maybe the time has come for another golden age for the Welsh?

Italy lost to Ireland 16-11, showing no ideas from the backs but quite a good work from the forwards; the World Cup ghosts still linger around, will they lead Italy to the wooden spoon? Come on Azzurri, let's give it a cut! Ireland was not the squad we saw in the last tournament but we hope things will improve in the next weeks.

Yesterday France run over Scotland in Murrayfield; Paterson, introduced only 20 minutes before the end of the match, gave Scotland some fresh energy but that was not enough: a heavy 6-27 was the final result, reflecting the values shown on the field.