Monday, December 20, 2010

Meet Flatworm

I'm OK with flat files, but why should you use a fixed-length record when every row is formed by 1320 chars, most of which are blank? Isn't it an enormous waste of resources?

Anyway, should you need to deal with flat files (it seems like no programmer can keep them at bay... does this ring a bell?) after some investigation I stumbled upon Flatworm, an interesting library that lets you read from a flat file, be it fixed-length or separated by a separator character, and instantiate the appropriate beans. It also supports repeating segments or multi-line records. Nicely enough, it also work the other way around, which it what I was primarily interested in.

All you have to do is provide a descriptor in XML format, sit back and relax. Let's see how it works.

Let's suppose we need to produce a fixed-length file with the following format:
XXvalueOne  valueTwo  
i.e. a fixed record identifier and two fields of 10 chars each.

First of all you have to provide the descriptor:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE file-format SYSTEM "http://www.blackbear.com/dtds/flatworm-data-description_1_0.dtd">
<file-format>
<converter name="char" class="com.blackbear.flatworm.converters.CoreConverters" method="convertChar" return-type="java.lang.String"/>
<record name="whatever-record">
<record-ident>
<field-ident field-start="0" field-length="2">
<match-string>XX</match-string>
</field-ident>
</record-ident>
<record-definition>
<bean name="whatever" class="my.package.Whatever"/>
<line>
<record-element length="2"/>
<record-element length="10" beanref="whatever.propOne" type="char">
<conversion-option name="justify" value="left"/>
</record-element>
<record-element length="10" beanref="whatever.propTwo" type="char">
<conversion-option name="justify" value="left"/>
</record-element>
</line>
</record-definition>
</record>
</file-format>
Then you can write a simple class that, given an iterator of whatever you have to export, creates a file and populates it:
public class SimpleExporter {

FileCreator fileCreator;
Iterator<Whatever> iterator;

public SimpleExporter(
Iterator<Whatever> iterator,
final String configFile,
final String outputFile)
throws FlatwormCreatorException {
this.iterator = iterator;
InputStream config = Thread.currentThread().
getContextClassLoader().
getResourceAsStream(configFile);
fileCreator = new FileCreator(config, outputFile);
}

public void execute() {
try {
fileCreator.setRecordSeperator("\r\n");
fileCreator.open();
while (iterator.hasNext()) {
Whatever whatever = iterator.next();
fileCreator.setBean("whatever", whatever);
fileCreator.write("whatever-record");
}
fileCreator.close();
} catch (IOException ex) {
Logger.getLogger(SimpleExporter.class.getName()).log(Level.SEVERE, null, ex);
} catch (FlatwormCreatorException ex) {
Logger.getLogger(SimpleExporter.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
Exceptions should be managed row by row, but just stay with me for the example, OK?

For the sake of this example the Whatever class is just a POJO, so it's not worth reporting it here. So what have I done? I just created a FileCreator object passing it an InputStream to the descriptor and the path for the output file. That was not hard, was it?

If your bean has inner properties you can simply use a dot notation:
<record-element length="10" beanref="whatever.outerProperty.innerProperty" type="char">
Playing around I had some little tricks to learn: maybe there's a better way, but they work :-) for example I had to write several fields which are not present in my beans. For this I simply added a "filler" property of type String and used it in all such cases, adding a comment in the desctriptor to specify what I was substituting.

Another problem emerged when the properties in my bean were null; to fix this once and for all I simply extended the CoreConverters class adding null-safe operations:
@Override
public String convertChar(String str, Map<String, ConversionOption> options) {
return super.convertChar(str == null ? "" : str, options);
}
That said, I found the library really useful as it saved me a lot of time.

4 comments:

derekwrobel said...

I have been looking into trying this out and having issues. The DTD seems to have been taken down. Do you know where another may be hosted? I assume this was the reason I kept getting 'premature end of file' errors from org.xml.sax.inputsource

Unknown said...

Unluckily I haven't been able to find it myself :-/

Steve said...

I ran into that same problem. I finally found the DTD in the source code, which you can download from sourceforge. I just put it in my project locally and referenced it from the xml file. HTH

Unknown said...

Thanks, I'll look for it :-)