Delimited flat-file parsing often leads to brittle index-based code. In this post, I show how enums make field positions easier to read and maintain.
In the examples below, we assume the input has already been split:
String[] split = delimitedData.split("\\|");
There are caveats of using the split method this way, but they are outside the scope of this post.
Direct indexing
DbData existingData = dbHandler.getExistingData(
split[2],
split[7],
split[8],
split[9]
);
Direct indexing is compact, but brittle and hard to scan.
Local variables
String id = split[2];
String date = split[7];
String time = split[8];
String reason = split[9];
DbData existingData = dbHandler.getExistingData(
id,
date,
time,
reason
);
Local variables improves readability at the call site, but field mappings are still scattered across the code base.
Enum mapping
public enum FlatFileField {
ID(2),
DATE(7),
TIME(8),
REASON(9);
private final int index;
FlatFileField(int index) {
this.index = index;
}
public int index() {
return index;
}
}
DbData existingData = dbHandler.getExistingData(
split[FlatFileField.ID.index()],
split[FlatFileField.DATE.index()],
split[FlatFileField.TIME.index()],
split[FlatFileField.REASON.index()]
);
Comparison
| Approach | Pros | Cons |
|---|---|---|
| Direct indexing | Concise | Uses magic numbers, hard to maintain, higher cognitive load |
| Local variables | Readable at call site | Field mapping still scattered |
| Enum mapping | Centralized field positions, clearer intent | Require an additional enum |
Takeaway
Enums are a simple way to replace magic numbers with meaningful names when working with delimited data. They improve readability and centralize field positions. When parsing logic grows beyond simple positional access, a dedicated parser or DTO is usually a better choice.
That's absolutely solid feedback and if you're doing only a handful of field mappings, I'd agree that this pattern could be used. However, as mappings grow, you tent to end up with a whole bunch of constants, something that can be hard to maintain, or you simply have an overeager developer who thinks it's a good idea to refactor the code into
static final int TWO = 2, which of course leads you right back to the original problem.When grouping the constants together, you make it more clear how these constants tie into your domain model and you make it easier for maintainers to read and extend the code.
Personally I wouldn't call an enum boilerplate, since it is quite small and efficient, and I'd rather take that over Constants.java with a group of constants I cannot easily understand.