Million
As a diligent developer, you have installed an application for several customers who need better access to complex, large data stores, well written and thoroughly tested.
For each customer, the field test phase is unimpeded. On your way to the bank, you rarely think about the six-month software review, when your pager rings. One of your customers is running a report using your software and the system crashes.
You rushed to the site of the accident and ran a random test. Work well. You run another. There is no problem. You ran hundreds of more tests. Still no problem. You also checked other customers who ran the application for six months. No complaints.
You repeatedly run the report that caused the problem. Collapse! What's going on?
Data failure mode for the attacker
Many programs require frequent access to and processing of internally stored data to perform a variety of complex tasks. This data can be retrieved from large structures, databases, or networks in memory.
Such programs are very vulnerable to crashes caused by corrupted internal data. I call this error pattern the model of the destructive data, because it can exist indefinitely in the system (much like a latent spy in the Cold War), without causing any problems until a certain amount of data is accessed, and the corrupted data explodes like a bomb.
Grammatical reasons
Suppose we have a JDBC application that stores a database table named Mapping that maps a String's name to a collection of elements. (See Resources For more information about the JDBC API.) Each element in each collection refers to a keyword in another table named properties that contains the different known properties of those elements.
Let's just say that the Mapping and Properties tables were originally read from a text file, this text file is developed by an external source that is not an internally generated arbitrary data source, whereas in an external source, each row begins with a name followed by the corresponding set of representations, as follows:
Listing 1. Sample, external source text file
In the Mapping file:
apples {macintosh, gala, golden-delicious}
trees {elm, beech, maple, pine, birch}
rocks {quartz, limestone, marble, diamond}
...
In the Properties file:
macintosh {color: red, taste: sour}
gala {color: red, taste: sweet}
diamond {color: clear, rigidity: hard, value: high}
...
Mapping and Properties table entries can be parsed and passed into a method that inserts the entries into a database. But there are potential drawbacks to this approach. For example, suppose we have written a class that processes a JDBC-compliant database. Following the JDBC API, we can define a PreparedStatement object and use it to pass information to the database as follows:
Listing 2. Inserting field and region strings using Streamtokenizer
...
PreparedStatement insertionStmt =
con.prepareStatement("INSERT INTO MAPPING VALUES(?,?)");
...
public void insertEntry(String domain, String range)
throws SQLException {
insertionStatement.setString(1, domain);
insertionStatement.setString(2, range);
insertionStatement.executeUpdate();
}
Inserting two strings in this manner is appropriate or not depends on how you get a string from a text file. For example, suppose a simple regular expression matching tool is used to split each row into two strings:
A string contains all the characters before the first string.
A string contains all the characters after the first string.