Read and Write CSV Data with jackson-dataformat-csv
In addition to item reader and writer based on super-csv library, jberet-support also contains jacksonCsvItemReader
and jacksonCsvItemWriter
, which implement reading from and writing to CSV using jackson-dataformat-csv library. This offers a convenient alternative especially for those applications that already depend on jackson family of libraries.
The following dependency is required for jacksonCsvItemReader
and jacksonCsvItemWriter
:
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-jaxb-annotations</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-csv</artifactId>
</dependency>
Configure jacksonCsvItemReader
and jacksonCsvItemWriter
in job xml
The following is a partial example job xml defining jacksonCsvItemReader
and jacksonCsvItemWriter
with selected batch properties:
<job id="MovieTestWithJacksonCsv" xmlns="http://xmlns.jcp.org/xml/ns/javaee" version="1.0">
<step id="MovieTestWithJacksonCsv.step1">
<chunk>
<reader ref="jacksonCsvItemReader">
<properties>
<property name="resource" value="movies-2012.csv"/>
<property name="start" value="#{jobParameters['start']}"/>
<property name="end" value="#{jobParameters['end']}"/>
<property name="beanType" value="#{jobParameters['beanType']}"/>
<property name="columns" value="#{jobParameters['columns']}"/>
<property name="useHeader" value="#{jobParameters['useHeader']}?:false;"/>
</properties>
</reader>
<writer ref="jacksonCsvItemWriter">
<properties>
<property name="resource" value="#{jobParameters['writeResource']}"/>
<property name="beanType" value="#{jobParameters['beanType']}"/>
<property name="writeMode" value="overwrite"/>
<property name="columns" value="#{jobParameters['columns']}"/>
<property name="useHeader" value="#{jobParameters['useHeader']}?:false;"/>
</properties>
</writer>
</chunk>
</step>
</job>
Batch Configuration Properties for Both jacksonCsvItemReader
and jacksonCsvItemWriter
resource
The resource to read from (for batch readers), or write to (for batch writers).
beanType
java.lang.Class
Specifies a fully-qualified class or interface name that maps to a row of the source CSV file. For example,
- a custom java type that represents data item and serves as the CSV schema class
java.util.Map
java.util.List
java.lang.String[]
com.fasterxml.jackson.databind.JsonNode
When using java.util.List
or java.lang.String[]
for reading, it is deemed raw access, and CSV schema will not be configured and any schema-related properties are ignored. Specifically, CSV header and comment lines are read as raw access content.
columns
Specifies CSV schema in one of the 2 ways:
- columns = "<fully-qualified class name>":
CSV schema is defined in the named POJO class, which typically has class-level annotation com.fasterxml.jackson.annotation.JsonPropertyOrder to define property order corresponding to CSV column order.
- columns = "<comma-separated list of column names, each of which may be followed by a space and column type>":
use the value to manually build CSV schema. Valid column types are defined in com.fasterxml.jackson.dataformat.csv.CsvSchema.ColumnType
, including:
STRING
STRING_OR_LITERAL
NUMBER
NUMBER_OR_STRING
BOOLEAN
ARRAY
For complete list and descriptioin, see com.fasterxml.jackson.dataformat.csv.CsvSchema.ColumnType
javadoc.
For example,
columns = "org.jberet.support.io.StockTrade"
columns = "firstName STRING, lastName STRING, age NUMBER"
In jacksonCsvItemReader
, if this property is not defined and useHeader
is true (CSV input has a header), the header is used to create CSV schema. However, when beanType
is java.util.List
or java.lang.String[]
, the reader is considered raw access, and all schema-related properties are ignored.
This property is optional for reader and required for writer class.
See Also: com.fasterxml.jackson.dataformat.csv.CsvSchema
useHeader
boolean
whether the first line of physical document defines column names (true) or not (false): if enabled, parser will take first-line values to define column names; and generator will output column names as the first line. Optional property.
For jacksonCsvItemReader
, if beanType
is java.util.List
or java.lang.String[]
, it is considered raw access, useHeader
property is ignored and no CSV schema is used.
valid values are true or false, and the default is false.
See Also: com.fasterxml.jackson.dataformat.csv.CsvSchema
quoteChar
Character used for quoting values that contain quote characters or linefeeds. Optional property and defaults to " (double-quote character).
See Also: com.fasterxml.jackson.dataformat.csv.CsvSchema
columnSeparator
Character used to separate values.
Optional property and defaults to , (comma character). Other commonly used values include tab (\t
) and pipe (|
)
See Also: com.fasterxml.jackson.dataformat.csv.CsvSchema
Other Jackson Configuration
The following batch properties can be used to configure jackson json objects such as JsonFactory
, ObjectMapper
, serialization, deserialization, and custom jackson modules.
jsonFactoryFeatures
mapperFeatures
jsonFactoryLookup
serializationFeatures
customSerializers
deserializationFeatures
customDeserializers
customDataTypeModules
See Chapter JsonItemReader and JsonItemWriter for more details.
Batch Configuration Properties for jacksonCsvItemReader
Only
In addition to the common properties listed above, jacksonCsvItemReader
also supports the following batch properties:
skipBeanValidation
boolean
Indicates whether the current batch reader will invoke Bean Validation API to validate the incoming data POJO. Optional property and defaults to false, i.e., the reader will validate data POJO bean where appropriate.
start
int
Specifies the start position (a positive integer starting from 1
) to read the data. If reading from the beginning of the input CSV, there is no need to specify this property.
end
int
Specify the end position in the data set (inclusive). Optional property, and defaults to Integer.MAX_VALUE
. If reading till the end of the input CSV, there is no need to specify this property.
skipFirstDataRow
Whether the first data line (either first line of the document, if useHeader
=false
, or second, if useHeader
=true
) should be completely ignored by parser. Needed to support CSV-like file formats that include additional non-data content before real data begins (specifically some database dumps do this)
Optional property, valid values are true
and false
and defaults to false
.
See Also: com.fasterxml.jackson.dataformat.csv.CsvSchema
escapeChar
Character, if any, used to escape values. Most commonly defined as backslash (\
). Only used by parser; generator only uses quoting, including doubling up of quotes to indicate quote char itself. Optional protected and defaults to null
.
See Also: com.fasterxml.jackson.dataformat.csv.CsvSchema
jsonParserFeatures
java.util.Map<String, String>
A comma-separated list of key-value pairs that specify com.fasterxml.jackson.core.JsonParser
features. Optional property and defaults to null
. For example,
ALLOW_COMMENTS=true, ALLOW_YAML_COMMENTS=true, ALLOW_NUMERIC_LEADING_ZEROS=true, STRICT_DUPLICATE_DETECTION=true
See Also: com.fasterxml.jackson.core.JsonParser.Feature
csvParserFeatures
java.util.Map<String, String>
A comma-separated list of key-value pairs that specify com.fasterxml.jackson.dataformat.csv.CsvParser.Feature
. Optional property and defaults to null. For example,
TRIM_SPACES=false, WRAP_AS_ARRAY=false
See Also: com.fasterxml.jackson.dataformat.csv.CsvParser.Feature
deserializationProblemHandlers
A comma-separated list of fully-qualified names of classes that implement com.fasterxml.jackson.databind.deser.DeserializationProblemHandler
, which can be registered to get called when a potentially recoverable problem is encountered during deserialization process. Handlers can try to resolve the problem, throw an exception or do nothing. Optional property and defaults to null
. For example,
org.jberet.support.io.JsonItemReaderTest$UnknownHandler, org.jberet.support.io.JsonItemReaderTest$UnknownHandler2
See Also: com.fasterxml.jackson.databind.deser.DeserializationProblemHandler
, com.fasterxml.jackson.databind.ObjectMapper#addHandler(com.fasterxml.jackson.databind.deser.DeserializationProblemHandler)
inputDecorator
java.lang.Class
Fully-qualified name of a class that extends com.fasterxml.jackson.core.io.InputDecorator
, which can be used to decorate input sources. Typical use is to use a filter abstraction (filtered stream, reader) around original input source, and apply additional processing during read operations. Optional property and defaults to null. For example,
org.jberet.support.io.JsonItemReaderTest$NoopInputDecorator
See Also: com.fasterxml.jackson.core.JsonFactory#setInputDecorator(com.fasterxml.jackson.core.io.InputDecorator)
, com.fasterxml.jackson.core.io.InputDecorator
Batch Configuration Properties for jacksonCsvItemWriter
Only
In addition to the common properties listed above, jacksonCsvItemWriter
also supports the following batch properties:
nullValue
When asked to write Java null
, this String value will be used instead. Optional property and defaults to empty string.
See Also com.fasterxml.jackson.dataformat.csv.CsvSchema
writeMode
Instructs csvItemWriter
, when the target CSV resource already exists, whether to append to, or overwrite the existing resource, or fail. Valid values are:
append
(default)overwrite
failIfExists
lineSeparator
Character used to separate data rows. Only used by generator; parser accepts three standard linefeeds (\r
, \r\n
, \n
). Optional protected and defaults to \n
.
See Also: com.fasterxml.jackson.dataformat.csv.CsvSchema
jsonGeneratorFeatures
java.util.Map<String, String>
A comma-separated list of key-value pairs that specify com.fasterxml.jackson.core.JsonGenerator
features. Optional property and defaults to null. Keys and values must be defined in com.fasterxml.jackson.core.JsonGenerator.Feature
. For example,
WRITE_BIGDECIMAL_AS_PLAIN=true, WRITE_NUMBERS_AS_STRINGS=true, QUOTE_NON_NUMERIC_NUMBERS=false
See Also: com.fasterxml.jackson.core.JsonGenerator.Feature
csvGeneratorFeatures
java.util.Map<String, String>
A comma-separated list of key-value pairs that specify com.fasterxml.jackson.dataformat.csv.CsvGenerator.Feature
. Optional property and defaults to null
. For example,
STRICT_CHECK_FOR_QUOTING=false, OMIT_MISSING_TAIL_COLUMNS=false, ALWAYS_QUOTE_STRINGS=false
See Also: com.fasterxml.jackson.dataformat.csv.CsvGenerator.Feature
outputDecorator
java.lang.Class
Fully-qualified name of a class that implements com.fasterxml.jackson.core.io.OutputDecorator
, which can be used to decorate output destinations. Typical use is to use a filter abstraction (filtered output stream, writer) around original output destination, and apply additional processing during write operations. Optional property and defaults to null
. For example,
org.jberet.support.io.JsonItemReaderTest$NoopOutputDecorator
See Also: com.fasterxml.jackson.core.io.OutputDecorator
, com.fasterxml.jackson.core.JsonFactory#setOutputDecorator(com.fasterxml.jackson.core.io.OutputDecorator)