Converting XML to JSON + Raw Use in MongoDB + Spring Batch

Overview

Why convert XML to JSON for raw use in MongoDB?

Since MongoDB uses JSON documents in order to store records, just as tables and rows store records in a relational database, we naturally need to convert our XML to JSON.

Some applications may need to store raw (unmodified) JSON because there is uncertainty in how the data will be structured.

There are hundreds of XML-based standards. If an application is to process XML files that do not follow the same standard, there is uncertainty in how the data will be structured.

Why use Spring Batch?

Spring Batch provides reusable functions that are essential in processing large volumes of records and other features that enable high-volume and high performance batch jobs. The Spring Website has documented Spring Batch well.

For another tutorial on Spring Batch, see my previous post on Processing CSVs with Spring Batch.

0 – Converting XML to JSON For Use In MongoDB With Spring Batch Example Application

The example application converts an XML document that is a “policy” for configuring a music playlist. This policy is intended to resemble real cyber security configuration documents. It is a short document but illustrates how you will search complex XML documents.

The approach we will be taking our tutorial is for handling XML files of varying style. We want to be able to handle the unexpected. This is why we are keeping the data “raw.”

View and Download the code from Github

1 – Project Structure

It is a typical Maven structure. We have one package for this example application. The XML file is in src/main/resources.
Project structure of converting xml to json with spring batch mongodb

2 – Project Dependencies

Besides our typical Spring Boot dependencies, we include dependencies for an embedded MongoDB database and for processing JSON.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.michaelcgood</groupId>
	<artifactId>michaelcgood-spring-batch-mongodb</artifactId>
	<version>0.0.1</version>
	<packaging>jar</packaging>

	<name>michaelcgood-spring-batch-mongodb</name>
	<description>Michael C  Good - XML to JSON + MongoDB + Spring Batch Example</description>

	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>1.5.7.RELEASE</version>
		<relativePath /> <!-- lookup parent from repository -->
	</parent>

	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
		<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
		<java.version>1.8</java.version>
	</properties>

	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-batch</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>
		<dependency>
			<groupId>de.flapdoodle.embed</groupId>
			<artifactId>de.flapdoodle.embed.mongo</artifactId>
			<version>1.50.5</version>
		</dependency>
		<dependency>
			<groupId>cz.jirutka.spring</groupId>
			<artifactId>embedmongo-spring</artifactId>
			<version>RELEASE</version>
		</dependency>
		<dependency>
				<groupId>org.json</groupId>
				<artifactId>json</artifactId>
				<version>20170516</version>
			</dependency>

			<dependency>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-starter-data-mongodb</artifactId>
			</dependency>
	</dependencies>

	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
			</plugin>
		</plugins>
	</build>


</project>

3 – XML Document

This is the example policy document created for this tutorial. It’s structure is based on real cyber security policy documents.

  • Note the parent of the document is the Policy tag.
  • Important information lies within the Group tag.
  • Look at the values that reside within the tags, such as the id in Policy or the date within status.

There’s a lot of information condensed in this small document to consider. For instance, there is also the XML namespace (xmlns). We won’t touch on this in the rest of the tutorial, but depending on your goals it could be something to add logic for.

<?xml version="1.0"?>
<Policy  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" style="STY_1.1" id="NRD-1">
  <status date="2017-10-18">draft</status>
  <title xmlns:xhtml="http://www.w3.org/1999/xhtml">Guide to the Configuration of Music Playlist</title>
   <description xmlns:xhtml="http://www.w3.org/1999/xhtml" >This guide presents a catalog of relevant
    configuration settings for a playlist that I listen to while I work on software development.
    <html:br xmlns:html="http://www.w3.org/1999/xhtml"/>
    <html:br xmlns:html="http://www.w3.org/1999/xhtml"/>
    Providing myself with such guidance reminds me how to efficiently
    configure my playlist.  Lorem ipsum <html:i xmlns:html="http://www.w3.org/1999/xhtml">Lorem ipsum,</html:i> 
    and Lorem ipsum.  Some example
    <html:i xmlns:html="http://www.w3.org/1999/xhtml">Lorem ipsum</html:i>, which are Lorem ipsum.
  </description>
  <Group id="remediation_functions">
    <title xmlns:xhtml="http://www.w3.org/1999/xhtml" >Remediation functions used by the SCAP Security Guide Project</title>
    <description xmlns:xhtml="http://www.w3.org/1999/xhtml" >XCCDF form of the various remediation functions as used by
      remediation scripts from the SCAP Security Guide Project</description>
    <Value id="is_the_music_good" prohibitChanges="true" >
      <title xmlns:xhtml="http://www.w3.org/1999/xhtml" >Remediation function to fix bad playlist</title>
      <description xmlns:xhtml="http://www.w3.org/1999/xhtml" >Function to fix bad playlist.
      
        
       Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum
       
       Lorem ipsum
       Lorem ipsum
       Lorem ipsum
       Lorem ipsum
      </description>
      <value>
        function fix_bad_playlist {
        
        # Load function arguments into local variables
       Lorem ipsum
       Lorem ipsum
       Lorem ipsum
        
        # Check sanity of the input
        if [ $# Lorem ipsum ]
        then
        echo "Usage: Lorem ipsum"
        echo "Aborting."
        exit 1
        fi
        
        }
      </value>
    </Value>
    </Group>
    </Policy>

4 – MongoDB Configuration

Below we specify that we are using an embedded MongoDB database, make it discoverable for a component scan that is bundled in the convenience annotation @SpringBootApplication, and specify that mongoTemplate will be a bean.

package com.michaelcgood;

import java.io.IOException;
import cz.jirutka.spring.embedmongo.EmbeddedMongoFactoryBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.mongodb.core.*;
import com.mongodb.MongoClient;
 
 
@Configuration
public class MongoConfig {
 
    private static final String MONGO_DB_URL = "localhost";
    private static final String MONGO_DB_NAME = "embeded_db";
    @Bean
    public MongoTemplate mongoTemplate() throws IOException {
        EmbeddedMongoFactoryBean mongo = new EmbeddedMongoFactoryBean();
        mongo.setBindIp(MONGO_DB_URL);
        MongoClient mongoClient = mongo.getObject();
        MongoTemplate mongoTemplate = new MongoTemplate(mongoClient, MONGO_DB_NAME);
        return mongoTemplate;
    }
}



5 – Processing XML to JSON

step1() of our Spring Batch Job contains calls three methods to help process the XML to JSON. We will review each individually.

    @Bean
    public Step step1() {
        return stepBuilderFactory.get("step1")
                .tasklet(new Tasklet() {
                    @Override
                    public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
                        
                        // get path of file in src/main/resources
                        Path xmlDocPath =  Paths.get(getFilePath());
                        
                        // process the file to json
                         String json = processXML2JSON(xmlDocPath);
                         
                         // insert json into mongodb
                         insertToMongo(json);
                        return RepeatStatus.FINISHED;
                    }
                }).build();
    }

5.1 – getFilePath()

This method simply gets the file path that is passed as a parameter to the method processXML2JSON.
Note:

  • ClassLoader is helping us locate the XML file in our resources folder.
 // no parameter method for creating the path to our xml file
    private String getFilePath(){
        
        String fileName = "FakePolicy.xml";
        ClassLoader classLoader = getClass().getClassLoader();
        File file = new File(classLoader.getResource(fileName).getFile());
        String xmlFilePath = file.getAbsolutePath();
        
        return xmlFilePath;
    }

5.2 – processXML2JSON(xmlDocPath)

The string returned by getFilePath is passed into this method as a parameter. A JSONOBject is created from a String of the XML file.

   // takes a parameter of xml path and returns json as a string
    private String processXML2JSON(Path xmlDocPath) throws JSONException {
        
        
        String XML_STRING = null;
        try {
            XML_STRING = Files.lines(xmlDocPath).collect(Collectors.joining("\n"));
        } catch (IOException e) {
            e.printStackTrace();
        }
        
        JSONObject xmlJSONObj = XML.toJSONObject(XML_STRING);
        String jsonPrettyPrintString = xmlJSONObj.toString(PRETTY_PRINT_INDENT_FACTOR);
        System.out.println("PRINTING STRING :::::::::::::::::::::" + jsonPrettyPrintString);
        
        return jsonPrettyPrintString;
    }

5.3 – insertToMongo(json)

We insert the parsed JSON into a MongoDB document. We then insert this document with the help of the @Autowired mongoTemplate into a collection named “foo”.

   // inserts to our mongodb
    private void insertToMongo(String jsonString){
        Document doc = Document.parse(jsonString);
        mongoTemplate.insert(doc, "foo");
    }

6 – Querying MongoDB

step2() of our Spring Batch Job contains our MongoDB queries.

  • mongoTemplate.collectionExists returns a Boolean value based on the existence of the collection.
  • mongoTemplate.getCollection(“foo”).find() returns all the documents within the collection.
  • alldocs.toArray() returns an array of DBObjects.
  • Then we call three methods that we will review individually below.
 public Step step2(){
        return stepBuilderFactory.get("step2")
            .tasklet(new Tasklet(){
            @Override
            public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception{
                // all printing out to console removed for post's brevity
                // checks if our collection exists
                Boolean doesexist = mongoTemplate.collectionExists("foo");
                
                // show all DBObjects in foo collection
                DBCursor alldocs = mongoTemplate.getCollection("foo").find();
                List<DBObject> dbarray = alldocs.toArray();
                
                // execute the three methods we defined for querying the foo collection
                String result = doCollect();
                String resultTwo = doCollectTwo();
                String resultThree = doCollectThree();
               
                return RepeatStatus.FINISHED;
            }
        }).build();
    }

6.1 – First query

The goal of this query is to find a document where style=”STY_1.1″. To accomplish this, we need to remember where style resides in the document. It is a child of Policy; therefore, we address it in the criteria as Policy.style.

The other goal of this query is to only return the id field of the Policy. It is also just a child of Policy.

The result is returned by calling this method: mongoTemplate.findOne(query, String.class, “foo”);. The output is a String, so the second parameter is String.class. The third parameter is our collection name.

    public String doCollect(){
        Query query = new Query();
        query.addCriteria(Criteria.where("Policy.style").is("STY_1.1")).fields().include("Policy.id");
        String result = mongoTemplate.findOne(query, String.class, "foo");
        return result;
    }

6.2 – Second query

The difference between the second query and the first query are the fields returned. In the second query, we return Value, which is a child of both Policy and Group.

    public String doCollectTwo(){
        Query query = new Query();
        query.addCriteria(Criteria.where("Policy.style").is("STY_1.1")).fields().include("Policy.Group.Value");
        String result = mongoTemplate.findOne(query, String.class, "foo");
        
        return result;
    }

6.3 – Third query

The criteria for the third query is different. We only want to return a document with the id “NRD-1” and a status date of “2017-10-18”. We only want to return two fields: title and description, which are both children of Value.

Referring to the XML document or the printed JSON in the demo below for further clarification on the queries.

    public String doCollectThree(){
        Query query = new Query();
        query.addCriteria(Criteria.where("Policy.id").is("NRD-1").and("Policy.status.date").is("2017-10-18")).fields().include("Policy.Group.Value.title").include("Policy.Group.Value.description");
        String result = mongoTemplate.findOne(query, String.class, "foo");
        
        return result;
    }

7 – Spring Batch Job

The Job begins with step1 and calls step2 next.

 @Bean
    public Job xmlToJsonToMongo() {
        return jobBuilderFactory.get("XML_Processor")
                .start(step1())
                .next(step2())
                .build();
    }

8 – @SpringBootApplication

This is a standard class with static void main and @SpringBootApplication.

package com.michaelcgood;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.EnableAutoConfiguration;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration;

@SpringBootApplication
@EnableAutoConfiguration(exclude={DataSourceAutoConfiguration.class})
public class SpringBatchMongodb {

	public static void main(String[] args) {
		SpringApplication.run(SpringBatchMongodb.class, args);
	}
}

9 – Demo

9.1 – step1

The JSON is printed as a String. I have cut the output past description below because it is long.

Executing step: [step1]
PRINTING STRING :::::::::::::::::::::{"Policy": {
    "Group": {
        "Value": {
            "prohibitChanges": true,
            "description": {

9.2 – step2

I have cut the results to format the output for the blog post.

Executing step: [step2]

Checking if the collection exists

Status of collection returns :::::::::::::::::::::true

Show all objects

list of db objects returns:::::::::::::::::::::[{ "_id" : { "$oid" : "59e7c0324ad9510acf5773c0"} , [..]

Just return the id of Policy

RESULT:::::::::::::::::::::{ "_id" : { "$oid" : "59e7c0324ad9510acf5773c0"} , "Policy" : { "id" : "NRD-1"}}

To see the other results printed to the console, fork/download the code from Github and run the application.

10 – Conclusion

We have reviewed how to convert XML to JSON, store the JSON to MongoDB, and how to query the database for specific results.

Further reading:

The source code is on Github



Spring Batch CSV Processing

Overview

Topics we will be discussing include the essential concepts of batch processing with Spring Batch and how to import the data from a CSV into a database.

0 – Spring Batch CSV Processing Example Application

We are building an application that demonstrates the basics of Spring Batch for processing CSV files. Our demo application will allow us to process a CSV file that contains hundreds of records of Japanese anime titles.

0.1 – The CSV

I have downloaded the CSV we will be using from this Github repository, and it provides a pretty comprehensive list of animes.

Here is a screenshot of the CSV open in Microsoft Excel

Animes CSV screenshot

View and Download the code from Github

1 – Project Structure

Project structure of spring batch application

2 – Project Dependencies

Besides typical Spring Boot dependencies, we include spring-boot-starter-batch, which is the dependency for Spring Batch as the name suggests, and hsqldb for an in-memory database. We also include commons-lang3 for ToStringBuilder.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.michaelcgood</groupId>
	<artifactId>michaelcgood-spring-batch-csv</artifactId>
	<version>0.0.1</version>
	<packaging>jar</packaging>

	<name>michaelcgood-spring-batch-csv</name>
	<description>Michael C  Good - Spring Batch CSV Example Application</description>

	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>1.5.7.RELEASE</version>
		<relativePath /> <!-- lookup parent from repository -->
	</parent>

	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
		<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
		<java.version>1.8</java.version>
	</properties>

	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-batch</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-jpa</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>

		<dependency>
			<groupId>org.hsqldb</groupId>
			<artifactId>hsqldb</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
		</dependency>
		<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-lang3 -->
		<dependency>
			<groupId>org.apache.commons</groupId>
			<artifactId>commons-lang3</artifactId>
			<version>3.6</version>
		</dependency>
	</dependencies>

	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
			</plugin>
		</plugins>
	</build>


</project>

3 – Model

This is a POJO that models the fields of an anime. The fields are:

  • ID. For the sake of simplicity, we treat the ID as a String. However, this could be changed to another data type such as an Integer or Long.
  • Title. This is the title of the anime and it is appropriate for it to be a String.
  • Description. This is the description of the anime, which is longer than the title, and it can also be treated as a String.

What is important to note is our class constructor for the three fields: public AnimeDTO(String id, String title, String description). This will be used in our application. Also, as usual, we need to make a default constructor with no parameters or else Java will throw an error.

package com.michaelcgood;

import org.apache.commons.lang3.builder.ToStringBuilder;
/**
 * Contains the information of a single anime
 *
 * @author Michael C Good michaelcgood.com
 */

public class AnimeDTO {
	
	public String getId() {
		return id;
	}

	public void setId(String id) {
		this.id = id;
	}

	public String getTitle() {
		return title;
	}

	public void setTitle(String title) {
		this.title = title;
	}

	public String getDescription() {
		return description;
	}

	public void setDescription(String description) {
		this.description = description;
	}



	private String id;
	



	private String title;
	private String description;
	
	public AnimeDTO(){
		
	}
	
	public AnimeDTO(String id, String title, String description){
		this.id = id;
		this.title = title;
		this.description = title;
	}
	

	
	   @Override
	    public String toString() {
		   return new ToStringBuilder(this)
				   .append("id", this.id)
				   .append("title", this.title)
				   .append("description", this.description)
				   .toString();
	   }


}

4 – CSV File to Database Configuration

There is a lot going on in this class and it is not all written at once, so we are going to go through the code in steps. Visit Github to see the code in its entirety.

4.1 – Reader

As the Spring Batch documentation states FlatFileIteamReader will “read lines of data from a flat file that typically describe records with fields of data defined by fixed positions in the file or delimited by some special character (e.g. Comma)”.

We are dealing with a CSV, so of course the data is delimited by a comma, making this the perfect for use with our file.

   @Bean
    public FlatFileItemReader<AnimeDTO> csvAnimeReader(){
        FlatFileItemReader<AnimeDTO> reader = new FlatFileItemReader<AnimeDTO>();
        reader.setResource(new ClassPathResource("animescsv.csv"));
        reader.setLineMapper(new DefaultLineMapper<AnimeDTO>() {{
            setLineTokenizer(new DelimitedLineTokenizer() {{
                setNames(new String[] { "id", "title", "description" });
            }});
            setFieldSetMapper(new BeanWrapperFieldSetMapper<AnimeDTO>() {{
                setTargetType(AnimeDTO.class);
            }});
        }});
        return reader;
    }

Important points:

  • FlatFileItemReader is parameterized with a model. In our case, this is AnimeDTO.
  • FlatFileItemReader must set a resource. It uses setResource method. Here we set the resource to animescsv.csv
  • setLineMapper method converts Strings to objects representing the item. Our String will be an anime record consisting of an id, title, and description. This String is made into an object. Note that DefaultLineMapper is parameterized with our model, AnimeDTO.
  • However, LineMapper is given a raw line, which means there is work that needs to be done to map the fields appropriately. The line must be tokenized into a FieldSet, which DelimitedLineTokenizer takes care of. DelimitedLineTokenizer returns a FieldSet.
  • Now that we have a FieldSet, we need to map it. setFieldSetMapper is used for taking the FieldSet object and mapping its contents to a DTO, which is AnimeDTO in our case.
  • 4.2 – Processor

    If we want to transform the data before writing it to the database, an ItemProcessor is necessary. Our code does not actually apply any business logic to transform the data, but we allow for the capability to.

    4.2.1 – Processor in CsvFileToDatabaseConfig.Java

    csvAnimeProcessor returns a new instance of the AnimeProcessor object which we review below.

    	@Bean
    	ItemProcessor<AnimeDTO, AnimeDTO> csvAnimeProcessor() {
    		return new AnimeProcessor();
    	}
    

    4.2.2 – AnimeProcessor.Java

    If we wanted to apply business logic before writing to the database, you could manipulate the Strings before writing to the database. For instance, you could add toUpperCase() after getTitle to make the title upper case before writing to the database. However, I decided not to do that or apply any other business logic for this example processor, so no manipulation is being done. The Processor is here simply for demonstration.

    package com.michaelcgood;
    
    import org.slf4j.Logger;
    import org.slf4j.LoggerFactory;
    
    import org.springframework.batch.item.ItemProcessor;
    
    public class AnimeProcessor implements ItemProcessor<AnimeDTO, AnimeDTO> {
    	
        private static final Logger log = LoggerFactory.getLogger(AnimeProcessor.class);
        
        @Override
        public AnimeDTO process(final AnimeDTO AnimeDTO) throws Exception {
        	
        	final String id = AnimeDTO.getId();
            final String title = AnimeDTO.getTitle();
            final String description = AnimeDTO.getDescription();
    
            final AnimeDTO transformedAnimeDTO = new AnimeDTO(id, title, description);
    
            log.info("Converting (" + AnimeDTO + ") into (" + transformedAnimeDTO + ")");
    
            return transformedAnimeDTO;
        }
    
    }
    
    

    4.3 – Writer

    The csvAnimeWriter method is responsible for actually writing the values into our database. Our database is an in-memory HSQLDB however this application allows us to easily swap out one database for another. The dataSource is autowired.

    	@Bean
    	public JdbcBatchItemWriter<AnimeDTO> csvAnimeWriter() {
    		 JdbcBatchItemWriter<AnimeDTO> excelAnimeWriter = new JdbcBatchItemWriter<AnimeDTO>();
    		 excelAnimeWriter.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<AnimeDTO>());
    		 excelAnimeWriter.setSql("INSERT INTO animes (id, title, description) VALUES (:id, :title, :description)");
    		 excelAnimeWriter.setDataSource(dataSource);
    	        return excelAnimeWriter;
    	}
    

    4.4 – Step

    A Step is a domain object that contains an independent, sequential phase of a batch job and contains all of the information needed to define and control the actual batch processing.

    Now that we’ve created the reader and processor for data we need to write it. For the reading, we’ve been using chunk-oriented processing, meaning we’ve been reading the data one at a time. Chunk-oriented processing also includes creating ‘chunks’ that will be written out, within a transaction boundary. For chunk-oriented processing, you set a commit interval and once the number of items read equals the commit interval that has been set, the entire chunk is written out via the ItemWriter, and the transaction is committed. We set the chunk interval size to 1.

    I suggest reading the Spring Batch documentation about chunk-oriented processing.

    Then the reader, processor, and writer call the methods we wrote.

    	@Bean
    	public Step csvFileToDatabaseStep() {
    		return stepBuilderFactory.get("csvFileToDatabaseStep")
    				.<AnimeDTO, AnimeDTO>chunk(1)
    				.reader(csvAnimeReader())
    				.processor(csvAnimeProcessor())
    				.writer(csvAnimeWriter())
    				.build();
    	}
    

    4.5 – Job

    A Job consists of Steps. We pass a parameter into the Job below because we want to track the completion of the Job.

    	@Bean
    	Job csvFileToDatabaseJob(JobCompletionNotificationListener listener) {
    		return jobBuilderFactory.get("csvFileToDatabaseJob")
    				.incrementer(new RunIdIncrementer())
    				.listener(listener)
    				.flow(csvFileToDatabaseStep())
    				.end()
    				.build();
    	}
    



    5 – Job Completion Notification Listener

    The class below autowires the JdbcTemplate because we’ve already set the dataSource and we want to easily make our query. The results of our are query are a list of AnimeDTO objects. For each object returned, we will create a message in our console to show that the item has been written to the database.

    package com.michaelcgood;
    
    import java.sql.ResultSet;
    import java.sql.SQLException;
    import java.util.List;
    
    import org.slf4j.Logger;
    import org.slf4j.LoggerFactory;
    
    import org.springframework.batch.core.BatchStatus;
    import org.springframework.batch.core.JobExecution;
    import org.springframework.batch.core.listener.JobExecutionListenerSupport;
    import org.springframework.beans.factory.annotation.Autowired;
    import org.springframework.jdbc.core.JdbcTemplate;
    import org.springframework.jdbc.core.RowMapper;
    import org.springframework.stereotype.Component;
    
    @Component
    public class JobCompletionNotificationListener extends JobExecutionListenerSupport {
    
    	private static final Logger log = LoggerFactory.getLogger(JobCompletionNotificationListener.class);
    
    	private final JdbcTemplate jdbcTemplate;
    
    	@Autowired
    	public JobCompletionNotificationListener(JdbcTemplate jdbcTemplate) {
    		this.jdbcTemplate = jdbcTemplate;
    	}
    
    	@Override
    	public void afterJob(JobExecution jobExecution) {
    		if(jobExecution.getStatus() == BatchStatus.COMPLETED) {
    			log.info("============ JOB FINISHED ============ Verifying the results....\n");
    
    			List<AnimeDTO> results = jdbcTemplate.query("SELECT id, title, description FROM animes", new RowMapper<AnimeDTO>() {
    				@Override
    				public AnimeDTO mapRow(ResultSet rs, int row) throws SQLException {
    					return new AnimeDTO(rs.getString(1), rs.getString(2), rs.getString(3));
    				}
    			});
    
    			for (AnimeDTO AnimeDTO : results) {
    				log.info("Discovered <" + AnimeDTO + "> in the database.");
    			}
    
    		}
    	}
    	
    }
    
    

    6 – SQL

    We need to create a schema for our database. As mentioned, we have made all fields Strings for ease of use, so we have made their data types VARCHAR.

    DROP TABLE animes IF EXISTS;
    CREATE TABLE animes  (
        id VARCHAR(10),
        title VARCHAR(400),
        description VARCHAR(999)
    );
    
    

    6 – Main

    This is a standard class with main(). As the Spring Documentation states, @SpringBootApplication is a convenience annotation that includes @Configuration, @EnableAutoConfiguration, @EnableWebMvc, and @ComponentScan.

    package com.michaelcgood;
    
    import org.springframework.boot.SpringApplication;
    import org.springframework.boot.autoconfigure.SpringBootApplication;
    
    @SpringBootApplication
    public class SpringBatchCsvApplication {
    
    	public static void main(String[] args) {
    		SpringApplication.run(SpringBatchCsvApplication.class, args);
    	}
    }
    
    

    7 – Demo

    7.1 – Converting

    The FieldSet is fed through the processor and “Converting” is printed to the console.
    Converting CSV to database in Spring Batch

    7.2 – Discovering New Items In Database

    When the Spring Batch Job is finished, we select all the records and print them out to the console individually.
    Discovering newly imported items in database in Spring Batch application

    7.3 – Batch Process Complete

    When the Batch Process is complete this is what is printed to the console.

    Job: [FlowJob: [name=csvFileToDatabaseJob]] completed with the following parameters: [{run.id=1, -spring.output.ansi.enabled=always}] and the following status: [COMPLETED]
    Started SpringBatchCsvApplication in 36.0 seconds (JVM running for 46.616)
    

    8 – Conclusion

    Spring Batch builds upon the POJO-based development approach and user-friendliness of the Spring Framework’s to make it easy for developers to create enterprise grade batch processing.

    The source code is on Github