187 lines
		
	
	
		
			8.7 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			187 lines
		
	
	
		
			8.7 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| ---
 | |
| tags: []
 | |
| projects: [spring-batch]
 | |
| ---
 | |
| :spring_version: current
 | |
| :spring_boot_version: 1.2.4.RELEASE
 | |
| :Component: http://docs.spring.io/spring/docs/{spring_version}/javadoc-api/org/springframework/stereotype/Component.html
 | |
| :SpringApplication: http://docs.spring.io/spring-boot/docs/{spring_boot_version}/api/org/springframework/boot/SpringApplication.html
 | |
| :toc:
 | |
| :icons: font
 | |
| :source-highlighter: prettify
 | |
| :project_id: gs-batch-processing
 | |
| This guide walks you through the process of creating a basic batch-driven solution.
 | |
| 
 | |
| == What you'll build
 | |
| 
 | |
| You'll build a service that imports data from a CSV spreadsheet, transforms it with custom code, and stores the final results in a database.
 | |
| 
 | |
| 
 | |
| == What you'll need
 | |
| 
 | |
| :java_version: 1.8
 | |
| include::https://raw.githubusercontent.com/spring-guides/getting-started-macros/master/prereq_editor_jdk_buildtools.adoc[]
 | |
| 
 | |
| include::https://raw.githubusercontent.com/spring-guides/getting-started-macros/master/how_to_complete_this_guide.adoc[]
 | |
| 
 | |
| 
 | |
| include::https://raw.githubusercontent.com/spring-guides/getting-started-macros/master/hide-show-gradle.adoc[]
 | |
| 
 | |
| include::https://raw.githubusercontent.com/spring-guides/getting-started-macros/master/hide-show-maven.adoc[]
 | |
| 
 | |
| include::https://raw.githubusercontent.com/spring-guides/getting-started-macros/master/hide-show-sts.adoc[]
 | |
| 
 | |
| 
 | |
| 
 | |
| Typically your customer or a business analyst supplies a spreadsheet. In this case, you make it up.
 | |
| 
 | |
| `src/main/resources/sample-data.csv`
 | |
| [source,csv]
 | |
| ----
 | |
| include::initial/src/main/resources/sample-data.csv[]
 | |
| ----
 | |
| 
 | |
| This spreadsheet contains a first name and a last name on each row, separated by a comma. This is a fairly common pattern that Spring handles out-of-the-box, as you will see.
 | |
| 
 | |
| 
 | |
| Next, you write a SQL script to create a table to store the data.
 | |
| 
 | |
| `src/main/resources/schema-all.sql`
 | |
| [source,sql]
 | |
| ----
 | |
| include::initial/src/main/resources/schema-all.sql[]
 | |
| ----
 | |
| 
 | |
| NOTE: Spring Boot runs `schema-@@platform@@.sql` automatically during startup. `-all` is the default for all platforms.
 | |
| 
 | |
| 
 | |
| [[initial]]
 | |
| == Create a business class
 | |
| 
 | |
| Now that you see the format of data inputs and outputs, you write code to represent a row of data.
 | |
| 
 | |
| `src/main/java/hello/Person.java`
 | |
| [source,java]
 | |
| ----
 | |
| include::complete/src/main/java/hello/Person.java[]
 | |
| ----
 | |
| 
 | |
| You can instantiate the `Person` class either with first and last name through a constructor, or by setting the properties.
 | |
| 
 | |
| 
 | |
| == Create an intermediate processor
 | |
| 
 | |
| A common paradigm in batch processing is to ingest data, transform it, and then pipe it out somewhere else. Here you write a simple transformer that converts the names to uppercase.
 | |
| 
 | |
| `src/main/java/hello/PersonItemProcessor.java`
 | |
| [source,java]
 | |
| ----
 | |
| include::complete/src/main/java/hello/PersonItemProcessor.java[]
 | |
| ----
 | |
| 
 | |
| `PersonItemProcessor` implements Spring Batch's `ItemProcessor` interface. This makes it easy to wire the code into a batch job that you define further down in this guide. According to the interface, you receive an incoming `Person` object, after which you transform it to an upper-cased `Person`.
 | |
| 
 | |
| NOTE: There is no requirement that the input and output types be the same. In fact, after one source of data is read, sometimes the application's data flow needs a different data type.
 | |
| 
 | |
| 
 | |
| == Put together a batch job
 | |
| 
 | |
| Now you put together the actual batch job. Spring Batch provides many utility classes that reduce the need to write custom code. Instead, you can focus on the business logic.
 | |
| 
 | |
| `src/main/java/hello/BatchConfiguration.java`
 | |
| [source,java]
 | |
| ----
 | |
| include::complete/src/main/java/hello/BatchConfiguration.java[]
 | |
| ----
 | |
| 
 | |
| For starters, the `@EnableBatchProcessing` annotation adds many critical beans that support jobs and saves you a lot of leg work. This example uses a memory-based database (provided by `@EnableBatchProcessing`), meaning that when it's done, the data is gone.
 | |
| 
 | |
| Break it down:
 | |
| 
 | |
| `src/main/java/hello/BatchConfiguration.java`
 | |
| [source,java]
 | |
| ----
 | |
| include::/complete/src/main/java/hello/BatchConfiguration.java[tag=readerwriterprocessor]
 | |
| ----
 | |
| .
 | |
| The first chunk of code defines the input, processor, and output.
 | |
| - `reader()` creates an `ItemReader`. It looks for a file called `sample-data.csv` and parses each line item with enough information to turn it into a `Person`.
 | |
| - `processor()` creates an instance of our `PersonItemProcessor` you defined earlier, meant to uppercase the data.
 | |
| - `write(DataSource)` creates an `ItemWriter`. This one is aimed at a JDBC destination and automatically gets a copy of the dataSource created by `@EnableBatchProcessing`. It includes the SQL statement needed to insert a single `Person` driven by Java bean properties.
 | |
| 
 | |
| The next chunk focuses on the actual job configuration.
 | |
| 
 | |
| `src/main/java/hello/BatchConfiguration.java`
 | |
| [source,java]
 | |
| ----
 | |
| include::/complete/src/main/java/hello/BatchConfiguration.java[tag=jobstep]
 | |
| ----
 | |
| .
 | |
| The first method defines the job and the second one defines a single step. Jobs are built from steps, where each step can involve a reader, a processor, and a writer. 
 | |
| 
 | |
| In this job definition, you need an incrementer because jobs use a database to maintain execution state. You then list each step, of which this job has only one step. The job ends, and the Java API produces a perfectly configured job.
 | |
| 
 | |
| The `listener()` method lets you hook into the engine and detect when the job is complete, triggering the verification of results.
 | |
| 
 | |
| In the step definition, you define how much data to write at a time. In this case, it writes up to ten records at a time. Next, you configure the reader, processor, and writer using the injected bits from earlier. 
 | |
| 
 | |
| NOTE: chunk() is prefixed `<Person,Person>` because it's a generic method. This represents the input and output types of each "chunk" of processing, and lines up with `ItemReader<Person>` and `ItemWriter<Person>`.
 | |
| 
 | |
| `src/main/java/hello/JobCompletionNotificationListener.java`
 | |
| [source,java]
 | |
| ----
 | |
| include::/complete/src/main/java/hello/JobCompletionNotificationListener.java[]
 | |
| ----
 | |
| 
 | |
| This code listens for when a job is `BatchStatus.COMPLETED`, and then uses `JdbcTemplate` to inspect the results.
 | |
| 
 | |
| 
 | |
| == Make the application executable
 | |
| 
 | |
| Although batch processing can be embedded in web apps and WAR files, the simpler approach demonstrated below creates a standalone application. You package everything in a single, executable JAR file, driven by a good old Java `main()` method.
 | |
| 
 | |
| 
 | |
| `src/main/java/hello/Application.java`
 | |
| [source,java]
 | |
| ----
 | |
| include::complete/src/main/java/hello/Application.java[]
 | |
| ----
 | |
| 
 | |
| `@SpringBootApplication` is a convenience annotation that adds all of the following:
 | |
|     
 | |
| - `@Configuration` tags the class as a source of bean definitions for the application context.
 | |
| - `@EnableAutoConfiguration` tells Spring Boot to start adding beans based on classpath settings, other beans, and various property settings.
 | |
| - Normally you would add `@EnableWebMvc` for a Spring MVC app, but Spring Boot adds it automatically when it sees **spring-webmvc** on the classpath. This flags the application as a web application and activates key behaviors such as setting up a `DispatcherServlet`.
 | |
| - `@ComponentScan` tells Spring to look for other components, configurations, and services in the the `hello` package, allowing it to find the `HelloController`.
 | |
| 
 | |
| The `main()` method uses Spring Boot's `SpringApplication.run()` method to launch an application. Did you notice that there wasn't a single line of XML? No **web.xml** file either. This web application is 100% pure Java and you didn't have to deal with configuring any plumbing or infrastructure.
 | |
| 
 | |
| For demonstration purposes, there is code to create a `JdbcTemplate`, query the database, and print out the names of people the batch job inserts.
 | |
| 
 | |
| include::https://raw.githubusercontent.com/spring-guides/getting-started-macros/master/build_an_executable_jar_subhead.adoc[]
 | |
| 
 | |
| include::https://raw.githubusercontent.com/spring-guides/getting-started-macros/master/build_an_executable_jar_with_both.adoc[]
 | |
| 
 | |
| The job prints out a line for each person that gets transformed. After the job runs, you can also see the output from querying the database.
 | |
| 
 | |
| ....
 | |
| Converting (firstName: Jill, lastName: Doe) into (firstName: JILL, lastName: DOE)
 | |
| Converting (firstName: Joe, lastName: Doe) into (firstName: JOE, lastName: DOE)
 | |
| Converting (firstName: Justin, lastName: Doe) into (firstName: JUSTIN, lastName: DOE)
 | |
| Converting (firstName: Jane, lastName: Doe) into (firstName: JANE, lastName: DOE)
 | |
| Converting (firstName: John, lastName: Doe) into (firstName: JOHN, lastName: DOE)
 | |
| Found <firstName: JILL, lastName: DOE> in the database.
 | |
| Found <firstName: JOE, lastName: DOE> in the database.
 | |
| Found <firstName: JUSTIN, lastName: DOE> in the database.
 | |
| Found <firstName: JANE, lastName: DOE> in the database.
 | |
| Found <firstName: JOHN, lastName: DOE> in the database.
 | |
| ....
 | |
| 
 | |
| 
 | |
| == Summary
 | |
| 
 | |
| Congratulations! You built a batch job that ingested data from a spreadsheet, processed it, and wrote it to a database.
 | |
| 
 | |
| 
 | |
| 
 |