Search results
Preventing N+1 SELECT problem using Spring Data JPA EntityGraph
1. OVERVIEW
What’s the N+1 SELECT problem?
The N+1 SELECT problem happens when an ORM like Hibernate executes one SQL query to retrieve the main entity from a parent-child relationship and then one SQL query for each child object.
The more associated entities you have in your domain model, the more queries will be executed. The more results you get when retrieving a parent entity, the more queries will be executed. This will impact the performance of your application.
This blog post helps you understand what the N+1 SELECT problem is and how to fix it for Spring Boot applications using Spring Data JPA Entity Graph.
2. DATABASE TABLES
Let’s start with a partial ER diagram for the db_dvdrental
relational database and its JPA mappings:
DVD Rental Entity Relationship Diagram
film
, actor
, film_actor
, film_category
, and language
are the relevant tables which I’ll cover in this post.
3. JPA ENTITIES
Let’s now generate the JPA entities mapped to these tables using a Maven plugin:
Film.java
:
FilmActor.java
:
Actor.java
:
Language.java
:
Notice the @OneToMany, @ManyToOne entity associations are LAZY loaded.
4. SPRING DATA JPA REPOSITORIES - N+1 SELECT PROBLEM
Let’s start with the simplest repository implementation using Spring Data JPA to perform CRUD operations for Film JPA entities:
FilmDao.java
:
And that is it. For now.
To reproduce the N+1 SELECT problem using the accompanying source code, let’s first understand the execution flow:
Retrieve films Sequence Diagram
The methods relevant to this post are:
- FilmDao.findAll(page): Retrieves films from the RDBMS.
- FilmResourceMapper.map(films): Maps from the internal domain to the API contracted response.
Sending a GET
request to /api/films
outputs these logs:
logging:
level:
org.hibernate.type.descriptor.sql.BasicBinder: TRACE
spring:
jpa:
show-sql: true
properties:
hibernate:
format_sql: true
Hibernate first executes a SELECT statement to retrieve all films.
For each parent entity Film, Hibernate executes another SELECT statement to retrieve all its FilmActor associated entities (@OneToMany).
And for each FilmActor entity retrieved, Hibernate executes another SELECT statement to retrieve its Actor child entity (@ManyToOne).
But there is more, for each Film, Hibernates also executes yet another SELECT statement to retrieve the Film’s Language (@ManyToOne).
To illustrate the N+1 SELECT problem with an example:
If you retrieve two films, one with five actors, in English and the second film with eight actors, in Spanish, Hibernate ends up executing:
- One SELECT to retrieve both films.
- Two SELECT to retrieve film actors associations, one for each film.
- Five SELECT to retrieve actors for the first film.
- Eight SELECT to retrieve actors for the second film.
- One SELECT to retrieve English language for the first film.
- One last SELECT to retrieve Spanish language associated to the second film.
These add up to eighteen SELECT statements. A very inefficient approach to retrieve associated JPA entities. A lot of round trips to the RDBMS. You’ll very-likely have performance problems once your traffic starts to grow.
5. HIBERNATE’S LazyInitializationException
Before showing how to avoid the N+1 SELECT problem using @EntityGraph, let’s ask a question.
Where were these SELECT statements executed?
The first SELECT from the film table statement is executed via the FilmDao.findAll(page) method. The other SELECT statements are executed when the REST controller converts the Film entities to REST resources using FilmResourceMapper.
Don’t expose your internal domain as your external interface. It might lead to API compatibility issues.
The application’s internal domain might evolve at a different pace than the contracted API responses.
Map or transform the internal domain and send the result as the API response.
But the database transaction is committed after the execution of DvdRentalService.retrieveFilms() as a result of its @Transactional annotation, how does the mapping even work?
It works as a consequence of Spring Boot’s behavior of setting spring.jpa.open-in-view=true
by default. This approach keeps the Hibernate Session open after a database transaction has been committed.
Once you set spring.jpa.open-in-view=false
, which is the recommended practice, sending the same GET
request to /api/films
fails.
The Entity Manager (wraps the Hibernate Session) is closed after executing DvdRentalService.retrieveFilms(). The JPA entities are in a DETACHED state and navigating through the filmActors or any other child entity throws LazyInitializationException.
How do you avoid the N+1 SELECT problem whitout using the Open Entity Manager in View pattern and preventing the LazyInitializationException?
You would have to eagerly-fetch the associated entities of the parent JPA entity. A few ways to do it are:
- SQL JOIN clauses if you are using native queries.
- FETCH JOIN clauses if you are using JPQL or Criteria API.
- Spring Data JPA’s @EntityGraph if you are implementing your repositories using Spring Data JPA. The next section helps you with this option.
6. SPRING DATA JPA REPOSITORIES - @EntityGraph
@EntityGraph, added in JPA 2.1 allows you to define which relations or associated entities to load for your specific use cases.
Let’s now update the Spring Data JPA repository:
FilmDao.java
:
When retrieving Films using FilmDao.findAll(page), the @EntityGraph annotation causes Spring Data JPA and Hibernate to fetch the associated entities included in attributePaths
according to the EntityGraphType specified, FETCH, or LOAD.
Type | Description |
EntityGraphType.FETCH | The attributes included in attributePaths are treated as FetchType.EAGER. Other attributes found in the JPA entities that are not included in attributePaths are treated FetchType.LAZY. |
EntityGraphType.LOAD | The attributes included in attributePaths are treated as FetchType.EAGER. Other attributes found in the JPA entities that are not included in attributePaths are treated according to their specified or default FetchType. |
Notice attributesPath
can also include the nested attributes of a JPA entity collection. filmActors is a collection of FilmActor. And @EntityGraph includes the actor associated entity via filmActors.actor
.
This time sending a GET
request to /api/films
logs:
A single SELECT statement to retrieve Films and its associated JPA entities via JOIN clauses.
7. CONCLUSION
The N+1 SELECT issue will impact the performance of your Java application.
If you are already using Spring Data JPA to interact with a relational database, you can take advantage of its support for JPA’s @EntityGraph.
Spring Data JPA’s own @EntityGraph is easy to add to your existing repositories. It helps to execute only one SELECT statement preventing the N+1 SELECT problem.
8. UPCOMING
Lets take a closer look at the Hibernate logs this request generates:
WARN 84147 --- [nio-8080-exec-1] o.h.h.internal.ast.QueryTranslatorImpl : HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!
HHH000104: firstResult/maxResults specified with collection fetch; applying in memory! might be a huge problem depending on the size of your dataset. Stay tuned, I’ll cover how to fix Hibernate HHH000104 firstResult/maxResults warning using Spring Data JPA.
Thanks for reading and as always, feedback is very much appreciated. If you found this post helpful and would like to receive updates when content like this gets published, sign up to the newsletter.
9. SOURCE CODE
Accompanying source code for this blog post can be found at: