IRCTC System Design (Data modeling and Parallel Booking Management System)
What is IRCTC?
Indian Railway Catering and Tourism Corporation is the only authorized entity to provide certain services to Indian railways. The primary function includes online ticket booking, which we will be discussing in this article in detail.
Why are we designing IRCTC?
IRCTC is one of the most complex tickets reservation systems that we have come across. It involves all kinds of services in online ticket booking which includes search, parallel bookings, and cancellations. Hence it comes as a good exercise to try and think how such a complex system is catering to the needs of the massive Indian population with increasing demands.
Functional Requirements
We will be covering a small section of the complete set of requirements in this article. Particularly focussed on the schema structure, train search, and booking management.
- Each user should have a profile with personal details
- Users should be able to search trains by train number and via source and destination.
- Users should be able to see the number of available seats, given the train, source, and destination.
- Users should be able to book tickets (The system should be able to handle concurrent booking requests in a fair manner).
- Seat Reusability of booked seats IS allowed. (User A books a seat S1 from station1 to station2 then UserB should be able to book the same seat S1 from station3 onwards).
- Users should be able to cancel a booked ticket.
Assumptions:
- A train will run from source to destination once a day. In the case of the reverse journey, the ID of the train will change and it will be treated as a new train itself.
- To keep things simple, we will not consider the berth and coach preference in this article.
- We will also only book tickets that are confirmed, waitlisting is not supported.
Database Selection:
We will be using a relational database due to the following reasons:
- Structured data (Every row will have a standard set of information, no unidentified data)
- Transactional and consistent system (Focus is on making a consistent state of the system and atomicity is required while booking trains, a relational database is best suited for this)
Schema Structure:
User: UserId (PK), userName, Email, phoneNo, Address (Stores the details of users to be used at the time of booking).
Train: TrainId(PK), TrainName, SeatCount, CoachCount, RunsOnDays (Stores the details of all train related information)
Station: StationId(PK), StationName, PlatformCount (Stores standard station related information)
Route: RouteId(PK), TrainId(FK), StationId(FK), ArrivalTime, DepartureTime, Date(Stores the information associated with a single route). e.g. ([routeId1, Train1, Delhi, 9am, 9:02am, 04–02–2022], [routeId2, Train1, Bhopal, 11am, 11:02am, 04–02–2022], [routeId3, Train2, Meerut, 9am, 9:02am, 04–02–2022]). This table will be primarily used to implement train search functionality between 2 stations. We will discuss this in detail in further sections.
StationToSeatMapping: Id(PK), TrainId(FK), StationId(FK), Seats(Array), Date(This table maintains the booked seat numbers per train and station combination. This will be used to support partial bookings and get seat availability.) We will discuss this logic in detail.
Booking: Bookingid(PK), TrainId(FK), From_station(FK), To_station(FK), userId(FK), seatId(FK), status(Confirmed/cancelled), Date (Used to maintain booking specific information)
Search Train between two stations logic:
Let's take a look at the ‘Routes’ table above. We have added several routes to the table. Each route has a unique routeId, a reference to TrainId, a reference to StationId, and Arrival and departure time on that station.
To query or search for trains running from Delhi to Bhopal, we can use the above JOIN to get the result.
If we want to search the itinerary for a particular train, that can be done by a simple select query on Routes table order by time.
Seat Availability, Booking, and Reusability Logic:
- Store train information in trains table with seatCount
2. How seat booking will look like:
Let’s say we get a request to book a train from Delhi to Allahabad for a train T. (User would have searched all the trains and selected the one they want the booking on using previous logic)
Step 1: Check whether the seat is available between Delhi and Allahabad for Train T.
Check all the stations between source and destination and see how many seats have already been allocated. If number of seats allocated < total seats then booking CAN be made.
In the above screenshot, seat allocated = 0 and 0<5(seatCount) hence we can allot a seat to this user.
Step 2: Allot the seat if the condition is satisfied and add an entry in booking table and StationToSeatMapping.
Number of available seats = (TotalSeats)-(Number of seats that have been booked between source and destination)
3. SEAT REUSABILITY (PARTIAL SEAT BOOKING)
Reallocate the seats which are available. User A books a seat S1 from station1 to station2 then UserB should be able to book the same seat S1 from station3 onwards.
Consider a Train ‘Train1’ which has a total of 5 seats. Let’s consider that the path followed by train is (Delhi -> Rohtak -> Chandigarh->Shimla).
As per the above table data, we have the following observations:
- A user booked a ticket from Delhi to Rohtak. Let’s say this user was allotted seat number ‘1’. We added ‘1’ in the “seats” Array corresponding to station ‘Delhi’. We didn’t add it in Rohtak as the seat is available from Rohtak onwards.
- Another user booked a ticket from Delhi to Chandigarh. Let’s say this user was allotted seat number ‘2’. We added ‘2’ in the “seats” array corresponding to stations Delhi and Rohtak. (Didn’t add it in Chandigarh as the seat is available from Chandigarh onwards).
- Another user booked the ticket from Chandigarh to Shimla. He got seat number ‘1’ again as it was available.
By maintaining the association between every seat of each train and the corresponding station, we are able to calculate which seat to allocate between stations A and B.
How application logic will work:
a) A user checks the seat availability for a particular train from station A to Station B.
b) If the seat is available, the user raises a request to book the ticket
c) The application gets the rows associated with the required (train, source) and (train, destination). Consider the above example, let's say we get one more booking from Rohtak to Shimla.
- We cannot book seats that are already booked. The seat count for this train is 5. We know the seats which are available [1,3,5]. These are our potential seats that can be allocated.
- We will go to the next station and check which seats have been booked from our potential seats. 1 is already booked so now we are left with [3,5]. Any of the two seats can be allocated to this user.
P.S. The application logic can be made further smarter by allotting the seats which are available partially for short journeys and rest for the bigger journeys.
After the seat is allocated, an entry is made in the Bookings table to record the booking details per user.
4. BOOKING CANCELLATION
If a user requests for cancellation of a BookingId, we can get the associated TrainId, sourceStationId, destinationStationId and SeatId from the Bookings table. Using this information we can change the status of booking to “Cancelled” and vacate the allotted seat from stationToSeatMapping table.
With this schema structure, cancellation is straight forward and database integrity is maintained.
Handling Concurrent User Requests At the Time of Booking
Concurrency is one of the most dreaded things where candidates get confused during an interview. But it’s not that difficult :)
In the design that we discussed above, you might ask how will we manage the consistency of our system if 1000 users try to book a seat for the same train at the same time.
In this design, we can employ pessimistic locking on the StationToSeatMapping table whenever a user wants to book a ticket. To understand this concept in detail you’ll have to learn about isolation levels in the database and a little about locking. (Article to learn about locking in detail)
Reason for choosing pessimistic locking rather than optimistic locking in IRCTC:
- No bad user experience: Instead of throwing an error after the request is made, the user is stopped from making a request if there is a chance of failure.
- Data Integrity: Maintains cleanliness in the database, less rollback as number of requests erroring out are less. Consistency is always maintained.
Logistics:
When a user makes a request to book a ticket between stations A and B for a train T. A lock will be taken on all the rows in SeatMappingTable for Train T and StationId between A and B.
Conclusion
What we did:
- How to design IRCTC schema?
- How will various functional requirements can be fulfilled using the designed schema structure.
- Search trains between two stations
- Seat Availability, Booking and Reusability
- Handling concurrent requests
I hope you liked the article, let us know if you want to understand other segments of IRCTC design as well, such as:
- How will APIs work with the above schema structure?
- How will the microservice architecture look like?
- Granular level details on Payments system etc