In June 2015 our customer GALERIA Kaufhof relaunched
their E-commerce platform galeria-kaufhof.de.
Several teams worked for about a year on this greenfields project aimed at building a new foundation for their customer shopping experience.
Prior to the official relaunch we conducted comprehensive performance and load tests for quality assurance, configuration testing and capacity planning.
on the general architecture of this endeavor.
The goal of the rebuild was to get rid of the monolithic system and introduce a new scalable, shared-nothing, self contained systems architecture to be ready for future features with a reduced time-to-market. Everything from the operational environment up to the user interface and user experience was redesigned and build from scratch.
Kaufhofs team and architecture is divided along business domain areas into scrum teams:
- Front-end Integration: Integrating the systems to a coherent website experience
- Explore: Controlling teasers
- Search: Product search and navigation
- Evaluate: Details on products
- Order: Order process
- Control: Customer account handling
- Foundation Systems: Horizontal services like media & asset delivery, feature toggles etc.
- Platform Engineering: The machine room. Covering tools, deployment and platform operations.
Systems contained in these domain areas, also called "verticals", are designed to
- not share any code
- be loosely coupled and
- technologically independent (they currently have Java/Scala and Ruby on Rails teams).
The only thing these self contained systems share, is the platform they operate on. Each of these systems has the authority about its front-end, business logic down to data storage.
We joined the effort backed by the platform engineering team (PENG for short). PENG provided insights like log and monitoring data while we conducted our testing and analysis.
Our general approach is to assess systems in a bottom-up approach. For this project we ended up with three test categories:
Service Tests: We first identified performance critical components of each vertical and developed a test case scenario for each. The purpose of these fine-grained test cases is to get a starting point and baseline for further testing. These isolated tests are also used to artificially stress code paths when troubleshooting performance related issues.
Vertical Tests: The next step is to compile these isolated tests into vertical or service-wide tests that will cover multiple endpoints and features of a given service. These tests are still isolated on a vertical level so that each team can run those tests without effecting other systems.
Combined Tests: In the last step more complete tests are compiled. These tests are modeled after reference data from the existing shop system and will reuse scenarios developed in previous tests. By design these tests will cover almost all verticals and are aimed at identifying previously undiscovered performance critical dependencies. The goal is to get a broader view at the system and it is the first time a more user-centric workload is modeled.
There were also a couple of tests to establish a baseline in network performance and latency. As well as a set of specialized tests to stress the shop's content delivery and caching architecture.
Infrastructure Provider Evaluation
Using test cases from our first step as reference, we evaluated different OpenStack-based providers and their flavors of compute instances and other configuration aspects. All providers were located in Europe, but we still had to verify that the bandwidth and base latency between the individual providers and our load generators in Frankfurt (AWS eu-central-1) and Dublin (AWS eu-west-1) are well known and understood.
The platform engineering team build a operating environment where each team can deploy their application to. They make this possible by utilizing OpenStack APIs and automate the entire provisioning processes using Puppet. This made it very simple to bootstrap and manage different target environments and e.g. make configuration related changes.
For our customer we ended up in comparing multiple providers with traffic originating from Frankfurt and Dublin.
"The flexibility of StormForger enabled our platform engineering team to run large load tests targeting different environments and data centers with ease."
— Torsten Hamper, Head of System Engineering eShop Systems, GALERIA Kaufhof GmbH
GALERIA Kaufhof succeeded with their relaunch project and managed to create a modern, well designed and scalable E-commerce platform.
Evaluating different target environment is as important as testing system and application level configurations. Understanding the basic performance characteristics is also crucial for capacity planing and resource estimation e.g. when you have to ensure the system is ready for big marketing events and traffic spikes.
In case you speak German and would like read more about the ongoing development check out GALERIA Kaufhofs blog at galeria-kaufhof.github.io.