Suo Lu

Thinking in Data

| | email

What I learned from migrate large Enterprise Microservices System to Kubernetes

There is one SIEM system we built. As one of our customer requirements, recently we developed and migrate it to Kubernetes (K8s).
Here are some advice to share:

1. Get to Know Your System Well

Cloud Native is not simply move processes to docker. You need to fully understand your app, estimate works of migration and alteration, it may even involve new system/hardware plan.

2. Step by Step

Roman was not built in one day. Make a reasonable, secure plan.

3. System Resource Planning

Be aware of Cloud Native require more resources to run, especially when your system was well-designed base on the specific hardware.

4. Application Changes

The app was coupled design with OS/hardware, part of program may need remake or reconsideration.

Our Practices

Before Cloud Native

A multi-nodes mixed style distributed system:

After Cloud Native

Lesson Learned

At the beginning, we tried remove Eureka totally, which was a mistake. Eureka was basis of Microservices that every java service needs, delete it from one service require several config changes even code changes, and even worse, we realized some services are too hard to convert quickly, which mean Eureka cannot be removed for the moment. Two weeks are thus wasted.

Make one common docker image, move differences dynamically remotely.
We tried build customized image for each service with openjdk-alpine, which lead to two critical problems:

  1. Tons of images make up a huge install package, even though many of them are actually very similar.
  2. openjdk-alpine is small indeed. However, it has critical debug problem (ref4 ref5). For use cases not require app elastic expanding often, one big common image is not an issue. BTW docker has one great image shrink tool called docker-slim, K8s Ephemeral Containers is still in alpha until today.

So we turn to build a common image for all services, move runtime (app jar, jdk, shared libs, python etc) to network storage. Use a global config-map to keep every environment value the same.
We keep the logging in “local file” for now, by alter the file name format with hostname.

Network Changes has to be made.
The direct way to get client IP is broken after the migration, as K8s service design. An extra LB is required if you need it as we do.
Client mode of Spark submit is not working in K8s pod, becaure of server can’t reach back to pod. You need to change app to use cluster mode, and pass all the environments during startup (client codes to get environments on the fly will make no sense).

NEXT

28 Jun 2020