Welcome to the inaugural episode of the All Things Devops Podcast. This is a brand new podcast from the BigBinary Team where we will discuss all things related to devops.

Today we have Rahul and Vishal with us and we will be discussing Rancher, Kubernetes and a couple of the programs and methods that we have been using at BigBinary.

Rahul currently handles all the big infrastructure at BigBinary, where over the past six months they have been shipping containers to production, using Kubernetes. Like Rahul, Vishal also works at BigBinary and is currently assisting in developing an internet tool to deploy apps on Kubernetes with a single click. Today we discuss everything from the process and challenges of integrating with Kubernetes, image building, segregating and labelling apps and the latest app building programs. Take a listen!

Key Points From This Episode:

  • The setting up process to deploy apps on Kubernetes.
  • Implementing with Rancher versus Kubernetes.
  • Why choose Kubernetes?
  • Alternatives to Kubernetes.
  • Integrating existing development cycles with Kubernetes.
  • Image building with Docker Cloud and Docker Hub.
  • Base images and community image building.
  • Cluster, AWS and Kops communication.
  • Segregating apps on servers.
  • The challenges faced setting up the infrastructure for Kubernetes.
  • And much more!

Transcript

Please note the in the discussion Rahul mistakenly said “Katkoda” instead of “Stratoscale”.

[INTRO] [0:00:00.8] VIPUL: Welcome to the inaugural episode of All Things DevOps Podcast. This is a brand new podcast from BigBinary team where we will discuss, well, all things related to DevOps.

[EPISODE]

[0:00:09.8] VIPUL: Today we have Rahul and Vishal with us and we will be discussing all things about Kubernetes Rancher and how a couple of these things that we have been using at BigBinary. First off, I’m Vipul and I work with BigBinary and I’ll let Rahul and Vishal introduce themselves. Why don’t you go ahead Rahul.

[0:00:28.6] RAHUL: Hi everyone. So it’s great to be at BigBinary’s podcast. I’m Rahul, currently handling all the infrastructure stuff at BigBinary. From last six months we are busy shipping containers to production and we are using Kubernetes for that. So we are busy in optimizing the planning flow with containers.

[0:00:49.4] VIPUL: Awesome, how about you Vishal?

[0:00:51.1] VISHAL: Yeah, thank you Vipul. My name is Vishal, I work at BigBinary and I primarily work as a Ruby on Rails Developer but for the past few months I’m working on Kubernetes. Right now we travel working on building down the internet tool to deploy apps on Kubernetes with a single click.

[0:01:13.4] VIPUL: Awesome man, how did the whole containerization thing get started and when did you get started using containers, maybe Docker or other container-related technologies?

[0:01:24.3] RAHUL: Yeah, so in one of our projects we have like software as a service kind of service. Whenever our client just onboards one new client, we have to create an application with other domain line. Traditionally it was being created on servers than easy to, it was automated using chef and civil and cloud formation. For each new client, for each new app, our client needed to boot up some servers, add all the configuration tools and that used to go to lab.

This was the traditional setup but when it come to resource management and other things, it was a bit of cumbersome and we are in search of something like related to, which can save on resources as well as the architectural way microservices run.

We came across containers, we tried out luckily first using Docker and that turned out to be really useful, but after trying those containers, we were like — we were not supposed to make a call on which orchestration tool to use it. After that, we explored some techniques like Docker data center, Docker swarm and Elastic container service from Amazon.

Some of them worked for us but some of them were having this feature but not having the other one and then tried Technologies like Rancher and Kubernetes. Rancher was good, we started off with it, but due to our scale and the features we wanted, we were not able to go complete with Rancher. Then we tried out Kubernetes and it seemed like there was something which was fitting our needs, which was giving scalability and the resource management things what we were looking for.

We started exploring Kubernetes from Kubernetes 1.2 and then only we started containerizing our app. So first thing we started was like segregating services in terms of micro services architecture. This is how we started to and decided to use Kubernetes and started containerizing the app.

[0:03:34.4] VISHAL: I had a bit more add in I think, using Rancher long battle. That was for one of our other projects, which is Trinity. There we added Rancher because I myself was not too familiar with the whole setup of using Kubernetes at that time or maybe something extra bigger. I believe we started using Rancher because for me as was just work mostly on rails, doing something like Rancher was pretty straight forward, I could just provide it with a Docker machine.

The environment that we need is something for us, we were using images that have been provided by say CircleCI or we could just start using those images, we could just try to use those machine, provide them to Rancher and get started with it. I know that’s pretty straightforward setting those things up so that I, like myself not being a DevOps, so how exactly the containerization takes place and all of the orchestration for setting those things up was pretty straightforward that I did not need to do those things all by myself.

[0:04:38.3] RAHUL: Exactly, the Rancher is made for the developers who don’t have much of ops background and want to learn their containerization and production, Rancher is the place you can choose. But for our other project case, we tried using Rancher but as we have different components like we had really used database system services like Redis and also needed robust service discovery.

With all that and implementing it with Rancher, we faced some challenges while using service discovery and with the Kubernetes, it just worked because Kubernetes offers some of the features which really fits into our needs. I would say like Rancher was a great fit but just for our other project, it was not and Native Kubernetes had all the features we were looking for. We started containerizing and automating the steps and that went well.

[0:05:39.5] VIPUL: You did give a drag to Rancher on the same project you’re speaking about, right? How was that experience? Like, I believe Rancher also provided help or it does also does provide support for Kubernetes. So that didn’t work really well?

[0:05:54.3] RAHUL: Yeah, one thing about Rancher is like we tried Rancher but on top of Rancher, instead using Cattle, we used Kubernetes. It started off well but when we started off using, we analyzed that one of the features of Kubernetes is still not ported to Rancher’s Kubernetes.

[0:06:15.3] VIPUL: Which is what?

[0:06:16.6] RAHUL: Which is not supported by Rancher. When Kubernetes 1.4 was out, Rancher was still supporting Kubernetes 1.3. We were not able to easily upgrade to Kubernetes 1.4 and use the features. Instead of that, we thought of using native Kubernetes and remove the barrier of franchise.

The only challenge we faced initially was like, we were not going to have the cool UI that Rancher offers.

[0:06:47.2] VIPUL: That is quite interesting because for a person who doesn’t know much of DevOps, I feel that if I try to look at some things which Kubernetes provides, it’s really hard to see all of those things as a person who has - I mean, mostly not works on ops, it’s mostly just using the Kubernetes CLI if I’m not mistaken, right?

[0:07:06.2] VISHAL: Yes. Kubernetes, working more on UI side to make it more user friendly. So since past two releases, there are so many improvements on the Kubernetes, native Kubernetes UI. If you use Rancher, a couple of months ago then probably if you now try to use Kubernetes that it will more, you know, handy, we did a 1.6 release.

[0:07:35.2] RAHUL: Yeah, taking advantage of this thing, which is like Openshift, [inaudible] and tectonic from CoreOs came up with the cool UI that you can use Kubernetes but the UI was one folks like Openshift and [inaudible] gave us for the public.

That was one thing when we started off but now with the latest version, 1.7 and keeping an eye on future release of 1.8. That is what’s happening in Kubernetes, to stabilize the Kubernetes driver support. Yeah, I would say, if it comes to being Kubernetes, we have to handle two things. First is like maintaining Kubernetes cluster and for developers, it should be only like deploying apps on Kubernetes and this is where we need to segregate and dashboard or UI specific to developers.

This led us to think of like something internal tool, which — internal automation tool, which can hand over to developers to just be specific to deploy apps on Kubernetes.

[0:08:38.0] VIPUL: Like looking up for options apart from Rancher and Kubernetes? What other different things did you try before actually deciding on Kubernetes and was Kubernetes your tool of choice when using this for the new project?

[0:08:51.5] RAHUL: Yeah, we started off the Docker Swarm, I don’t remember the [inaudible] name. It was pretty young and not many people were using it in production. We started off with just did not scale in out need, then there came the Docker data center from Docker itself.

Docker bought a project called tumtum or what we pronounce it is at tum-tum, that had a UI for Docker and Docker data center started off, it was in beta but we found that we were something going into vendor locking thing if we were using Docker data center, we were tightly coupled to Docker and this is where technologies like Mesos or Kubernetes came up.

The Kubernetes was one thing that we can even deploy it to bare metal or we can still have — even we can deploy mini shift on raspberry pi. That was one thing that we spent time on and elastic container services from Amazon because our project was an AWS and we preferred to but at that time, ECS was not that mature and the terminologies are architecture we wanted to deploy in containers was not that really full-fledged efficiently going to be deployed on ECS. After like when we, I know ECS, then we shift it to trying out Kubernetes and this is where like Kubernetes seem to be in a great option over ECS. That was a time like we decided not to go with ECS.

[0:10:28.6] VIPUL: Correct, that is interesting to hear because I hear amazon itself is also working on providing native support for like Kubernetes kind of orchestration. Am I correct on that?

[0:10:40.7] VISHAL: Yeah, I mean, they haven’t made that statement specifically, but last week, AWS joined CNCF which is like cloud native computing foundation. They’re sponsoring that as a platinum sponsor. The last month, Microsoft even joined there and Microsoft came up with the Azure container instances, something we can expect from AWS but in ECS and almost 62% of people run their Kubernetes cluster on AWS.

This is good transfer, AWS when if they can come up with Kubernetes as in service and a product. I don’t know how they will managed either ECS or Kubernetes as in service but it would be interesting.

But on lines of that, we have some other place like Stratoscale and tectonic from CoreOS. Initially, CoreOs was a big major part in Kubernetes as of now also it is, but they offer tectonic as in service for Kubernetes so yeah, with the joining of AWS into CNCF foundation, we can expect something like that.

But I don’t think like that is the only case, we need to have Kubernetes as in service from AWS. So it would be just kind of automation from point of cluster scaling but the issues like, which the native applications face, those will not be addressed and I don’t think that would be a great success. When we look at Google container engine, what is that? It is a container service from Google on top of Kubernetes. But nowadays, people don’t prefer to be a vendor lock in kind of thing.

[0:12:17.6] VIPUL: Right.

[0:12:18.1] RAHUL: I’m sure, AWS might be having a plan but we have to wait and watch.

[0:12:22.2] VIPUL: Which is also surprising if you think about it because ultimately, people are getting vendor locked into AWS even after using Kubernetes.

This thing I wanted to discuss about is like after you picked up Kubernetes, what other things do you have to do to get started with integrating with Kubernetes and how exactly did you map the whole Kubernetes along with the existing development cycle?

[0:12:47.4] RAHUL: Yeah, first thing is like, we have to dockerize our app and our application project or something like it was credit on four major components and when we want to deploy our provisional application, we have to bring up all the four components.

The first challenge was like dockerizing our, all the four models of application. So basically they were using Ruby on Rails. So we started off with dockerizing and building it an image. So, after that, the second challenge or the second choice was like which Docker image registry to choose and how to build automatically images for Docker. That one was basically addressed by Vishal and he can put up more light on like how we automated our building image pipeline for our application.

[0:13:42.2] VISHAL: Yeah, thanks Rahul. Our first challenge was to build the Docker image for our application and as Rahul mentioned that there were a couple of models are components, our application was divided into. So it was a bit challenging to build a Docker image for that application. Initially, we did with a basic Docker file and not worrying too much about cache layering and cache busting in the Docker file.

Initially, we tried to manually build the image and also did not worry about how much time it takes to build that image. With some experience, we then optimized the image building logic. Then the next challenge was…

[0:14:29.0] VIPUL: Do you remember how long it took like initially to build your images?

[0:14:33.6] VISHAL: Yes, initially it was taking about half an hour then with some optimizations, we reduced that time to just five to six minutes. Yeah, it was a major learning from that optimizations.

[0:14:50.1] VIPUL: How can that be such a big optimization like did you use a different service or what exactly did you do like to gain, I believe you are doing this image building by yourself now, right?

[0:15:03.9] VISHAL: Yes. I believe earlier it was taking up around half an hour to build an image but that was happening when we build the image on our Docker Cloud, our Docker Hub actually. Docker has two services, one is Docker Hub and one is Docker Cloud.

On both of those, if we build that image due to resource constraints, it was taking so much time and the image building logic that they have is kind of sequential, you cannot build two image at the same time. So ultimately, we found that it was not working as intended so instead of building on Docker Cloud, we thought of building it ourselves and then we decided to use Jenkins to build the images on our own service.

In that case, with the help of some optimizations in Docker Cloud, we were able to reduce that time to just five to six minutes.

[0:16:06.8] VIPUL: You mean these it is not allowed to run in parallel but still big tool support image caching or layer caching for Docker or no? Or that was also a hindrance for you?

[0:16:18.5] VISHAL: So Docker images are built in such a way that there are less generated when a Docker image is built. Those layers are being used when building a new image. So there are so many map hectors, which affects cache the image cached layered to be reused or being expired or bust. So one factory they check so of that particular cache layer. If it doesn’t match then it will just expire the previous cache and will try to generate a new image cache that’s why we did layered cache.

So it self-supports these things but as a platform the Docker Hub as well as Docker Cloud were not respecting these things too much and probably were making the image building process was taking so much time.

[0:17:13.5] VIPUL: Interesting and [inaudible] optimizations for your build cycle, I think that sped up your image build.

[0:17:22.3] VISHAL: Exactly.

[0:17:23.1] VIPUL: Interesting. What are different things do you think are taking like are important during considering like, I will put it this way: what other things did you improve for your image caching?

[0:17:35.1] VISHAL: Yeah, so first thing is your Docker pile tells you like where caching will be applied or where exactly the cache will be busted or expired. So a simple example would be like you have the specified redeem your Docker file that the package need to be re-trailed or installed from third party URL and if that URL contains some dynamic part then every time when you build the image, check on that is generated after installing that package will be different.

And Docker will assume that that package is different or that line is from different checks. In that case it will expire or bust the previous cache and after that what the lines, which you had in your Docker file will be run over again. So ideally I don’t need to place all of those lines or all of those commands or whatever you need to be placed in Docker file, the things shouldn’t be dynamic. It should not actually expire the cache somehow.

[0:18:52.3] VIPUL: I think another thing also is that they are using or I believe you are using some community supported images as well? Like we could just use those images and then build it on top of it, right?

[0:19:03.1] VISHAL: Yes, so always start with the base image and communities took for example, if you are trying to build an image for Ruby on Rails application then you should actually use a base image forwarded by Ruby on Rails or Rails community and the community has taken care of most of the things like installing the necessary dependencies and whatnot. So it’s sort of starting from scratch you should actually try to use base image created by the community.

[0:19:36.5] VIPUL: And if in this case you were using some base image or you were still building on your own? Or you are doing a different way where you have a base image that you were building by your own self and then using that as a defense to other images?

[0:19:49.3] VISHAL: Yeah, so in our case as our application had four different components our base image was built from a Ruby-based image or in particular Ruby [inaudible] and after that, this base image is being used as another base image for the different components. So you can actually sort of have multiple base images and you can reuse those base images in the lower part, in the sub components of your application.

So it is not necessary to have just one Docker file. You can have multiple Docker files and have multiple base images actually. Just to ensure that the base image that you are using is not – I mean it does not contain any dynamic logic like setting up the environment that it was dynamically or adding context to build context like they use those colors or something like that. It shouldn’t be dynamic. So in that case you can use base image.

[0:20:49.8] VIPUL: And this is a new discussion though like we could be touching with all different things where we could avoid cache listing for Docker. I want to move onto the next thing which is what are things after you’re done building this image, what are different parts of Docker have you touched or how are you using in your application? Can you just list out all of those things and what are you using right now?

[0:21:11.0] RAHUL: So after building an image, the first thing was like how to [inaudible] a Kubernetes cluster and how to [inaudible] and highly available with Kubernetes. So that was one thing but answering to question like what are Docker terminologies we use. The first thing is image then image build cycle. After that is continuous integration with our deployment pipeline.

So once our image building was done, we wanted to [provisional] Kubernetes cluster with high level ability and secure. So we come across different things like Kubeadm, KubeSpray and Kops. So we choose Kops, it was where you can incubate your project from Kubernetes itself and it is built on top of Terraform which provisions Kubernetes cluster are on top of AWS. It can provision it in the private subnet, which is like a secured network cluster and we played with that and we provisioned our play cluster using Kops but thought we had some issues and troubleshooting things.

But for our initial try route, we were using Kops 1.5 and Kubernetes I think it was at that time, it was just support. Kops was supporting Kubernetes 1.4. So after that, the one important part was also scaling the cluster then also scaling apps, monitoring the cluster and after alerting the system resources if one of your service is down, if one of your nodes is down notify to developers and your ops people.

So this latest thinking about for monitoring we chose ELK. That is Elasticsearch, Logstash and Kibana, which worked out of the box and it came up as an add on from Kubernetes community itself. But for monitoring we thought of Prometheus and Using prometheus with Grafana. So we clubbed all of those things together and after that, we started to look into the core application performance and scaling issues like auto scaling nodes. So for that, running Kubernetes cluster on AWS with Kops doesn’t have an auto scaler functionality.

We had to implement cluster auto-scalers to specify your minimum or maximum limits and using that, you can auto-scale up and down your limit. The other important thing, what we do is segregating the apps using labels and name spaces. So in our traditional architecture let’s say our client was adding one new app and he was adding it on two servers, maybe one is front and other is back end somewhere and that enrolment course completely segregated.

But when you run your apps across the clusters, maybe a Kubernetes cluster, you afford, we don’t have control like where is our afford shielded. Let’s say our four containers are running on four different systems and when our –

[0:24:19.2] VIPUL: You mean physical machines?

[0:24:20.4] RAHUL: Yeah, on four physical machines or nodes and when our application user has an access to that portable container port, he might see the port’s running on the same node. So this lead us to think of something about name spaces which are again a great feature from Kubernetes and we decided to put our club or all of our application-specific things in a single name space. So for each application or for each client for our application we are creating a unique name space under that.

We have our front and back end boards and services and this is how we segregate but apart from that, our node allocations is still not in our control. So there is something called leveling concept from Kubernetes or where we can level the nodes or physical machines. Say this app is equal to XYZ and we will specify that to label in our port manifest or deployment manifest. So that our port is scheduled to run on that particular node, which matches the label. So this is some things like that made the deployment cycle easy.

[0:25:30.4] VIPUL: Yeah. We are touching a lot of things here so I will just try to restrict, we are discussing a lot of things about Kops, how we are doing auto-scaling on the cluster level as well as I believe auto-scaling on AWS level. So I think we can tear these things apart and this of course is individually but since we are running out of time, I would like to take any final words on one of the things or challenges you faced while doing – For example whatever we have discussed about any other challenges that you faced while doing imaging caching or any other challenges that you can face while setting up the Kops or the whole infrastructure that you use right now for Kubernetes.

Apart from that, we will be continuing the discussion about how exactly all of these layers that we just discussed in our next podcast. So any final words Rahul?

[0:26:18.0] RAHUL: Yeah, so if you have made the decision to use Kubernetes and you have questions like where do I host my Kubernetes cluster and if you decide to use AWS, there are some limitations and challenges when you want to host your production grade Kubernetes cluster or AWS. So just think of like bring your own VPC and kops provision or cluster based on your DNS. So you have to bring your DNS most probably. I am not sure if this challenges has been out on latest release of Kops and Kubernetes that your domain has to be in Route 53.

Because using your Route 53 service discovery mechanism, Kops provisions the cluster and API communication happens and it will automatically allocates and registers domains. So VPC and DNS are important things if you are trying to explore Kubernetes or pro-agent Kubernetes cluster on AWS using kops.

[0:27:19.9] VIPUL: Okay, any final words Vishal?

[0:27:22.6] VISHAL: Yes, so while getting started with Kubernetes, one would take a look at the challenges which are inward while setting up Kubernetes. Kubernetes is just an orchestration tool which manages level Docker images. So Kubernetes would not help you to build Docker images as well as say, you know, I am — you need to think about how to build those in your continuous integration cycle that you may have already designed so you need to look into that.

Another challenge is setting up the Kubernetes cluster itself. So there are multiple tools Rahul has discussed Kops in more detail. So we will discuss it in the next podcast. Other challenges are maintaining the manifest files, which are needed to cleared resources or resources on the Kubernetes for the deployments for services and whatnot. So you’ll need to either check in those files or need an internet tool or a custom tool to at least hold those files and some way to modify those files and apply on Kubernetes in order to supervise the changes.

Or actually, the ruling deployment of the changes that you made. Other challenge is related to authorization and security like who can access and what role. Like name spaces and there are multiple things. Also, I just missed out, Kubernetes is just for production or to start working on a kind of server thing but on local units some way to use it on your development machine. So for that unit some different techniques to orchestrate the Docker images on your local machine. So I guess we’ll discuss this in more details next time. But yeah, these are the thoughts from my end.

[0:29:22.1] VIPUL: Yep, so whenever we have tons of things to discuss, this was just to get our feet wet about the latest things that we will be discussing in our upcoming episodes. So yeah again, thanks Vishal and Rahul. That’s it for this episode and in the next episode, we continue on our discussion about what are the different things or different layers of Kubernetes. For example, the tools as well as the discussion board like opt or optimization or other things.

Yeah, moreover also using the internal tools for managing your manifests and other things as well as how all of these things you add up in your own tool when you are deploying it. For example, I believe you are doing a range application deployment, how exactly that develops in your application. Again, thanks Rahul and Vishal. That’s it for this episode. Thanks.

[0:30:07.6] VISHAL: Thanks, Vipul.

[0:30:09.6] RAHUL: Thank you.

[END]