This document describes all steps that we need to make when we decide to start the production cluster from Hetzner. This contains:
- server installation
- database
- frontend apps
- backend apps
- ssl
- grafana + loki
1 Install servers
We buy the servers from the clould web interface. For each server we need to do the following steps when buying:
Add it to the
brandName-net-01
private network(Used to access the nfs storage) In the future, maybe start the cluster on this network.Add it to the
brandName-firewall-01
firewallAdd it to the
brandName-01
placement group(this way they won't end up on the same phisical server, so if one fails the others are still up)Add the public IP to the
brandName-firewall-01
fireawall, we have two rules that allow traffic between those servers. This is due to tha fact that we couldn't make it(rke2 cluster, here's smt similar) work on the private addresses.
1.1 Change root pass
After buying a new server, we will receive an email with the root pass, we will connect manually to it and change the pass.
We also need to add it to the inventory of rke2-ansible.
1.2 Local utilities to install and preparations
We need to add the users to the new servers and and install the requirements.
sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt install ansible
Prepare the key for ansible_noob user which will be used to install all things on nodes.
Generate the key
ssh-keygen -t rsa -b 4096 -C "ansible_noob"
1.3 Add ansible_noob user
Adds the anssible_noob user to all servers and copies the key + makes the user a sudoer.
To run this, you will need sshpass
instaled on your PC:
sudo apt-get install sshpass
ansible-playbook -v -i hosts/hetzner/hosts_ansible_noobs ansible_noob.yml
Note: When you want to install a new node, add it to the ansible_noobs group and run the ansible_noob.yml, then comment/remove the hosts from that group.
1.4 Init server - install utilities for rke
- Update + upgrade
- Add developer users
- Nfs server on nfs_servers
ansible-playbook -v -i hosts/hetzner/hosts init_rke2_hetzner.yml
# Or
ansible-playbook -v -i hosts/hetzner/hosts --key-file "~/.ssh/ansible_noob_id_rsa" init_rke2_hetzner.yml
Note: When you want to install a new node, add it to the new_nodes group and run the init.yml, then remove the hosts from that group.
You can test of the nfs works, you can mount it on another server and see if it works:
ssh ansible_noob@SERVER-IP-1
sudo mkdir test
sudo mount 10.112.0.2:/var/nfs/general $(pwd)/test
cd test
touch file
cd ..
sudo umount $(pwd)/test
exit
ssh ansible_noob@SERVER-IP-2
cd /var/nfs/general
ls
# file should be there
1.3 Install RKE2
git clone git@github.com:rancherfederal/rke2-ansible.git
cd rke2-ansible/
ansible-galaxy collection install -r requirements.yml
cd inventory/
ln -s ../../rke2_inventory/hetzner/ hetzner
ansible-playbook site.yml -i inventory/hetzner/hosts.ini
To get the kubeconfig(we can omit this, because we can get it from rancher):
ssh ansible_noob@SERVER-IP-2
sudo cp /etc/rancher/rke2/rke2.yaml .
sudo chown ansible_noob: rke2.yaml
exit
scp ansible_noob@SERVER-IP-2:/home/ansible_noob/rke2.yaml $(pwd)/inventory/hetzner/credentials/
# Edit the server ip
export KUBECONFIG=/path/rke2_inventory/hetzner/credentials/rke2.yaml
kubectl get nodes
1.4 Post RKE2 install
Things that we need to do after RKE2 is installed. This is needed for rancher:
cd .. # get back in the ansible folder
# Make sure that the master node is not commented in the new_nodes section
ansible-playbook -v -i hosts/hetzner/hosts post_rke2.yml
1.5 Install rancher
Install helm on your PC and add the repository + create namesapce for rancher:
# Helm install
curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
sudo apt-get install apt-transport-https --yes
echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
kubectl create namespace cattle-system
Install cert-manager:
# If you have installed the CRDs manually instead of with the `--set installCRDs=true` option added to your Helm install command, you should upgrade your CRD resources before upgrading the Helm chart:
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.6.1/cert-manager.crds.yaml
# Add the Jetstack Helm repository
helm repo add jetstack https://charts.jetstack.io
# Update your local Helm chart repository cache
helm repo update
# Install the cert-manager Helm chart
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.6.1
# See the cert manager pods
kubectl get pods --namespace cert-manager
Install rancher with rancher Certificates, the external certificates will be provided by clouldflare:
helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--set hostname=rancher-hetzner.brandName.com \
--set replicas=3
# To uninstall
helm uninstall rancher
# Wait for it to finish installing:
kubectl -n cattle-system rollout status deploy/rancher
kubectl -n cattle-system get deploy rancher
Get the link for the first setup
echo https://rancher-hetzner.brandName.com/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}')
Open that in a browser and set the password.
If you forget the password, use this.
2 Post install rancher
2.1 Add helm repositories in rancher
In Apps & Marketplace
> Repositories
-> Create
https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
The rest can be postponned:
https://charts.helm.sh/stable
https://charts.helm.sh/incubator
https://charts.jetstack.io
2.2 Prepare secrets
From Storage
> Secrets
> Create
> Opaque
For rancher backups: backblaze-brandName-hetzner-rancher
, source.
accessKey: KEYID
secretKey: SECRET
For vitess backup: backblaze-brandName-hetzner-vitess
, the key should be brandName-hetzner-vitess-key
and the value should be this(It looks silly, I know..):
[default]
aws_access_key_id=KEYID
aws_secret_access_key=SECRET
Note: It must be in ~/.aws/credentials format as stated in the docs.
For mailjet: mailjet-api
, it should contain two keys:
MAILJET_API_KEY
-value
MAILJET_API_SECRET
-value
2.2 Install cluster tools
nfs-subdir-external-provisioner
provider(this will pop-up when installing thenfs-subdir-external-provisioner
, set as default class + set archive totrue
)- Rancher Backups
These, we can install them when we really need them:
- Monitoring:
10Gb
-10d
- Alerting Drivers - I'm not sure if we should install this.
NFS install.
Go to Apps & Marketplace
> Charts
and search for nfs:
Name: nfs-master1-storage
path: /var/nfs/general
server: 10.112.0.2 # Master1 private IP
allowVolumeExpansion: true
archiveOnDelete: true
defaultClass: true
name: nfs-master1-storage
For Rancher backups use the following: Cluster Tools
> Rancher Backups
secret: backblaze-brandName-hetzner-rancher
region: eu-central-003
endpoint: s3.eu-central-003.backblazeb2.com
bucket name: brandName-hetzner-rancher
Then go to the Rancher Backups
> Backup
> Create
section and create a recurring backup, everyday at 12 AM: 0 0 * * *
(UTC -> 03:00 RO). The name should be backup-rancher-to-backblaze
.
Retention: 30
3 Prepare database
All things that are needed for preparing the database env for our apps.
3.1 Vitess
Install the operator for vitess:
kubectl apply -f https://raw.githubusercontent.com/vitessio/vitess/main/examples/operator/operator.yaml
Install vitess:
Before this, you should add a backup inside the bucket, this way it will initizlize. Or comment the initializeBackup: true
from vitess. Check this for more info for the initial import:
1. Initial schema import - This should be only one time, we shouldn't need this anymore
Tpyeorm doesn't work with vitess atm, I've opened an issue here, so to initialize the database I did the following:
- Created an empty database locally:
domain-com-prod-schema
- Started the backend and connected to it
- Ran
mysqldump -d -u root -p domain-com-prod-schema > domain-com-prod.sql
- Commented the
initializeBackup: true
from the vitess cluster, because there is no backup for it. - Started the vitess cluster - todo link - It will auto upload a backup to backblaze.
- Uncomment that line, and apply the vitess again. To be sure, I've deleted the cluster and re-deployed with the line uncommented.
- Ran the
pf.sh
script fromvitess
:bash pf.sh
- Created alias for mysql:
alias mysql="mysql -h 127.0.0.1 -P 15306 -u domain-com_admin"
- You need to use the admin user. - Imported the schema:
mysql -pdomain-com_admin < domain-com-prod.sql
.
2. Update database
- If you need to add a new table:
- Run
mysqldump -d -u root -p domain-com-prod-schema > domain-com-prod.sql
- Get the qsl for that specific table
- Run the
pf.sh
script fromvitess
:bash pf.sh
in the specific cluster - Created alias for mysql:
alias mysql="mysql -h 127.0.0.1 -P 15306 -u domain-com_admin"
- You need to use the admin user. - Open the mysql client:
mysql -pdomain-com_admin
- Run the query
Where pf.sh
is this:
#!/bin/sh
kubectl port-forward --address localhost "$(kubectl get service --selector="planetscale.com/component=vtctld" -o name | head -n1)" 15000 15999 &
process_id1=$!
kubectl port-forward --address localhost "$(kubectl get service --selector="planetscale.com/component=vtgate,!planetscale.com/cell" -o name | head -n1)" 15306:3306 &
process_id2=$!
sleep 2
echo "You may point your browser to http://localhost:15000, use the following aliases as shortcuts:"
echo 'alias vtctlclient="vtctlclient -server=localhost:15999 -logtostderr"'
echo 'alias mysql="mysql -h 127.0.0.1 -P 15306 -u user"'
echo "Hit Ctrl-C to stop the port forwards"
wait $process_id1
wait $process_id2
Go into backblaze account, download the last snapshot from contabo. Then upload it in the hetzner bucket. Make sure you have the correct folder path: Buckets/brandName-hetzner-vitess /vt/domain-com/-/2021-11-19.000002.dehetznernuremberg-1009888160/
. The 2021-11-19.000002.dehetznernuremberg-1009888160
is important, it should contain the same cell name as the cluster: dehetznernuremberg
. I think..
cd vitess
kubectl apply -f hetzner/vitess-cluster.yaml
Notes:
- If no backups are found in the bucket, it won't start, so we need to set
initializeBackup
to false. - Sometimes
kubectl
doesn't start thevtablet
pod, this can be fixed if we copy the yaml to another file and re-run it.
Install vitess client locally(If you don't have it):
wget https://github.com/vitessio/vitess/releases/download/v11.0.1/vitess_11.0.1-92ac1ff_amd64.deb
sudo dpkg -i vitess_11.0.1-92ac1ff_amd64.deb
Check database:
# Port-forward vtctld and vtgate and apply schema and vschema
bash pf.sh &
alias mysql="mysql -h 127.0.0.1 -P 15306 -u domain-com_admin"
alias vtctlclient="vtctlclient -server localhost:15999 -alsologtostderr"
Pass: `domain-com_admin_brandName2`
# Go to `http://localhost:15000/app/dashboard` to see the dashboard.
mysql -pdomain-com_admin_brandName2
vtctlclient BackupShard -allow_primary domain-com/-
Atm, typeorm doesn't initializez the db, so we need to do it manually, first create it locally and then import it in vitess:
mysqldump -u root -proot test_typeorm > domain-com.sql
mysql -pdomain-com_admin_brandName2 < domain-com.sql
This should be done when we update something, before going in production. We're still waiting for this.
3.1 Backup database
We will have to create a recurring CronJob that creates a backup of the vitess database.
The CronJob should have the following: Workload
> CronJobs
> Create
- name:
backup-vitess-domain-com
- schedule:
0 0 * * *
- container-image: vitess/lite:v12.0.2-mysql80
- pull policy: IfNotPresent
- command:
/vt/bin/vtctlclient
- args:
-logtostderr --server vt-vtctld-f26eb0bb:15999 BackupShard -allow_primary domain-com/-
Note: When we will have multiple replicas, we can remove the allow_primary
.
Be sure to check if the --server vt-vtctld-f26eb0bb
matches the current name for that vtctld. To do this, run: kubectl get svc
and see the name of vt-vtctld
.
4. Gitlab registry
Add it to Storage
->Secrets
->Create
->Registry
Registry:
- name: registry-gitlab-com
- url:
registry.gitlab.com
- user:
DEPLOY_TOKEN_USER
- Token: secret
The token was created here with only read_registry
access.
5. Install other utilities
- Loki stack - next post
6. Deploy apps
6.1 Deploy backend(+admin) & frontend
cd deployment_domain-com
kubectl apply -f hetzner/domain.com-backend.yaml
kubectl apply -f hetzner/domain.com-frontend.yaml
kubectl apply -f contabo/domain.com-backend-admin.yaml
6.2 Deploy certificates
We will have to deploy a cluster issues, we will use the staging certificates from let's encrypt. We can use the production ones in production.
Search ClusterIssuer
in rancher > Create from YAML
: add the yaml like we have it here:
# Example for production
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: mail@gmail.com
preferredChain: ""
privateKeySecretRef:
name: letsencrypt-prod
server: https://acme-v02.api.letsencrypt.org/directory
solvers:
- http01:
ingress:
class: nginx
selector: {}
6.3 Deploy ingress for frontend & backend
Add the DNS record in cloudflare first, otherwise the certificate won't be generated.
From rancher UI Service Discovery->Ingress
:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
cert-manager.io/cluster-issuer: letsencrypt-staging
kubernetes.io/ingress.class: nginx
name: domain-com-frontend-ingress
namespace: default
spec:
- host: 'dev.domain.com'
http:
paths:
- backend:
service:
name: domain-com-frontend-service
port:
number: 80
path: /app/
pathType: Prefix
tls:
- hosts:
- dev.domain.com
secretName: domain.com-cert # Autogenerated
6.4 Deploy a service
apiVersion: apps/v1
kind: Deployment
metadata:
name: domain-com-backend
labels:
app: domain-com-backend
spec:
replicas: 1
selector:
matchLabels:
app: domain-com-backend
template:
metadata:
labels:
app: domain-com-backend
spec:
imagePullSecrets:
- name: registry-gitlab-com
containers:
- name: domain-com-backend
image: registry.gitlab.com/backend:1.3.0_master_111111
imagePullPolicy: IfNotPresent
ports:
- containerPort: 6060
env:
- name: DB_USER
valueFrom:
secretKeyRef:
name: domain-com-backend-secret
key: db_user
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: domain-com-backend-secret
key: db_password
- name: DB_HOST
valueFrom:
configMapKeyRef:
name: domain-com-backend-configmap
key: db_host
- name: DB_NAME
valueFrom:
configMapKeyRef:
name: domain-com-backend-configmap
key: db_name
- name: DB_LOGGING
valueFrom:
configMapKeyRef:
name: domain-com-backend-configmap
key: db_logging
- name: DB_SYNCHRONIZE
valueFrom:
configMapKeyRef:
name: domain-com-backend-configmap
key: db_synchronize
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: domain-com-backend-configmap
key: log_level
- name: JWT_EXPIRES_IN
valueFrom:
configMapKeyRef:
name: domain-com-backend-configmap
key: jwt_expires_in
- name: JWT_ALGORITHM
valueFrom:
configMapKeyRef:
name: domain-com-backend-configmap
key: jwt_algorithm
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: domain-com-backend-secret
key: jwt_secret
- name: MAILJET_API_KEY
valueFrom:
secretKeyRef:
name: mailjet-api
key: MAILJET_API_KEY
- name: MAILJET_API_SECRET
valueFrom:
secretKeyRef:
name: mailjet-api
key: MAILJET_API_SECRET
- name: FRONTEND_URL
valueFrom:
configMapKeyRef:
name: domain-com-backend-configmap
key: frontend_url
---
apiVersion: v1
kind: Service
metadata:
name: domain-com-backend-service
spec:
selector:
app: domain-com-backend
ports:
- protocol: TCP
port: 6060
targetPort: 6060
---
# Create this first before the deployment
apiVersion: v1
kind: ConfigMap
metadata:
name: domain-com-backend-configmap
data:
db_host: vt-vtgate-41864810
db_name: "domain-com"
db_synchronize: "false"
db_logging: "all"
jwt_expires_in: "200d"
jwt_algorithm: "HS256"
log_level: "debug"
frontend_url: "domain.com"
# the service name
---
# kubectl apply -f mongo-secret.yaml
# Run this first, before deployment
apiVersion: v1
kind: Secret
metadata:
name: domain-com-backend-secret
type: Opaque
# data: This is just like stringData only that it's base64 encoded
# db_user: sXumcm3laZU=
# db_password: eGezc2dvcuZ=
# jwt_secret: sdadas
stringData:
db_user: domain-com_backend
db_password: "domain-com_backend_domain-com"
jwt_secret: "jwt"