Configuring multi-tenant Cloud Pak for Data environment on OpenShift

5 min readSep 14, 2020

In some scenarios you need to deploy multiple Cloud Pak for Data instances (for example Prod/Dev environments, HA configuration, separation of different services), in other cases you need to install Cloud Pak for Data next to existing solutions and you want to make sure the products don’t affect each other.

Since Cloud Pak for Data deploys in a dedicated namespace there is some initial separation, this however may not be enough. Let me share a couple of additional mechanism that can be used to keep the workloads independent.

Please make sure that you review you license agreements before you deploy multiple CP4D instances on top of a single OpenShift cluster.

1. Dedicating a group of workers to a namespace.

Allocating specific resources (workers) to Cloud Pak for Data can be done in various ways — there are taints, labels, toleration, etc.

In my opinion the easiest mechanism is the node-selector approach. This solution allows you to fine-tune your configuration in a few simple steps.

Label nodes

oc label node mynode type=cp4doc get node mynode --show-labels

2. Change your project configuration

oc patch namespace myproject -p '{"metadata":{"annotations":{"openshift.io/node-selector":"type=cp4d"}}}'

The above will not relocate the already running pods, but any new workload will be placed on the desired workers.

You can notice this by describing a newly create pod and grepping for Node-Selectors

oc describe pod mypod |grep Node-Selectors
Node-Selectors:  type=cp4d

More details can be found in OpenShift documentation:

Placing pods on specific nodes using node selectors

You can use node selector labels on pods to control where the pod is scheduled. With node selectors, OpenShift…

docs.openshift.com

Managing Projects

The number of self-provisioned projects requested by a given user can be limited with the ProjectRequestLimit admission…

docs.openshift.com

2. Create a dedicated Storage Class for each namespace.

Usually a single Storage Class is used across multiple namespaces, and while this works fine, sometimes project- specific storage definition may be needed.

This can be especially useful when using a storage mechanism which is not namespace-aware — for example NFS.

Instead of keeping all the data in a single location (export path) you can separate them by defining a new Storage Class, this allows you to back up/transfer the data and configure security restrictions per project.

For example (assuming NFS):

Provider 1:      - env:
        - name: PROVISIONER_NAME
          value: cpd-storage-dev.io/nfs
        - name: NFS_SERVER
          value: 10.10.10.10
        - name: NFS_PATH
          value: /nfsdevProvider 2:      - env:
        - name: PROVISIONER_NAME
          value: cpd-storage-prod.io/nfs
        - name: NFS_SERVER
          value: 10.10.10.10
        - name: NFS_PATH
          value: /nfsprod

3. Create a new Machine Config Pool for individual projects

This approach allows you to apply different changes to specific workers instead of modifying all workers at once.

Since OpenShift Container Platform 4.3 & Cloud Pak for Data 3.0.1 we suggest the following steps to configure nodes:

https://www.ibm.com/support/knowledgecenter/en/SSQNUZ_3.0.1/cpd/install/node-settings.html#node-settings__crio

Please note that some MachineConfig configuration are needed and that by default only one group of workers exist in an OCP cluster.

To change this, follow this process:

Label the desired nodes

oc label node <node-name> node-role.kubernetes.io/cp4d=""oc get node

Create a new MachineConfigPool

cat << EOF > cp4dmcp.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: cp4d
spec:
  machineConfigSelector:
    matchExpressions:
      - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,cp4d]}
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/cp4d: ""
EOFoc create -f cp4dmcp.yaml
oc get mcp

Test it with

cat << EOF > cp4dmc-test.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: cp4d
  name: 31-cp4d
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:,cp4d-test
        filesystem: root
        mode: 0644
        path: /etc/cp4dtest
EOFoc create -f cp4dmc-test.yaml
oc get mc | grep 31-cp4dWait for the node to restart (monitor `oc get node`), finally run:ssh core@<node-name> 'cat /etc/cp4dtest'

More references:
https://access.redhat.com/solutions/4287111

Configuring Infrastructure Nodes with Taint in OpenShift 4 - Red Hat Customer Portal

At a minimum, an OpenShift cluster contains 2 worker nodes in addition to 3 control plane nodes. While a large number…

access.redhat.com

Configuring Infrastructure Nodes with Taint in OpenShift 4 - Red Hat Customer Portal

At a minimum, an OpenShift cluster contains 2 worker nodes in addition to 3 control plane nodes. While a large number…

access.redhat.com

Openshift 4 create infra machines - Red Hat Customer Portal

Openshift cluster is not having infra machines $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED master…

access.redhat.com

4. Configure egress IPs and/or redirection through a load balancer

This method allows you to control the traffic coming out of your project/namespace. You may need to do so if your worker nodes are dynamic, you need to limit the number of firewall rules or you simply need to clearly differentiate between two Cloud Pak for Data instances.

The first solution involves egress IP configuration and can be achieved by modifying the NetNamespace resource and then patching hosts’ subnet settings.

Please see the following link for details:

Configuring egress IPs for a project

By configuring an egress IP address for a project, all outgoing external connections from the specified project will…

docs.openshift.com

The second solution involves redirecting all outgoing traffic through your load balancer, this is particularly useful when dealing with air-gaped environments.

Start by modifying the load balancer configuration (for example haproxy):

Assuming that there is a service running on exampleHost (1.2.3.4), listening on port 678.Edit the configuration file of the load balancer (/etc/haproxy/haproxy.cfg) and restart the lb after applying the changes.Add a `frontend` and `backend` section:frontend example
  bind *:30678
  mode tcp
  default_backend example-tcp
  option tcplogbackend example-tcp
  mode tcp
  balance source
  server exampleHost 1.2.3.4:678 check

Check that you can talk to the service through the load balancer (`curl -k -v <load_balancer_ip>:30678').

Once the load balancer has been configured, let’s create a Service and an Endpoint which will talk to the load balancer. This way we don’t need to use the load balancer details in our application, and the users can simply talk to the example-service as if it was the application running on exampleHost.

cat << EOF > objects.yaml
apiVersion: v1
kind: Service
metadata:
  name: example-service
spec:
  clusterIP: None
  ports:
  - protocol: TCP
    port: 30678
    targetPort: 30678
---
apiVersion: v1
kind: Endpoints
metadata:
  name: example-service
subsets:
  - addresses:
    - ip: <load_balancer_ip>
    ports:
    - port: 30678
EOFoc create -f objects.yaml

Configuring multi-tenant Cloud Pak for Data environment on OpenShift

1. Dedicating a group of workers to a namespace.

Placing pods on specific nodes using node selectors

You can use node selector labels on pods to control where the pod is scheduled. With node selectors, OpenShift…

Managing Projects

The number of self-provisioned projects requested by a given user can be limited with the ProjectRequestLimit admission…

2. Create a dedicated Storage Class for each namespace.

3. Create a new Machine Config Pool for individual projects

Configuring Infrastructure Nodes with Taint in OpenShift 4 - Red Hat Customer Portal

At a minimum, an OpenShift cluster contains 2 worker nodes in addition to 3 control plane nodes. While a large number…

Configuring Infrastructure Nodes with Taint in OpenShift 4 - Red Hat Customer Portal

At a minimum, an OpenShift cluster contains 2 worker nodes in addition to 3 control plane nodes. While a large number…

Openshift 4 create infra machines - Red Hat Customer Portal

Openshift cluster is not having infra machines $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED master…

4. Configure egress IPs and/or redirection through a load balancer

Configuring egress IPs for a project

By configuring an egress IP address for a project, all outgoing external connections from the specified project will…

Written by Tomasz Hanusiak