Debugging Argo CD and OIDC logins

Over the past years a constant reoccurring pain in my daily job is when we have spun up a fresh kubernetes cluster and need to sign into Argo CD for the first time. We are greeted with the following failure:

failed to get token: oauth2: “invalid client” “invalid client credentials.”

We know the issue is with Argo CD itself since our Identity Provider is used for logging with 5+ different applications within the same cluster.

Now I have lost hope that this login issue will be resolved by updating Argo CD, at least without someone (me, I guess) pinpointing the root cause.

Let me take you through my journey of discovery or you can skip to The Root Cause.

The Test Setup

First I made an empty kind cluster and started building a minimum setup to make an OIDC login into Argo CD where I used the following products:

This means this simple cluster looks like this:

IngressKindArgDoexCD

Knowing I would need to run and re-run setups, teardowns and tests many, many, many times to know if the issue was consistently reproduced I figured it would be a good idea to organize everything in a Makefile.

Some of code snippets in this post will have variables that when run by make is replaced by actual values by envsubst. This is to make sure they are the same value across all of the configuration.

Example variables:

Let us review the rest of the relevant configuration before we mimic the OIDC login flow.

Dex

Dex is a widely used OpenID Connect Provider that can be configured with static clients and static user logins which is beneficial for our test setup.

The installation uses its Helm chart and for completeness you may review the values file used below.

View Dex values file
envFrom:
  - secretRef:
      name: dex-client-secrets

config:
  issuer: http://$DEX

  staticClients:
    - id: $CLIENT_ID
      name: ArgoCD
      secretEnv: CLIENT_SECRET
      redirectURIs:
        - http://$ARGO/auth/callback

  enablePasswordDB: true

  staticPasswords:
    - email: "admin@example.com"
      # bcrypt hash of the string "password": $(echo password | htpasswd -BinC 10 admin | cut -d: -f2)
      hash: "$2a$10$2b2cU8CPhOTaGrs1HRQuAueS7JTT5ZHsHSzYiFPm1leZck7Mc8T4W"
      username: "admin"
      userID: "08a8684b-db88-4b73-90a9-3cd1661f5466"

  oauth2:
    skipApprovalScreen: true
    passwordConnector: local

  storage:
    type: sqlite3
    config:
      file: /var/dex/dex.db

Argo CD

Argo CD is installed using its Helm chart and with mostly default values. The important non-default settings are:

That makes the values file look like this:

configs:
  params:
    server.insecure: true

  cm:
    url: http://$ARGO
    admin.enabled: false
    oidc.config: |
      name: Dex
      issuer: http://$DEX
      clientID: $CLIENT_ID
      clientSecret: $argocd-client-secrets:clientSecret
      requestedScopes:
        - openid
        - profile
        - email
        - groups      

dex:
  enabled: false

Kind

It was not trouble-free to setup Kind. You must configure port mapping to make anything inside the Kind cluster available on the outside.

You could use port forwarding instead instead of configuring Kind, however managing those ports becomes painful when you want to run multiple make recipes to run test cases.

The easier solution was to create the Kind cluster with the following configuration:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    kubeadmConfigPatches:
      - |
        kind: InitConfiguration
        nodeRegistration:
          kubeletExtraArgs:
            node-labels: "ingress-ready=true"        
    extraPortMappings:
      - containerPort: 80
        hostPort: 80
        protocol: TCP

Ingress

There was a guide to setting up ingress on Kind and we only needed simple host mapping. You can review the ingress manifests below.

View ingress manifests
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: argocd
  namespace: argocd
spec:
  rules:
    - host: $ARGO
      http:
        paths:
          - pathType: ImplementationSpecific
            backend:
              service:
                name: argocd-server
                port:
                  number: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: dex
  namespace: dex
spec:
  rules:
    - host: $DEX
      http:
        paths:
          - pathType: ImplementationSpecific
            backend:
              service:
                name: dex
                port:
                  number: 5556

Client Secrets

For the OIDC login flow to function both Argo CD and Dex will need to know a shared client secret and both applications can be configured to read this client secret from Kubernetes secrets.

The Makefile contains a variable that will be used as the shared secret to generate two secrets with the same value to configure both applications.

For instance the secret provided for Dex looks like this:

apiVersion: v1
kind: Secret
metadata:
  name: dex-client-secrets
  namespace: dex
type: Opaque
stringData:
  CLIENT_SECRET: $CLIENT_SECRET

Bypass HTTPS/SSL requirements

It is common to run into issues with SSL when testing locally because you want to keep everything simple while you debug. It is also common for there to be a way to turn the SSL requirement off.

Currently we only have port 80/http mapped in our Kind cluster. The ingress happily routes traffic on port 80 towards our applications. Dex does not - at least not per default - enforce using SSL. Argo CD needs to be configured to allow using unencrypted http. This is done with the following line in its values file:

server.insecure: true

Bypass DNS requirements

Both applications we have set up are routed to using the hostnames dex.host and argocd.host and these hostnames do not exist in any DNS anywhere.

From the Outside

I wanted to make sure running the make recipes for testing purposes would work in isolation therefore adding the hostnames to the /etc/hosts file was a no-go.

It was a good thing that I went looking for a different way to handle hostnames because I learned a new thing about curl from this list of name resolving tricks.

The --resolve option can override resolving a hostname:port combination to a specific address (or addresses). This means I can run curl command towards the Argo CD running in my Kind cluster with the following command:

curl -v \
 --resolve argocd.host:80:127.0.0.1 \
 http://argocd.host/

Running the curl in verbose mode lets you know that the option simply adds fake records to the DNS cache for the current execution with the following logging output:

* Added argocd.host:80:127.0.0.1 to DNS cache
* Hostname argocd.host was found in DNS cache
*   Trying 127.0.0.1:80...
* Connected to argocd.host (127.0.0.1) port 80
...

On the Inside

As you will see later ArgoCD needs to resolve the hostname for Dex internally in the Kind cluster. We do not have much choice on the inside of the cluster and will need to manipulate CoreDNS.

Thankfully manipulating CoreDNS is as easy as providing a custom ConfigMap with its configuration.

We grab and expand the current configuration to make CoreDNS resolve the hostname for Dex as the service named http.dex.svc. See the highlighted line 14 in this ConfigMap:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        rewrite name $DEX http.dex.svc.cluster.local
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }    

Note that http.dex.svc is a service we will have to add ourselves. You can review the service manifest below.

View service manifest
apiVersion: v1
kind: Service
metadata:
  name: http
  namespace: dex
spec:
  selector:
    app.kubernetes.io/instance: dex
    app.kubernetes.io/name: dex
  ports:
    - appProtocol: http
      name: http
      port: 80
      protocol: TCP
      targetPort: http

Mimicking OIDC Login Flow

After preparing the setup and configuration we are now ready to run tests. Obviously I ran the setup and test recipes many times to get the setup and configuration working too.

My very first step before debugging this issue was to find some way of using curl to mimic OIDC logins and I had found this StackOverflow answer. I was able to piece together the commands that leads to a token using the curl commands from that answer (in verbose mode) supported by inspection of logins into an Argo CD in the wild.

I am letting the diagram below do most of the explaining with HTTP verbs and paths, but there will be more explanation after the diagram.

111c123456789012urlGGPGEEOETTSTT///aa/auuauttuthhth/h/l/colag3oli0cln3abla-/clkhotgtipn://dex.hos3t0/3au-AtrhgoC3GD0ES3TSeen-/ndtdshostkstettpnoa:kt/ee/nacrcogooookckidie.ehhohe2sea0tad0/deaer-urt2lh0o/0gcia-nllefbmoaprctmkyDex

Diagram additional explanations

#2 The cookies must be captured since Argo CD uses them for verification in #8.

#5 The html for the login form must be captured since we need to POST to the action endpoint of that form in #6.

#6 This step includes the username and password in the POST payload.

#9 This step was the problematic DNS lookup that required tinkering with CoreDNS.

#11 The cookies must be captured since they now contain the JWT.

Conclusion

At this point while getting the OIDC login in a working state I had already stumbled over the problem that leads to the failure message:

failed to get token: oauth2: “invalid client” “invalid client credentials.”

The root cause I found is only relevant when Argo CD is configured to use “SSO clientSecret with secret references”, see Argo CD documentation for details.

We are referencing such a secret at line 12 in the Argo CD values file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
configs:
  params:
    server.insecure: true

  cm:
    url: http://$ARGO
    admin.enabled: false
    oidc.config: |
      name: Dex
      issuer: http://$DEX
      clientID: $CLIENT_ID
      clientSecret: $argocd-client-secrets:clientSecret
      requestedScopes:
        - openid
        - profile
        - email
        - groups      

dex:
  enabled: false

The problem is that the secrets and config maps with a app.kubernetes.io/part-of: argocd label is queried and replaced into Argo CD configuration only under certain conditions:

You may check my setup and test cases by reviewing the Makefile I have been referencing, or run its recipes:

# start out working and then break it
make working test
make break
make test
# start out broken and then fix it
make broken test
make fix
make test
Show Makefile
cluster := debugging-argocd
argo := argocd.host
dex := dex.host

argoVersion := 7.4.4
dexVersion := 0.19.1
ingressVersion := 4.11.2

clientId := argo
clientSecret := some-secret-here

tmp := /tmp/$(cluster)
jar := $(tmp)/cookie.jar

kubectl := kubectl --context kind-$(cluster)
curl := curl -sf --resolve $(argo):80:127.0.0.1 --resolve $(dex):80:127.0.0.1 --cookie $(jar) --cookie-jar $(jar)

_ := $(shell mkdir -p $(tmp) ; find $(tmp) -delete -mindepth 1)

verify-deps:
	which docker kind kubectl helm curl jq envsubst base64 > /dev/null

clean: verify-deps
	-kind -q delete cluster --name $(cluster)
	-find $(tmp) -delete -mindepth 1

_setup: verify-deps clean
	ARGO=$(argo) DEX=$(dex) CLIENT_ID=$(clientId) CLIENT_SECRET=$(clientSecret) envsubst '$$ARGO,$$DEX,$$CLIENT_ID,$$CLIENT_SECRET' < values/argocd.yaml > $(tmp)/values-argocd.yaml
	ARGO=$(argo) DEX=$(dex) CLIENT_ID=$(clientId) CLIENT_SECRET=$(clientSecret) envsubst '$$ARGO,$$DEX,$$CLIENT_ID,$$CLIENT_SECRET' < values/dex.yaml > $(tmp)/values-dex.yaml
	ARGO=$(argo) DEX=$(dex) CLIENT_ID=$(clientId) CLIENT_SECRET=$(clientSecret) envsubst '$$ARGO,$$DEX,$$CLIENT_ID,$$CLIENT_SECRET' < manifests/ingress.yaml > $(tmp)/ingress.yaml
	ARGO=$(argo) DEX=$(dex) CLIENT_ID=$(clientId) CLIENT_SECRET=$(clientSecret) envsubst '$$ARGO,$$DEX,$$CLIENT_ID,$$CLIENT_SECRET' < manifests/coredns.yaml > $(tmp)/coredns.yaml
	ARGO=$(argo) DEX=$(dex) CLIENT_ID=$(clientId) CLIENT_SECRET=$(clientSecret) envsubst '$$ARGO,$$DEX,$$CLIENT_ID,$$CLIENT_SECRET' < manifests/dex-secret.yaml > $(tmp)/dex-secret.yaml
	ARGO=$(argo) DEX=$(dex) CLIENT_ID=$(clientId) CLIENT_SECRET=$(clientSecret) envsubst '$$ARGO,$$DEX,$$CLIENT_ID,$$CLIENT_SECRET' < manifests/argocd-secret.yaml > $(tmp)/argocd-secret.yaml

	curl -sfL https://raw.githubusercontent.com/kubernetes/ingress-nginx/helm-chart-$(ingressVersion)/deploy/static/provider/kind/deploy.yaml > $(tmp)/ingress-nginx.yaml
	helm template argocd argo-cd --version $(argoVersion) --repo https://argoproj.github.io/argo-helm -n argocd -f $(tmp)/values-argocd.yaml --create-namespace > $(tmp)/argocd.yaml
	helm template dex dex --version $(dexVersion) --repo https://charts.dexidp.io -n dex -f $(tmp)/values-dex.yaml --create-namespace > $(tmp)/dex.yaml

	kind -q create cluster --config kind.config --name $(cluster)
	@echo

	$(kubectl) create namespace ingress-nginx
	$(kubectl) create namespace dex
	$(kubectl) create namespace argocd
	@echo

	$(kubectl) apply -ningress-nginx -f $(tmp)/ingress-nginx.yaml
	sleep 5 # `kubectl wait` requires the resource to exist
	$(kubectl) wait --namespace ingress-nginx \
		--for=condition=ready pod \
		--selector=app.kubernetes.io/component=controller \
		--timeout=90s
	@echo

	$(kubectl) apply --filename $(tmp)/ingress.yaml
	$(kubectl) apply --filename $(tmp)/coredns.yaml
	$(kubectl) rollout restart -n kube-system deployment/coredns
	@echo

	$(kubectl) apply --filename $(tmp)/dex-secret.yaml
	$(kubectl) apply --filename manifests/dex-service.yaml
	$(kubectl) apply -ndex -f $(tmp)/dex.yaml
	sleep 5 # `kubectl wait` requires the resource to exist
	$(kubectl) wait -n dex \
		--for=condition=Ready pod \
		--selector=app.kubernetes.io/name=dex \
		--timeout=90s
	@echo

working: _setup
	$(kubectl) apply --filename $(tmp)/argocd-secret.yaml
	$(kubectl) apply -nargocd -f $(tmp)/argocd.yaml
	sleep 5 # `kubectl wait` requires the resource to exist
	$(kubectl) wait -n argocd \
		--for=condition=Ready pod \
		--selector=app.kubernetes.io/name=argocd-server \
		--timeout=90s
	@sleep 2
	@echo

broken: _setup
	$(kubectl) apply -nargocd -f $(tmp)/argocd.yaml
	sleep 5 # `kubectl wait` requires the resource to exist
	$(kubectl) wait -n argocd \
		--for=condition=Ready pod \
		--selector=app.kubernetes.io/name=argocd-server \
		--timeout=90s
	$(kubectl) apply --filename $(tmp)/argocd-secret.yaml
	@sleep 2
	@echo

test:
	touch $(jar)
	@echo

	$(curl) -Lo $(tmp)/login.html http://$(argo)/auth/login
	grep -o 'action="[^"]*"' < $(tmp)/login.html | cut -d\" -f2 | sed 's/&amp;/\&/g' > $(tmp)/path
	@echo

	$(curl) -D $(tmp)/header.log -XPOST -d "login=admin@example.com&password=password" "http://$(dex)$$(cat $(tmp)/path)"
	grep ^Location $(tmp)/header.log | cut -d' ' -f2 | tr -d '\r' > $(tmp)/endpoint
	@echo

	$(curl) -o /dev/null "$$(cat $(tmp)/endpoint)"
	@echo

	grep argocd.token $(jar) | cut -f7- | tee $(tmp)/token
	@echo

	@echo Token payload:
	(cut -d. -f2 < $(tmp)/token|tr -d '\n'; echo '===') | base64 -d | jq
	@echo

fix:
	$(kubectl) rollout restart -nargocd deployment
	$(kubectl) rollout restart -nargocd sts

_break:
	@head -c 12 /dev/random | base64 | base64 > $(tmp)/random-secret
break: _break
	$(kubectl) patch -nargocd secret argocd-client-secrets --type='json' -p='[{"op" : "replace" ,"path" : "/data/clientSecret" ,"value" : "$(shell cat $(tmp)/random-secret)"}]'
	$(kubectl) patch -ndex secret dex-client-secrets --type='json' -p='[{"op" : "replace" ,"path" : "/data/CLIENT_SECRET" ,"value" : "$(shell cat $(tmp)/random-secret)"}]'
	$(kubectl) rollout restart -ndex deployment

logs:
	$(kubectl) logs -nargocd -l app.kubernetes.io/name=argocd-server --since=1m
	$(kubectl) logs -ndex -l app.kubernetes.io/name=dex --since=1m

The Root Cause

Summarized, Argo CD does not detect if any of your secrets have changed.

Therefore it is imperative that all secrets are created prior to installing Argo CD or to restart the Argo CD deployments when secrets are updated.

Not really a satisfying solution, but good enough for now.

Moving to Hugo ยป