Automating Bug Bounties using Kubernetes and Argo

Thu, Apr 1, 2021

I have experimented a lot trying to create fast and efficient workflows that would help me automate my recon process during bug hunting. At first using Python, RabbitMQ and Docker but I have now moved my whole process inside GKE using Kubernetes and Argo Events. I found out that using Argo I can easily pass data from one pod to another inside the workflow and also use RabbitMQ as a message queue.

Argo Workflows & Events

Argo Events can help you automate your workflows inside Kubernetes and also offer a variety of events to trigger them like webhooks, message queues, etc. Personally I have only used webhooks and message queues using RabbitMQ. Both are very simple to use, the first one can be triggered by classic curl request and the latter using python’s pika module or a quick golang program.

Setting up Argo in GCP

You can start by creating an account in GCP where you get a number of credits for the first 90 days of use, that should be enough for you to see if GCP and Argo are the right tools for your automation. You will also need to download gcloud and kubectl.

Start by creating a project and selecting it to setup Argo inside it.

gcloud projects create bug-bounty-argo
gcloud config set project bug-bounty-argo

You will then need to create a cluster which you can do with:

gcloud container clusters create bug-bounty-argo --cluster-version 1.18.16-gke.300 --disk-size 200 --num-nodes 1 --machine-type e2-small --no-enable-cloud-logging --no-enable-cloud-monitoring --zone europe-west2-a

With kubectl running locally you can run the commands below to access the cluster

gcloud container clusters get-credentials bug-bounty-argo --zone europe-west2-a
kubectl get nodes

Installing Argo and Argo Events

The fastest way to have a cluster with Argo installed is to run the below. However, I would first recommend you to go through the YAML files and understand what each one is doing.

kubectl create ns argo
kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo-workflows/release-2.12.4/manifests/install.yaml

Proceed with the Argo Events installation:

kubectl create namespace argo-events
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-events/v1.2.3/manifests/install.yaml
kubectl apply -n argo-events -f https://raw.githubusercontent.com/argoproj/argo-events/v1.2.3/examples/eventbus/native.yaml

Example Argo Webhook Workflow

To see if everything is installed correctly and with no errors you can now setup a webhook event that will listen for events on port 12000. You will need to first setup the Event Source and Sensor:

kubectl apply -n argo-events -f https://raw.githubusercontent.com/argoproj/argo-events/stable/examples/event-sources/webhook.yaml
kubectl apply -n argo-events -f https://raw.githubusercontent.com/argoproj/argo-events/stable/examples/sensors/webhook.yaml

Once the Event Source and Sensor are running you can create a port forward on 12000 to listen to your messages.

kubectl -n argo-events port-forward $(kubectl -n argo-events get pod -l eventsource-name=webhook -o name) 12000:12000

Finally, curl can be used to send a message:

curl -d '{"message":"this is my first webhook"}' -H "Content-Type: application/json" -X POST http://localhost:12000/example

You can now download the Argo CLI tool and use to submit workflows or view their process. For example:

argo list -n argo-events

NAME            STATUS    AGE   DURATION   PRIORITY
webhook-g2x8h   Running   0s    1s         0

argo get -n argo-events @latest

Name:                webhook-g2x8h
Namespace:           argo-events
Status:              Succeeded
Conditions:          
 Completed           True
Progress:            1/1
Parameters:          
  message:           {"context":{"id":"39663737663965312d383431622d346336622d613865662d323836346430366265386461","source":"webhook","specversion":"1.0","type":"webhook","datacontenttype":"application/json","subject":"example","time":"2021-04-12T16:50:48Z"},"data":"eyJoZWFkZXIiOnsiQWNjZXB0IjpbIiovKiJdLCJDb250ZW50LUxlbmd0aCI6WyIzOCJdLCJDb250ZW50LVR5cGUiOlsiYXBwbGljYXRpb24vanNvbiJdLCJVc2VyLUFnZW50IjpbImN1cmwvNy42OC4wIl19LCJib2R5Ijp7Im1lc3NhZ2UiOiJ0aGlzIGlzIG15IGZpcnN0IHdlYmhvb2sifX0="}

STEP              TEMPLATE  PODNAME        DURATION  MESSAGE
 ✔ webhook-g2x8h  whalesay  webhook-g2x8h  24s         

argo logs -n argo-events @latest

webhook-g2x8h: / {"context":{"id":"39663737663965312d383 \
webhook-g2x8h: | 431622d346336622d613865662d323836346430 |
webhook-g2x8h: | 366265386461","source":"webhook","specv |
webhook-g2x8h: | ersion":"1.0","type":"webhook","datacon |
webhook-g2x8h: | tenttype":"application/json","subject": |
webhook-g2x8h: | "example","time":"2021-04-12T16:50:48Z" |
webhook-g2x8h: | },"data":"eyJoZWFkZXIiOnsiQWNjZXB0IjpbI |
webhook-g2x8h: | iovKiJdLCJDb250ZW50LUxlbmd0aCI6WyIzOCJd |
webhook-g2x8h: | LCJDb250ZW50LVR5cGUiOlsiYXBwbGljYXRpb24 |
webhook-g2x8h: | vanNvbiJdLCJVc2VyLUFnZW50IjpbImN1cmwvNy |
webhook-g2x8h: | 42OC4wIl19LCJib2R5Ijp7Im1lc3NhZ2UiOiJ0a |
webhook-g2x8h: \ GlzIGlzIG15IGZpcnN0IHdlYmhvb2sifX0="}   /
webhook-g2x8h:  ----------------------------------------- 
[...]

Setting up Google Storage for Argo

Argo supports Google Storage buckets that you can use to save your scan results. You will need to create a bucket first and then in order to set it up you need to create a secret to store the bucket’s access and secret keys. For example, the contents of the gcs-secret.yml are shown below:

apiVersion: v1
kind: Secret
metadata:
  name: my-gcs-cred
type: Opaque
data:
  accessKey: <base64-access>
  secretKey: <base64-secret>

You will then need to create/edit the workflow controller on both namespaces to store the results in the bucket. You can do this with the below and store it to workflow-controller-configmap.yml:

apiVersion: v1
kind: ConfigMap
metadata:
  name: workflow-controller-configmap
data:
  artifactRepository: |
    s3:
      bucket: bounty-argo-workflows
      endpoint: storage.googleapis.com
      insecure: true                  
      accessKeySecret:               
        name: my-gcs-cred
        key: accessKey
      secretKeySecret:           
        name: my-gcs-cred
        key: secretKey
      useSDKCreds: false

Finally, apply changes in both namespaces:

kubectl -n argo apply -f workflow-controller-configmap.yaml
kubectl -n argo-events apply -f workflow-controller-configmap.yaml

Running an Example Workflow

We can now write a workflow that will receive a domain to review and it will start by running httprobe. It will then store the results into a file but also in a JSON Array. The latter is required in order to split the results (if there are multiple ones) and pass them seperately into dirsearch. The next step will be to run httpx, nuclei and dirsearch in parallel.

You can find the example below and others in my Gitlab Repository.

apiVersion: argoproj.io/v1alpha1
kind: Sensor
metadata:
  name: amqp
spec:
  template:
    serviceAccountName: argo-events-sa
  dependencies:
    - name: test-dep
      eventSourceName: amqp
      eventName: example
  triggers:
    - template:
        name: amqp-workflow-trigger
        k8s:
          group: argoproj.io
          version: v1alpha1
          resource: workflows
          operation: create
          source:
            resource:
              apiVersion: argoproj.io/v1alpha1
              kind: Workflow
              metadata:
                generateName: endpoints-
              spec:
                entrypoint: combo
                serviceAccountName: argo-events-sa
    
                arguments:
                  parameters:
                  - name: message
                    value:

                volumes:
                - name: webhook-volume
                  persistentVolumeClaim:
                    claimName: webhook-claim
      
                templates:
                - name: combo
                  steps:
                  - - name: httprobe
                      template: httprobe
                  - - name: httpx
                      template: httpx
                      arguments:
                        artifacts:
                        - name: httprobe-results
                          from: "{{steps.httprobe.outputs.artifacts.httprobe}}"
                    - name: nuclei
                      template: nuclei
                      arguments:
                        artifacts:
                        - name: httprobe-results
                          from: "{{steps.httprobe.outputs.artifacts.httprobe}}"
                    - name: dirsearch
                      template: dirsearch
                      arguments:
                        parameters:
                        - name: target
                          value: "{{item}}"
                      withParam: "{{steps.httprobe.outputs.parameters.livedomains}}"

                - name: httprobe
                  container:
                    image: thealchemist11/tomnomnom-tools
                    command: [sh, -c]
                    args: ["echo {{workflow.parameters.message}} | httprobe -p 8080,https:8443,80,https:443 -c 150 -t 15000 > /tmp/httprobe.txt; cat /tmp/httprobe.txt | jq -R . | jq -s . > /tmp/httprobe.json"]
                  outputs:
                    parameters:
                      - name: livedomains
                        valueFrom: 
                          path: /tmp/httprobe.json
                    artifacts:
                    - name: httprobe
                      path: /tmp/httprobe.txt

                - name: nuclei
                  inputs:
                    artifacts:
                    - name: httprobe-results
                      path: /tmp/httprobe
                  container:
                    image: thealchemist11/nuclei
                    command: [sh, -c]
                    args: ['/tools/nuclei -l /tmp/httprobe -t /tools/nuclei-templates/cves/ -t /tools/nuclei-templates/technologies -silent -retries 3 -timeout 10 -o /tmp/nuclei.txt']
                  outputs:
                    artifacts:
                    - name: nuclei
                      path: /tmp/nuclei.txt

                - name: httpx
                  inputs:
                    artifacts:
                    - name: httprobe-results
                      path: /tmp/httprobe
                  container:
                    image: projectdiscovery/httpx
                    command: [sh, -c]
                    args: ['httpx -l /tmp/httprobe -retries 4 -title -content-length -status-code -follow-redirects -o /tmp/httpx.txt']
                  outputs:
                    artifacts:
                    - name: httpx
                      path: /tmp/httpx.txt

                - name: dirsearch
                  inputs:
                    parameters:
                      - name: target
                  container:
                    image: jradik/dirsearch-docker
                    command: [sh, -c]
                    args: ['/root/dirsearch/dirsearch.py -u {{inputs.parameters.target}} -e .php,.js,.aspx -x 400,429,501,503,520 --plain-text-report=/tmp/dirsearch.txt']
                  outputs:
                    artifacts:
                    - name: dirsearch
                      path: /tmp/dirsearch.txt

          parameters:
            - src:
                dependencyName: test-dep
                dataKey: body.message
              dest: spec.arguments.parameters.0.value

As this Argo Event is using AMQP we will need to setup a RabbitMQ pod together with the Event Source to listen and send messages to the queue. You can use the below:

kubectl -n argo-events apply -f https://raw.githubusercontent.com/argoproj/argo-events/master/examples/event-sources/amqp.yaml
kubectl -n argo-events apply -f workflow.yaml

Then create the RabbitMQ pod:

apiVersion: v1
kind: Service
metadata:
  labels:
    component: rabbitmq
  name: rabbitmq-service
spec:
  ports:
    - port: 5672
  selector:
    app: taskQueue
    component: rabbitmq
---
apiVersion: v1
kind: ReplicationController
metadata:
  labels:
    component: rabbitmq
  name: rabbitmq-controller
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: taskQueue
        component: rabbitmq
    spec:
      containers:
        - image: rabbitmq
          name: rabbitmq
          ports:
            - containerPort: 5672
          resources:
            limits:
              cpu: 100m

Now apply the YAML file into the argo-events namespace:

kubectl -n argo-events apply -f rabbit.yaml

Then locate the RabbitMQ Pod name and port forward on 5672:

kubectl -n argo-events get pods
kubectl -n argo-events port-forward <rabbitmq-pod-name> 5672:5672

A quick python script to help you trigger the workflow and send the domain to scan into RabbitMQ would be:

import pika

f = open('./subdomains.txt', 'r')

for subdomain in f:
        sub = subdomain.rstrip()
        print sub

        connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
        channel = connection.channel()
        channel.basic_publish(exchange='test', routing_key='hello', body='{"message": "' + sub + '"}')

Now running a quick scan on uber.com and checking the workflow we have (also notice how argo split the two domains identified by httprobe and scheduled two dirsearch scans):

argo -n argo-events get @latest

Name:                endpoints-btnkf
Namespace:           argo-events
ServiceAccount:      argo-events-sa
Status:              Succeeded
Conditions:          
 Completed           True
Duration:            1 hour 6 minutes
Progress:            5/5
ResourcesDuration:   4h19m31s*(1 cpu),4h19m31s*(100Mi memory)
Parameters:          
  message:           uber.com

STEP                                  TEMPLATE   PODNAME                     DURATION  MESSAGE
 ✔ endpoints-btnkf                    combo                                              
 ├───✔ httprobe                       httprobe   endpoints-btnkf-3851781355  9s          
 └─┬─✔ dirsearch(0:https://uber.com)  dirsearch  endpoints-btnkf-3047568844  1h          
   ├─✔ dirsearch(1:http://uber.com)   dirsearch  endpoints-btnkf-1686601986  1h          
   ├─✔ httpx                          httpx      endpoints-btnkf-2676845826  22s         
   └─✔ nuclei                         nuclei     endpoints-btnkf-3145352274  1m

The scan results from all tools are stored in the bucket we created earlier:

Storage Bucket Results

All the above and the use of Argo showed us how easy you can run multiple tools and pass data between Pods, something that would potentially require a lot of lines of code to implement in bash or python.