meh: Hiding Our Shellcode - Pre and Post Processing

Intro #

In the past when writing/using little implants and agents, a blocker seems to be embedding shellcode. Shellcode that has malicious intent and is designed to give us a shell or execute some form of ‘dangerous’ process has likely been caught in the past, signatured and had the word spread to AV vendors. For that reason, just having the shellcode embedded can often mean we are caught as soon as it touched disk.

The Problem #

Due to the nature of shellcode and how you use it… it has to exist in it’s raw format, with the binary before being written into the process and ultimately executed. As stated in the intro, malicious shellcode that has been designed and used for executing tools such as shells has likely been flagged in the past and vendors have implemented checks and signatures for these.

The problem is pretty obvious, right? We need the shellcode in a usable state to place in our process and execute it - however, we can’t just embed it in the code in the raw state as it will be seen and flagged.

The Solution #

Common methods of working around this problem is to encrypt the shellcode payloads. XOR-ing seems to be a go-to approach, that still works quite well. Digging into this method is out of scope for this blog, however a quick description can be seen below to help understand the approach:

When we talk about XOR, we are talking about the bitwise operation that compares two binary values. This works in the following:

If the bits are different, the result is “1”.
If the bits are the same, the result is “0”.

A binary example for this is:

0 XOR 0 = 0
1 XOR 0 = 1
1 XOR 1 = 0

Now when we apply XOR to two binary sequences (e.g., text and key), it scrambles the data. The good thing about XOR is that it is reversible, meaning applying the same key to the encrypted result will return the original plaintext. This is perfect for what we want to achieve!

We can ultimately take out raw shellcode, as a ‘plaintext’ string, encrypt (XOR) it with a key and place the XOR’d shellcode in our source code. All we have to do to make it usable again, is XOR the encrypted data with the key and we have our shellcode again:

At development time: Raw Shellcode -> XOR -> Add encrypted value to source code

At run time: Encrypted value -> XOR -> Raw Shellcode

My Approach #

This approach is something I’m pretty proud of. For the record, I am simply XORing my shellcode as of right now, however, anyone who has been following these posts will know that I have been randomly generating my agents and building the final solution by merging snippets/templates and randomly generated values together.

As this is the case, I needed to implement this method of encrypting/decrypting shellcode with various different approaches to ensure the results aren’t always the same. I could have simple just hardcoded a single XOR function into the final result and have the user add their encrypted shellcode and key. But that wouldn’t be scalable in a similar manner to my overall solution.

Instead, I have come up with a pre-process and post-process feature of my generator. The pre-processor takes the raw shellcode, that is provided by the user at during the usage of meh, it randomly selects a method of encrypting the shellcode. As of right now, this is just a basic XOR approach. It will randomly generate a key, use it to encrypt the data and keep a reference to this value. This is the pre-process.

Now when the agent’s source is being built - the previously reference encryption method and key will be used to select a decryption method that works correctly and this is placed in the agent, as well as the key. When executed, the agent will simply have a blob of encrypted data, a key and a decrypt function. This has been ‘hard-coded’ to a point to ensure it is called before the shellcode is needed for execution, but with this in mind the solution is scalable for different algorithms and methods of hiding our shellcode. This is the post-process.

The diagrams are above aren’t perfect, probably terribly incorrect based on what the icons actually mean and they differ a little to the actual process flow. I was just hoping to visually show what is going on code side…

The following screenshot shows meh at runtime as of now.

As can be seen, a few new steps are present:

First few steps are the same as previous updates (ignore the x86 ASM, it’s just placeholder)
The pre-processing step is chosen following the ’execution’ template.
A random key is generated ready for encrypting/decrypting the payload.
Now we know we are using XOR, we select a post-process file of this type.
The final result is populated and compiled

As of right now, I am just using XOR to encrypt the shellcode - but plan to implement a wide range of performing this same algorithm to give the final result a better chance at generating different code solutions and implementations, whilst maintaining the correct working functionality.

A snipped example of the source code that was generated from the above can be seen here:

<... snipped ...>

func basicExec() {
	shellcodeStr := `QZ <MJU9)zS`
	shellcodeStr = xorBytes(shellcodeStr)

	// shellcode execution is called in here
	
	<... snipped ...>
}

func xor(data string) string {
	key := []byte("PzhjUDtOHeseeQCcsuChCwRIpDHNZpVn")

	<... snipped ...>
}

func main() {
	fmt.Println("Starting agent...")
	
	basicExec()
	
	fmt.Println("Agent executed")
}

As can be seen, the encrypted payload is added as expected from the pre-process steps. The key has been embedded and the correct post-process function, xor(), has also been added to match the pre-process steps.

As mentioned before, I’m really quite happy with this solution so far. It’s far from bleeding edge, probably exists somewhere else and will unlikely ever be used. BUT, we are learning, we are coding, and we are having fun!

Next Steps #

Implement some logic steps to prevent bypasses of the same nature being selected. I need to stop using 2 sleeps for example.

Further obfuscation should be added. I think I will add a bunch of padding to the final solution. Attempt to generate a bunch of unused and pointless functions and see if the compiler ignores them or includes them. This will hopefully help bloat the binary and possibly distract if it were being reversed?

Now I have shellcode usage, I would also like to add a pre-written selection of shellcode to the project for the user to select from. It should be ‘modifiable’ to a point, so I will need to template it. For example, selecting a reverse shell should allow a custom IP and Port to be chosen. I might even wrap, and include, something like msfvenom if it exists on the system that meh is being used on. That way it can just generate and embed the results straight away. Either solution will be a nice addition I think!

Thanks for following along :)