meh: Obfuscation
First Steps #
As the title states, this post will focus primarily around some obfuscation steps I am taking for my generated source code. All templates and code blobs that are used and merged together through the entire process are clear text, commented, very human readable and have obviously named functions and variables.
I was doing some reading online about obfuscation techniques in general, not specifically focused on Go or malware, and the obvious results address the following:
- convert function names into non-sensical names
- convert variable names into non-sensical names
As a base line, this is probably pretty handy if someone were to try and identify functionality based on naming conventions used within the source code. So as a first step into hiding what I am trying to do, it seems like a good one.
Functions and Variables #
In order to achieve this, I created a new module which runs within my generator, assuming the user wants the source code to be obfuscated. This module was relatively ‘black box’ and functions can just be called as required. Providing the function is given the raw source code, it will run through each obfuscation function I have implemented and return an obfuscated blob of Go, ready for compiling.
In order to obfuscate function names, I am using Regex to find the function declaration within the source file. At the moment, this currently looks like:
re := regexp.MustCompile(`func\s+(\w+)\(`)
I’m not a regex wizard so I’m not sure if this will catch ALL instances, but for arguments sake, the 2 variations I need to find in any source code should be either of the following examples:
package main
func name() {
// body
}
func otherName(someParam string, anotherParam string) string {
// body
}
func main() {
name()
otherName("one", "two")
}
Currently, my implementation will give me both values of ’name’ and ‘otherName’. As I loop through the source, finding each function declaration one by one, I generate random function names to find and replace the original with the new. Right now, it’s just simple string generation with a upper and lower length limit and using lowercase and uppercase letters. This will cover all function declarations as well as calls to that function in other areas of the code.
Using the above code snippet example, once the obfuscation has ran, end up with something similar to the following example.
package agbHAidNAUSnejd
func qgHnsjQDSoap() {
// body
}
func aibOEndkMSNf(someParam string, anotherParam string) string {
// body
}
func agbHAidNAUSnejd() {
qgHnsjQDSoap()
aibOEndkMSNf("one", "two")
}
Note, I also obfuscate the ‘main()’ function names, which tripped me up a little when I realised the agent still ran when compiled. I haven’t really dug into how it knows the entrypoint regardless of the naming convention - perhaps it’s a relation between the 1 source file, the package name also being ‘main’/‘agbHAidNAUSnejd’? I need to look into this more.
Although it’s not perfect, if the function was called ‘sleepForFiveSeconds’, then it’s a bit more obvious to the human eye what this function does compared to ‘qgHnsjQDSoap’.
With regards to the declared variables within the source code. I handle these in a very similar manner to obfuscating and replacing function names, except for this the regex is a little different to ensure I find declared variables. In Go, a new variable is declared on the left side of ‘:=’ making this relatively easy to identify within raw source code.
re := regexp.MustCompile(`\b(\w+)\s*:=`)
An example of before and after might look something like:
func hello() {
toSay := "hi there"
fmt.Println(toSay)
}
func hello() {
kjhaVA := "hi there"
fmt.Println(kjhaVA)
}
Again, not a huge different or mind blowing step, but it’s further hiding human eye readable information when coupled with the function names.
Stomping out STDOUT #
Up to now, a bunch of templates/snippets that I have been using have included calls to the following Go functions, that print data to stdout:
- fmt.Println()
- fmt.Printf()
When running the agent after generating and compiling, there has been indication of what the agent has been doing, leaking current state and some bypass functionality. These will also be included in the binary as strings, which could be analysed statically. An example of one of the hardcoded prints can be seen below at runtime:
The below is a little noisy, but running strings against the same binary (and grepping for a known string for ease of demonstration), we can see the ‘Starting agent…’ string hard coded on the second line of the output.
In an attempt to address this sort of output leaks, I have been finding and removing all instances of these functions with the following Regex:
re := regexp.MustCompile(`fmt.Print.*`)
Once found, I simply replace the instances with nothing, completely removing the calls and their hardcoded strings within the function parameters.
Bonus: Removing Comments #
I don’t think this has any effect whatsoever on the final compiled binary as the compiler is already massively smarter than me and will probably remove all comments from the compiled source? I can’t think of any reason it would even care about keeping them, but to obfuscate the raw source code (in the event it leaks, can be identified by some analyst, whatever) I find and remove all comments in a similar manner to the Print functions. The Regex I currently use is:
re := regexp.MustCompile(`//.*`)
The observant reader might immediately think “well, what about multi-line comments that are wrapped in /* */?” and to that, I simply tip my hat and say “yes, what does happen to those?”
Well, they are included as I haven’t addressed them yet. I’m sure the regex can be adapted very easily but I just haven’t tried it yet.
Conclusion and Next Steps #
To wrap up this post, the obfuscation steps are not very complex and address the basic requirements however right now, they do enough to prove that I can obfuscate the source code and still have a compiled agent returned from ‘meh’. \
The next step for me is to implement a form of pre and post processing for attempting to hide and obfuscate shellcode within the agent.
For less than a weeks worth of work, I’m loving where meh is right now and excited to see where it goes! Thanks for following along :)
First commit: 15/10/2024