Email address enumeration Using Python.


Welcome back one and all! It’s been a while, I apologise, but I’ve been a very busy man. I finally managed to escape the clutches of mechanical engineering, and landed myself a job in Cyber Security! Very excited about that, but that’s not the topic of the blog, so moving on for now:

Recently, I have been involved in a few penetration tests. And I’ve learnt that one of the very first steps is a phishing campaign. What’s the first thing you need for a phishing campaign?

Coffee of course. 

Closely followed by email addresses, I guess.

Whilst not as important as the coffee, having valid email addresses is somewhat integral to pulling off a successful phishing campaign, resulting in those juicy creds you’re looking for. 

So where do you get those email addresses from? You could get lucky and be handed a list by the client. But more often than not, it’s down to you to find them yourself. 

I'm going to show you three steps today, along with the code needed to automate these steps, they are as follows:

1) Using the clients webpage

2) Using LinkedIn to guess email addresses

3) Using Microsoft 365 to validate the email addresses we have found

1) Using the clients webpage

Ok, so the first port of call is having a look at the clients webpage. Search each page meticulously for any sign of email addresses, and add them to your list. Don’t forget the source code! Sometimes they may be added as comments by the creator. Have fun, and I’ll see you in a few hours…

Pretty boring right? And a complete waste of your time? And your fingers are sore from ctrl-c ctrl-v? Ok, fair enough, let’s automate the process:


Ok, here you go, no hard work involved - I've done it all for you:

htemail (github.com/the-jcksn)

"So how does it work?" - I hear you enthusiastically ask as you fire up your terminal and grab those email addresses hassle free (sarcasm, that was sarcasm - but if you are interested, read on!).

TLDR: It pulls the webpages source code and looks for strings matching email address format.

Firstly, you need to specify some arguments with running the script (help with these is available through the -h flag). One of these arguments is for pulling email addresses from a URL, the one we want in this instance.

It takes the URL supplied and assigns it to a variable, then uses that variable to make a GET request, writing the source code to another variable and then reading it. Basically, the script is now reading the webpage for us. Result.

Source code is a mess, so the next step is cleaning it up a bit it order to make it more 'searchable'. For this we assign each line of the source code to a variable, then check each line for a list of special characters (such as '<'), removing them and replacing them with a space. This isolates everything in the line into individual strings, rather than one big line of gobbledygook. 

Ok, so we have nice individual strings, and much less nonsense. Time to tell the script what it is actually looking for. For this we will use a regex (regular expression), a string that (in this instance) covers email address formats, and something we can compare everything in the source code to. If you want to see the regex have a look in the htemail code i posted a link to above. 

After comparing each string in each line of the source code to the regex, we are left with a list of things that are (probably, and/or hopefully) email addresses. From here it is just a case of checking we haven't got any duplicates and then adding the results to a file. Step one complete, enjoy your new email addresses! Why not send them a hello email introducing yourself? or maybe not, probably best to just use them for phishing.

2) Using LinkedIn to guess email addresses

LinkedIn is great for OSINT. And (lucky for us) this especially rings true for guessing email addresses. LinkedIn users often display their first and last names, and companies often use firstname.lastname@domain.com as their naming convention for email addresses. Put 2 and 2 together and you got yourself a nice juicy list of staff emails, ripe for harvesting those all important creds.

What are you waiting for? Grab yourself a pen and paper and get writing all those 3000 staff members names down!

What? What do you mean you're not doing that? Fine, I'll automate it for you:



linkedemail (github.com/the-jcksn)

Ok, so this code is a but messy, and it requires about 2 minutes prep on your behalf. The issue was that you need to be logged into LinkedIn and on the right page in order to pull the companies staff members. But once you've followed the instructions and done that, it DOES work, and gets staff email addresses. With regards to automating this bit, I quote "Ain't nobody got time for that".

"So how does it work" - I again hear you not asking.

TLDR: Similar to the last time, reads the source code, looks for first-last names and then converts these to email addresses adding a dot and a domain.

Firstly, you need to provide it with the source code this time. As mentioned above, there is a log-in barrier preventing me from pulling this with a request (for now, it is do-able but I have better things to do, like washing my hair).

It performs a similar action to htemail, reads the source code line by line, this time looking for links to staff members profiles.

After separating these links into their own list, it reads the list and strips the URL part, leaving only the LinkedIn usernames of the staff members. Thankfully, the default username for LinkedIn is firsname-lastname.

If any of the staff usernames in our list use this naming convention (which most usually do), it then removes the hypen, replaces it with a dot, and appends a user supplied domain to the end. This gives us firstname.lastname@domain.com, the email naming convention 99% of businesses use! Simples.

I then added the option to create a csv file containing firstname, lastname, and email address. As this is the format GoPhish requires for setting up user lists. You're welcome.

It's not perfect, but I knocked it up in an hour or so, if it gets any traffic I will improve it and make it more point and shoot. If not, then just use linkedin2usernames by initstring. Its much better and simpler, and saves me more coding.

3) Using Microsoft 365 to validate the email addresses we have found

Ok! So there we have it, you have some amazing lists of hundreds of email addresses. You're ready to go phishing!

No, no you're not. Not yet. One last thing I'm afraid.

So you have all these email addresses you have pulled/generated. But how do you know they are actually valid? You don't. So you could send out your campaign and get nothing back, and cry yourself to sleep. But you don't have to! There's a way to validate them.

I'm assuming you are targeting Microsoft 365 accounts here. Because that's what the majority of businesses use. If you're not, then this bit isn't for you and you can go away. Shoo, leave. If you are though, then read on...

To validate the email addresses, you can go to the office 365 login page, and type them email addresses in, one by one, and see if they ask you for a password, or tell you that the user doesn't exist. Done, you now know which are and aren't valid email addresses! Hurray, continue phishing.

What do you mean you have 5000 email addresses to check? Better put the kettle on, you're in for a long night.

Or not, yet again, I'll do the hard work so you don't have to:


365validator (github.com/the-jcksn)

Ok, so this one I'm pretty proud of. Point and shoot script that does all the hard work for you, just supply it with a list of email addresses and it checks the validity of each against the 365 login page. Again, use the '-h' flag to see the available options.

"So how does it work" - I know you want to know, so I'll explain.

TLDR: It fires each username against the login page and analyses the response to find if they are valid or not.

Firstly, I spend a while buried deep within burpsuite repeater. Checking the different responses to valid and invalid email addresses. I started to notice a pattern, every valid email address I sent produced either of the following in the response:

"ifExistsResult":5

"ifExistsResult":0

Whilst invalid email addresses produced:

"ifExistsResult":1

Bingo, we nailed it. Now before you go running off and trying to submit this as a bug bounty to Microsoft: I already checked, they don't accept submissions for user-enumeration. So no cash for you. Sorry.

Now, the requests I sent in burpsuite contained a LOT of fields of data, cookies, headers, and all sorts of nonsense. So I removed them one by one, and worked out the bare minimum we needed to send in order to still get our results. After deleting all the rubbish, we ended up only having to send the following to get out "ifExistsResult" response rather than errors:

{"username":"user@example.com", "isOtherIdpSupported":true, "country":"GB"}

From here we just need fire each email address in the list to the login page, with a post request containing the above, replacing the username field with the email addresses, and check if the above "ifExistsResult" strings for valid emails are in the response. If they are then we print them out as valid, if they aren't then we ignore them. Done and dusted. We got out emails, time to go phishing.

Ta-raa for now.

I hope you learnt something reading this blog, and even if you didn't, I hope the tools I produced make your life a little easier. If they did then feel free to reach out to my LinkedIn and say hello! 

Now go get those creds.

Comments

Popular posts from this blog

THM RootMe Walkthrough

THM Agent Sudo Walkthrough

Defending against physical intrusion attacks - The under door tool.