Devlog #1: Automating Garbageposting

kuantum

🍔Arbiters? Intern🍔
Since Tech never makes their newsletters anymore, I have decided to make my own because I truly have nothing better to do.

On the docket today, we have Garbageposting. Specifically, threads of mine where I should normally be responding everyday but yet I simply…don't. It would be great if I had a tool that automatically posted to a thread everyday so I don't have to. And as it turns out, I made such a tool to do that! And I'll teach you how I did it, because let's face it, reading ban contests, CL's, MAUL notes, and other hidden gems gets very old very fast and it's not like you're doing anything else worth of value.

First, we need to know what tools are available that can help achieve this objective. The first solution is the forum's API itself. Pretty easy to do, all I need is to find the API first and foremost.

NHlwvjRrlWw-2kRXdtMZjGbyoS1ybnnsBkBZAOE-O96oXjCbJF35WhMZ5L2X72Sw9LyqP7Tqr6w_W-101FhcIkIPSVib2kEdXHiBWTOTi8ZixplDPZAQ7ceKPVQlECUCcih1mXQVLStmgbDNQMs

Looks like I need an API key. Simple to fix, just gotta find a local Tech Team member to get a key made!

LItgZ64eXh2Ac9v-YsquYQ6UCt5X99PIPEEFa9gxSURaxrZUsQhOsfJLz3TjiDFB_FRMVLJZoaQYKsb_FS5DaaTXmcfZlTucrJ0WI_sBffLy9y9lqBeneULRm8--dZOhpOVYa64WViMmEwyKeCU

1656807831131.png

Damn. So using the most available and correct tool in front of me is not an option. Well, that's not so bad as we have two more options available!

Our second solution is webscraping. Web scraping is a crude way of extracting data from a website that does not have an API in its place. On all websites, the way data is formatted from a server to your web browser is via tools such as HTML, CSS, and JS. HTML is the way the data is laid out on the site, CSS is how the data should look, and JS is how you interact with the data. So we need to be able to do the following things:
  1. Be able to click buttons.
  2. Read data.
  3. Fill data in a box.
  4. Spam the forum Do this every day at a certain time.
For something this simple, we look to scripting languages, like Python. Using a library called bs4, we can start to look at the HTML and start our plan of attack!

Let's write some very simple code first.
Python:
PS C:\Users\mrsam> python

Python 3.10.5 (tags/v3.10.5:f377153, Jun  6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32

Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> import requests
>>> res = requests.get("https://edgegamers.com")
>>> soup = BeautifulSoup(res.content)
>>> soup
…
<h2 data-translate="why_captcha_headline">Why do I have to complete a CAPTCHA?</h2>
…
Welp. Basically, Cloudflare doesn't like webscraping and when it figures out that I am webscraping, it's not going to allow it. There's also other downsides to webscraping as doing webscrapes, you can't interact with the website. So even if it wasn't blocked, I couldn't interact with the site.

Our final option: Selenium. Selenium is originally a browser testing framework. When web developers test their site, ideally, we'd want to not have to actually do anything and have our computer do the testing for us. Selenium allows that but also allows just anyone to automate their browser, though, not in the typical sense. More on that later. For now, we need to test Selenium and start working on a solution.

First, we need to know how Selenium works. From their own documentation, "Selenium Python bindings provide a convenient API to access Selenium WebDrivers like Firefox, Ie, Chrome, Remote etc. The current supported Python versions are 3.5 and above". The key note here is "WebDrivers". Traditionally, to cover all bases in web development, browser makers release custom versions of their browsers targeted for web developers to speed up website development. For example, Chromium (the underlying framework for Chrome) has a WebDriver called ChromeDriver, Firefox has GeckoDriver, etc.

Second, we need to interact with ChromeDriver using Python. From the docs, we just write a one-liner and we get our own ChromeDriver instance up and running!
Python:
PS C:\Users\mrsam> python
Python 3.10.5 (tags/v3.10.5:f377153, Jun  6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from selenium import webdriver
>>> driver = webdriver.Chrome()

DevTools listening on ws://127.0.0.1:63799/devtools/browser/cd8356db-bad1-4e4f-a450-bdbe6c0fba1b

1656810249888.png


Great! Now, we need to navigate to the forums, we'll tell the driver to go to the forums!
Python:
>>> driver.get("https://edgegamers.com")

1656810351945.png

Let's go. Now that we're here, we need to log in to the forums. Normally, you'd click on "Log in" and put in your username and password and TOTP code if applicable and bubs your uncle, you are in! But, that's not necessarily so easy for Selenium. To tell it to click on the button, first, we need to find some information about the buttons and inputs. Let's navigate to https://edgegamers.com/login to get started.
1656810575867.png


First, we'll click F12 and find the exact HTML block that houses the username input box.
1656854098407.png
1656854093878.png

Now, we have found the exact HTML for the input box, and we know it's class name, type, and name of the box itself. Selenium has a helper library called By. When you try to find elements, you'll want to use the By library to filter the search to make it exact. Since there's only one "login" input box, we can search By.NAME using the following code:
Python:
>>> from selenium.webdriver.common.by import By
>>> element = driver.find_element(By.NAME, "login")
You'll notice the website hasn't changed at all, but rest assured Selenium knows where the login box is. Now, all we have to do is type our username.
1656854392918.png

Very nice. Now, we need to deal with passwords. It's the same thing, but just search for the password box instead.
1656854486959.png

Nice! Now, we need to focus on how to complete the login. There are a couple of ways, we can do the same thing with Selenium: finding the exact HTML for the box and having it click on it, but do you do that when you login? More often, you just press enter right? Turns out, we can do that too! Selenium has another helper library called Keys that mimics key presses as well! So let's get that imported and let's try to send a Enter key to login.

Python:
>>> from selenium.webdriver.common.keys import Keys
>>> element.send_keys(Keys.RETURN)

1656854774667.png

And it worked! But now we're at an impasse. We need our TOTP code before we can continue the login process.

TOTP or (Time-based One Time Passwords) is self-explanatory. To generate those six digit passcodes, two very important parts are needed in order to generate them. One part is a secret key. This could be anything so long as the website and the user both have access to the secret key. The other part is time but not your local time or time zone time. Have you ever been doing something and thought an hour has passed when only 15 minutes have passed? It's key to understand that time is relative to the person observing. And in order to have some sort of stability, we need something concrete and rock solid enough to generate these codes across continents and planets, even!

This stability is known as Unix Time. Unix Time doesn't have a timezone, it doesn't have "time" clocks as we know it. Instead of generating a time, like 8:30 AM, it generates a timestamp, more notably, a timestamp of how many seconds have passed since January 1st, 1970, at exactly midnight.
1656855429431.png
From there, our computers can format that timestamp into something workable, like a human readable date and time. Bonus tip: if you have ever seen a computer's set time be so old, around December 31st, 1969, it is because it either lost power and needs you to reset it, or it got turned back to 0. Now, how do we generate a TOTP code with Python? Well, the good news is that we don't have to do it by hand, we can use a module called mintotp. mintotp does the fancy work of making those codes without us having to worry about actually doing it.

Let's go ahead and import it.
Python:
>>> import mintotp
From their documentation, mintotp has a function called totp that requires your secret key to generate these codes. Here's their example:
Python:
$ python3
>>> import mintotp
>>> mintotp.totp('ZYTYYE5FOAGW5ML7LRWUL4WTZLNJAMZS')
So let's try it! I use Bitwarden to house my passwords as well as my TOTP codes (which isn't advised ever, so don't do it) so I'll go ahead and grab my secret key and have it generate a TOTP code.
Python:
>>> import mintotp
>>> mintotp.totp("THISISNTAREALTOTPSECRETKEY")
'754878'
Nice! That works, so now we need to do a couple of things!
  1. Find the TOTP code input box.
  2. Input that box with the value of mintotp.totp
  3. Press enter.
Very simple. So let's run through it.
1656856274132.png

Step 1 finished.
1656856327145.png

Step 2 finished.
1656856420764.png

Step 3 finished! We're logged in! Now, we have a lot of things we could do. I'm going to choose to participate in the Count to the Million thread. Let's navigate over there.
Python:
>>> driver.get("https://www.edgegamers.com/threads/333944/")
Now, we need to post the last digit seen. I'm going to cheat a bit here and look at the last digit which is currently 25382. So let's figure out our next steps here.
  1. Find the message box.
  2. Type in the digits.
  3. Send a key combo CTRL and Enter.
We need to send the key combo this time as Enter only does a new line when you are writing a post.
Let's find the message box here:
1656856757025.png

It's a div with the class name fr-element fr-view. Let's find the div.
Python:
>>> driver.get("https://www.edgegamers.com/threads/333944/")
>>> message_box = driver.find_element(By.CLASS_NAME, "fr-element.fr-view")
Let's type in the new number:
1656857083025.png

Now to simulate the key presses and...
1656857120655.png

We did it!

This isn't automation since we technically did this by hand. But now that we did it by hand, we can have Python running a script every so often to post whatever we want, whenever we want. We could have it just login for us and do nothing to keep our activity...well, active. And if you have any posts (like I do) that require daily attention, you can just automate it away and never have to do anything about it.

And before you ask, "oh but kuantum, wont this be used to spam the forums", to which my answer would be that it already is. Selenium has been around since 2002. XenForo forums and the like have a similar DOM and HTML pages so while, yes, this tool could be used maliciously, it's probably already been used maliciously. On the forums and elsewhere.

If you would like to take a quick look at some of the source code I've wrote, you can direct yourself to my Github project, xenposter.

Til next time, nerds
 
This is very detailed and one of the few longer post I’ve ever read. I can’t say I learned much but this was very interesting to read. Nice job nerd👍🏽
 
couldn't you have skipped the selenium bullshit by adding cookies & a user agent to your initial request? probably not but just wondering if someone smart could let me know
 
couldn't you have skipped the selenium bullshit by adding cookies & a user agent to your initial request? probably not but just wondering if someone smart could let me know
I probably could've, but the problem is that webscraping can't interact with web pages anyway, at least bs4 can't. So Selenium is what I used.

There's also other reasons for Selenium, as in, I don't have to use my actual computer to run the scripts and can run a grid through Docker, but more topics for later.
 
Back
Top