Fightlike File: How I Built a Mastodon Bot

  1. Conception
  2. Theory
    1. The Chipp and Seth Problem
    2. The Robo-Ky Dilemma
  3. Technology
    1. Libraries
  4. Code
    1. Rundown
      1. Connect to Mastodon
      2. Usage
      3. Processing Input
      4. Outstanding Mentions
      5. Streaming
    2. Database
      1. Aggregation Pipeline
    3. Principles
      1. Separation of Concerns
      2. DRY (Don't Repeat Yourself)
      3. YAGNI (You Aren't Gonna Need It)
      4. Clean Code
    4. Testing
    5. Potential Improvements
  5. Growth and Challenges
    1. Database Roadblock
      1. Character Submission Form
    2. Recommendations Accuracy
    3. Mastodon
    4. Hosting
  6. Summary

Conception

Not unlike other, multiplayer-focused, video game genres, it is largely agreed on, among fans or otherwise, that fighting games are difficult. They ask players to invest a lot of time, effort, and take tens, hundreds, sometimes even thousands of beat downs to get to a point where they can relatively consistently win.

Like every hobby, it often can be discouraging, which is why a player's connection to their character is vital to their progress. A weak connection can turn improvement from fun to a chore, which is a good reason to abandon a hobby. To quote Reggie Fils-Aimé:

If it's not fun, why bother?

It is close to impossible to commit to a fighting game if no character in its roster grabs me. It is almost as if I gravitate towards characters, not games. To help myself and other Fediverse users like me, I set out to build some kind of character recommender, which eventually ended up becoming Fightlike

Theory

On a basic level, my assumption is there are three possible criteria to group characters:

  1. Visual attributes—size, outfit, physical features… etc.
  2. Archetypes—e.g. grappler vs. shoto
  3. Gameplay attributes—fast, rekka, teleport… etc.

Since Fightlike is gameplay-focused, its primary concern is archetypes and gameplay attributes. Including visual attributes is not out of the question, but it is not within the bot's current scope.

Initially, I thought grouping characters based on archetypes should be enough; however, I quickly realized the more data there is about any character, the easier it becomes to find others like them.

The Chipp Zanuff and Seth Problem

Chipp and Seth challenged, and eliminated, two misconceptions I had:

  1. Grouping characters by archetype is enough to provide accurate recommendations.
  2. Archetypes should take precedence over singular gameplay attributes.

In theory, Chipp and Seth belong to two different archetypes: pixie and glass cannon; however, in practice, they have many similarities—both are fast, mobile characters with a teleport, and high mixup potential.

It can certainly be argued, out of every character in UNISC, in terms of gameplay, Seth is the closest to Chipp. In other words: if a player mains Chipp, they will most likely enjoy playing Seth in UNISC.

With this assumption in mind, if Fightlike relies only on archetypes, it will not recommend Seth to Chipp mains looking for a main in UNISC; ergo, some of its recommendations will be inaccurate.

Giving archetypes precedence over singular gameplay attributes leads to the same outcome.

The Robo-Ky Dilemma

What is Robo-Ky's archetype? There does not seem to be a definite answer.

Robo-Ky's dilemma is not exclusive to characters without any obvious archetype; it also encompasses characters with a unique archetype, like Jack-O

If a character's archetype is not obvious, or if it is one-of-a-kind unique, in the absence of gameplay attributes, Fightlike will not find any similar character to recommend.

A workaround could be using an arbitrary label, like oddball, in lieu of an archetype; however, this does not necessarily lead to accurate recommendations.

The Robo-Ky Dilemma, combined with The Chipp and Seth Problem, lead me to conclude singular gameplay attributes are as important as archetypes.

Technology

While a Mastodon bot is sufficient proof of concept, my plan has always been to make a free and open-source web app.

With that in mind, I decided to host Fightlike's database at MongoDB's Atlas because:

  1. It is cloud-based, which allows me to simultaneously hook into it from the bot and the web app, while relieving myself of the burden of having to host it.
  2. It supports multiple languages.
  3. It is free, to an extent.
  4. It supports aggregation.

For the bot, to write the middle layer between the database and Mastodon, I chose Python because of its extensive libraries, and write less, do more approach.

Although this is not set in stone, my plan is to use SvelteKit to write the web app.

Libraries

The Mastodon bot relies on the following Python libraries:

  1. python-dotenv to process environment variables.
  2. PyMongo to connect to MongoDB's Atlas.
  3. Mastodon.py to connect to botsin.space, where Fightlike is hosted.
  4. Beautiful Soup to parse toots.
  5. pytest for unit testing.

Code

This section covers only the Mastodon bot's codebase. When finished, the web app will likely get its own blog post.

Rundown

Fightlike is designed to accomplish the following objectives:

  1. Connect to the Mastodon instance where the bot is hosted.
  2. Check if any user mentioned the bot while it is offline, and reply to every mention on first in, first out basis.
  3. Listen, and reply, to any mentions as long as it is active.

Connect to Mastodon

python-dotenv and Mastodon.py make connecting to Mastodon a breeze. For the sake of re-usability, and to facilitate unit-testing, I elected to wrap each in a separate function, and call it whenever an environment variable, or a connection to Mastodon is required.

getEnvironmentVariables() returns an instance of a dataclass with every variable stored in .env:

def getEnvironmentVariables():

    @dataclass
    class environmentVariables():
        api_base_url: str
        db_uri: str

    # Load variables from .env.
    try:
        load_dotenv()
    except Exception as e:
        print(e)

    return environmentVariables(
        api_base_url=os.getenv("API_BASE_URL"),
        db_uri=os.getenv("DB_URI")
    )

getMastodonConnection() creates and returns an API instance that can be passed around, and used, from anywhere inside the codebase:

def getMastodonConnection():

    api_base_url: str = getEnvironmentVariables().api_base_url

    mastodon = Mastodon(
        access_token="mastodon_access_token.secret",
        api_base_url=api_base_url
    )

    return mastodon

Usage

If all present, Fightlike splits every toot into three separate categories:

  1. Character name, or slug—a string.
  2. Game title, to specify the character's version, prepended by !!—another string.
  3. Filters, which could be netcode—"rollback", or a franchise—"guilty gear"—an array of strings.

A typical prompt could be:

@fightlike bridget !!ggst !rollback !blazblue

or:

@fightlike Bridget !!Guilty Gear -STRIVE- !under night !rollback

Only character name is required. As long as the exclamation marks rule is respected, all flags are optional, and can be in any order.

Processing Input

Fightlike processes user input (toots) in six separate, relatively simple steps:

  1. In getTootText(), using Beautiful Soup, the bot parses toots, then returns their full text, beginning with its own mention:
def getTootText(tootContent: str) -> str:
    parsedTootContent: str = BeautifulSoup(
        tootContent, "html.parser"
    )

    tootText: str = parsedTootContent.get_text().strip(' \t\n\r')

    beginIndex: int = tootText.find("@fightlike")

    return tootText[beginIndex:]
  1. getCharName() is written to process single and full character names. After eliminating flags and @fightlike, the function checks for, and returns, a full name when provided; otherwise, it returns the name as is:
def getCharName(tootText: str) -> str:

    flaglessToot: str = tootText.split(" !").pop(0)

    charName: str = flaglessToot[11:]

    if (len(charName.split(" ")) == 2):

        firstName: str = charName.split(" ").pop(0).capitalize()

        lastName: str = charName.split(" ").pop(-1).capitalize()

        return firstName + " " + lastName

    else:

        return charName.capitalize()
  1. The simple getFlags() returns a list of strings without the exclamation mark, except gameTitle, which has two:
def getFlags(tootText: str) -> [str]:
    return tootText.split(" !")[1:]
  1. tootID, used to dismiss processed notifications, charName, and flags are all stored, and returned, as an instance of the dataclass TootInfo:
def getTootInfo(notification: dict):

    tootContent: str = notification["status"]["content"]

    tootText: str = getTootText(tootContent)

    @ dataclass
    class TootInfo():
        tootID: int
        charName: str
        flags: [str]

    return TootInfo(
        tootID=notification["status"]["id"],
        charName=getCharName(tootText),
        flags=getFlags(tootText)
    )
  1. Before performing database lookup, using an exclamation mark as an identifier, getGameTitleFromFlags() returns the game title flag when provided, or returns False when not:
def getGameTitleFromFlags(flags: [str]) -> str | bool:

    gameTitle: bool | string = False

    for flag in flags:
        if (flag.startswith("!")):
            gameTitle = flag[1:]

    return gameTitle
  1. Finally, the bot loops through all provided flags—ignores game title, which had already been extracted, checks for values "rollback" or "delay" to identify netcode, and assigns the last flag to query["game.franchise"]:
for flag in flags:
    if (flag.startswith("!")):
        continue
    elif (flag == "rollback" or flag == "delay"):
        query["game.netcode"] = flag
    else:
        query["game.franchise"] = flag

Outstanding Mentions

Passing mentions_only=True to Mastodon.py's mastodon.notifications() is all it takes to filter notifications down to only mentions, and the rest is fairly straightforward:

def handleMissedNotifications() -> None:

    mastodon = getMastodonConnection()

    notifications = mastodon.notifications(mentions_only=True)

    if (len(notifications) > 0):

        for notification in notifications:

            replyToToot(notification)

            dismissNotification(notification)

Streaming

Although it is fairly simple, code-side, figuring out how to stream notifications was probably the most challenging objective. It can be done in two steps:

  1. Creating a StreamListener subclass to override on_notification() method:
class CustomListener(StreamListener):

    def on_notification(self, notification):

        if (notification["type"] == "mention"):

            replyToToot(notification)

            dismissNotification(notification)
  1. Passing the subclass to mastodon.stream_user():
# Begin listening to notifications.
mastodon.stream_user(listener=notificationListener)

To do more with streaming, the Mastodon.py documentation is an excellent resource

Database

After going through multiple, different permutations, this is what the typical character document in Fightlike's database currently looks like:

{
    _id: 650060820a48eee8cba77d1c,
    name: "Eddie",
    game: {
        title: "Guilty Gear XX Accent Core Plus R",
        netcode: "rollback",
        franchise: "guilty gear",
        slug: "ggxxacpr"
    },
    slug: "zato",
    keywords: [
        "puppet",
        "summoner",
        "flight",
        "negative edge"
    ],
}

Initially, archetypes and keywords were two different keys, but, for reasons mentioned in the theory section, I decided to flatten them into one array.

Aggregation Pipeline

To cut down code density, and processing time, I decided early on to delegate filtering, and sorting, results, to MongoDB's aggregation pipelines:

pipeline = [
    {"$match": query},
    {"$unwind": "$keywords"},
    {"$match": query},
    {"$group": {"_id": {
        "name": "$name",
        "game": "$game.title",
        "netcode": "$game.netcode"
    },
        "count": {"$sum": 1}}
    },
    {"$sort": SON([("count", -1)])},
    {"$limit": 10}
]

What the above pipeline does:

  1. Find every character in Fightlike's database that matches the user's query.
  2. Deconstructs the keywords array of each document and creates a separate document for each keyword.
  3. Groups all documents by _id and adds a count field for the total number of matching keywords.
  4. Sorts results in a descending order, and, optionally, limits their number to ten.

Principles

Throughout Fightlike's development cycle, I was conscious of the following principles:

Separation of Concerns

On macro and micro levels, I tried to observe separation of concerns—every module is restricted to one file, and one section:

connect_to_mastodon.py
custom_stream_listener.py
database.py
environment_variables.py
missed_notifications.py
reply_and_dismiss.py
toot_info.py

Every function, inside every module, carries out one task only, without overlap:

def replyToToot(notification: dict) -> None:

    tootInfo = getTootInfo(notification)

    mastodon = getMastodonConnection()

    reply: str = composeReply(tootInfo.charName, tootInfo.flags)

    try:
        mastodon.status_reply(
            to_status=mastodon.status(tootInfo.tootID),
            status=reply,
            in_reply_to_id=tootInfo.tootID
        )
    except Exception as e:
        print(e)

def dismissNotification(notification: dict) -> None:

    mastodon = getMastodonConnection()

    notificationID: int = notification["id"]

    try:
        mastodon.notifications_dismiss(notificationID)
    except Exception as e:
        print(e)

DRY (Don't Repeat Yourself)

I took every opportunity possible to write reusable functions—e.g. replyToToot() and dismissNotification() are written to be reused inside CustomListener and handleMissedNotifications():

def handleMissedNotifications() -> None:

    mastodon = getMastodonConnection()

    notifications = mastodon.notifications(mentions_only=True)

    if (len(notifications) > 0):

        for notification in notifications:

            replyToToot(notification)

            dismissNotification(notification)

DRY can also be observed when getEnvironmentVariables() is reused inside connect_to_mastodon.py and database.py

Lastly, DRY makes automated unit testing far more effective throughout Fightlike's codebase.

YAGNI (You Aren't Gonna Need It)

Fightlike's codebase covers only its core functions, while staying conscious of any potential code debt

As a result, it is lightweight, and easily extensible.

Clean Code

Being an open-source project, I tried to ensure Fightlike's codebase is as clean and human-readable as possible:

  1. Function names are descriptive and follow camel case

  2. File names are descriptive of their content, and follow snake case

  3. Comments are clear, dispel any possible confusion, and do not duplicate code.

  4. All data units and arguments have type hints.

  5. Global scope pollution is nonexistent. Every distinct piece of code is wrapped in a relevant function, and only the main function, runFightlike, is called at the global scope, inside fightlike.py

Testing

Written using pytest, Fightlike's unit tests cover the following processes:

  1. Fightlike's connection to Mastodon has been successfully established—test_connect_to_mastodon.py

  2. All necessary toot info have been successfully extracted—test_toot_info.py

  3. Atlas database is accessible, and character recommendations are retrievable—test_database.py

Potential Improvements

Although the written unit tests are sufficient to cover the common breaking points, they do not cover Fightlike's ability to reply to toots, or dismiss notifications.

One route I have considered to achieve this:

  1. Create, and hook into a test bot.
  2. Use it to test Fightlike's replyToToot() and dismissNotification()

In terms of readability, the return values of core PyMongo and Mastodon.py functions still need type hints—particularly, getMatodonConnection() and getMongoDBCollection()

Lastly, error handling throughout the codebase is rudimentary, and needs further development.

Growth and Challenges

The main avenues of potential growth, and source of obstacles, can be separated into three different categories: database, Mastodon, and hosting.

Database Roadblock

The most challenging aspect of Fightlike has been building, and maintaining, the database:

  1. Due to their being community-maintained, building a custom parser to reliably obtain character-defining keywords from Dustloop, Mizuumi, or SuperCombo did not bear fruit.
  2. Open-source, large language models, particularly, Llama, could not provide consistent results.

To build the database, I had to read through every character page, and fill in +200 database entries, which is neither practical nor sustainable on the long-run, making it a roadblock impeding Fightlike's potential growth.

In addition to that, I play only two fighting games—Guilty Gear and Under Night In-Birth, so I do not feel qualified to collect data related to 3D games, like Tekken, or platform fighters, like Smash

Character Submission Form

In attempt to get around the aforementioned database roadblock, I decided to build an anonymous, basic character submission form and host it on Fightlike's dedicated web portal

What the form does: mail its content to a dedicated Fightlike email for review, before I add them to the database. The Fightlike form source code

Fightlike's portal also includes a data dump of the bot's entire database, in the form of an automatically updating table

The Fightlike table source code

My hope is: if players find Fightlike useful, my role will be reduced to reviewing, and carrying, their submissions to the database, while occasionally contributing to data collection efforts.

Recommendations Accuracy

If adding more characters to Fightlike's database is a breadth problem, recommendations accuracy is a depth problem.

Ideally, the keywords key, in the database, should include every possible description of how every character plays. Currently, most characters have just enough keywords stored to provide accurate yet unrefined recommendations.

I plan to continue to add more keywords to every character for as long I am working on Fightlike; however, a better solution could be creating an addendum to Fightlike's character submission form to allow the community to submit keywords for existing characters.

Mastodon

The current layout of Fightlike's reply toot:

@mohab Fightlike recommends:

-Valentine in Skullgirls 2nd Encore+.

-Chipp Zanuff in Guilty Gear XX Accent Core Plus R.

-Orie in Under Night In-Birth II Sys:Celes.

-Spectre in DNF Duel.

-Ushiwakamaru in Melty Blood: Type Lumina.

-Swift Master in DNF Duel.

-Filia in Skullgirls 2nd Encore+.

-Robo-Fortune in Skullgirls 2nd Encore+.

-Double in Skullgirls 2nd Encore+.

The lack of official markdown, or HTML, support does not make adding wiki links an option without impeding the toot's readability, which already is suboptimal in the absence of ordered lists.

Hopefully, a future update will introduce this much needed change.

Hosting

My hosting plan, where mohab.xyz and fightlike.mohab.xyz are hosted, do not allow me to run a Python bot. Unfortunately, I currently do not own a spare server to locally host it either.

As a workaround, I set it up to run on boot when I boot up Linux, and in a WSL when I boot up Windows.

Needless to say, this is a makeshift solution, and I will be trying to move it to a cloud server at some point in the future.

Summary

Apart from some time-consuming refactoring, thanks to Mastodon.py's excellent documentation, and my previous experience with MongoDB's Atlas, coding the Mastodon bot was fairly simple and straightforward.

The main challenge was, and remains to be, maintaining, and growing the database.

With the bot in a functioning state, next stop for Fightlike will be a dockerized, self-hosted, open-source web app.