Page 1 of 1

Hogumathi: Content System API. Still learning Python industry standards, lack of CI is an issue.

Posted: Sun Aug 13, 2023 7:13 pm
by harlanji
Hogumathi started as a Twitter and Mastodon client and evolved into what I call a Social Media CMS (Content Management System). It's not made the full transition from client app to CMS as I'd intended to find a host CMS to integrate with, and decouple it from the Flask based web app I've constructed around what's become the core model.

Progress got interrupted for several months mainly due to the Twitter API going premium, and I'm going back and looking at the code with a fresh set of eyes. Twitter is not the only backend supported by far, but it's the most developed and initial backend that I built for. In the process of implementing several backends I started getting to the point of wanting to cache and query content generically, which led to the introduction of the Content System API.

Is it over-engineering? Is it a misguided abstraction? It feels like it upon re-examination. It's pretty similar to Android's ContentSource/ContentProvider API, but I think they chose a better approach in using a uniform URI scheme whereas my IDs take a regexp which ends up being uniform in practice. The motivating use case was the Brand page and Collections, which can feature content from across sources using the same FeedItem interface. The next was fetching Tweets from an Archive where possible and the Live API where we don't have a local copy, or want to grab the latest metrics using a chosen provider. An edge case that is allowed by some providers is fetching several pieces of content at once, namely the Twitter API which allows fetching a list of Tweets.

Code: Select all


    def register_content_source (self, id_prefix, content_source_fn, id_pattern='(\d+)', source_id=None, weight=None):

    def find_content_id_args (self, id_pattern, content_id):

    def resolve_content_source (self, content_id, content_source_id=None, *extra_args, **extra_kwargs):

    def get_content (self, content_id, content_source_id=None, ttl_hash=get_ttl_hash(60), *extra_args, **extra_kwargs):

    def get_all_content (self, content_ids, enable_bulk_fetch=False, ttl_hash=get_ttl_hash(60)):

    def register_hook (self, hook_type, hook_fn, *extra_args, **extra_kwargs):

    def invoke_hooks (self, hook_type, *args, **kwargs):
I'm still not 100% up to industry standards in the Python ecosystem, for example I haven't figured out the right documentation tools to use. I'm good at documenting my code once it's stabilized but I've just not gotten to that point much in my Python apps because I'm still figuring out the best ways to structure them. The only thing worse than no documentation is wrong documentation, so I've pretty much relied on 'self-documenting' code until things are more solidified.

This module on its own is a good candidate for extraction and thorough testing. And I have indeed created some tests around the code using PyTest. but it's not running as part of a regular build process and has fallen out of date. Visible CI reports are missing from my stack; I used to just use Jenkins. I guess finding a lightweight CI that works for Python might be an upcoming adventure.