'On Writing Software Well' II

'On Writing Software Well' II

Callbacks vs. Listeners

DHH remarks that he is a big fan of callbacks since they allow you to move a lot of incidental complexity off to the side while the rest of the code can pretend to be on a simple, default path shielding the programmer from a lot of the cognitive load that lives in callbacks instead.

To see what that means in practice, we are going to trace the mentions feature in Basecamp all the way down and pay attention to how callbacks are used to that end.

The entry point is the create action of the messages controller, which simply records a new message on a bucket (a bucket is an abstraction used to group certain entities, which will be explained in future episodes). The new_message method in turn simply instantiates a message, note that logic pertaining to the creation of mentions or actual recordings is missing from the controller.

class MessagesController < ApplicationController
...
  def create
    @recording = @bucket.record new_message,
      parent: @parent_recording,
      status: status_param,
      subscribers: find_subscribers,
      category: find_category

    ...
  end
  ...
  def new_message
    Message.new params.require(:message).permit(:subject, :content)
  end
...
end

A mention is a model joining a mentioner and mentionee to a specific recording:

class Mention < ActiveRecord::Base
  ...
  belongs_to: :recording

  belongs_to: mentionee, class_name: 'Person', inverse_of: :mentions
  belongs_to: mentioner, class_name: 'Person'
  ...
  after_commit :deliver, unless: -> { mentioner == mentionee }, on: [:create, :update]
end

Mentions are a simple concern which orchestrates when mentions are to be scheduled.

module Recording::Mentions
  extend ActiveSupport::Concern

  included do
    has_many :mentions
    after_save :remember_to_eavesdrop
    after_commit :eavesdrop_for_mentions, on: %i[ create update ], if: :eavesdropping?
  end
  ...
  private
  
  def remember_to_eavesdrop
    @eavesdropping = active_or_archived_recordable_changed? || draft_became_active?
  end

  def eavesdropping?
    @eavesdropping && !Mention::Eavesdropper.suppressed? && has_mentions? 
  end

  def eavesdrop_for_mentions
    Mention::EavesdroppingJob.perform_later self, mentioner: Current.person
  end
end

DHH points out a trick to track dirty attributes, circumventing a problem that many Rails developers have also run into; when you run an after_commit callback you can no longer access to which attributes changed invoking neither changed_attributes nor the _changed? methods, since they only persist within a database transaction.

We simply check before the transaction is committed in an after_save callback which attributes changed, make a note of it in an instance variable so that we can access the information later (e.g. in the after_commit callback).

Here, remember_to_eavesdrop records whether the content of the recordable record actually changed or whether a recordable which might contain mentions became active before we scan for mentions.

The eavesdropping? query method simply checks whether the instance variable is set, that mentions exists and that the eavesdropping callback has not been disabled via suppress. To the last point, DDH explains that while callbacks are supposed to contain code that should run by default, it might sometimes be necessary to disable them.

Finally, after checking whether we should perform any work and scan for mentions, the actual work is delegated to a job via eavesdrop_for_mentions, the job simply instantiates an instance of Mention::Eavesdropper which creates the actual mentions. Also, note how the method Current, a class that allows global, per-request storage of attributes, is used to pass the current user as mentioner to the job.

class Mention::EavesdroppingJob < ApplicationJob
  queue_as :background

  def perform(recording, mentioner)
    Current.set(account: recording.account) do
      Mention::EavesDropper.new(recording, mentioner).create_mentions
    end
  end
end

The EavesDropper in turn invokes a scanner that finds mentionees and creates mentions.

class Mention::Eavesdropper
  extend Suppressible
  ...
  def create_mentions
    recording.with_lock do
      mentionees.each do |mentionee, callsign|
        create_mention mentionee, callsign
      end
    end
  end
end

That is it, we moved the ancillary concern of creating mentions off to the side, by handling it in callbacks as response to certain life cycle events of our model as opposed to the 'main path' of our code inside the controller action. A developer interested in the main path i.e. creating messages is not confronted with the complexity of creating mentions right away. While it is true that this reduces some cognitive load in that specific case, it comes at non-negligible cost.
Note how we had to trace the feature of creating mentions in response to a change to a recordable record all the way from the controller, through the model's life cycle methods to a job and finally a service creating the mentions. Along the way we are given hints that this level of coupling is fraught with some amount amount of complication.

Tracking dirty attributes

First off, we need intricate knowledge about Rails life cycle methods in order to be able to track changes to a recording and know whether we should even check for mentions. I need to be cognizant of database transaction and how they relate to callbacks to even become aware of how to track model changes in after_commit. Talk about incidental complexity.

Checking for suppression in callbacks

Secondly, apparently, there are use cases where the client (whoever is initiating those model updates) might not want to listen for mentions, maybe I am seeding data or going through an admin API that I don't want to trigger sending emails. Quite plausible. In those cases, I need to check whether creating mention eavesdroppers has explicitly been suppressed. The problems introduced by this sort of coupling have been addressed in this post. But it again strikes me as very counterintuitive and error-prone to reach into a completely different class, whose internal state has been modified elsewhere in order to decide whether to run a callback or not.

Using Current to store request-wide state

Finally, a problem that results from handling these types of interactions deep down in active record models is that I still need information from the controller. In this case, a global object is used to register that information making it globally accessible in the entire application. That should be the clearest indication that I might be performing work in a class that has to know too much in order to perform it and hence might be the wrong place to do it.

The controller as mediator

That's enough for criticisms. I think the highlighted problems all indicate that we shouldn't know what the we are trying to know inside the callback, because we are too far removed from where those decisions occur; the controller.

I have always thought of the controller, more specifically a controller action, as a mediator encapsulating knowledge about a particular use case and deciding which models need to talk to which and what they need to know to accomplish their particular tasks. The controller orchestrates, passes on information and creates side effects, much in the vein of Gary Bernhardt's functional core / imperative shell.

At speqtor.com, we have a similar feature to Basecamp's mentions where certain updates to models create notifications for different users subscribed to that model.

A typical controller action looks like this:

def update
  load_criterion
  build_criterion
  authorize_criterion

  subscribe_listeners(@criterion)

  save_criterion

  decorate_criterion
  render_criterion
end

We like sticking to the same structure in every controller which makes them easy to understand and to spot where interesting things are happening (See the excellent Growing Rails Applications in Practice). Here, we are updating a criterion that indicates how complex a project is going to be. In this specific use case, a user directly interacts with our web app, as opposed to an importer job or the rails console. In this context we want a number of side effects to happen as a result of certain model events. This is achieved by registering event listeners, which in turn decide what is supposed to happen as a result of those changes.

In our example, we want to listen to successful updates in order to notify other users.

This happens inside the SubscribesListeners concern:

def subscribe_notification_listener(options = {})
  with_load_error_guard do
    listener_class = options[:notification_listener_class] || infer_listener
    listener = listener_class.new

    listener.current_user = current_user
    listener.changes = subscription_target.changes.transform_values do |val|
      val.map(&:to_s)
    end

    subscription_target.subscribe listener, async: true
  end
end

Here, we are instantiating the listener class, pass in the information it needs, i.e. model changes and the current user and subscribe it to the target model (in this case the criterion). Often we add other information available in the controller, such as the scope of the current project or user permissions. The listener, in turn, simply creates the notification.

Simple enough.

How are the above problems solved here?

Regarding dirty attribute tracking; since we haven't persisted the model yet, we can still access model changes though the attributes api, when the model is saved and the database transaction completes, the listener is merely notified of its success or failure.

As we are still inside the controller context we can also pass any information such as the current user to the listener without having to awkwardly store it in a global Current.

Lastly, the listener is maximally decoupled, we have to explicitly opt into creating notifications depending on the current use case, as opposed to anticipating every use case by checking related models for suppression.

An additional benefit is that, we can now easily background the listener without having to worry  about implicit state in the form of model suppression or Current registries.

So what should callbacks be used for?

I am not a big fan of hard and fast rules in software design, but sometimes it's prudent to have certain guidelines to stick to unless there is a very good reason for violating them.

One of them is that callbacks should only deal with immediate model concerns, which are in declining order of popularity:

  1. Maintaining data integrity and mutating the model into a valid state, examples are normalizing or splitting attributes.
  2. Mutate a closely associated model, for instance counter caches in a one-to-many association.
  3. Small side effects for related or derived data such as busting caches.