This commit shows how to move away from rf/defn
f12c7401d1/src/utils/re_frame.clj (L1-L90)
& rf/merge
f12c7401d1/src/utils/re_frame.cljs (L39-L85)
and why we should do it.
## Problems
Before jumping to solutions, let's understand the problems first, in no order of
importance.
### Problem 1: Cyclic dependencies
If you ever tried to move event handlers or the functions used inside them to
different files in status-mobile, you probably stumbled in cyclic dependency
errors.
When an event is registered in re-frame, it is globally available for any other
place to dispatch. The dispatch mechanism relies on globally unique keywords,
the so called event IDs, e.g. :chat/mute-successfully. This means that event
namespaces don't need to require other event namespaces, just like you don't
need to require subscription namespaces in views.
rf/merge increases the likelihood of cyclic dependencies because they force
event namespaces to require each other. Although not as common, this happened a
few times with devs in the team and it can be a big hassle to fix if you are
unlucky. It is a problem we should not have in the first place (at least not as
easily).
### Problem 2: We are not linting thousands of lines of code
The linter (clj-kondo) is incapable of understanding the rf/defn macro. In
theory, we could teach clj-kondo what the macro produces. I tried this, but gave
up after a few tries.
This is a big deal, clj-kondo can catch many issues and will continue to catch
more as it continue to evolve. It's hard to precisely count how many lines are
affected, but `find src/ -type f -name 'events.cljs' -exec wc -l {} +` gives us
more than 4k LOC.
### Problem 3: Blocking RN's UI thread for too long
Re-frame has a routing mechanism to manage events. When an event is dispatched,
it is enqueued and scheduled to run some time later (very soon). This process is
asynchronous and is optimized in such a way as to balance responsiveness vs the
time to empty the queue.
>[...] when processing events, one after the other, do ALL the currently queued
>events. Don't stop. Don't yield to the browser. Hog that CPU.
>
>[...] but if any new events are dispatched during this cycle of processing,
>don't do them immediately. Leave them queued.
>
>-- https://github.com/day8/re-frame/blob/master/src/re_frame/router.cljc#L8-L60
Decisions were made (way back in 2017) to reduce the number of registered
re-frame events and, more importantly, to coalesce events into bigger ones with
the rf/merge pattern. I tried to find evidence of real problems that were trying
to be solved, but my understanding is that decisions were largely based on
personal architectural preferences.
Fast-forward to 2023, and we are in a situation where we have many heavy events
that process a LOT of stuff in one go using rf/merge, thus blocking the UI
thread longer than we should. See, for example,
[status-im2.contexts.profile.login.events/login-existing-profile](3082605d1e/src/status_im2/contexts/profile/login/events.cljs (L69)),
[status-im2.contexts.profile.login.events/get-chats-callback](3082605d1e/src/status_im2/contexts/profile/login/events.cljs (L98)),
and many others.
The following excerpt was generally used to justify the idea that coalescing
events would make the app perform better.
> We will reduce the the amount of subscription re-computations, as for each
> distinct action, :db effect will be produced and swapped into app-db only once
>
> -- https://github.com/status-im/swarms/issues/31#issuecomment-346345981
This is in fact incorrect. Re-frame, ever since 2015 (so before the original
discussions in 2017) uses a concept of batching to process events, which means
subscriptions won't re-run after every dispatched event, and thus components
won't re-render either. Re-frame is smarter than that.
> groups of events queued up will be handled in a batch, one after the other,
> without yielding to the browser (previously re-frame yielded to the browser
> before every single event).
>
> -- 39adca9367/docs/releases/2015.md (050--2015-11-5)
Here's a practical example you can try in a shadow-cljs :mobile REPL to see the
batching behavior in practice.
```clojure
;; A dummy event that toggles between DEBUG and INFO levels.
(re-frame/reg-event-fx :dummy-event
(fn [{:keys [db]}]
{:db (update-in db
[:profile/profile :log-level]
(fn [level]
(if (= "DEBUG" level)
"INFO"
"DEBUG")))}))
(def timer
(js/setInterval #(re-frame/dispatch [:dummy-event])
50))
;; 1. In component status-im.ui.screens.advanced-settings.views/advanced-settings,
;; add a print call to see when it's re-rendered by Reagent because the
;; subscription :log-level/current-log-level will be affected by our dummy event.
;;
;; 2. Also add a print call to the subscription :log-level/current-log-level to
;; see that the subscription does NOT re-run on every dispatch.
;; Remember to eval this expression to cancel the timer.
(js/clearInterval timer)
```
If you run the above timer with 50ms interval, you'll see a fair amount of
batching happening. You can infer that's the case because you'll see way less
than 20 print statements per second (so way less than 20 recomputations of the
subscription, which is the max theoretical limit).
When the interval is reduced even more, to say 10ms (to simulate lots of
dispatches in a row), sometimes you don't see a single recomputation in a 5s
window because re-frame is too busy processing events.
This shows just how critical it is to have event handlers finishing as soon as
possible to relinquish control back to the UI thread, otherwise responsiveness
is affected. It also shows that too many dispatches in a row can be bad, just as
big event handlers would block the batch for too long. You see here that
dispatching events in succession does NOT cause needless re-computations.
Of course there's an overhead of using re-frame.core/dispatch instead of calling
a Clojure function, but the trade-off is clearly documented: the more we
process in a single event, the less responsive the app may be because re-frame
won't be able to relinquish control back to the UI thread. The total time to
process the batch increases, but re-frame can't stop in the middle compared to
when different dispatches are used.
Thus, I believe this rf/merge pattern is harmful as a default practice in an
environment such as ours, where it's desirable end-users feel a snappy RN app. I
actually firmly believe we can improve the app's responsiveness by not
coalescing events by default. We're also preventing status-mobile from taking
the most advantage from future improvements in re-frame's scheduler. I can
totally see us experimenting with other algorithms in the scheduler to best fit
our needs. We should not blindly reduce the number of events as stated here
https://github.com/status-im/status-mobile/pull/2634#discussion_r155243127.
Solution: only coalesce events into one pile when it's strictly desirable to
atomically update the app db to avoid inconsistencies, otherwise, let the
re-frame scheduler do its job by using fx, not rf/merge. When needed, embrace
*eventual app db consistency* as a way to achieve lower UI latency, i.e. write
fast and short events, intentionally use :dispatch-later or other timing effects
to bend the re-frame's scheduler to your will.
There's another argument in favor of using something like rf/merge which I would
like to deconstruct. rf/merge gives us a way to reuse computations from
different events, which is nice. The thing here is that we don't need rf/merge
or re-frame to reuse functions across namespaces. rf/merge complects re-frame
with the need to reuse transformations.
Instead, the solution is as trivial as it gets, reuse app db "transformers"
across events by extracting the logic to data store namespaces
(src/status_im/data_store). This solution has the added benefit of not causing
cyclic dependency errors.
### Problem 4: Clojure's language server doesn't understand code under rf/defn
Nowadays, many (if not most) Clojure devs rely on the Clojure Language Server
https://github.com/clojure-lsp/clojure-lsp to be more effective. It is an
invaluable tool, but it doesn't work well with the macro rf/defn, and it's a
constant source of frustration when working in event namespaces. Renaming
symbols inside the macro don't work, finding references, jumping to local
bindings, etc.
Solution: don't use rf/defn, instead use re-frame's reg-event-fx function and
clojure-lsp will understand all the code inside event handlers.
### Problem 5: Unit tests for events need to "test the world"
Re-frame's author strongly recommends testing events that contain non-trivial
data transformations, and we do have many in status-mobile (note: let's not
confuse with integration tests in status_im/integration_test.cljs). That, and
non-trivial layer-3 subscriptions should be covered too. The reasoning is that
if we have a well developed and tested state layer, many UI bugs can be
prevented as the software evolves, since the UI is partially or greatly derived
from the global state. See re-frame: What to Test?
39adca9367/docs/Testing.md (what-to-test).
See PR Introduce subscription tests
https://github.com/status-im/status-mobile/pull/14472, where I share more
details about re-frame's testing practices.
When we use rf/merge, we make unit testing events a perennial source of
frustration because too many responsibilities are aggregated in a single event.
Unfortunately, we don't have many devs in the team that attempted to write unit
tests for events to confirm my claim, but no worries, let's dive into a real
example.
In a unit test for an event, we want to test that, given a cofx and args, the
event handler returns the expected map of effects with the correct values
(usually db transformations).
Let's assume we need to test the following event. The code is still using the
combo rf/defn & rf/merge.
```clojure
(rf/defn accept-notification-success
{:events [:activity-center.notifications/accept-success]}
[{:keys [db] :as cofx} notification-id {:keys [chats]}]
(when-let [notification (get-notification db notification-id)]
(rf/merge cofx
(chat.events/ensure-chats (map data-store.chats/<-rpc chats))
(notifications-reconcile [(assoc notification :read true :accepted true)]))))
```
As you can see, we're "rf/merging" two other functions, namely ensure-chats and
notifications-reconcile. In fact, ensure-chats is not registered in re-frame,
but it's 99% defined as if it's one because it needs to be "mergeable" according
to the rules of rf/merge. Both of these "events" are quite complicated under the
hood and should be unit tested on their own.
Now here goes the unit test. Don't worry about the details, except for the expected output.
```clojure
(deftest accept-notification-success-test
(testing "marks notification as accepted and read, then reconciles"
(let [notif-1 {:id "0x1" :type types/private-group-chat}
notif-2 {:id "0x2" :type types/private-group-chat}
notif-2-accepted (assoc notif-2 :accepted true :read true)
cofx {:db {:activity-center {:filter {:type types/no-type :status :all}
:notifications [notif-2 notif-1]}}}
expected {:db {:activity-center {:filter {:type 0 :status :all}
:notifications [notif-2-accepted notif-1]}
:chats {}
:chats-home-list nil}
;; *** HERE ***
:dispatch-n [[:activity-center.notifications/fetch-unread-count]
[:activity-center.notifications/fetch-pending-contact-requests]]}
actual (events/accept-notification-success cofx (:id notif-2) nil)]
(is (= expected actual)))))
```
Notice the map has a :dispatch-n effect and other stuff inside of it that are
not the responsibility of the event under test to care about. This happens
because rf/merge forces the event handler to compute/call everything in one go.
And things get MUCH worse when you want to test an event A that uses rf/merge,
but A calls other events B and C that also use rf/merge (e.g. event
:profile.login/get-chats-callback). At that point you flip the table in horror
😱, but testing events and maintaining them should be trivial.
Solution: Use re-frame's `fx` effect.
Here's the improved implementation and its accompanying test.
```clojure
(defn accept-notification-success
[{:keys [db]} [notification-id {:keys [chats]}]]
(when-let [notification (get-notification db notification-id)]
(let [new-notifications [(assoc notification :read true :accepted true)]]
{:fx [[:dispatch [:chat/ensure-chats (map data-store.chats/<-rpc chats)]]
[:dispatch [:activity-center.notifications/reconcile new-notifications]]]})))
(re-frame/reg-event-fx :activity-center.notifications/accept-success accept-notification-success)
(deftest accept-notification-success-test
(testing "marks notification as accepted and read, then reconciles"
(let [notif-1 {:id "0x1" :type types/private-group-chat}
notif-2 {:id "0x2" :type types/private-group-chat}
notif-2-accepted (assoc notif-2 :accepted true :read true)
cofx {:db {:activity-center {:filter {:type types/no-type :status :all}
:notifications [notif-2 notif-1]}}}
;; *** HERE ***
expected {:fx [[:dispatch [:chat/ensure-chats []]]
[:dispatch [:activity-center.notifications/reconcile [notif-2-accepted]]]]}
actual (events/accept-notification-success cofx [(:id notif-2) nil])]
(is (= expected actual)))))
```
Notice how the test expectation is NOT verifying what other events do (it's
actually "impossible" now). Using fx completely decouples events and makes
testing them a joy again.
### Problem 6: Unordered effects
status-mobile still uses the legacy way to describe the effects map, which has
the problem that their order is unpredictable.
> Prior to v1.1.0, the answer is: no guarantees were provided about ordering.
> Actual order is an implementation detail upon which you should not rely.
>
> -- 39adca9367/docs/Effects.md (order-of-effects)
> In fact, with v1.1.0 best practice changed to event handlers should only
> return two effects :db and :fx, in which case :db was always done first and
> then :fx, and within :fx the ordering is sequential. This new approach is more
> about making it easier to compose event handlers from many smaller functions,
> but more specificity around ordering was a consequence.
>
> -- 39adca9367/docs/Effects.md (order-of-effects)
### Problem 7: Usage of deprecated effect dispatch-n
We have 35 usages, the majority in new code using dispatch-n, which has been
officially deprecated in favor of multiple dispatch tuples in fx. See
39adca9367/docs/api-builtin-effects.md (L114)
### Problem 8: Complexity 🧙♂️
Have you ever tried to understand and/or explain how rf/merge and rf/defn work?
They have their fare share of complexity and have tripped up many contributors.
This is not ideal if we want to create a project where contributors can learn
re-frame as quickly as possible. Re-frame is already complicated enough to grasp
for many, the added abstractions should be valuable enough to justify.
Interestingly, rf/merge is a stateful function, and although this is not a
problem in practice, it is partially violating re-frame's spirit of only using
pure functions inside event handlers.
### Problem 9: Using a wrapping macro rf/defn instead of global interceptors
When rf/defn was created inside status-mobile, re-frame didn't have global
interceptors yet (which were introduced 3+ years ago). We no longer have this
limitation after we upgraded our old re-frame version in PR
https://github.com/status-im/status-mobile/pull/15997.
Global interceptors are a simple and functional abstraction to specify functions
that should run on every event, for example, for debugging during development,
logging, etc. This PR already shows this is possible by removing the wrapping
function utils.re-frame/register-handler-fx without causing any breakage.
## Conclusion
By embracing re-frame's best practices for describing effects
39adca9367/docs/FAQs/BestPractice.md (use-the-fx-effect),
we can solve long standing issues that affect every contributor at different
levels and bring the following benefits:
- Simplify the codebase.
- Bring back the DX we all deserve, i.e. Clojure Language Server and clj-kondo
fully working in event namespaces.
- Greatly facilitate the testability of events.
- Give devs more flexibility to make the app more responsive, because the new
default would not coalesce events, which in turn, would block the UI thread
for shorter periods of time. At least that's the theory, but exceptions will
be found.
The actions to achieve those benefits are:
- Don't use the macro approach, replace rf/defn with
re-frame.core/reg-event-fx.
- Don't use rf/merge, simply use re-frame's built-in effect :fx.
- Don't call event handlers as normal functions, just as we don't directly call
subscription handlers. Use re-frame's built-in effect :fx.
## How do we refactor the remainder of the code?
Some numbers first:
- There are 228 events defined with rf/defn in src/status-im2/.
- There are 34 usages of rf/merge in src/status_im2/.
## Resources
- Release notes where fx was introduced in re-frame:
39adca9367/docs/releases/2020.md (110-2020-08-24)
* nix: upgrade zprint from 1.2.4 to 1.2.5
To address issue described in:
https://github.com/kkinnear/zprint/issues/273
Signed-off-by: Jakub Sokołowski <jakub@status.im>
* chore: use zprint :multi-lhs-hang
* refactor: re-format clojure using zprint 1.2.5
---------
Signed-off-by: Jakub Sokołowski <jakub@status.im>
Co-authored-by: yqrashawn <namy.19@gmail.com>