Reimplement edx-platform static asset processing#
Overview#
edx-platform has a complicated process for managing its static frontend assets. It slows down both developers and site operators.
We will deprecate the current paver-based asset processing system in favor of a new implementation based primarily on frontend tools and bash.
After one named release, the deprecated paver system will be removed.
Status#
Provisional
This was originally authored in March 2023. We modified it in July 2023 based on learnings from the implementation process.
The status will be moved to Accepted upon completion of reimplementation. Related work:
Context#
State of edx-platform frontends (early 2023)#
New Open edX frontend development has largely moved to React-based micro-frontends (MFEs). However, edx-platform still has a few categories of important static frontend assets:
Name |
Description |
Example |
Expected direction |
---|---|---|---|
Legacy LMS Frontends |
JS, SCSS, and other resources powering LMS views that have not yet been replatformed into MFEs |
Instructor Dashboard assets |
Replatform & DEPR |
Legacy CMS Frontends |
JS, SCSS, and other resources powering Studio views that have not yet been replatformed into MFEs |
Course outline editor and unit editor assets |
Replatform & DEPR |
Shared Frontend Files |
JS modules, SCSS partials, and other resources, usable by both Legacy LMS and CMS Frontends. This includes a few vendor libraries that have been committed to edx-platform in their entirety. |
Legacy cookie policy banner; CodeMirror |
Remove as part of full LMS/CMS frontend replatforming |
npm-installed Assets |
JS modules and CSS files installed via NPM. Not committed to edx-platform. |
React, studio-frontend, paragon |
Uninstall as part of full LMS/CMS frontend replatforming |
XModule Fragments |
JS and SCSS belonging to the older XModule-style XBlocks defined in edx-platform |
ProblemBlock (aka CAPA) assets |
Convert to pure XBlock fragments |
XBlock Fragments |
JS and CSS belonging to the pure XBlocks defined in edx-platform |
library_sourced_block.js |
Keep and/or extract to pip-installed, per-XBlock repositories |
pip-installed Assets |
Pre-compiled static assets shipped with several Python libraries that we install, including XBlocks. Not committed to edx-platform. |
Django Admin, Swagger, Drag-And-Drop XBlock V2 |
Keep |
Note: this table excludes HTML templates. Templates are part of the frontend, but they are dynamically rendered by the Web application and therefore must be handled differently than static assets.
So, with the exception of XBlock fragments and pip-installed assets, which are very simple for edx-platform to handle, we plan to eventually remove all edx-platform static frontend assets. However, given the number of remaining edx-platform frontends and speed at which they are currently being replatformed, estimates for completion of this process range from one to five years. Thus, in the medium term future, we feel that timeboxed improvements to how edx-platform handles static assets are worthwhile, especially when they address an acute pain point.
Current pain points#
Three particular issues have surfaced in Developer Experience Working Group discussions recently, each with some mitigations involving static assets:
Pain Point |
Potential solution(s) |
---|---|
edx-platform Docker images are too large and/or take too long to build. |
Switch from large, legacy tooling packages (such as libsass-python and paver) to industry standard, pre-compiled ones (like node-sass or dart-sass). Remove unnecessary & slow calls to Django management commands. |
edx-platform Docker image layers seem to be rebuilt more often than they should. |
Remove all Python dependencies from the static asset build process, such that changes to Python code or requirements do not always have to result in a static asset rebuild. |
In Tutor, using a local copy of edx-platform overwrites the Docker image’s pre-installed node_modules and pre-built static assets, requiring developers to reinstall & rebuild in order to get a working platform. |
Better parameterize the input and output paths edx-platform asset build, such that it may search for node_modules outside of edx-platform and generate assets outside of edx-platform. |
All of these potential solutions would involve refactoring or entirely replacing parts of the current asset processing system.
Decision#
We will largely reimplement edx-platform’s asset processing system. We will aim to:
Use well-known, npm-installed frontend tooling wherever possible.
When bespoke processing is required, use standard POSIX tools like Bash.
When Python is absolutely required, minimize the scope of its usage and the set of required Python libraries.
Avoid unnecessary indirection or abstraction. For this task, extensibility is a non-goal, and simplicity is a virtue.
Provide a clear migration path from the old system to the new one.
Enable the future removal of as much legacy frontend tooling code as possible.
Consequences#
Reimplementation Specification#
Commands and stages#
The three top-level edx-platform asset processing actions are build, collect, and watch. The build action can be further broken down into five stages. Here is how those actions and stages will be reimplemented:
Description |
Old implementation |
New implementation |
---|---|---|
Build: All stages. Compile, generate, copy, and otherwise process static assets so that they can be used by the Django webserver or collected elsewhere. For many Web applications, all static asset building would be coordinated via Webpack or another NPM-managed tool. Due to the age of edx-platform and its legacy XModule and Comprehensive Theming systems, though, there are five stages which need to be performed in a particular order. |
A Python-defined task that calls out to each build stage. |
Simple NPM wrappers around the build stages. The wrappers will be written in Bash and tested on both GNU+Linux and macOS. These commands are a “one stop shop” for building assets, but more efficiency-oriented users may choose to run build stages individually. |
|
Implemented in Python within update_assets. There is no standalone command for it. |
An NPM post-install hook will automatically call scripts/copy-node-modules.sh, a pure Bash reimplementation of the node_modules asset copying, whenever |
|
Equivalent paver task and console script, both pointing at to an application-level Python module. That module inspects attributes from legacy XModule-style XBlock classes in order to determine which static assets to copy and what to name them. |
(step no longer needed) |
|
Python wrapper around a call to webpack. Invokes the |
Simple shell script defined in package.json to invoke Webpack in prod or dev mode. The script will look for several environment variables, with a default defined for each one. See Build Configuration for details. The script will NOT invoke To continue using |
|
Paver task that invokes Note: We compile SCSS using |
A functionally equivalent reimplementation, wrapped as an If and when we upgrade from libsass-python to a more modern tool like |
|
The management command is a wrapper around the paver task. The former looks up the list of theme search directories from Django settings and site configuration; the latter requires them to be supplied as arguments. |
The management command will remain available, but it will be updated to point at |
Collect the built static assets from edx-platform to another location (the |
Paver task wrapping a call to the standard Django collectstatic command. It adds (This command also builds assets. The collect action could not be run on its own without calling pavelib’s Python interface.) |
The standard Django interface will be used without a wrapper. The ignore patterns will be added to edx-platform’s staticfiles app configuration so that they do not need to be supplied as part of the command. |
Watch static assets for changes in the background. When a change occurs, rebuild them automatically, so that the Django webserver picks up the changes. This is only necessary in development environments. A few different sets of assets may be watched: XModule fragments, Webpack assets, default SCSS, and theme SCSS. |
Paver task that invokes |
Bash wrappers around invocations of the watchdog library for themable/themed assets, and webpack –watch for Webpack-managed assets. Both of these tools are available via dependencies that are already installed into edx-platform. We considered using watchman, a popular file-watching library maintained by Meta, but found that the Python release of the library is poorly maintained (latest release 2017) and the documentation is difficult to follow. Django uses pywatchman but is planning to migrate off of it and onto watchfiles. We considered watchfiles, but decided against adding another developer dependency to edx-platform. Future developers could consider migrating to watchfiles if it seemed worthwile. |
Build Configuration#
To facilitate a generally Python-free build reimplementation, we will require that certain Django settings now be specified as environment variables, which can be passed to the build like so:
MY_ENV_VAR="my value" npm run build # Set for the whole build.
MY_ENV_VAR="my value" npm run webpack # Set for just a single step, like webpack.
For Docker-based distributions like Tutor, these environment variables can instead be set in the Dockerfile.
Some of these options will remain as Django settings because they are used in edx-platform application code. Others will be removed, as they were only read by the asset build.
Django Setting (Before) |
Description |
Django Setting (After) |
Environment Variable (After) |
---|---|---|---|
|
Path to Webpack config file. Defaults to |
removed |
|
|
Path to which LMS’s static assets will be collected. Defaults to |
|
|
|
Path to which CMS’s static assets will be collected. Defaults to |
|
|
|
Global configuration object available to edx-platform JS modules. Specified as a JSON string. Defaults to the empty object ( |
removed |
|
|
Directories that will be searched when compiling themes. |
|
|
Migration#
We will communicate the deprecation of the old asset system upon provisional acceptance of this ADR.
The old and new systems will both be available for at least one named release. Operators will encouraged to try the new asset processing system and report any issues they find. The old asset system will print deprecation warnings, recommending equivalent new commands to operators. Eventually, the old asset processing system will be entirely removed.
Tutor migration guide#
Tutor provides the openedx-assets Python script on its edx-platform images for building, collection, and watching. The script uses a mix of its own implementation and calls out to edx-platform’s paver tasks, avoiding the most troublesome parts of the paver tasks. The script and its interface were the inspiration for the new build-assets.sh that this ADR describes.
As a consequence of this ADR, Tutor will either need to:
reimplement the script as a thin wrapper around the new asset processing commands, or
deprecate and remove the script.
Either way, the migration path is straightforward:
Existing Tutor-provided command |
New upstream command |
---|---|
|
|
|
|
|
(no longer needed) |
|
|
|
|
|
|
|
|
|
|
The options accepted by openedx-assets
will all be valid inputs to scripts/build-assets.sh
.
non-Tutor migration guide#
Operators using distributions other than Tutor should refer to the upstream edx-platform changes described above in Reimplementation Specification, and adapt them accordingly to their distribution.
See also#
OpenCraft has also performed a discovery on a modernized system for static assets for XBlocks in xmodule. Its scope overlaps with this ADR’s in a way that makes it great supplemental reading.
Rejected Alternatives#
Live with the problem#
We could avoid committing any work to edx-platform asset tooling, and instead just wait until all frontends have been replatformed into MFEs. See the Context section above for why this was rejected.
Improve existing system#
Rather than replace it, we could try to improve the existing Paver-based asset processing system. However, entirely dropping Paver and mostly dropping Python has promising benefits:
Asset build independence#
When building a container image, we want to be able to build static assets without first copying any Python code or requirements lists from edx-platform into the build context. That way, only changes to system requirements, npm requirements, or the assets themselves would trigger an asset rebuild.
Encouraging simplicity#
The asset pipeline only needs to perform a handful of simple tasks, primarily copying files and invoking shell commands. It does NOT need to be extensible, as we do not want new frontend features to be added to the edx-platform repository. On the contrary, simplicity and obviousness of implementation are virtues. Bash is particularly suited for these sort of scripts.
However, Python (like any modern application language) encourages developers to modularize, build abstractions, use clever control flow, and employ indirection. This is particularly noticeable with the Paver assets build, which is a thousand lines long and difficult to understand.
Better interop with standard tools#
It is best if the build can stem from a single call to npm install && npm run build
rather than a call to a bespoke script (whether Paver or Bash). Generally speaking, the more edx-platform can work with standard frontend tooling, the easier it’ll be for folks to use, understand, and maintain it.