Compiling and Running a MATLAB Training Script on Databricks (Linux)
Scope
Compile a MATLAB deep-learning training script into a Linux standalone executable with MATLAB Compiler, package it with MATLAB Runtime in a custom Docker image, run it on a Databricks cluster, and persist checkpoints to DBFS. References provided inline.
Prerequisites
- MATLAB + MATLAB Compiler on a Linux build machine. Compiled artifacts are OS-specific. Build Linux for Linux. (MathWorks)
- Access to MATLAB Runtime of the same release as the compiler. (MathWorks)
- Databricks workspace with permission to run clusters and specify a custom container image. (Databricks Dokumentation)
Step 1: Refactor your training code for headless execution
Create an entry-point function (no GUI, no interactive figures). Save as train_model.m
.
function exit_code = train_model(dataPath, outPath)
% dataPath: directory with prepared training/validation data
% outPath: where to write logs, checkpoints, final model
% Example skeleton using Deep Learning Toolbox API
% Load data
imdsTrain = imageDatastore(fullfile(dataPath,"train"), "IncludeSubfolders",true, "LabelSource","foldernames");
imdsVal = imageDatastore(fullfile(dataPath,"val"), "IncludeSubfolders",true, "LabelSource","foldernames");
% Define net and training options (no plots)
lgraph = layerGraph(alexnet); % placeholder; replace with your model
opts = trainingOptions("sgdm", ...
"MaxEpochs",5, ...
"MiniBatchSize",64, ...
"ValidationData",imdsVal, ...
"Verbose",true, ...
"Plots","none", ...
"OutputFcn",@(info)checkpointFcn(info,outPath));
% Train
net = trainNetwork(imdsTrain, lgraph, opts);
% Save final artifact
if ~isfolder(outPath); mkdir(outPath); end
save(fullfile(outPath,"final_model.mat"),"net","-v7.3");
exit_code = 0;
end
function stop = checkpointFcn(info,outPath)
stop = false;
if info.State == "iteration" && mod(info.Iteration,100)==0
if ~isfolder(outPath); mkdir(outPath); end
save(fullfile(outPath, sprintf("ckpt_iter_%06d.mat", info.Iteration)), "-struct", "info");
end
end
Notes: MATLAB Compiler supports packaging command-line programs; compiled apps run under MATLAB Runtime without an interactive desktop. (MathWorks)
Step 2: Compile on a Linux build machine
Use mcc
to create a Linux standalone. Example:
-m
builds a command-line executable;-o
names the binary. Seemcc
reference for flags, including adding assets with-a
. (MathWorks)
The compiler also emits a helper script run_train_model_cli.sh
that sets required environment variables before launching the binary with MATLAB Runtime. (MathWorks)
Step 3: Obtain and install MATLAB Runtime (Linux)
Download the matching Runtime and install silently:
# Example: place the MATLAB Runtime installer in /tmp/MATLAB_Runtime_R2024b_glnxa64
# Create a response file (installer control text)
cat >/tmp/mcr_silent.txt <<'EOF'
agreeToLicense=yes
destinationFolder=/opt/mcr
outputFile=/var/log/mcr_install.log
EOF
# Run installer silently
/tmp/MATLAB_Runtime_R2024b_glnxa64/install -mode silent -inputFile /tmp/mcr_silent.txt
- Silent/noninteractive options are documented for MATLAB Runtime installers. (MathWorks)
Step 4: Build a Databricks-ready Docker image
Create a minimalist image that contains:
- MATLAB Runtime (installed to
/opt/mcr
) - Your compiled app artifacts (
train_model_cli
,run_train_model_cli.sh
) - Entrypoint convenience script
Dockerfile (example):
FROM ubuntu:22.04
# System deps
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates libxext6 libxrender1 libxt6 libxi6 libxrandr2 libxfixes3 libxcursor1 \
unzip curl bash && \
rm -rf /var/lib/apt/lists/*
# Copy MATLAB Runtime installer payload and response file
# Expect these to be staged next to Dockerfile
COPY MATLAB_Runtime_R2024b_glnxa64 /tmp/MATLAB_Runtime_R2024b_glnxa64
COPY mcr_silent.txt /tmp/mcr_silent.txt
# Install MATLAB Runtime silently
RUN /tmp/MATLAB_Runtime_R2024b_glnxa64/install -mode silent -inputFile /tmp/mcr_silent.txt && \
rm -rf /tmp/MATLAB_Runtime_R2024b_glnxa64 /tmp/mcr_silent.txt
# App files
WORKDIR /opt/app
COPY train_model_cli .
COPY run_train_model_cli.sh .
# Helper: wrapper that sets LD_LIBRARY_PATH then launches the app
RUN printf '%s\n' \
'#!/usr/bin/env bash' \
'export MCRROOT=/opt/mcr' \
'export LD_LIBRARY_PATH=$MCRROOT/v920/runtime/glnxa64:$MCRROOT/v920/bin/glnxa64:$MCRROOT/v920/sys/os/glnxa64:$LD_LIBRARY_PATH' \
'exec /opt/app/train_model_cli "$@"' > /usr/local/bin/run_train && \
chmod +x /usr/local/bin/run_train
# Default working dir for jobs
WORKDIR /workdir
ENTRYPOINT ["/usr/local/bin/run_train"]
- Custom containers are the recommended way to standardize Databricks environments. (Databricks Dokumentation)
- Setting the MATLAB Runtime library path at run time is required; MathWorks documents LD_LIBRARY_PATH usage for deployment. Replace
v920
with the correct version path for your Runtime. (MathWorks)
Build and push:
Databricks publishes reference container examples for guidance when adapting Dockerfiles. (GitHub)
Step 5: Create a Databricks cluster with this image
- In cluster configuration, specify the custom Docker image
<registry>/<repo>/mcr-train:latest
. - Databricks Container Services docs cover setup, GPU notes, and init scripts if you need additional boot-time steps. (Databricks Dokumentation)
If you must run shell configuration or mount commands at startup, use init scripts. (Databricks Dokumentation, Microsoft Learn)
Step 6: Stage data and outputs on DBFS
Upload your prepared training data and pick an output directory:
# From your workstation using Databricks CLI (example)
databricks fs cp -r ./local_data dbfs:/mnt/data/myproject/train_data
You will pass dataPath
and outPath
pointing to driver-visible paths. In notebooks/shell on the cluster, paths under /dbfs/...
map to DBFS. (community.databricks.com)
Step 7: Run the compiled training job on Databricks
From a Databricks notebook cell:
%sh
# Example invocation. The container ENTRYPOINT is /usr/local/bin/run_train.
# Map DBFS paths through /dbfs for the Linux process.
DATA_PATH=/dbfs/mnt/data/myproject/train_data
OUT_PATH=/dbfs/mnt/outputs/myproject/run_001
mkdir -p "$OUT_PATH"
run_train "$DATA_PATH" "$OUT_PATH"
You can also open the Web Terminal on the cluster’s driver to run shell commands interactively. (Databricks Dokumentation, Microsoft Learn, Databricks)
GPU acceleration (optional)
Use a GPU-enabled Databricks runtime and GPU nodes; ensure the CUDA/NVIDIA driver stack is compatible with your MATLAB release and Runtime. [Inference] Databricks documents running custom containers on GPU compute. (Databricks Dokumentation)
Parallelism and multi-node considerations
Compiled MATLAB apps execute as single processes by default. Multi-node or Spark-integrated workflows require MATLAB Parallel Server and specific Spark/cluster configuration; this is outside the standard compiled-app pattern for Databricks. (MathWorks)
Troubleshooting checklist
- Binary fails to start: verify Runtime version matches the build and LD_LIBRARY_PATH is set as documented for MATLAB Runtime. (MathWorks)
- DBFS file access: invoke paths through
/dbfs/...
from shell; confirm permissions. (community.databricks.com) - Container boot customization: move repeated setup into init scripts. (Databricks Dokumentation, Microsoft Learn)
References
mcc
reference (MATLAB Compiler). (MathWorks)- Standalone applications with MATLAB Compiler (platform-specific executables). (MathWorks)
- Download and install MATLAB Runtime; silent/noninteractive options. (MathWorks)
- Set MATLAB Runtime library path for deployment. (MathWorks)
- Databricks custom containers (Docker) for clusters, including GPU notes. (Databricks Dokumentation)
- Databricks container examples (reference Dockerfiles). (GitHub)
- Databricks init scripts documentation. (Databricks Dokumentation, Microsoft Learn)
- MATLAB Parallel Server on Spark/Databricks (advanced, optional). (MathWorks)