Skip to content

Compiling and Running a MATLAB Training Script on Databricks (Linux)

Scope

Compile a MATLAB deep-learning training script into a Linux standalone executable with MATLAB Compiler, package it with MATLAB Runtime in a custom Docker image, run it on a Databricks cluster, and persist checkpoints to DBFS. References provided inline.


Prerequisites

  • MATLAB + MATLAB Compiler on a Linux build machine. Compiled artifacts are OS-specific. Build Linux for Linux. (MathWorks)
  • Access to MATLAB Runtime of the same release as the compiler. (MathWorks)
  • Databricks workspace with permission to run clusters and specify a custom container image. (Databricks Dokumentation)

Step 1: Refactor your training code for headless execution

Create an entry-point function (no GUI, no interactive figures). Save as train_model.m.

function exit_code = train_model(dataPath, outPath)
% dataPath: directory with prepared training/validation data
% outPath: where to write logs, checkpoints, final model

% Example skeleton using Deep Learning Toolbox API
% Load data
imdsTrain = imageDatastore(fullfile(dataPath,"train"), "IncludeSubfolders",true, "LabelSource","foldernames");
imdsVal   = imageDatastore(fullfile(dataPath,"val"),   "IncludeSubfolders",true, "LabelSource","foldernames");

% Define net and training options (no plots)
lgraph = layerGraph(alexnet); % placeholder; replace with your model
opts = trainingOptions("sgdm", ...
    "MaxEpochs",5, ...
    "MiniBatchSize",64, ...
    "ValidationData",imdsVal, ...
    "Verbose",true, ...
    "Plots","none", ...
    "OutputFcn",@(info)checkpointFcn(info,outPath));

% Train
net = trainNetwork(imdsTrain, lgraph, opts);

% Save final artifact
if ~isfolder(outPath); mkdir(outPath); end
save(fullfile(outPath,"final_model.mat"),"net","-v7.3");

exit_code = 0;
end

function stop = checkpointFcn(info,outPath)
stop = false;
if info.State == "iteration" && mod(info.Iteration,100)==0
    if ~isfolder(outPath); mkdir(outPath); end
    save(fullfile(outPath, sprintf("ckpt_iter_%06d.mat", info.Iteration)), "-struct", "info");
end
end

Notes: MATLAB Compiler supports packaging command-line programs; compiled apps run under MATLAB Runtime without an interactive desktop. (MathWorks)


Step 2: Compile on a Linux build machine

Use mcc to create a Linux standalone. Example:

# From the folder containing train_model.m
mcc -m train_model.m -o train_model_cli
  • -m builds a command-line executable; -o names the binary. See mcc reference for flags, including adding assets with -a. (MathWorks)

The compiler also emits a helper script run_train_model_cli.sh that sets required environment variables before launching the binary with MATLAB Runtime. (MathWorks)


Step 3: Obtain and install MATLAB Runtime (Linux)

Download the matching Runtime and install silently:

# Example: place the MATLAB Runtime installer in /tmp/MATLAB_Runtime_R2024b_glnxa64
# Create a response file (installer control text)
cat >/tmp/mcr_silent.txt <<'EOF'
agreeToLicense=yes
destinationFolder=/opt/mcr
outputFile=/var/log/mcr_install.log
EOF

# Run installer silently
/tmp/MATLAB_Runtime_R2024b_glnxa64/install -mode silent -inputFile /tmp/mcr_silent.txt
  • Silent/noninteractive options are documented for MATLAB Runtime installers. (MathWorks)

Step 4: Build a Databricks-ready Docker image

Create a minimalist image that contains:

  • MATLAB Runtime (installed to /opt/mcr)
  • Your compiled app artifacts (train_model_cli, run_train_model_cli.sh)
  • Entrypoint convenience script

Dockerfile (example):

FROM ubuntu:22.04

# System deps
RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates libxext6 libxrender1 libxt6 libxi6 libxrandr2 libxfixes3 libxcursor1 \
    unzip curl bash && \
    rm -rf /var/lib/apt/lists/*

# Copy MATLAB Runtime installer payload and response file
# Expect these to be staged next to Dockerfile
COPY MATLAB_Runtime_R2024b_glnxa64 /tmp/MATLAB_Runtime_R2024b_glnxa64
COPY mcr_silent.txt /tmp/mcr_silent.txt

# Install MATLAB Runtime silently
RUN /tmp/MATLAB_Runtime_R2024b_glnxa64/install -mode silent -inputFile /tmp/mcr_silent.txt && \
    rm -rf /tmp/MATLAB_Runtime_R2024b_glnxa64 /tmp/mcr_silent.txt

# App files
WORKDIR /opt/app
COPY train_model_cli .
COPY run_train_model_cli.sh .

# Helper: wrapper that sets LD_LIBRARY_PATH then launches the app
RUN printf '%s\n' \
'#!/usr/bin/env bash' \
'export MCRROOT=/opt/mcr' \
'export LD_LIBRARY_PATH=$MCRROOT/v920/runtime/glnxa64:$MCRROOT/v920/bin/glnxa64:$MCRROOT/v920/sys/os/glnxa64:$LD_LIBRARY_PATH' \
'exec /opt/app/train_model_cli "$@"' > /usr/local/bin/run_train && \
    chmod +x /usr/local/bin/run_train

# Default working dir for jobs
WORKDIR /workdir
ENTRYPOINT ["/usr/local/bin/run_train"]
  • Custom containers are the recommended way to standardize Databricks environments. (Databricks Dokumentation)
  • Setting the MATLAB Runtime library path at run time is required; MathWorks documents LD_LIBRARY_PATH usage for deployment. Replace v920 with the correct version path for your Runtime. (MathWorks)

Build and push:

docker build -t <registry>/<repo>/mcr-train:latest .
docker push <registry>/<repo>/mcr-train:latest

Databricks publishes reference container examples for guidance when adapting Dockerfiles. (GitHub)


Step 5: Create a Databricks cluster with this image

  • In cluster configuration, specify the custom Docker image <registry>/<repo>/mcr-train:latest.
  • Databricks Container Services docs cover setup, GPU notes, and init scripts if you need additional boot-time steps. (Databricks Dokumentation)

If you must run shell configuration or mount commands at startup, use init scripts. (Databricks Dokumentation, Microsoft Learn)


Step 6: Stage data and outputs on DBFS

Upload your prepared training data and pick an output directory:

# From your workstation using Databricks CLI (example)
databricks fs cp -r ./local_data dbfs:/mnt/data/myproject/train_data

You will pass dataPath and outPath pointing to driver-visible paths. In notebooks/shell on the cluster, paths under /dbfs/... map to DBFS. (community.databricks.com)


Step 7: Run the compiled training job on Databricks

From a Databricks notebook cell:

%sh
# Example invocation. The container ENTRYPOINT is /usr/local/bin/run_train.
# Map DBFS paths through /dbfs for the Linux process.
DATA_PATH=/dbfs/mnt/data/myproject/train_data
OUT_PATH=/dbfs/mnt/outputs/myproject/run_001

mkdir -p "$OUT_PATH"
run_train "$DATA_PATH" "$OUT_PATH"

You can also open the Web Terminal on the cluster’s driver to run shell commands interactively. (Databricks Dokumentation, Microsoft Learn, Databricks)


GPU acceleration (optional)

Use a GPU-enabled Databricks runtime and GPU nodes; ensure the CUDA/NVIDIA driver stack is compatible with your MATLAB release and Runtime. [Inference] Databricks documents running custom containers on GPU compute. (Databricks Dokumentation)


Parallelism and multi-node considerations

Compiled MATLAB apps execute as single processes by default. Multi-node or Spark-integrated workflows require MATLAB Parallel Server and specific Spark/cluster configuration; this is outside the standard compiled-app pattern for Databricks. (MathWorks)


Troubleshooting checklist

  • Binary fails to start: verify Runtime version matches the build and LD_LIBRARY_PATH is set as documented for MATLAB Runtime. (MathWorks)
  • DBFS file access: invoke paths through /dbfs/... from shell; confirm permissions. (community.databricks.com)
  • Container boot customization: move repeated setup into init scripts. (Databricks Dokumentation, Microsoft Learn)

References

  • mcc reference (MATLAB Compiler). (MathWorks)
  • Standalone applications with MATLAB Compiler (platform-specific executables). (MathWorks)
  • Download and install MATLAB Runtime; silent/noninteractive options. (MathWorks)
  • Set MATLAB Runtime library path for deployment. (MathWorks)
  • Databricks custom containers (Docker) for clusters, including GPU notes. (Databricks Dokumentation)
  • Databricks container examples (reference Dockerfiles). (GitHub)
  • Databricks init scripts documentation. (Databricks Dokumentation, Microsoft Learn)
  • MATLAB Parallel Server on Spark/Databricks (advanced, optional). (MathWorks)