Argonne Leadership Computing Facility
Sophia
ALCF's NVIDIA-accelerated machine for AI workloads, used here for MLIP inference and fine-tuning.
Specifications
| Architecture | x86_64 |
| GPU types | NVIDIA A100 40GB |
| Scheduler | PBS Pro |
| Operating institution | Argonne National Laboratory |
| Allocation model | ALCC and Director's Discretionary; INCITE for production work |
| Documentation | https://docs.alcf.anl.gov/sophia/ |
Available checkpoints
Live from Rootstock · 46 checkpoints across 18 environments
env ani_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| ani-1ccx | ANI | 2026-05-06 | ○ not recent |
| ani-1x | ANI | 2026-05-06 | ○ not recent |
| ani-2x | ANI | 2026-05-06 | ○ not recent |
env dimenet_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| dimenet-plus-plus-s2ef-oc20-200k | DimeNet++ | 2026-05-06 | ○ not recent |
| dimenet-plus-plus-s2ef-oc20-20m | DimeNet++ | 2026-05-06 | ○ not recent |
| dimenet-plus-plus-s2ef-oc20-2m | DimeNet++ | 2026-05-06 | ○ not recent |
| dimenet-plus-plus-s2ef-oc20-all | DimeNet++ | 2026-05-06 | ○ not recent |
env equiformer_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| equiformer-v2-153m-s2ef-oc20-all-md | EquiformerV2 | 2026-05-06 | ○ not recent |
| equiformer-v2-31m-s2ef-oc20-all-md | EquiformerV2 | 2026-05-06 | ○ not recent |
| equiformer-v2-83m-s2ef-oc20-2m | EquiformerV2 | 2026-05-06 | ○ not recent |
env escn_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| escn-l4-m2-lay12-s2ef-oc20-2m | eSCN | 2026-05-06 | ○ not recent |
| escn-l6-m2-lay12-s2ef-oc20-2m | eSCN | 2026-05-06 | ○ not recent |
| escn-l6-m2-lay12-s2ef-oc20-all-md | eSCN | 2026-05-06 | ○ not recent |
| escn-l6-m3-lay20-s2ef-oc20-all-md | eSCN | 2026-05-06 | ○ not recent |
env esen
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| esen-md-direct-all-omol | eSEN | 2026-05-06 | ○ not recent |
| esen-sm-conserving-all-omol | eSEN | 2026-05-06 | ○ not recent |
| esen-sm-direct-all-omol | eSEN | 2026-05-06 | ○ not recent |
env gemnet_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| gemnet-dt-s2ef-oc20-all | GemNet | 2026-05-07 | ○ not recent |
| gemnet-oc-large-s2ef-oc20-all-md | GemNet | — | ○ errored |
| gemnet-oc-s2ef-oc20-all | GemNet | — | ○ errored |
| gemnet-oc-s2ef-oc20-all-md | GemNet | — | ○ errored |
- gemnet-oc-large-s2ef-oc20-all-md: verify: ConnectionResetError: [Errno 104] Connection reset by peer
- gemnet-oc-s2ef-oc20-all: verify: ConnectionResetError: [Errno 104] Connection reset by peer
- gemnet-oc-s2ef-oc20-all-md: verify: ConnectionResetError: [Errno 104] Connection reset by peer
env m3gnet_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| m3gnet-mp-2021-2-8-pes | — no catalog entry | — | ○ errored |
- m3gnet-mp-2021-2-8-pes: verify: RuntimeError: Worker process died with code 1. stdout: b'CHGNet v0.3.0 initialized with 412,525 parameters\n' stderr: b'Traceback (most recent call last):\n File "/var/tmp/pbs.153128.sophia-pbs-01.lab.alcf.anl.gov/rootstock_wrapper_csk3sgzg.py", line 10, in <module>\n run_worker(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/rootstock/worker.py", line 273, in run_worker\n calculator = setup_fn(checkpoint, device, **setup_kwargs)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/env_source.py", line 45, in setup\n return CHGNetCalculator(use_device=device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/chgnet/model/dynamics.py", line 102, in __init__\n self.model = CHGNet.load(verbose=False, use_device=self.device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/chgnet/model/model.py", line 741, in load\n model = model.to(device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1384, in to\n return self._apply(convert)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 965, in _apply\n param_applied = fn(param)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1370, in convert\n return t.to(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/m3gnet_env/lib/python3.10/site-packages/torch/cuda/__init__.py", line 478, in _lazy_init\n torch._C._cuda_init()\nRuntimeError: The NVIDIA driver on your system is too old (found version 12040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.\n'
env mace
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| mace-mp-0-large | MACE-MP-0 | 2026-05-06 | ○ not recent |
| mace-mp-0-medium | MACE-MP-0 | 2026-05-06 | ○ not recent |
| mace-mp-0-small | MACE-MP-0 | 2026-05-06 | ○ not recent |
| mace-off23-large | MACE-OFF23 | 2026-05-06 | ○ not recent |
| mace-off23-medium | MACE-OFF23 | 2026-05-06 | ○ not recent |
| mace-off23-small | MACE-OFF23 | 2026-05-06 | ○ not recent |
env mattersim_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| mattersim-v1-0-0-1m | MatterSim | — | ○ errored |
| mattersim-v1-0-0-5m | MatterSim | — | ○ errored |
- mattersim-v1-0-0-1m: verify: RuntimeError: Worker process died with code 1. stdout: b'\x1b[32m2026-05-06 23:16:56.251\x1b[0m | \x1b[1mINFO \x1b[0m | \x1b[36mmattersim.forcefield.potential\x1b[0m:\x1b[36mfrom_checkpoint\x1b[0m:\x1b[36m873\x1b[0m - \x1b[1mLoading the pre-trained mattersim-v1.0.0-1M.pth model\x1b[0m\n' stderr: b'/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/__config__.py:9: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 12040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:119.)\n return torch._C._show_config()\nTraceback (most recent call last):\n File "/var/tmp/pbs.153128.sophia-pbs-01.lab.alcf.anl.gov/rootstock_wrapper_1n_87sh1.py", line 10, in <module>\n run_worker(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/rootstock/worker.py", line 273, in run_worker\n calculator = setup_fn(checkpoint, device, **setup_kwargs)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/env_source.py", line 39, in setup\n return MatterSimCalculator(load_path=CHECKPOINTS[checkpoint], device=device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/mattersim/forcefield/potential.py", line 1161, in __init__\n self.potential = Potential.from_checkpoint(device=device, **kwargs)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/mattersim/forcefield/potential.py", line 892, in from_checkpoint\n checkpoint = torch.load(load_path, map_location=device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 1570, in load\n return _load(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 2190, in _load\n result = unpickler.load()\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/_weights_only_unpickler.py", line 541, in load\n self.append(self.persistent_load(pid))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 2154, in persistent_load\n typed_storage = load_tensor(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 2116, in load_tensor\n wrap_storage = restore_location(storage, location)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 1915, in restore_location\n return default_restore_location(storage, map_location)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 734, in default_restore_location\n result = fn(storage, location)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 667, in _deserialize\n device = _validate_device(location, backend_name)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 634, in _validate_device\n raise RuntimeError(\nRuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device(\'cpu\') to map your storages to the CPU.\n'
- mattersim-v1-0-0-5m: verify: RuntimeError: Worker process died with code 1. stdout: b'\x1b[32m2026-05-06 23:16:30.452\x1b[0m | \x1b[1mINFO \x1b[0m | \x1b[36mmattersim.forcefield.potential\x1b[0m:\x1b[36mfrom_checkpoint\x1b[0m:\x1b[36m887\x1b[0m - \x1b[1mLoading the pre-trained mattersim-v1.0.0-5M.pth model\x1b[0m\n' stderr: b'/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/__config__.py:9: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 12040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:119.)\n return torch._C._show_config()\nTraceback (most recent call last):\n File "/var/tmp/pbs.153128.sophia-pbs-01.lab.alcf.anl.gov/rootstock_wrapper_1b5gbp9j.py", line 10, in <module>\n run_worker(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/rootstock/worker.py", line 273, in run_worker\n calculator = setup_fn(checkpoint, device, **setup_kwargs)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/env_source.py", line 39, in setup\n return MatterSimCalculator(load_path=CHECKPOINTS[checkpoint], device=device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/mattersim/forcefield/potential.py", line 1161, in __init__\n self.potential = Potential.from_checkpoint(device=device, **kwargs)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/mattersim/forcefield/potential.py", line 892, in from_checkpoint\n checkpoint = torch.load(load_path, map_location=device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 1570, in load\n return _load(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 2190, in _load\n result = unpickler.load()\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/_weights_only_unpickler.py", line 541, in load\n self.append(self.persistent_load(pid))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 2154, in persistent_load\n typed_storage = load_tensor(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 2116, in load_tensor\n wrap_storage = restore_location(storage, location)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 1915, in restore_location\n return default_restore_location(storage, map_location)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 734, in default_restore_location\n result = fn(storage, location)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 667, in _deserialize\n device = _validate_device(location, backend_name)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/mattersim_env/lib/python3.10/site-packages/torch/serialization.py", line 634, in _validate_device\n raise RuntimeError(\nRuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device(\'cpu\') to map your storages to the CPU.\n'
env orb
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| orb-v2 | Orb-v2 | — | ○ errored |
- orb-v2: verify: RuntimeError: Worker process died with code 1. stdout: b'' stderr: b'/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/google/api_core/_python_version_support.py:273: FutureWarning: You are using a Python version (3.10.19) which Google will stop supporting in new releases of google.api_core once it reaches its end of life (2026-10-04). Please upgrade to the latest Python version, or at least Python 3.11, to continue receiving updates for google.api_core past that date.\n warnings.warn(message, FutureWarning)\nTraceback (most recent call last):\n File "/var/tmp/pbs.153128.sophia-pbs-01.lab.alcf.anl.gov/rootstock_wrapper_a2w5kwir.py", line 10, in <module>\n run_worker(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/rootstock/worker.py", line 273, in run_worker\n calculator = setup_fn(checkpoint, device, **setup_kwargs)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/env_source.py", line 24, in setup\n orbff = load_fn(device=torch.device(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/orb_models/forcefield/pretrained.py", line 643, in orb_v2\n model = orb_v2_architecture(device=device, system_config=system_config)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/orb_models/forcefield/pretrained.py", line 163, in orb_v2_architecture\n model.cuda(device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1097, in cuda\n return self._apply(lambda t: t.cuda(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n [Previous line repeated 1 more time]\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1035, in _apply\n self._buffers[key] = fn(buf)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1097, in <lambda>\n return self._apply(lambda t: t.cuda(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb/lib/python3.10/site-packages/torch/cuda/__init__.py", line 478, in _lazy_init\n torch._C._cuda_init()\nRuntimeError: The NVIDIA driver on your system is too old (found version 12040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.\n'
env orb_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| orb-v3-conservative-inf-omat | Orb-v3 | — | ○ errored |
| orb-v3-direct-inf-omat | Orb-v3 | — | ○ errored |
- orb-v3-conservative-inf-omat: verify: RuntimeError: Worker process died with code 1. stdout: b'' stderr: b'/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/google/api_core/_python_version_support.py:273: FutureWarning: You are using a Python version (3.10.19) which Google will stop supporting in new releases of google.api_core once it reaches its end of life (2026-10-04). Please upgrade to the latest Python version, or at least Python 3.11, to continue receiving updates for google.api_core past that date.\n warnings.warn(message, FutureWarning)\nTraceback (most recent call last):\n File "/var/tmp/pbs.153128.sophia-pbs-01.lab.alcf.anl.gov/rootstock_wrapper_2ctfa6t1.py", line 10, in <module>\n run_worker(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/rootstock/worker.py", line 273, in run_worker\n calculator = setup_fn(checkpoint, device, **setup_kwargs)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/env_source.py", line 48, in setup\n orbff = load_fn(device=torch.device(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/orb_models/forcefield/pretrained.py", line 435, in orb_v3_conservative_inf_omat\n model = orb_v3_conservative_architecture(device=device, system_config=system_config)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/orb_models/forcefield/pretrained.py", line 238, in orb_v3_conservative_architecture\n model.cuda(device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1097, in cuda\n return self._apply(lambda t: t.cuda(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1035, in _apply\n self._buffers[key] = fn(buf)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1097, in <lambda>\n return self._apply(lambda t: t.cuda(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/cuda/__init__.py", line 478, in _lazy_init\n torch._C._cuda_init()\nRuntimeError: The NVIDIA driver on your system is too old (found version 12040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.\n'
- orb-v3-direct-inf-omat: verify: RuntimeError: Worker process died with code 1. stdout: b'' stderr: b'/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/google/api_core/_python_version_support.py:273: FutureWarning: You are using a Python version (3.10.19) which Google will stop supporting in new releases of google.api_core once it reaches its end of life (2026-10-04). Please upgrade to the latest Python version, or at least Python 3.11, to continue receiving updates for google.api_core past that date.\n warnings.warn(message, FutureWarning)\nTraceback (most recent call last):\n File "/var/tmp/pbs.153128.sophia-pbs-01.lab.alcf.anl.gov/rootstock_wrapper_t3kj5qci.py", line 10, in <module>\n run_worker(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/rootstock/worker.py", line 273, in run_worker\n calculator = setup_fn(checkpoint, device, **setup_kwargs)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/env_source.py", line 48, in setup\n orbff = load_fn(device=torch.device(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/orb_models/forcefield/pretrained.py", line 473, in orb_v3_direct_inf_omat\n model = orb_v3_direct_architecture(device=device, system_config=system_config)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/orb_models/forcefield/pretrained.py", line 329, in orb_v3_direct_architecture\n model.cuda(device)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1097, in cuda\n return self._apply(lambda t: t.cuda(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply\n module._apply(fn)\n [Previous line repeated 1 more time]\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1035, in _apply\n self._buffers[key] = fn(buf)\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1097, in <lambda>\n return self._apply(lambda t: t.cuda(device))\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/orb_env/lib/python3.10/site-packages/torch/cuda/__init__.py", line 478, in _lazy_init\n torch._C._cuda_init()\nRuntimeError: The NVIDIA driver on your system is too old (found version 12040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.\n'
env painn_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| painn-s2ef-oc20-all | PaiNN | 2026-05-06 | ○ not recent |
env schnet_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| schnet-s2ef-oc20-200k | SchNet | — | ○ errored |
| schnet-s2ef-oc20-20m | SchNet | 2026-05-06 | ○ not recent |
| schnet-s2ef-oc20-2m | SchNet | 2026-05-06 | ○ not recent |
| schnet-s2ef-oc20-all | SchNet | 2026-05-06 | ○ not recent |
- schnet-s2ef-oc20-200k: verify: forces are all (near-)zero — model likely returned zeros
env scn_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| scn-s2ef-oc20-2m | SCN | 2026-05-06 | ○ not recent |
| scn-s2ef-oc20-all-md | SCN | 2026-05-06 | ○ not recent |
| scn-t4-b2-s2ef-oc20-2m | SCN | 2026-05-06 | ○ not recent |
env tensornet
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| tensornet-matpes-pbe-2025-2 | TensorNet | — | ○ errored |
- tensornet-matpes-pbe-2025-2: verify: RuntimeError: Worker process died with code 1. stdout: b'' stderr: b'/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch/__config__.py:9: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 12040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:119.)\n return torch._C._show_config()\n/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_geometric/__init__.py:4: UserWarning: An issue occurred while importing \'torch-scatter\'. Disabling its usage. Stacktrace: Could not load this library: /lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_scatter/_version_cuda.so\n import torch_geometric.typing\n/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_geometric/__init__.py:4: UserWarning: An issue occurred while importing \'torch-cluster\'. Disabling its usage. Stacktrace: Could not load this library: /lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_cluster/_version_cuda.so\n import torch_geometric.typing\n/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_geometric/__init__.py:4: UserWarning: An issue occurred while importing \'torch-spline-conv\'. Disabling its usage. Stacktrace: Could not load this library: /lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_spline_conv/_version_cuda.so\n import torch_geometric.typing\n/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_geometric/__init__.py:4: UserWarning: An issue occurred while importing \'torch-sparse\'. Disabling its usage. Stacktrace: Could not load this library: /lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_sparse/_version_cuda.so\n import torch_geometric.typing\nTraceback (most recent call last):\n File "/var/tmp/pbs.153128.sophia-pbs-01.lab.alcf.anl.gov/rootstock_wrapper_xu99u8ls.py", line 10, in <module>\n run_worker(\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/rootstock/worker.py", line 273, in run_worker\n calculator = setup_fn(checkpoint, device, **setup_kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/env_source.py", line 62, in setup\n from matgl.ext.ase import PESCalculator\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/matgl/ext/ase.py", line 18, in <module>\n from ._ase_pyg import ( # type: ignore[assignment]\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/matgl/ext/_ase_pyg.py", line 38, in <module>\n from matgl.graph._converters_pyg import GraphConverter\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/matgl/graph/_converters_pyg.py", line 9, in <module>\n from torch_geometric.data import Data\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_geometric/__init__.py", line 22, in <module>\n import torch_geometric.datasets\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_geometric/datasets/__init__.py", line 18, in <module>\n from .qm9 import QM9\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch_geometric/datasets/qm9.py", line 22, in <module>\n conversion = torch.tensor([\n ^^^^^^^^^^^^^^\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch/utils/_device.py", line 116, in __torch_function__\n return func(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^\n File "/lus/eagle/projects/Garden-Ai/rootstock/envs/tensornet/lib/python3.11/site-packages/torch/cuda/__init__.py", line 478, in _lazy_init\n torch._C._cuda_init()\nRuntimeError: The NVIDIA driver on your system is too old (found version 12040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.\n'
env tensornet_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| tensornet-dgl-matpes-pbe-2025-2 | — no catalog entry | — | ○ errored |
| tensornet-matpes-r2scan-2025-2 | — no catalog entry | — | ○ errored |
- tensornet-dgl-matpes-pbe-2025-2: download: ModuleNotFoundError: No module named 'matgl'
- tensornet-matpes-r2scan-2025-2: download: ModuleNotFoundError: No module named 'matgl'
env uma
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| uma-s-1p1 | UMA | 2026-05-07 | ○ not recent |
env uma_env
built 2026-05-06
| Checkpoint | Page | Verified | Status |
|---|---|---|---|
| uma-m-1p1 | UMA | 2026-05-07 | ○ not recent |
Running on Sophia
Sophia uses PBS rather than Slurm — translate `sbatch` examples accordingly. Rootstock is preinstalled at `/soft/rootstock` on Sophia and exposes the same `rootstock` CLI as Della. Cluster-local model caches live under `/eagle/garden-cache`; set `GARDEN_CACHE` to that path or run `module load rootstock` to pick it up automatically.
Sophia carries a wider set of environments than Della — including `orb_env`, `sevennet_env`, and `esen_env` in addition to the FAIR-Chem and MACE stacks — because it serves as the primary fine-tuning host for newer architectures.