Changelog¶
Unreleased¶
Added¶
SkipDerivativeexception: nodes can now raiseSkipDerivativeto signal that a source file is intentionally not processable by a given derivative — distinct from an unexpected error. neurodags catches this, writes a.skipmarker file alongside where the artifact would have been saved, and propagates the skip to all parent derivatives that depend on it (each writing their own.skipmarker). Skipped derivatives are not retried on subsequent runs unless the.skipfile is deleted oroverwrite: trueis set. Motivation: in multi-condition studies, some subjects may not have undergone every condition; withoutSkipDerivative, their missing conditions showed as missing inneurodags status— indistinguishable from derivatives that simply had not run yet, which made pipeline completion state ambiguous. (definitions.SkipDerivative,dag.run_derivative,neurodags.SkipDerivative)neurodags statusreports skipped derivatives: the status table now includes a skipped column alongside done, missing, and errored. Derivatives with a.skipmarker are reported as skipped — not missing — so pipeline operators can distinguish “will never compute for this file” from “has not run yet”. The note at the bottom of the table explains what skipped means. JSON output (--format json) also includes the skipped count per derivative. (cli._status_classify,cli._cmd_status)
Fixed¶
Sub-derivative cache respected when parent has
overwrite: True: previously, a derivative withoverwrite: Trueforcedcached_here = Falsefor all of its sub-derivative inputs, causing them to re-execute even when they hadoverwrite: Falseand valid cached files on disk. The cache check now uses the child derivative’s ownoverwriteflag rather than the parent’s, so only the derivative that explicitly setsoverwrite: Trueis recomputed. (dag.run_derivative){"cached": [...]}dict no longer leaks into node arguments: when a sub-derivative early-returns its internal{"cached": [path, ...]}sentinel (e.g. because it hit its own cache), the parent previously stored that dict raw instore[sid]and passed it on to downstream node functions as if it were a real value — causingAttributeErrorat runtime. The parent now resolves the cached dict to the matching path string before storing, so downstream nodes always receive a proper path orNodeResult. (dag.run_derivative)
Changed¶
neurodags statusexit code: now exits1when any derivatives are missing or errored (previously only errored triggered a non-zero exit). Enables use in CI and shell dependency chains:neurodags status pipeline.yml || sbatch resubmit.sh.neurodags status --format json: new flag emits machine-readable JSON withconfig,n_files, per-derivative counts,grand_total, andcompleteboolean. Useful for scripted post-cluster checks and quota estimation.
Changed¶
DAG HTML visualization uses ELK layout by default: Mermaid diagrams now use the ELK layout engine (orthogonal edge routing, crossing minimisation) instead of dagre with bezier curves. Significantly cleaner for dense pipelines. Use
--layout dagrefor offline use. The raw Mermaid text output (neurodags dagwithout--html) is unchanged.neurodags countrenamed toneurodags count-inputs: clarifies that the command counts source (input) files the pipeline will process, not output files or derivative instances. One input file may produce multiple output files depending on the derivatives. All generated SLURM templates, documentation, and tests updated accordingly.
Added¶
neurodags dag --layout: new flag for HTML DAG output selecting the layout engine.elk(default) uses orthogonal routing via ELK — requires CDN access.dagreuses right-angle step edges with no CDN dependency — suitable for offline environments. Also available aslayout=in the Python API (pipeline_to_html,derivative_to_html,save_mermaid_html).Dataset-level variables (
vars:): dataset entries indatasets.ymlcan now declare avars:block of arbitrary key-value pairs. Any pipeline node arg whose string value matches$identifieris substituted with the corresponding value from the active dataset entry’svarsat runtime, afterid.Nreference resolution. Only whole-string values are substituted — embedded$in paths or other strings is left untouched. Variables may be any YAML type (string, int, float, bool, list). Referencing an undefined variable raisesKeyErrorwith the list of available vars. Primary use case: encoding a condition name (or any dataset-specific parameter) in the dataset entry so that activating a different entry changes bothderivatives_pathand pipeline behaviour in one step, with no pipeline YAML edits required. (definitions.DatasetConfig.vars,dag._resolve_vars,dag._prep_kwargs)In-memory multi-artifact selection: when a node returns a
NodeResultwith multiple artifacts (e.g. a splitter that produces one artifact per condition), downstream derivatives can now select a specific artifact using the existing dot-extension syntax —derivative: SplitterName.condA.fif— even when the splitter has not yet been written to disk. Previously this selection only worked for on-disk (cached) artifacts; the in-memory path passed the fullNodeResultand relied on the_unwrap_for_argheuristic, which returned the first matching artifact regardless of the requested suffix. The fix applies the same suffix filter to the in-memoryNodeResultthat was already applied to on-disk candidates, making both paths consistent. A warning is logged when the requested suffix is absent from the splitter’s artifacts. (dag.run_derivative)Config snapshot on
neurodags run: before executing any derivatives, the pipeline YAML,new_definitionsfile(s), and datasets YAML are copied toderivatives_path/code/. Aneurodags_env.jsonfile is also written with the installed neurodags version, git commit of the source repo (when installed from a checkout), and a UTC timestamp. Skipped on dry runs; failures are warnings, never errors. (orchestrators._snapshot_pipeline_config)
0.1.0¶
Initial release of the template.