Zwillingssterns Weltenwald
Published on Zwillingssterns Weltenwald (http://www.zwillingsstern.de)

Startseite > Org-mode with Parallel Babel

Org-mode with Parallel Babel

Update 2017: a block with sem -j ... seems to block in recent versions of Emacs until all subtasks are done. It would be great if someone could figure out why (though it likely is the right thing to do). To circumvent that, you can daemonize the job in sem, but that might have unwanted side-effects: sem "[job] &"

Table of Contents

  • 1. Babel in Org
  • 2. GNU Parallel to the rescue! Process-pool made easy.
  • 3. A word of caution: Shell escapes
  • 4. Summary

Babel in Org

Emacs [1] Org-mode [2] provides the wonderful babel-capability: Including code-blocks in any language directly in org-mode documents in plain text.

In default usage, running such code freezes my emacs until the code is finished, though.

Up to a few weeks ago, I solved this with a custom function, which spawns a new emacs as script runner for the specific code:

; Execute babel source blocks asynchronously by just opening a new emacs.
(defun bab/org-babel-execute-src-block-new-emacs ()
  "Execute the current source block in a separate emacs,
so we do not block the current emacs."
  (interactive)
  (let ((line (line-number-at-pos))
        (file (buffer-file-name)))
    (async-shell-command (concat 
                          "TERM=vt200 emacs -nw --find-file " 
                          file 
                          " --eval '(goto-line "
                          (number-to-string line) 
                          ")' --eval "
     "'(let ((org-confirm-babel-evaluate nil))(org-babel-execute-src-block t))' "
                          "--eval '(kill-emacs 0)'"))))

and its companion for exporting to beamer-latex presentation pdf:

; Export as pdf asynchronously by just opening a new emacs.
(defun bab/org-beamer-export-new-emacs ()
  "Export the current file in a separate emacs,
so we do not block the current emacs."
  (interactive)
  (let ((line (line-number-at-pos))
        (file (buffer-file-name)))
    (async-shell-command (concat 
                          "TERM=vt200 emacs -nw --find-file " 
                          file 
                          " --eval '(goto-line " 
                          (number-to-string line) 
                          ")' --eval "
     "'(let ((org-confirm-babel-evaluate nil))(org-beamer-export-to-pdf))' "
                          "--eval '(kill-emacs 0)'"))))

But for shell-scripts there’s a much simpler alternative:

GNU Parallel to the rescue! Process-pool made easy.

Instead of spawning an external process, I can just use GNU Parallel [3] for the long-running program-calls in the shell-code. For example like this (real code-block):

#+BEGIN_SRC sh :exports none
  oldPWD=$(pwd)
  cd ~/tm5tools/plotting
  filename="./obsheat-increasing.png" >/dev/null 2>/dev/null
  sem -j -1 ./plotstation.py -c ~/sun-work/ct-production-out-5x7e300m1.0 -C "aircraft" -c ~/sun-work/ct-production-out-5x7e300m1.0no-aircraft -C "continuous"  --obsheat --station allnoaa --title "\"Reducing observation coverage\"" -o ${oldPWD}/${filename}
  cd -
#+END_SRC

Let me explain this.

sem is a part of GNU parallel which makes parallel execution easy. Essentially it gives us a simple version of the convenience we know from make.

for i in {1..100}; do 
    sem -j -1 [code] # run N-1 processes with N as the number of
                     # pocessors in my computer
done

This means that the above org-mode block will finish instantly, but there will be a second process managed by GNU parallel which executes the plotting script.

The big advantage here is that I can also set this to execute on exporting a document which might run hundreds of code-blocks. If I did this with naive multiprocessing, that would spawn 100 processes which overwhelm the memory of my system (yes, I did that…).

sem -j -1 ensures, that this does not happen. Essentially it provides a process-pool with which it executes the code.

If you use this on export, take care to add a final code-block which waits until all other blocks finished:

sem --wait

A word of caution: Shell escapes

If you use GNU parallel to run programs, the arguments are interpreted two times: once when you pass them to sem and a second time when sem passes them on. Due to this, you have to add escaped quote-marks for every string which contains whitespace. This can look like the following code (the example above reduced to its essential parts):

sem -j -1 ./plotstation.py --title "\"Reducing observation coverage\""

I stumbled over this a few times, but the convenience of GNU parallel is worth the small extra-caution.

Besides: For easier editing of inline-source-code, set org-src-fontify-natively to true (t), either via M-x customize-variable or by adding the following to your .emacs:

(setq org-src-fontify-natively t)

Summary

With the tool sem from GNU parallel you get parallel execution of shell code-blocks in emacs org-mode using the familiar syntax from make:

sem -j -1 [escaped code]
Werke von Arne Babenhauserheide. Lizensiert, wo nichts anderes steht, unter der GPLv3 or later und weiteren freien Lizenzen.

Diese Seite nutzt Cookies. Und Bilder. Manchmal auch Text. Eins davon muss ich wohl erwähnen — sagen die meisten anderen, und ich habe grade keine Zeit, Rechtstexte dazu zu lesen…


Source URL: http://www.zwillingsstern.de/english/emacs/parallel-babel

Links:
[1] http://gnu.org/s/emacs
[2] http://orgmode.org
[3] http://gnu.org/s/parallel