3

I'm working with TeX-Files which are converted from other formats. One thing that pops up in these are vertical lines in tables like the following:

\cline{1-1}\cline{2-2}...\cline{n-n}

This can be expressed in a compressed fashion as

\cline{1-n}

It is further complicated by the fact that multiple such ranges might appear in the same line, for example 1-5, 7-11 and 15-17 all might appear on the same line, which prevents me from just looking for the first and last ones and using those.

I attempted to compress these with query-replace-regexp, where I tried to replace \\cline{\([0-9]+\)-\1}\\cline{\,(+ 1 \#1)-\,(+ 1 \#1)} with \cline{\1-\,(+ 1 \#1)}, which did not work. Can this be done, and if yes, how? Also I'd prefer this a an elisp-Macro so that I can easily reuse it.

Shylock
  • 309
  • 2
  • 6

2 Answers2

3

Here's some code to "compress" each "cline sequence" in the buffer. To use it do M-x compress-all-cline-sequences.

I may have gone a little overboard in generality: this handles the case in which the clines are not in order or overlap in complicated ways, for example, it will replace

\cline{1-3}\cline{9-13}\cline{11-15}\cline{5-7}\cline{4-4}

with

\cline{1-7}\cline{9-15}

Here's the code:

(defun compress-intervals (intervals)
  "Compress a list of intervals of integers given as a list of conses."
  (let ((compressed '())
        (left nil)
        (count 0))
    (dolist (pt (sort
                 (append
                  (mapcar (lambda (I) (cons (car I) 'start)) intervals)
                  (mapcar (lambda (I) (cons (1+ (cdr I)) 'end)) intervals))
                 (lambda (p q)
                   (or (< (car p) (car q))
                       (and (= (car p) (car q))
                            (eq (cdr p) 'start))))))
      (pcase (cdr pt)
        ('start (incf count)
                (unless left (setq left (car pt))))
        ('end (decf count)
              (when (zerop count)
                (push (cons left (1- (car pt))) compressed)
                (setq left nil)))))
    (nreverse compressed)))

(defun extract-cline-intervals (str)
  "Extract the intervals from a string of the form \\cline{a-b}\\cline{c-d}..."
  (let ((pos 0)
        (intervals '()))
    (save-match-data
      (while (string-match "\\\\cline{\\([0-9]+\\)-\\([0-9]+\\)}" str pos)
        (push (cons (string-to-number (match-string 1 str))
                    (string-to-number (match-string 2 str)))
              intervals)
        (setq pos (match-end 0))))
    (nreverse intervals)))

(defun compress-cline-sequence (str)
  "Compress a string of the form \\cline{a-b}\\cline{c-d}... to minimal form."
  (apply #'concat
         (mapcar (lambda (I) (format "\\cline{%d-%d}" (car I) (cdr I)))
                 (compress-intervals (extract-cline-intervals str)))))

(defun compress-all-cline-sequences ()
  "Find all sequences of consecutive clines in the buffer and compress each to minimal form."
  (interactive)
  (save-excursion
    (goto-char (point-min))
    (while (re-search-forward "\\(\\\\cline{[0-9]+-[0-9]+}\\)+" nil t)
      (replace-match (compress-cline-squence (match-string 0)) t t))))

Edit for Emacs 24.3:

As Lupino pointed out in the comments, for Emacs 24.3 you need to do it like this:

(defun compress-intervals (intervals)
  "Compress a list of intervals of integers given as a list of conses."
  (let ((compressed '())
        (left nil)
        (count 0))
    (dolist (pt (sort
                 (append
                  (mapcar (lambda (I) (cons (car I) 'start)) intervals)
                  (mapcar (lambda (I) (cons (1+ (cdr I)) 'end)) intervals))
                 (lambda (p q)
                   (or (< (car p) (car q))
                       (and (= (car p) (car q))
                            (eq (cdr p) 'start))))))
      (pcase (cdr pt)
        (`start (incf count)
                (unless left (setq left (car pt))))
        (`end (decf count)
              (when (zerop count)
                (push (cons left (1- (car pt))) compressed)
                (setq left nil)))))
    (nreverse compressed)))

(defun extract-cline-intervals (str)
  "Extract the intervals from a string of the form \\cline{a-b}\\cline{c-d}..."
  (let ((pos 0)
        (intervals '()))
    (save-match-data
      (while (string-match "\\\\cline{\\([0-9]+\\)-\\([0-9]+\\)}" str pos)
        (push (cons (string-to-number (match-string 1 str))
                    (string-to-number (match-string 2 str)))
              intervals)
        (setq pos (match-end 0))))
    (nreverse intervals)))

(defun compress-cline-sequence (str)
  "Compress a string of the form \\cline{a-b}\\cline{c-d}... to minimal form."
  (message "str=%S" str)
  (apply #'concat
         (mapcar (lambda (I) (format "\\cline{%d-%d}" (car I) (cdr I)))
                 (compress-intervals (extract-cline-intervals str)))))

(defun compress-all-cline-sequences ()
  "Find all sequences of consecutive clines in the buffer and compress each to minimal form."
  (interactive)
  (save-excursion
    (goto-char (point-min))
    (while (re-search-forward "\\(\\\\cline{[0-9]+-[0-9]+}\\)+" nil t)
      (replace-match (compress-cline-sequence (match-string-no-properties 0)) t t)
)))
Shylock
  • 309
  • 2
  • 6
Omar
  • 4,812
  • 1
  • 18
  • 33
  • First of all , thanks a lot, it looks really impressive! However, If I try to run it, i get Symbol's function definition is void: compress-cline-sequence. Any idea why this happens? I use a fairly old emacs (24.3.1) on openSuse if that helps. – Shylock Oct 11 '17 at 12:24
  • 1
    In 24.3 this only works if you replace 'start and 'end in (compress-intervals) by start and end respectively. (Backquotes are stacke exchange's control symbol for inline code, dammit. How can you escape those chars?!?) – Lupino Oct 12 '17 at 12:52
  • Sorry, @Shylock, I've been away from Stack Exchange for a few days. I take it Lupino's change made it work for you? – Omar Oct 15 '17 at 02:06
1

Assuming all your \cline{...}-statements are on a single line and you're interested in the range between the first and last digits, you can use

\\cline{\([0-9]+\).*\([0-9]+\)}$

to capture the first number after \cline{ into a capturing group (denoted by \(..\)), then skip to the last number before the final } (using .* to skip everything inbetween, and }$ to get the } at the end of the line) and store that in a second capturing group.

The replacement string then becomes

\\cline{\1-\2}

which inserts \cline{, then the string in the first capturing group (the first number), a -, the string in the second capturing group (the last number) and finally a }.


You can wrap this in a function using some extra backspace-escaping like this:

(defun compress-cline ()
  (interactive)
  (query-replace-regexp "\\\\cline{\\([0-9]+\\).*\\([0-9]+\\)}$"
                        "\\\\cline{\\1-\\2}"))

which you can then bind to a key, or call using M-x compress-cline.

Arnot
  • 671
  • 1
  • 5
  • 15
  • 1
    Thanks, but I just realized that my question was oversimplifying the real issue. In fact it is usually multiple ranges, that's why this approach unfortunately won't work. I will edit the question accordingly. – Shylock Oct 04 '17 at 09:50