0

I'm trying to make a function that takes a string and compresses the repeating blocks. The code I'm using is implemented in such a way that a single character like 'a' would be converted to '1(a)' resulting in a bigger length.

The code is something like this:

import re


def _format_so(bchars, brep):
    return '%i(%s)' % (brep, bchars) if bchars else ''


def char_rep(txt, _format=_format_so):
    output, lastend = [], 0

    for match in re.finditer(r"""(?ms)(?P<repeat>(?P<chars>.+?)(?:(?P=chars))+)""", txt):
        beginpos, endpos = match.span()
        repeat, chars = match.group('repeat'), match.group('chars')

        if lastend < beginpos:
            output.append(_format(txt[lastend:beginpos], 1))
        output.append(_format(chars, repeat.count(chars)))
        lastend = endpos
    output = ''.join(output) + _format(txt[lastend:], 1)
    return output


givenList = ['dwdawdawd', 'aaaaaaaaa', 'abcabcabca']
newList = []

for txt in givenList:
    output_so = char_rep(txt, _format=_format_so)

    newList.append(output_so)

print(newList)


Output = ['1(d)2(wda)1(wd)', '9(a)', '3(abc)1(a)']

I want to make sure that the output will have the shortest length possible. The previous example should output ['d2(wda)1wd', '9(a)', '3(abc)a']

What do you suggest as the best approach for solving this problem?

John L.
  • 38,985
  • 4
  • 33
  • 90
  • Programming is off-topic here. – Yuval Filmus May 01 '20 at 10:47
  • 2
    Are you familiar with this question? – Yuval Filmus May 01 '20 at 10:47
  • 2
    "Smallest length possible" doesn't mean anything. For any given string $x$, you can construct a compression mechanism in which $x$ shrinks to the empty string. – Yuval Filmus May 01 '20 at 10:56
  • Welcome! It is nice to say "thank you in advance" in normal socializing. However, on all stackoverflow sites, what actually means "thanks you" is upvoting and accept answers. Most of "thank you" phrases in questions are considered unnecessary and distracting. So, I just removed that phrase from the question. – John L. May 01 '20 at 17:35

0 Answers0