Questions tagged [encoding-scheme]
140 questions
45
votes
9 answers
Is UTF-8 the final character encoding for all future time?
It seems to me that Unicode is the "final" character encoding. I cannot imagine anything else replacing it at this point. I'm frankly confused about why UTF-16 and UTF-32 etc. exist at all, not to mention all the non-Unicode character encodings…

Timone
- 459
- 4
- 3
22
votes
3 answers
Why is this code uniquely decodable?
Source alphabet: $\{a, b, c, d, e, f\}$
Code alphabet: $\{0, 1\}$
$a\colon 0101$
$b\colon 1001$
$c\colon 10$
$d\colon 000$
$e\colon 11$
$f\colon 100$
I thought that for a code to be uniquely decodable, it had to be prefix-free.
But in this code,…

2000mroliver
- 333
- 2
- 6
6
votes
2 answers
Can nested structures be encoded more "readably" with a single delimiter?
Imagine you have two systems of delimiting. One with paired delimiters, [ and ]:
[abc]
Then another system which uses a single interstitial delimiter, /:
a/b/c
It's easy to see how to encode structure in the first case, as they nest…
3
votes
3 answers
Reconstructing files from binary
Popular view tells us that any kind of information is just a collection of bits, that is zeroes and ones placed in a particular order. I was thus having this thought.
Suppose that I have some kind of file such as a text document, a PDF or anything…

Kurt
- 133
- 4
2
votes
1 answer
Encoding two 6-bit positive integers compactly
Is it possible to encode two 6-bit wide positive integers a and b guaranteed to be in [0, 63] in a way that a and b are recoverable -- in fewer than 12 bits? We could obviously concatenate the binary representations (a << 6 | b). The scheme would…

user110001
- 153
- 3
1
vote
2 answers
Understanding the binary and hexadecimal representations of UTF-8
Here is a code point: U+091D. The symbol it represents in UTF-8 is झ. In hex the symbol requires three bytes: e0 a4 9d. But it looks like the number 091D requires two bytes. So why do we need three bytes to encode the symbol? Probably due to
the…

Maksim Dmitriev
- 413
- 1
- 3
- 14
1
vote
0 answers
Base-N encoding with smallest output
I have a set of bytes (utf-8), and need to encode them into the smallest dataset possible using a Base N encoding scheme. Is it simply the higher the Base N encoding is (ie something like Base85 encoding) the smaller the output will be, or is there…

SamG101
- 111
- 3
0
votes
0 answers
algorithm for encoding segmentations in RGBA data
I'm trying to figure out a good way of encoding object segmentation data into a single RGBA image. I'd like to not only distinguish between classes, but also individuals within classes, and be able to have multiple classes and objects per pixel.
For…

waspinator
- 101
- 1
0
votes
0 answers
What is bi-binary code?
This being used in data bits transmitted by GLONASS Satellites. I am sure it is different from other codes like 'Binary Code' which the search engine throws up

Aadishri
- 101
- 1
0
votes
1 answer
Probability of certain symbols in a mapping of symbol and code word
Let's say you have this mapping of symbols and codewords:
$$
\begin{array}{cc}
\hline \text { Symbol } & \text { Codeword } \\
\hline \text { A } & 101 \\
\text { B } & 100 \\
\text { C } & 01 \\
\text { D } & 00 \\
\text { E } & 110 \\
\text { F }…

Rico1990
- 125
- 2
0
votes
1 answer
Are all prefix codes uniquely decodable?
I can't think of any counterexample but I can't find any such statement on the internet or my textbook either. I know that for each uniquely decodable code, there exists a prefix code with the same average length.

Black Jack 21
- 199
- 6