6

I am developing isometric 2D exploration game. At this moment I faced with a problem where game takes too much disk space. Game's world at this moment is about 1 square kilometer and it's about 50MB.

I need to somehow squeeze it. Should I think about compressing it to archive or maybe there's some kind of game files packing technique?

What about binary? Can someone explain me the magic behind it? I heard that a lot of people use it, but when I tried to use it - it took same amount of space like simple .txt file.

I'm new on file formats, so I would be grateful for any ideas. Thanks.

Tom
  • 233
  • 2
  • 9
  • 2
    What kind of data are you storing for your tiles now and in what format? If it's something like XML you're using too much space. Please give more details on your current method if you want suggestions for improvements, otherwise you'll have people suggesting things that aren't compatible with what you have, or things you already have. – House Jun 05 '13 at 15:52
  • 2
    Without more information about the kind of data on disk, it's going to be hard to give you any advice about how to shrink things down other than the generic "compression" one. And is 50 megs too big really? – Tetrad Jun 05 '13 at 15:53
  • Oh, sorry. My world chunk file's line looks like this: {xCoordinates yCoordinates xChunkCoordinates yChunkCoordinates TextureId isWalkable} – Tom Jun 05 '13 at 15:56
  • I calculated that with such file format 10x10 kilometers would take about 3GB. That's really too much. – Tom Jun 05 '13 at 16:07
  • That's a little more help, but what are the datatypes? Int? Long? Byte? Is your system tile-based? etc. – XNargaHuntress Jun 05 '13 at 16:09
  • {xCoordinates yCoordinates xChunkCoordinates yChunkCoordinates TextureId isWalkable}

    xCoordinates/yCoordinates - int xChunkCoordinates/yChunkCoordinates - int TextureId - string isWalkable - int.

    Yes, it's tile based.

    – Tom Jun 05 '13 at 16:12
  • 2
    Can you provide an example entry? – John McDonald Jun 05 '13 at 16:16
  • If tiles are the same size, and tiles are pre-defined as walkable or not, and chunks are made up of tiles: you could use standard tile mapping methods to cut your size down tremendously. Also, why aren't you using a boolean for isWalkable? – XNargaHuntress Jun 05 '13 at 16:19
  • Sure thing. You can look for a file example here http://txtup.co/Yfga @XGundam05 Can you give me a tile mapping example? I do use boolean. 0 means false and 1 means true :) – Tom Jun 05 '13 at 16:21
  • See Byte56's answer. Also, you defined isWalkable as a string type and not a boolean. An actual boolean should take up less memory (typically). – XNargaHuntress Jun 05 '13 at 16:31
  • I would really not worry about this. Modern hard drives are big, and they're fast. Modern CPUs are fast too. You're not likely to take very long to load and parse a save file written in a textual format like JSON; what will take a long time is preparing the level itself. (Loading images and sounds into memory, building a representation of the level, etc.) – Mason Wheeler Jun 05 '13 at 21:25
  • You're completely right, but wouldn't that be strange requirements for a non AAA indie game - 5GB of disk space? :D – Tom Jun 05 '13 at 22:08
  • Have you thought about compressing your maps with zlib? It could do better than a binary format since it would also compress based on repeating structures in your map. And you could keep all the advantages of a plain text map. – Kasper Jun 10 '13 at 23:46

2 Answers2

8

Use the organization of the data to your benefit. You can always be expect the data in the same order, so you know what the next bytes belong to. For example (not specific to your data), when reading in the data, always expect two bytes for tile type, two bytes for lighting information and then two bytes for extra info. So it knows that after 6 bytes, it's time to move onto the next tile. Don't store strings for your tile types, with two bytes you have many thousands of different types. It can take two bytes just to store one character depending on the format.

Don't store position information, it should be implied in the tile order. Always store the tile information in chunks, one column at a time (or one row at a time). This allows you to know the position of the next tile, without needing to read it from the byte stream. You read the starting position of the chunk, then the first tile is placed at that position. Then you know the second tile will be placed at the chunk position plus one in the Y direction (if sending column per column).

If your goal is to make the file smaller, you are probably going to have to give up the human readable feature (which is what your current formatting looks like it was designed for).

As for defining tiles as numbers. You just create a "conversion chart" in your code:

Decimal : Name     : Byte Value
0       : None     : 0000 0000
1       : Dirt     : 0000 0001
2       : Grass    : 0000 0010
3       : Snow     : 0000 0011
...
230     : Lava     : 1110 0110

If you have less than 256 different types of tiles (dirt, grass, sand, etc.), which is likely, you should just write a single byte to store the value of the tile.

When you write the data, make you're writing in a binary format. If you're writing characters as numbers, you're not doing it right. When you read the data back in, you check the byte value against your chart, and load the appropriate tile in to the game.

Look into how your language of choice can write to a byte stream, or write bytes directly. It's clear that you're writing text to your file because you have new lines for each tile. A binary format is probably going to be unreadable for your average text file editor. But your game can read it, so it doesn't matter.

House
  • 73,224
  • 17
  • 184
  • 273
  • How can you have many thousands of different types with only two bytes? – Tom Jun 05 '13 at 16:31
  • 3
    Each type is represented by a number. Two bytes can represent the numbers 0 to 65,535. One number for each type means many thousands of types. – House Jun 05 '13 at 16:34
  • I wrote 2 numbers into my .txt file and it takes 2 bytes already. How can 2 bytes represent the number from 0 to 65,535? I think I don't understand something about bytes. – Tom Jun 05 '13 at 16:54
  • @Tom 1 byte is 8 bits which can represent 256 values, so 2 bytes can represent 256*256 values, which is 65,536 values, 0-65,535 – Maik Semder Jun 05 '13 at 17:01
  • A simpler way to look at it: one byte isn't just 0-9. It can be 0-9a-zA-Z and there are special symbols as well. This will give you (10+26+26) different symbols. So, with two bytes, you have 62*62 different tile types. – Utkarsh Sinha Jun 05 '13 at 17:04
  • 2
    @Tom you aren't storing numbers in your text file, you're storing numbers represented by characters. A single character is a single byte. And the character '1' doesn't actually have the integer value of 1. – XNargaHuntress Jun 05 '13 at 17:22
  • I tried to modify my chunk file format. Now 1 square kilometer takes a little less and 10x10 kilometers takes 762MB. But I'm only saving tileType number into it at this moment. Still too much space. Any other ideas, guys? – Tom Jun 05 '13 at 18:13
  • Well in my test I tried using maximum number that you defined. I made all chunk tiles type - 65535. My file now looks like this http://justpaste.it/2sdo – Tom Jun 05 '13 at 18:17
  • I wrote binary file using BinaryWriter and now 5 square km takes only 110MB. But its still too much IMO, because I'm writing only tile type. What if I would have some dynamic data for specific tile? In that case size would increase. Here's my file example now https://www.dropbox.com/s/4iqg4v383v9spyy/World.bin – Tom Jun 05 '13 at 18:48
  • You're not writing it properly. If you look at your file in a hex editor, you can clearly see it's not in the correct format. http://en.webhex.net/view/49bbc981ff3dc74b389ad6db55605e0d/0 You may be using BinaryWriter, but you're using it to write text. – House Jun 05 '13 at 18:53
  • Can you explain me what's wrong with file format? Looks like I completely don't understand binary. – Tom Jun 05 '13 at 19:05
  • That's how I do in C#. Maybe you have some ideas what's wrong with it? string str = "444Hello World. How is it going?"; byte[] arr = System.Text.Encoding.ASCII.GetBytes(str); BinaryWriter writer = new BinaryWriter(File.Open("Data.bin", FileMode.Create)); writer.Write(str); writer.Close(); – Tom Jun 05 '13 at 19:12
  • Yes, the issue is you're converting it to ASCII encoding. Make Byte type, set it equal to your tile value, and use the BinaryWriter and use the regular write method, passing the byte in. Or you can use Integers and use 'Convert.ToByte(...)', but be sure you don't go larger than a byte size. Also, it'll help you to read up about binary and how bytes work. However, we're getting off the original topic now. So ask a new question if you still don't know how to do it. – House Jun 05 '13 at 19:25
  • At least for now I know where to go with world files. I have another question about data for npcs, items, etc. There would be text inside them of course. What method should I use for it to make those files small as possible? – Tom Jun 05 '13 at 19:52
  • As small as possible, and as small as reasonably possible are different things. As small as reasonably possible, I would just use the binary format, only store what's absolutely required. Beyond that, you're getting into compression algorithms and the like. – House Jun 05 '13 at 19:55
4

What Byte56 says makes sense.

I'm not sure why you need 2 separate coordinates (chunk coords and xy coords). But since a map is always square, the most basic representation is a 2D array of ascii chars

..,,.....@
..,,@@@@@@
...,,,...@
......,..@

So there is the specification of a 4x10 map in 40 bytes. A 10,000 x 10,000 map would take 95MB, which isn't that bad. Because of the repetition, simple compression using zlib would probably reduce the size a lot.

To make this work, you can merge the information of tile-type and walkability together. So @ is an impassible tree, for example, and . is (passable) grass, and , is (passable) dirt. The world space coordinates of each tile is implied, as each "chunk" here is a fixed size (say 1 world unit x 1 world unit). The bottom left corner of the world could be placed at 0,0.

Using single characters means each tile can represent one of 255 different things. If you want 65,536 different things represented, you could use unicode. If you want more, you could use UTF-32.

To add in items, and NPC's you could use an item layer.

.......$$.
..t.......
....W.....
..........

Where . means empty and $ means money.

You probably could store the map size and some kind of "key" as to what each character means in the header of this map file format.

bobobobo
  • 17,074
  • 10
  • 63
  • 96