anyone working on decode the replay file format?

Started by Pixie, February 27, 2010, 02:59:59 PM

Previous topic - Next topic

newbiz

I'm working on a C++ library to handle SC2 replays.
Currently got <almost> all the "replayinfo" reversed, and only messages from "messageevents".

I'll try to provide a public SVN soon so that people can join & participate.

Pixie

Sounds great. Please let me know when it's ready. Thanks :)

zeeg

I will be posted up the info on Sourcepeek.com as soon as I get time.

We've got a semi working parser on Nibbits.com, but it appears that the file requires you to read it by bit (rather than byte) so its sort of complex.


Right now, we are able to read (by bytes) most of the file, until right after the "Needed Modules", which are the 40 byte blobs which contain "s2ma", the realm id, and the file hash. There's a chunk in here which requires a bit stream, but then its fine again once you hit the map title. We ended up reverse-seeking as a quick solution on Nibbits.


Newbiz let me know when you get that up, hit me up here, or on irc (dcramer[nibbits] on Rizon).
Got feedback regarding Nibbits.com? Look no further.

newbiz

I have the exact same problem here :/

At the end of the replay.info there are 5 chunks of 40 bytes, and then a varying length chunk until the name of the map.
I can't find any logical way to compute the length of this chunk.

Currently, I'm just doing a trial/error to guess the bytes that may describe the length of the chunk.

Here are some dumps, I stressed the bytes that should store the length, one way or another...:


10 0E 02 06 08 01    64 02 06 15    80 24 2F 3F A6 AF 00                                  00 00 00 00
20 0C 02 06 08 03 01 64 02 06 15 01 80 24 2C 06 8F B4 00                                  00 00 00 00
10 0E 02 06 08 03 02 64 02 06 15 01 80 24 95 B6 1B A2 00                                  00 00 00 00
20 0C 02 06 08 01    64 02 06 15 01 81 24 A6 BF E3 22 00                                  00 00 00 00
20 2C 00 06 08 01    64 02 06 14 01 80 24 02 05 8F 00 C0 04 82 13 49 00 C0 04 94 B7 9E 32 00 00 00 00
20 2C 00 06 08 03 01 64 02 06 14 03 80 24 02 05 0F    C0 04 82 13 09    C0 84 0E FB 69 19 00 00 00 00
               ^^ ^^          ^^ ^^                   ^^          ^  ^^

By the way, it would be interesting to share our parsers. The only source of information that I have is http://code.google.com/p/vgce/source/browse/trunk/docs/Blizzard/Starcraft%20II/replays.txt (which is often false) and nibbits.com to download replays ;)

zeeg

Here's Nibbits initial parser (incomplete).


It's failing on the List(Boolean) chunk at the moment, but the structure is accurate.


http://github.com/dcramer/nibbits-shared/blob/master/sc2/parsers/replays.py
Got feedback regarding Nibbits.com? Look no further.

newbiz

Has the replay file format changed recently due to patches ?
I have clearly different dumps from the first release to the latest :/

newbiz

Here is a dump of the message events file. These are chat messages.
A message is easy to parse, it's basically composed of a timestamp (from last message) followed by a player ID (starting at 1). Then comes the message string, with its length before.
The timestamp & player ID are ended by a terminal 0 (like cstrings) since timestamp has a varying length.

I can't figure out the global file header for now. It seems to have a fixed 40 bytes headers, and then something variable :/

------------- 2 players: TO ALL ---------------
      2C 01 00 02 68 69
                   h  i

      68 01 00 05 66 72 6F 6D 3F
                   f  r  o  m  ?

      F8 02 00 06 73 77 65 64 65 6E
                   s  w  e  d  e  n

      50 02 00 04 79 6F 75 3F
                   y  o  u  ?

   01 68 01 00 07 66 69 6E 6C 61 6E 64
                   f  i  n  l  a  n  d

   01 60 01 00 0B 6D 69 6E 64 2E 6B 69 76 76 69 3F
                   m  i  n  d  .  k  i  v  v  i  ?

   01 98 02 00 0E 6E 65 76 65 72 20 68 65 61 72 64 20 6F 66
                   n  e  v  e  r     h  e  a  r  d     o  f

   01 E7 01 00 04 67 6C 68 66
                   g  l  h  f

      E0 02 00 04 73 61 6D 65
                   s  a  m  e

02 44 BF 01 00 02 67 67
                   g  g

   01 54 02 00 02 67 67
                   g  g
               ^^ message size
            ^^ end of header
         ^^ player ID
   ^^ ^^ timestamp


------------- 4 players: TO ALL ---------------


                                    51 02 00 02 67 67
                                                 g  g
                                   
                                    8C 03 00 02 67 67
                                                 g  g
                                   
                                    C0 04 00 02 67 67
                                                 g  g

09 BD 03 83 00 32 8E 84 00 97 94 06 28 01 00 1E 68 6F 77 20 64 6F 20 79 6F 75 20 73 74 6F 70 20 6D 61 73 73 20 73 74 61 6C 6B 65 72 73 3F
                                                 h  o  w     d  o     y  o  u     s  t  o  p     m  a  s  s     s  t  a  l  k  e  r  s  ?

                                 05 18 03 00 12 6D 61 73 73 20 73 74 61 6C 6B 65 72 73 20 73 75 63 6B
                                                 m  a  s  s     s  t  a  l  k  e  r  s     s  u  c  k

                                    C8 04 00 15 69 64 6B 20 49 20 64 6F 6E 27 74 20 70 6C 61 79 20 74 6F 73 73
                                                 i  d  k     I     d  o  n  '  t     p  l  a  y     t  o  s  s

                                 09 0D 01 00 18 79 6F 75 20 74 68 69 6E 6B 20 73 74 61 6C 6B 65 72 73 20 73 75 63 6B 3F
                                                 y  o  u     t  h  i  n  k     s  t  a  l  k  e  r  s     s  u  c  k  ?
                                 
                                 05 1A 03 00 07 74 68 65 79 20 64 6F
                                                 t  h  e  y     d  o
                                             ^^ message size
                                          ^^ end of header
                                       ^^ player ID
^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ timestamp

chuanhsing

#8
= Message Structure (replay.message.events) =


* Unknown Function, 7 bytes for each, always at header
** 0x0, maybe Frame ?
** unknown
** 0x80, maybe OPcode ?
** 0x0
** 0x0
** unknown
** unknown
* Message Function
** Accumulate Frames, 1-N bytes, Big Endian and Frame/64=Second
** Sender, 1 byte
** OPcode, 1 byte, 0x0 to all, 0x2 to alliance
** Message Len, 1 byte
** Message Content, n bytes from Message Len
* Blink Function (alt+g on Map)
** Accumulate Frames, 1-N bytes
** Player, 1 byte
** OPCode, 1 byte, 0x83
** X, 4 bytes
** Y, 4 bytes

= Info Structure (replay.info) =


* Players, 1 byte, always 0x10
** Player Name Length, 1 byte
** Player Name, N bytes from Player Name Length
** Player Info, 5 bytes
* MapInfo
** Unknown, 4 bytes
** unkDLen, 1 byte, always 0x4
** unkDefault, unkDLEn bytes, "Dflt"
** allianceLocked, 1 byte, (allianceLocked & 0x01) if alliances are locked
** Unknown, 1 byte
** Game Speed, 1 byte. 0 ~ 4, 4 is More Faster
** Unknown, 11 bytes
** Map Cache Length, 1 byte, always 0x4B
** Map Cache Path, 0x4B bytes
* Unknown, 686 bytes
* S2MA * 5
** S2MA, 4 bytes, "s2ma"
** Zero, 1 byte
** BattleNet, 3 bytes, "KRB"=KR Battle.Net, "EUB", "USB"
** MD5?, 32 bytes
* Unknown, X bytes
* MapLen1, 1 byte
* MapLen2, 1 byte
* Map Name, MapLen1<<2+MapLen2 bytes
* Zero, 1 byte, always 0x0
* Players, 4 bytes, always 0x10
** Player Name Length, 1 byte
** Player Name, N bytes from Player Name Length
** Player Race Length, 1 byte
** Player Race, N bytes from Player Race Length
** Player Color Length, 1 byte
** Player Color, N bytes from Player Color Length

= Sync Structure (replay.sync.events) =

* 1 bytes, always 4
* 2 bytes, unknown
* 1 bytes, always 0 or 1


Elapse Time (sec) = (filesize of replay.sync.events) / 64;



= Game Structure (replay.game.events) =


Start OPcode
* 00 01 1B, player1 initial
* 00 02 1B, player2 initial
* 00 10 05, game start


Action OPcode
* Accumulate Frames, 1~N bytes, Big Endian
* Player, 1 byte, 0x21 or 0x61 -> player1
* OpCode, 1 byte


OPCodes
* 0x81, Move Camera
** X, 4 bytes
** Y, 4 bytes
** Hor, 4 bytes
** Vet, 4 bytes
** Unknown, 4 bytes
* 0x0B, Unit Action, like building, morph, research, upgrade, order, ability
* 0x3C,
* 0xAC, Select Unit
* 0x0D,
* 0x1D,

newbiz

Thank you very much ! These are very valuable information ^^
Did you figure it out yourself or are you in a parser project ? If not, you can join my effort @ http://projects.coderbasement.com/projects/show/sc2replay


Bttw, could you explain a little bit the messageevents header structure ?
* Message Header, 7 bytes for each, pos 0 3 4 = 0x0, pos 2 = 0x80

I don't really understand this line. What do you mean by [/size][/color]pos 0 3 4 = 0x0, pos 2 = 0x80 ?[/color][/size][/font]

Thanks again

newbiz

Okay thanks a lot.


So I end up with something like :

00 21 80 00 00 1D 00  // Header 0
00 21 80 00 00 20 00  // Header 1
00 21 80 00 00 24 00  // Header 2
00 21 80 00 00 26 01  // Header 3
00 21 80 00 00 2E 00  // Header 4
00 21 80 00 00 32 00  // Header 5


elapsed frames | player id | zero | msg length | msg
         05 2C |        01 |   00 |         02 |                                     68 69
            68 |        01 |   00 |         05 |                            66 72 6F 6D 3F
            F8 |        02 |   00 |         06 |                         73 77 65 64 65 6E
            50 |        02 |   00 |         04 |                               79 6F 75 3F
         01 68 |        01 |   00 |         07 |                      66 69 6E 6C 61 6E 64
         01 60 |        01 |   00 |         0B |          6D 69 6E 64 2E 6B 69 76 76 69 3F
         01 98 |        02 |   00 |         0E | 6E 65 76 65 72 20 68 65 61 72 64 20 6F 66
         01 E7 |        01 |   00 |         04 |                               67 6C 68 66
            E0 |        02 |   00 |         04 |                               73 61 6D 65
      02 44 BF |        01 |   00 |         02 |                                     67 67
         01 54 |        02 |   00 |         02 |                                     67 67



Any idea how to guess the number of headers at the beginning of the file ?

chuanhsing

Quote from: newbiz on March 04, 2010, 02:07:31 AM
Any idea how to guess the number of headers at the beginning of the file ?

A poor idea is:


while(BMessage[HeadPointer]==0 && BMessage[HeadPointer+2]==0x80 && BMessage[HeadPointer+3]==0 && BMessage[HeadPointer+4]==0) {
    HeadPointer += sizeof(SC2MessageHead);
}

newbiz

lol ok ^^


If that's not considered harrassment, would you mind explaining the Accumulate Frames, 1-N bytes, Big Endian and Frame/64=Second line ?[/color][/size][/font]

I got a replay where the first message is a "gg", at the very end of the game, so the accumulated frame field is quite long ( 193 bytes ). How should I interpret it ?
A sum of 64b big endian integers storing elapsed frames ?

chuanhsing


elapsed frames | player id | zero | msg length | msg
         05 2C |        01 |   00 |         02 |                                     68 69
            68 |        01 |   00 |         05 |                            66 72 6F 6D 3F

The second message was chatted at (0x52c+0x68)/64 sec.


I don't have any frame field that is over 5 bytes. Maybe there is something wrong.

newbiz

I attached the dump (renamed to .txt so that i can upload it here).


From what you've wrote i have the following headers:
00 22 80 00 00 1D 00
00 21 80 00 00 1D 00
00 22 80 00 00 32 00
00 21 80 00 00 32 00
00 23 80 00 00 1C 00
00 23 80 00 00 1E 01
00 23 80 00 00 25 00
00 23 80 00 00 27 01
00 23 80 00 00 2F 00
00 23 80 00 00 32 00



And then a bunch of (unknown?) 193 bytes until the first message.
I have no clue how to interpret it.