Tapes created in 3.3.0.1 (AMD64) misfire in 3.3.1.2

Found a bug in R'n'D? Report it here!

Moderators: Flumminator, Zomis

Post Reply
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Tapes created in 3.3.0.1 (AMD64) misfire in 3.3.1.2

Post by filbo »

I have on my system two RnD binaries. One was provided by Canonical / Ubuntu in rocksndiamonds_3.3.0.1+dfsg1-2_amd64.deb (available from https://launchpad.net/ubuntu/oneiric/am ... .1+dfsg1-2). I built the other myself from source. Both are 64-bit (AMD64) binaries.

Some tapes originally created under that 3.3.0.1 binary play back incorrectly on 3.3.1.2.

I have many level tapes and have only checked a few. The ones I was just looking at are levels 002, 003 and 004 of levels/Contributions 1995 - 2006/Contributions_2005/rnd_martijn_mooij_iii/.

For these three tapes I carefully verified that they play back correctly under 3.3.0.1, and fail under 3.3.1.2. Everything else is identical -- both binaries are compiled to use the same directory hierarchies; they are picking up the same level & tape files. I just ran one binary, played the tapes, exited, ran the other, played tapes.

Holger, I understand that 3.3.0.1 was not yet intentionally ported to 64-bit, and 3.3.1.2 is. There was a bug about different random number initialization, but I thought that was fixed in 3.3.1.2. Maybe the problem is that my saves are from the not-64-bit-fixed random number initialization?

Is there a way for me to convert these bad tapes automatically? Or even better, could RnD somehow recognize the problem at load time and automatically correct them?

I attach my files .rocksndiamonds/tapes/rnd_martijn_mooij_iii/00[234].tape:
002.tape
level 2 of rnd_martijn_mooij_iii -- works on 3.3.0.1 AMD64, dies at time=199 on 3.3.1.2 AMD64
(11.11 KiB) Downloaded 557 times
003.tape
level 3 of rnd_martijn_mooij_iii -- works on 3.3.0.1 AMD64, dies at time=301 on 3.3.1.2 AMD64
(7.86 KiB) Downloaded 562 times
004.tape
level 4 of rnd_martijn_mooij_iii -- works on 3.3.0.1 AMD64, dies at time=148 on 3.3.1.2 AMD64 (maybe starts acting around about time=120?)
(11.77 KiB) Downloaded 586 times
I believe I have hundreds of these, so I'm looking for an automated solution... 8)

Thanks! Thanks in general for a wonderful game, and in specific, in advance, for fixing this :D
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: Tapes created in 3.3.0.1 (AMD64) misfire in 3.3.1.2

Post by Holger »

Hi filbo,

just a quick reply on this, as it has some sort of "urgent" state (from my point of view, as broken tapes are one of the worst things that can happen in R'n'D (and yes, I'm also thinking of a still-existing-not-yet-fixed bug in the quickload/quicksave tape system, although it is not related to this problem)).

Yes, right, tapes created from a 64-bit binary of pre-64-bit-R'n'D result in broken tapes, as the random seed in the tape files is different when done "right". This leads to two questions:

- Is it possible at all to fix a broken tape?
- If yes, is it possible to automatically detect such broken tapes and fix them automatically?

Unfortunately, I'm currently unsure even about the first question, because the 32/64 bit dilemma not only affects the random seed as stored in the tape file, but it also affects certain variables within the R'n'D game engine. So it might be possible that a tape with a fixed random seed might still break due to differently interpreted memory data when looking to it either with "32-bit eyes" or "64-bit eyes". Maybe there are no such problems at all beside the random seed, but I just cannot say at the moment. (A diff between the non-fixed and the fixed version of the source code should tell more.)

I will definitely investigate this, and will try to find a solution, if possible (and, if at all possible, one that requires as few manual user interaction as possible). Unfortunately I won't be able to do this in the next two or three weeks (but hopefully afterwards).

An interim solution (to find out how many tapes are affected) would be to auto-play all your tapes automatically over night, and see which tapes succeed, and which (and how many) are broken when played with the latest version. (See the "autoplay LEVELDIR [NR ...]" command line command for this.)
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: Tapes created in 3.3.0.1 (AMD64) misfire in 3.3.1.2

Post by filbo »

An interim solution (to find out how many tapes are affected) would be to auto-play all your tapes automatically over night, and see which tapes succeed, and which (and how many) are broken when played with the latest version. (See the "autoplay LEVELDIR [NR ...]" command line command for this.)
I started playing with this and have a bunch of observations:

- It displays the start(?) of each level as it prepares to solve it

- This display is done in the artwork set of whatever you were playing last: the "rnd_tutorial_aaron_davidson" levels look quite bizarre in the artwork of "jue_puzznic"!

- Because of the display, I assume (but did not test) that I will not be able to do "autoplay" from an ssh. I'm sitting here on "the slow computer" (1GHz Pentium III w/384MiB RAM!) while my oldest daughter hogs "the fast computer" (2.1GHz Core i3 w/8GiB RAM) playing Minecraft... (Also, of course, I cannot test any 64-bit issues on this PIII system!)

- The doc and command-line help do not appear to clarify what is meant by "LEVELDIR". I tried just "-e autoplay rnd_tutorial_aaron_davidson" and that worked; now I'm trying to distinguish, since both I and the game hierarchy have full sets of 26 tapes for this level, whether it is stepping through /usr/share/games/rocksndiamonds/levels/Tutorials/rnd_tutorial_aaron_davidson/tapes/* or /home/filbo/.rocksndiamonds/tapes/rnd_tutorial_aaron_davidson/*

- Each startup takes a fairly long time on this slow machine, with all levelsets and artworks installed

- Each exit reminds me of a bug I've been experiencing, where the process hangs stuck in what `strace` shows as:

Code: Select all

$ strace -p5501
Process 5501 attached - interrupt to quit
futex(0xb2f03bd8, FUTEX_WAIT, 5504, NULL
Mostly this dies when I kill it (-15); sometimes I have to `kill -9`; exactly twice I have seen:

Code: Select all

Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting.
on killing it -- so it looks like some misinteraction with pulseaudio libraries.

- The output "Level 004: playing tape ... solved." looks like it's intended to be printed in two stages, but in fact the entire line appears at the moment that level 4 is solved (so for the entire time that 4 is running, the last output on the terminal is "Level 003: playing tape ... solved.\n")

- Ah, it has just started autoplaying level 7 of "rnd_krystian_abramowicz" while I only have tapes for 001..006, so that proves it's playing the :ist tapes

- I then tested with a levelset for which I have tapes, but /usr/share/games/... does not; it didn't find my tapes. So that proves that it's only playing the :ist tapes.

- How, other than clumsy tricks with moving directories around the filesystem, can I have it play my tapes?

- Thought towards a fix: it should be possible to implement the wrong not-64-fixed random init code as a separate step. Tape files give the version number as e.g. 03 03 01 02, but not the 32/64-bitness of the engine. So (1) add that to the save format; (2) for tapes made before the 64-bit fix, let the user somehow control whether you use the correct or broken-64-bit random seed initialization procedure. Obviously it would be nice to automatically determine this, but I don't see how. (If the time stamp is old enough, dating back to before AMD64 CPUs shipped, you could assume a 32-bit save -- I don't know if that's a big help...)

- Ah, in particular, when doing "autoplay", if a tape fails then you could try the other random initialization; if it then succeeds, you could stamp the file with the 32- or 64-bit tag. Or stamp it with what the random number seed should have been, using the correct init algorithm. But "autoplay" should not edit files -- just have it display:

Level 004: playing tape ... solved.
Level 005: playing tape ... failed. Retrying with alternate random number seed algorithm:
Level 005: playing tape ... solved. Level 005 tape needs its random number seed corrected.

and add an "autofix" which actually applies such changes.

- Please add "rocksndiamonds --version" flag! It should print, as text, the same as you see in Info Screen -> Version Info. Please also add CPU info to that screen.

- -e "dump level foo.level" and -e "dump tape foo.tape" should print the date stamp from the file

- the tokens "tape" and "level" aren't (always) necessary since many of the formats you read have IFF chunks identifying what they are, or probably other similar things. Plus in many cases they're named "whatever.tape" or "whatever.level" (detect by extension) or "/blah/blah/levels/blah/12" (detect by last subdirectory named "levels" or "tapes" in path).

- WOULD IT HELP for me to go back and post individual bug topics for each of the above points which call out a specific bug? I am inclined to do this, but I want to do whatever is least annoying to you: (A) leave it like this, with 5-10 minor bug reports munged into a single thread reply or (B) post 5-10 separate threads about them...
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: Tapes created in 3.3.0.1 (AMD64) misfire in 3.3.1.2

Post by Holger »

Last question answered first: I think it's fine to keep all these bugs and related observations inside this single post/thread, as they are mostly related. If I should manage to set up (and use) a decent bug tracker for R'n'D again, it may make sense to file them as separate bugs, but for now it's fine as it is.

> - It displays the start(?) of each level as it prepares to solve it

Yeah, that's right. First I planned to run it completely inside the shell (to be able to batch-run it regularly), but I've lost focus on this, as it would require changes regarding graphic and sound output at several places, while I usually just start it manually from a shell over night anyway and don't care about graphical output. (Although it is some sort of progress information for long-running level sets, as one can see the currently test-played level number.)

> - This display is done in the artwork set of whatever you were playing last: the "rnd_tutorial_aaron_davidson" levels look quite bizarre in the artwork of "jue_puzznic"!

Right. :-)

Loading the correct artwork may take some time, so I just skip it in this case. But I could change it to run with a black playfield by preventing playfield screen updates in test run mode...

> - Because of the display, I assume (but did not test) that I will not be able to do "autoplay" from an ssh.

Well, as X11 is network capable, it should work (although probably a bit slow). Just use "ssh -X" to enable X11 forwarding.

> - The doc and command-line help do not appear to clarify what is meant by "LEVELDIR".

It's just the internal symbolic name of the level set, which is identical with the sub-directory name of the level set.

About solution tape precedence ("public" vs. "private" directory), I'm not totally sure at the moment.

> - Each startup takes a fairly long time on this slow machine, with all levelsets and artworks installed

This is the result of bad programming and is already fixed/improved in the current working version / will be fixed/improved in the next release version.

(out of time -- to be continued -- please stand by)
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: Tapes created in 3.3.0.1 (AMD64) misfire in 3.3.1.2

Post by filbo »

>> - It displays the start(?) of each level as it prepares to solve it

> Yeah, that's right. First I planned to run it completely inside the shell (to be able to batch-run it regularly), but I've lost focus on this, as it would require changes regarding graphic and sound output at several places, while I usually just start it manually from a shell over night anyway and don't care about graphical output. (Although it is some sort of progress information for long-running level sets, as one can see the currently test-played level number.)

Maybe as a separate flag (since the current behavior could be desired for some of those reasons), but some day it would be nice to have a way to do batch tape-testing runs without any X dependency at all. e.g. I can imagine adding into the Debian / Ubuntu package build process a test script that verifies the shipped tapes -- but I am sure that their automated builds are run under conditions which do not allow X.

`ssh -X`, as you just recommended, appears to work sufficiently well for my simple requirements of the moment. It isn't even noticeably slow (this means I can play RnD while my kids play Minecraft! The "slow" machine is perfectly adequate to display what's running on the "fast" machine, and it won't hurt their Minecraft performance too much...)

It already prints to stdout "Level %03d: playing tape ... solved.", which seems sufficient progress reporting. Except the bug I mentioned above: it should flush output after "playing tape ..." so that "solved" appears at a later time than "playing".

> Loading the correct artwork may take some time, so I just skip it in this case. But I could change it to run with a black playfield by preventing playfield screen updates in test run mode...

Hiding the slightly ugly wrong-artwork display of the levels isn't at all important to me. The only gain would be the other flag I'm asking for (at a very far off "wishlist" priority) which would have you not even initialize the X or SDL libraries, suitable for running out of a `cron` job or whatever.

>> - The doc and command-line help do not appear to clarify what is meant by "LEVELDIR".

> It's just the internal symbolic name of the level set, which is identical with the sub-directory name of the level set.

Sure, and that's obvious to the author and eventually figure-out-able to the user :wink: -- but the help could be better. e.g.:

Code: Select all

  "autoplay LEVELDIR [NR ...]"     play level tapes for LEVELDIR
  "convert  LEVELDIR [NR]"         convert levels in LEVELDIR
           (LEVELDIR is name of levelset's leaf directory)
(here I am assuming that "convert" takes the same sort of LEVELDIR...)

> About solution tape precedence ("public" vs. "private" directory), I'm not totally sure at the moment.

In my testing there is no "precedence": it tests *only* the public directory and ignores my own tapes.
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: Tapes created in 3.3.0.1 (AMD64) misfire in 3.3.1.2

Post by Holger »

Quick update on the following:

> Please add "rocksndiamonds --version" flag! It should print, as text, the same as you see in Info Screen -> Version Info.

Done (current development version from Git):

Code: Select all

$ ./rocksndiamonds --version
Rocks'n'Diamonds 4.0.0.0
$ ./rocksndiamonds --debug --version
Rocks'n'Diamonds 4.0.0.0
- SDL 2.0.1
- SDL_image 2.0.0
- SDL_mixer 2.0.0
- SDL_net 2.0.0
$ 
> Please also add CPU info to that screen.

What do you mean by "CPU info"?
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: Tapes created in 3.3.0.1 (AMD64) misfire in 3.3.1.2

Post by filbo »

>> Please also add CPU info to that screen.

> What do you mean by "CPU info"?

It would be nice to see "i386" or "x86_64" -- some string identifying for what CPU architecture the binary being --version'd was compiled. My own concern is entirely about 32-bit x86 vs. 64-bit x86, but it'd be best if the solution worked for arbitrary CPUs (there must be a standard C or Unix manifest define for "what CPU am I?" -- probably emulated or available somehow for Windows as well...)
Post Reply