AmoebeAbleger(): This should never happen!

Found a bug in R'n'D? Report it here!

Moderators: Flumminator, Zomis

filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

AmoebeAbleger(): This should never happen!

Post by filbo »

I'm running a massive autotest of all my saved tapes [and boy, a huge fraction aren't working :( ]

Noticed some error messages flowing by -- a whole bunch similar to:

AmoebeAbleger(): newax = 38, neway = 22
AmoebeAbleger(): This should never happen!

regarding level 39 of 'rnd_yamyam_palace'. Attaching the tape.
Attachments
rnd_yamyam_palace-039.tape
(560 Bytes) Downloaded 185 times
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: AmoebeAbleger(): This should never happen!

Post by filbo »

I re-ran it and it spewed tons of output, but nothing interesting ended up in ~/.rocksndiamonds/std{out,err}.

So I reran with > /tmp/out 2>/tmp/err, and all the interesting stuff ended up in stdout, which is attached.
Attachments
rnd_yamyam_palace-039-autotest-out.txt.bz2
(1.35 KiB) Downloaded 207 times
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: AmoebeAbleger(): This should never happen!

Post by Holger »

I'm running a massive autotest of all my saved tapes [and boy, a huge fraction aren't working :( ]
So the most interesting question will be if these tapes fail due to engine bugs or due to being broken as caused by using the buggy snapshot system. :-o

At least, this question can usually easy be answered: Take a few of these broken tapes, use the "file" command on them (with the "magic" file extensions for R'n'D files) to see with which game version they were recorded with, and replay/autotest them with exactly this version. If they succeed, I've added engine incompatibilities (or bugs) in later versions. If they still fail, they are really broken (for whatever reason, having used the snapshot system being the most likely reason).

A third possible cause, unfortunately, is running an older, 32-bit support only version that was compiled on a 64-bit system. This was fixed in one of the latest 3.x versions (3.1.0.0, I think), but an overseen 32/64-bit bug in the EM engine was only fixed in 4.0.1.2.

I will have a look at that example tape and error output later!

Update: 64-bit support was added in version 3.3.1.0.
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: AmoebeAbleger(): This should never happen!

Post by filbo »

I get this output (which you'll also have seen):

$ file 039.tape
039.tape: Rocks'n'Diamonds data, tape file, version 3.2.6-1, engine 1.4.0-0, date 20110111, level set "rnd_yamyam_palace", level 39

-- which ought to be new enough, if you're right. The levelset 'rnd_yamyam_palace' uses the RnD engine if I'm not mistaken.

Unfortunately my systematic directory of old RnD personal builds only goes back to 2015-04-01, version 4.0.0.0, 64-bit. Fortunately, I also have a few ad hoc old binaries, including a 3.3.0.1 (64-bit) and 3.3.1.2 (32 & 64). Off to try that with those binaries...

... ok, with 3.3.0.1 (which I believe came from an Ubuntu repository), it plays to 197s and dies, with no interesting stderr output. With 3.3.1.2 64-bit it prints the same style of messages (didn't try to confirm if _same_ messages) and dies somewhere along the way. The other binaries don't work (shared object problems).

For a *long* time now (at least 3y, but I think significantly longer), I have not been using the F1/F2 save system. Only the 'replay from start until near death' system. I also make it a habit to replay the level immediately after succeeding (I mean either '555' 'play as fast as possible through the end', or '5555' 'play as fast as possible (no display) to the end'). So I am rather puzzled to find that *still*, recently played levels have lots of problems.

e.g. emc_eagle_mine_07, with tapes dated 2018-02-12 through 19, has 35 of 81 levels NOT SOLVED.

According to tape & binary timestamps, these were created with a 4.0.1.1 64-bit binary dated 2018-02-04.

I can play back the tapes manually with that binary (i.e. go to that levelset & level number, then 'play'). But this:

$ bin/20180204-sdl1/rocksndiamonds --mytapes -e "autotest emc_eagle_mine_07"

results in an immediate seg fault! : (

Oh, interesting: this works (currently in progress):

$ bin/20180204-sdl1/rocksndiamonds --mytapes -e "autowarp emc_eagle_mine_07"

For some reason it can autowarp (play as fast as possible on-screen) but not autotest (play/test with no window). It has so far played levels 0-27 with no deaths (better than 4.0.1.3 current).

(all of which is moderately off-topic from the amoeba messages in old 3.x tapes...)
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: AmoebeAbleger(): This should never happen!

Post by filbo »

Do you want a tarball of 81 4.0.1.1 emc_eagle_mine_07 tapes which work perfectly on 4.0.1.1 and 40% die on 4.0.1.3 current?

I'll probably end up deciding to upload that in 5 minutes when the set completes with zero deaths... (and so saying, caused it to die on level 40...) (and then I check my notes for that levelset and find 'Level 40 is not solvable -- emeralds fall into magic wall far too quickly to collect enough to win'...)
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: AmoebeAbleger(): This should never happen!

Post by filbo »

Ok, 80 of 81 solved, with the failure expected since I didn't actually solve the level.

See attached tarball: 81 tapes, autoplay report from 4.0.1.3 (46 solved, 34 failed unexpectedly, 1 failed since tape is not a solution).

BTW, please add to allowed upload extensions: .tbz .tgz .txz .xz (didn't check if tgz is already ok).
Attachments
emc_eagle_mine_07-tapes.tar.bz2
(36.58 KiB) Downloaded 191 times
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: AmoebeAbleger(): This should never happen!

Post by Holger »

OK, checked all that. Here's what I found out:

Regarding tape 039 for level set "rnd_yamyam_palace": Even though this is a *very* old level from pre-historic version 1.4.0, and the tape is dated from pre-64-bit era (version 3.2.6.1, while R'n'D went 64-bit with 3.3.1.0), the tape you provided is still solvable with the latest R'n'D 4.0.1.3 as taken from the shrink-wrapped package!

However, it fails for self-compiled binaries (also 4.0.1.3)! How comes that? As I've just found out, the code does some sanity checks in amoeba handling when DEBUG mode is enabled, and fails early on some conditions recognized as inconsistent. Without DEBUG mode enabled, these checks are skipped, and the tape succeeds. (I will try to find out what these checks really do (that I wrote over 20 years ago) and if they should be removed or whatever.)

Regarding the tapes for level set "emc_eagle_mine_07": As you said, you recorded them with version 4.0.1.1. One version later (4.0.1.2), I've fixed an old bug left over from migrating R'n'D from 32 to 64 bit. (There was still a "long" that should be "int", to have it 32-bit sized on both 32 and 64 bit systems -- "long" has different size on these different systems!)

This bug only affects EM engine levels that use the amoeba -- all those levels that result in "NOT SOLVED" contain amoeba and are affected by this bug / bugfix.

Writing a workaround for this kind of bug that works on all systems is practically impossible, as it is unknown what the size of "long" was on the system the tape was recorded. For example, the Windows version is completely unaffected of this bug, as all Windows binaries of R'n'D were ever 32-bit up to today.

So this bug effectively affects all Linux and Mac systems, and versions of R'n'D between 3.3.1.0 and 4.0.1.2.

The only solution I can see for this bug is to re-record the tapes. :-(
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: AmoebeAbleger(): This should never happen!

Post by Holger »

P.S.:

What I could think of would be a "fix-tape" function to explicitly fix a tape by setting a new bit in the tape that forces such a workaround (to do some calculations in 64-bit instead of 32-bit for amoeba in the EM engine).

If you should have a huge amount of such tapes, it might be worth it... :-/
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: AmoebeAbleger(): This should never happen!

Post by filbo »

You seem to be saying that those broken tapes would be OK if a single-bit decision were taken the other way: if RnD recognized that some previous 'long' field was 64-bit rather than the 32-bits it believes.

Is there an unused header field somewhere in those tapes? Could you provide a simple standalone program to 'stamp' each of those tapes with a retro 'hey, this was built with 64-bit longs' flag, and RnD 4.0.1.4 recognize that flag?

I see commit c92edfe90dcec71c4d01e435b2a1baba29d187cd where you changed 'unsigned long random' to uint. But how does this play out in the tapes? Where is the code which (I think you're saying) loads the tape and tries to figure out whether that field was 32- or 64-bit, and sometimes gets it wrong?
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: AmoebeAbleger(): This should never happen!

Post by filbo »

Heh. Even though my post is an hour after your PS post, somehow I hadn't seen that one and basically proposed exactly what you'd already said!

So, it seems that I have a total of 13776 tapes in my collection, of which 12112 are 'solved' by 4.0.1.1, and 12031 by 4.0.1.3; the remainder are not solved. About 1200 of the not-solved levels are in directories with 'emc' in their names. I suspect a large fraction of these might be susceptible to such a fix.

I could easily script --

for each unsolved level {
save an unmodified copy of the tape
poke the tape's 'other amoeba interpretation' bit
see if it's solved
if not, put back the unmodified copy
}

-- if I had the 'poke that bit' utility + a version of RnD which recognized it. (Or maybe: a command-line flag to RnD telling it what value to 'see' of that bit, then I wouldn't need to pre-modify the tapes before determining whether flipping the bit would fix them...)

====

Then, after 1000-or-so tapes are fixed by this, we can figure out the next most popular problem in my tape collection and fix that, etc. :)

... basically retconning all my old tapes to work with current-to-future code. heh.
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: AmoebeAbleger(): This should never happen!

Post by filbo »

Ok, so maybe something like:

$ rocksndiamonds --mytapes --trytape amoeba32 --execute "autotest LEVELDIR"

-- test all the tapes in LEVELDIR, with the amoeba32 flag forced, and report results (DO NOT actually edit the tapes)

$ rocksndiamonds --mytapes --fixtape amoeba32 --execute "autotest LEVELDIR 12 42"

-- test tapes 12 & 42 in LEVELDIR, with amoeba32 flag forced. For each tape, if this makes it solvable, edit the tape file.

In fact, try it 3 ways:

1. without forcing the flag -- if it solves, leave it alone
2. forcing the flag one way -- if it solves, patch it
3. forcing the other way -- if so, patch it
4. if NONE of those options make it work, print a little report showing how much time (maybe number of frames) the player survived in each mode -- as this might help the human decide whether to force the flag one way or another despite it not completely fixing the tape. Don't actually touch the tape.

Trying N ways is as a matter of convenience, as the same directory of tapes might have mixed data from various sources (e.g. for a while I was playing on several different machines and using `scp` or `rsync` to merge sets of tapes -- without much regard for or attention to whether those tapes actually worked once arrived).

The same sort of thing can be imagined for other compatibility flags you might not be able to intuit from the actual tape data. This could get ugly / complicated if there are 5 different such flags and you might need to try all possible combinations to find the one that makes the tape work -- so don't try to solve that. The 'got this far' report will help a human work it out -- maybe. (The possible values of the 'wrap this way' flag could be another candidate...)

There should be a way to tell it to apply patch flags without testing the tapes, e.g.:

$ rocksndiamonds --mytapes --fixtape amoeba32 --execute "fixtape LEVELDIR 12 42"

-- or something like that, meaning 'stamp these tapes this way, without trying to play-test them".

Syntax is ugly, due to trying to integrate with existing --execute.

All of these should be sensitive to "--mytapes", i.e. without it they should operate on system files (if user has permission). Or maybe --mytapes should be abolished (made a no-op), replaced with opposite "--systapes"...
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: AmoebeAbleger(): This should never happen!

Post by Holger »

So, it seems that I have a total of 13776 tapes in my collection, of which 12112 are 'solved' by 4.0.1.1, and 12031 by 4.0.1.3; the remainder are not solved. About 1200 of the not-solved levels are in directories with 'emc' in their names. I suspect a large fraction of these might be susceptible to such a fix.
If 12112 levels are solved by 4.0.1.1, and 12031 levels are solved by 4.0.1.3, this means that 81 levels are affected by a change between version 4.0.1.1 and 4.0.1.3, most probably being the long/int bugfix in version 4.0.1.2.

Therefore, I think that handling an issue that cannot be auto-detected from the game engine version that was used to record a given tape (but only by successively replaying it with different game engine versions to see which one works), and that only affect a small part of the user base (those not using Windows), and that only affects 0.7 % of your tapes theoretically recorded with that bug over a period of about five years is not worth the trouble (although I do understand that broken tapes are a pain in the ass).

However, more interesting to me seem to be the remaining 1664 tapes that got broken _before_ 4.0.1.1 (that's 12 % of your tapes!) -- if you provide me with an example tape, I could check which engine change caused them to be broken, and might be able to add a fix for it. You could then "autotest" all tapes again to check how many tapes got repaired, and we would continue with the next example tape.

Aside from those tapes that were recorded for levels using the EM engine and containing amoeba (which may render them broken with the latest version again), there's a good chance to improve compatibility with older tapes.

Regarding future changes with the potential of breaking existing tapes, I regularly do automated tape tests with a huge amount of tapes from the last over 20 years, to lower the possibility of broken tapes. Unfortunately, tapes recorded with the EM and SP engine were more or less missing from these automated tape tests for whatever reasons, which I have improved now by adding several hundreds of EM and SP solution tapes to my tape test environment.
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: AmoebeAbleger(): This should never happen!

Post by Holger »

P.S.:

Nearly forgot that one:
According to tape & binary timestamps, these were created with a 4.0.1.1 64-bit binary dated 2018-02-04.

I can play back the tapes manually with that binary (i.e. go to that levelset & level number, then 'play'). But this:

$ bin/20180204-sdl1/rocksndiamonds --mytapes -e "autotest emc_eagle_mine_07"

results in an immediate seg fault! : (
Ouch! This should never happen! (Ahh, we're on-topic again! ;-) )

Could you run this within "gdb" and do a "bt" (to dump a backtrace) once it segfaults? That would be ultra-helpful!

I will also try that exact version on that exact level set with your tapes using the same command, to see if I can reproduce this crash! :-o
filbo
Posts: 647
Joined: Fri Jun 20, 2014 10:06 am

Re: AmoebeAbleger(): This should never happen!

Post by filbo »

>> $ bin/20180204-sdl1/rocksndiamonds --mytapes -e "autotest emc_eagle_mine_07"
>>
>> results in an immediate seg fault! : (

> Ouch! This should never happen! (Ahh, we're on-topic again! ;-) )

Hmmm, probably not. My top-level build script builds both SDL1 and SDL2 binaries, ever since you added SDL2 support. But I have been *playing* only the SDL2 for a long time now. For some reason (slip of finger or whatever) I ran the SDL1 in this case. Shortly after posting, I re-ran with SDL2 and it was fine.

So yes, there is some problem, but I think it is probably a really early startup fault in SDL1 on my machine, and probably represents some misconfiguration of SDL that's way outside of RnD's fault.
User avatar
Holger
Site Admin
Posts: 4073
Joined: Fri Jun 18, 2004 4:13 pm
Location: Germany
Contact:

Re: AmoebeAbleger(): This should never happen!

Post by Holger »

So yes, there is some problem, but I think it is probably a really early startup fault in SDL1 on my machine, and probably represents some misconfiguration of SDL that's way outside of RnD's fault.
Yes, this may be the case... But if there's any chance to see if the crash occurs inside R'n'D or inside SDL (by running it once inside "gdb"), this would be really good to know! :-)
Post Reply