Update!

I got a chance to look at the European release of ICO, and pretty much immediately noticed the new files SRCFILE.TXT and TRFILE.TXT. SRCFILE is the complete 'objdump -d' output of the game, with the debugging line numbers, and TRFILE is the complete linker log. Which includes these function names:

00136cc0:0003215616:0710:ffff:huft_build():fumi/ios/inflate.c:119
00137488:0003244212:0160:ffff:inflate_codes():fumi/ios/inflate.c:335
00137bd0:0003268411:00c0:ffff:inflate_stored():fumi/ios/inflate.c:439
00137ef0:0003278744:04d0:ffff:inflate_fixed():fumi/ios/inflate.c:485
00138150:0003288348:05e0:ffff:inflate_dynamic():fumi/ios/inflate.c:549
00138a68:0003319614:ffff:ffff:inflate_start():fumi/ios/inflate.c:706
00138ab8:0003321500:0030:ffff:close_inflate_handler():fumi/ios/inflate.c:750
00138b80:0003324593:00d0:ffff:inflate():fumi/ios/inflate.c:772
00139048:0003340442:0040:ffff:open_inflate_handler():fumi/ios/inflate.c:730
001390d8:0003343118:0060:ffff:fill_inbuf():fumi/ios/inflate.c:887
001391b8:0003346411:0020:ffff:huft_free():fumi/ios/inflate.c:309

00139568:0003361590:0040:ffff:new_mblock_node():fumi/ios/mblock.c:16
00139668:0003365214:ffff:ffff:reuse_mblock1():fumi/ios/mblock.c:95
00139690:0003365880:ffff:ffff:init_mblock():fumi/ios/mblock.c:12
001396a0:0003366175:0030:ffff:new_segment():fumi/ios/mblock.c:72
00139748:0003369314:0030:ffff:reuse_mblock():fumi/ios/mblock.c:105
001397a0:0003370659:0060:ffff:strdup_mblock():fumi/ios/mblock.c:123

This matches perfectly with all the stuff down here. I'm going to stop looking now, in case the Japanese release has Fumito Ueda's credit card numbers on it or something.

I haven't suceeded in contacting anyone about this; SCEI and ONICOS/Izumo don't read their email. Someone who speaks better Japanese than me should try writing them a letter.


Summary

ICO, a video game by Sony Computer Entertainment for the PlayStation 2, seems to be using parts of the GPL library libarc for compressed data handling. It doesn't credit the author or mention libarc or the GPL.

This isn't a big problem in terms of code — the two files from libarc used are under 1500 lines put together, and one is a heavily-edited copy of inflate.c from zlib, which is public domain. But, it's a GPL violation, and they need to fix it.

Evidence

To follow along with this, you'll need:

Reverse-engineering (interesting)

First, check out these strings from the data section:
ios/inflate.c
 incomplete literal tree
 incomplete distance tree
ios/mblock.c

ICO, helpfully, has all its debug logging still in the release binary. Here we can see the names of two files from libarc. Note the space before "incomplete" in both strings; this indicates a really old version of zlib. Even find-zlib, which claims to go back to zlib 0.1, doesn't have these. (It also doesn't find any data tables.)

From inflate.c:

/*
    Copyright (C) 2000 Masanao Izumo <mo@goice.co.jp>

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/

/* inflate.c -- Not copyrighted 1992 by Mark Adler
   version c10p1, 10 January 1993 */

/* You can do whatever you like with this source file, though I would
   prefer that if you modify it and redistribute it that you include
   comments to that effect with your name and the date.  Thank you.
   [The history has been moved to the file ChangeLog.]
 */
And:
    /* build the decoding tables for literal/length and distance codes */
    bl = lbits;
    i = huft_build(ll, nl, 257, cplens, cplext, &tl, &bl, &decoder->pool);
    if(bl == 0)                            /* no literals or lengths */
      i = 1;
    if(i)
    {
        if(i == 1)
            fprintf(stderr, " incomplete literal tree\n");
        reuse_mblock(&decoder->pool);
        return -1;            /* incomplete code set */
    }
    bd = dbits;
    i = huft_build(ll + nl, nd, 0, cpdist, cpdext, &td, &bd, &decoder->pool);
    if(bd == 0 && nl > 257)    /* lengths but no distances */
    {
        fprintf(stderr, " incomplete distance tree\n");
        reuse_mblock(&decoder->pool);
        return -1;
    }

    if(i == 1) {
#ifdef PKZIP_BUG_WORKAROUND
        i = 0;
#else
        fprintf(stderr, " incomplete distance tree\n");
#endif
    }
    if(i)
    {
        reuse_mblock(&decoder->pool);
        return -1;
    }
libarc uses a very old copy of INFLATE with the same error messages.

Reverse-engineering (boring)

Now that we've seen that, it's time for MIPS assembly!
I'll be using ps2dis's output here.
The equivalent to fprintf() is located at 0x001A6E28 in the binary. It's been simplified - the first argument is missing, but I'll use the same name for clarity.

Searching for those error strings finds this:

        jal             $001a6e28               # 0013531c:0c069b8a     v fprintf
        addiu           a0, a0, $6b10           # 00135320:24846b10     a0=" incomplete literal tree\n"
        beq             zero, zero, $0013540c   # 00135324:10000039     v __0013540c
        daddu           a0, s0, zero            # 00135328:0200202d     
__0013532c:                                     # 
        addiu           v0, zero, $0006         # 0013532c:24020006     v0=$00000006
        lui             a3, $0028               # 00135330:3c070028     a3=$00280000
        lui             t0, $0028               # 00135334:3c080028     t0=$00280000
        sll             a0, v1, 2               # 00135338:00032080     
        lw              a1, $0514(sp)           # 0013533c:8fa50514     
        sw              v0, $04fc(sp)           # 00135340:afa204fc     
        addu            a0, sp, a0              # 00135344:03a42021     
        addiu           a3, a3, $0b20           # 00135348:24e70b20     a3=$00280b20
        addiu           t0, t0, $0b60           # 0013534c:25080b60     t0=$00280b60
        daddu           a2, zero, zero          # 00135350:0000302d     
        addiu           t1, sp, $04f8           # 00135354:27a904f8     
        addiu           t2, sp, $04fc           # 00135358:27aa04fc     
        jal             $001336c0               # 0013535c:0c04cdb0     ^ FNC_001336c0
        daddu           t3, s0, zero            # 00135360:0200582d     
        lw              v1, $04fc(sp)           # 00135364:8fa304fc     
        bne             v1, zero, $00135394     # 00135368:1460000a     v __00135394
        daddu           s4, v0, zero            # 0013536c:0040a02d     s4=$00000006
        lw              v1, $0510(sp)           # 00135370:8fa30510     
        sltiu           v0, v1, $0102           # 00135374:2c620102     
        bne             v0, zero, $00135398     # 00135378:14400007     v __00135398
        addiu           v0, zero, $0001         # 0013537c:24020001     v0=$00000001
        lui             a0, $0055               # 00135380:3c040055     a0=$00550000
        jal             $001a6e28               # 00135384:0c069b8a     v fprintf
        addiu           a0, a0, $6b30           # 00135388:24846b30     a0=" incomplete distance tree\n"
        beq             zero, zero, $0013540c   # 0013538c:1000001f     v __0013540c
        daddu           a0, s0, zero            # 00135390:0200202d     
__00135394:                                     # 
        addiu           v0, zero, $0001         # 00135394:24020001     v0=$00000001
__00135398:                                     # 
        bne             s4, v0, $001353a8       # 00135398:16820003     v __001353a8
        lui             a0, $0055               # 0013539c:3c040055     a0=$00550000
        jal             $001a6e28               # 001353a0:0c069b8a     v fprintf
        addiu           a0, a0, $6b30           # 001353a4:24846b30     a0=" incomplete distance tree\n"
This calls fprintf three times, as you can see.

Now, let's do a Google Code Search for the errors. Almost all of these are the same — they're either commented out or there's only one call to each. The only different one is TiMidity++, which turns out to use libarc!

After the error message, all three paths jump to here:

__0013540c:                                     # 
        jal             $00136140               # 0013540c:0c04d850     v FNC_00136140
        nop                                     # 00135410:00000000     
        beq             zero, zero, $00135434   # 00135414:10000007     v __00135434
        addiu           v0, zero, $ffff         # 00135418:2402ffff     v0=$ffffffff

which goes to:

FNC_00136140:                                   # 
        addiu           sp, sp, $ffd0           # 00136140:27bdffd0     
        sd              s1, $0010(sp)           # 00136144:ffb10010     
        sd              ra, $0020(sp)           # 00136148:ffbf0020     
        daddu           s1, a0, zero            # 0013614c:0080882d     
        sd              s0, $0000(sp)           # 00136150:ffb00000     
        lw              s0, $0000(s1)           # 00136154:8e300000     
        beq             s0, zero, $00136184     # 00136158:1200000a     v __00136184
        ld              ra, $0020(sp)           # 0013615c:dfbf0020     
        daddu           a0, s0, zero            # 00136160:0200202d     
        nop                                     # 00136164:00000000     
__00136168:                                     # 
        jal             $00136060               # 00136168:0c04d818     ^ FNC_00136060
        lw              s0, $000c(s0)           # 0013616c:8e10000c     
        bne             s0, zero, $00136168     # 00136170:1600fffd     ^ __00136168
        daddu           a0, s0, zero            # 00136174:0200202d     
[...]
and further to:
FNC_00136060:                                   # 
        lw              v0, $0004(a0)           # 00136060:8c820004     
        sltiu           v0, v0, $2001           # 00136064:2c422001     
        bne             v0, zero, $00136078     # 00136068:14400003     v __00136078
        lw              v0, $9758(gp)           # 0013606c:8f829758     v0=$00632048
        j               $00139598               # 00136070:0804e566     v FNC_00139598
        lw              a0, $0000(a0)           # 00136074:8c840000     
__00136078:                                     # 
        sw              a0, $9758(gp)           # 00136078:af849758     [00632048]
        jr              ra                      # 0013607c:03e00008     
        sw              v0, $000c(a0)           # 00136080:ac82000c     
        nop                                     # 00136084:00000000     
__00136088:                                     # 
        sw              zero, $0004(a0)         # 00136088:ac800004     
        jr              ra                      # 0013608c:03e00008     
        sw              zero, $0000(a0)         # 00136090:ac800000     
        nop                                     # 00136094:00000000     
and finally:
FNC_00139598:                                   # 
        addiu           sp, sp, $fb70           # 00139598:27bdfb70     
        sd              s5, $0450(sp)           # 0013959c:ffb50450     
        daddu           s5, a0, zero            # 001395a0:0080a82d     
        sd              ra, $0480(sp)           # 001395a4:ffbf0480     
        lui             a0, $0055               # 001395a8:3c040055     a0=$00550000
        sd              s7, $0470(sp)           # 001395ac:ffb70470     
        sd              s6, $0460(sp)           # 001395b0:ffb60460     
        addiu           a0, a0, $72d8           # 001395b4:248472d8     a0="mem:free "
        sd              s4, $0440(sp)           # 001395b8:ffb40440     
        sd              s3, $0430(sp)           # 001395bc:ffb30430     
        sd              s2, $0420(sp)           # 001395c0:ffb20420     
        sd              s1, $0410(sp)           # 001395c4:ffb10410     
        jal             $001a6e28               # 001395c8:0c069b8a     v fprintf
        sd              s0, $0400(sp)           # 001395cc:ffb00400     
        bne             s5, zero, $00139618     # 001395d0:16a00011     v __00139618
        addiu           s1, s5, $fff0           # 001395d4:26b1fff0     
        lui             a0, $0055               # 001395d8:3c040055     a0=$00550000
        jal             $001a6e28               # 001395dc:0c069b8a     v fprintf
        addiu           a0, a0, $72e8           # 001395e0:248472e8     a0="null memory pointer\n"
        break           (00000)                 # 001395e4:0000000d     
        lui             s0, $0055               # 001395e8:3c100055     s0=$00550000
        lui             a2, $0055               # 001395ec:3c060055     a2=$00550000
        addiu           s0, s0, $70e0           # 001395f0:261070e0     s0="ios/memory.c"
        addiu           a2, a2, $7300           # 001395f4:24c67300     a2="IOSFREE(): NULL MEMORY POINTER\n"
        daddu           a0, s0, zero            # 001395f8:0200202d     a0="ios/memory.c"
        jal             $001ad748               # 001395fc:0c06b5d2     v FNC_001ad748
        addiu           a1, zero, $0334         # 00139600:24050334     a1=$00000334
        lui             a2, $0063               # 00139604:3c060063     a2=$00630000
        daddu           a0, s0, zero            # 00139608:0200202d     a0="ios/memory.c"
        addiu           a2, a2, $20b8           # 0013960c:24c620b8     a2=$006320b8
        beq             zero, zero, $001399b8   # 00139610:100000e9     v __001399b8
        addiu           a1, zero, $0334         # 00139614:24050334     a1=$00000334
[...]

That last function sure looks like free() to me.
From mblock.c:

static void reuse_mblock1(MBlockNode *p)
{
    if(p->block_size > MIN_MBLOCK_SIZE)
        free(p);
    else /* p->block_size <= MIN_MBLOCK_SIZE */
    {
        p->next = free_mblock_list;
        free_mblock_list = p;
    }
}

void reuse_mblock(MBlockList *mblock)
{
    MBlockNode *p;

    if((p = mblock->first) == NULL)
        return;                   /* There is nothing to collect memory */

    while(p)
    {
        MBlockNode *tmp;

        tmp = p;
        p = p->next;
        reuse_mblock1(tmp);
    }
    init_mblock(mblock);
}
Assuming you can read MIPS assembly (and you can, right?) they're obviously the same code. The memory management code here (reuse_mblock()) is entirely original; nothing that uses zlib compression would have this, unless it used libarc.

I could go further, but pointing out more of the same control flow in a bunch of assembly text isn't really needed.

Instead, I wrote a tool to decompress ICO's data archive, using libarc. libarc's compressor (in deflate.c) uses the same DEFLATE algorithm as gzip, but doesn't store a gzip or zip header. Nevertheless, it decompresses all the files perfectly* without any messing with the compressed stream needed. Get it in the links below.
("advertise.pss" is an MPEG-2 video and will play in VLC, although it won't have sound.)

* It doesn't have a checksum, so it might not actually be perfect, but it doesn't error at least!

Links

Etc.

Shadow of the Colossus, the "sequel" to ICO, doesn't seem to use any other code. I haven't disassembled it, but it's even more helpful: function names aren't stripped at all!
All of them look safe to me, aside from being as unorganized as any game code.

I tried contacting Masanao Izumo, the author of libarc, but one of his emails (mo@goice.co.jp) stopped working and I haven't received a response on the other (iz@onicos.co.jp). Maybe he can be reached through ONICOS?

Thanks !WAHa.06x36 for helping me with the format of DATA.DF.

(Why are the default colors for code2html so ugly? Why does tidy destroy text with CSS white-space: pre?)

http://astrange.ithinksw.net/