Rockbox Technical Forums

Support and General Use => Hardware => Topic started by: maraz on July 25, 2010, 09:33:34 PM

Title: Nano 1G: ATA DMA broken in some devices aka "works only when cooled"
Post by: maraz on July 25, 2010, 09:33:34 PM
This has already been discussed at some length on #rockbox, but I thought best to archive this information here for possible future reference.

Enabling ATA DMA on pp5020 targets, as introduced in r24405 (http://svn.rockbox.org/viewvc.cgi?view=rev&revision=24405), breaks disk use on my Nano 1G 2GB (which I originally acquired right after the launch). Possible problems include:

1) "No .rockbox directory / Installation incomplete" on svn, or on earlier revisions I've tried, "No partition found. Insert USB cable and fix it.", on boot. Firmware boots normally, but doesn't see anything on the disk.
2) All writes by rockbox to the file system get corrupted with funny filenames. (~NVR@I.IN, nvr`m*"in, NVRAM.BHN, etc)
3) Extremely slow/stalled boot (constant disk I/O, so probably bottlenecked by it).

Cooling the device down from the ambient 25°C/77°F before booting makes it start up normally, but after a song or two it starts skipping, then looping randomly from the file system (from different songs accompanied by MPEG clicks and pops) before either halting or crashing with a "*PANIC* Dir entry 2 in sector 0 is not free! E1 50 4F 44" or similar message.

Similarly, heating the unit greatly increases the severity of the slighest problem. Original firmware works fine, and I did a full restore to see if a fresh install of rockbox would help, and it didn't, either.

Someone on #rockbox suggested the UDMA timings, which I proceeded to test by forcing ATA_MAX_UDMA to always be 1 (instead of 2 when >= 30MHz), but this did not correct the problem.

So, I took r27545, commented out the following line from /firmware/export/config/ipodnano1g.h to disable DMA entirely (falling back to PIO)...

Code: [Select]
/* DMA is used only for reading on PP502x because although reads are ~8x faster
 * writes appear to be ~25% slower.
 */
#define HAVE_ATA_DMA

... and some compiling later, everything works. Even when heated so the backplate is hot to touch. So, obviously, at least on my Nano 1G, something is wrong with the ATA DMA implementation.

My problems originally started after upgrading Rockbox to current, so it took some work to find out what caused the sudden problems. As most nanos are just fine with DMA reads, it's certainly a mystery why mine is the only one showing symptoms of this. It's possible that someone else has run into this bug but attributed it to aging hardware or just unstable software.

Attached is a patch to disable ATA DMA altogether, in case I'm not alone with my troubles.


Here's a list of revisions I tried as written proof of the existence of the problem (there are some doubting Thomases ;)

r27544 BAD
r25642 BAD
r24691 BAD
r24572 BAD
r24453 BAD
r24424 BAD
r24413 BAD
r24405 BAD  enables ATA DMA on pp5020
r24404 GOOD   
r24397 GOOD   
r24394 GOOD
r24334 GOOD
r24216 GOOD
r23740 GOOD
Title: Re: Nano 1G: ATA DMA broken in some devices
Post by: dreamlayers on July 27, 2010, 12:16:00 AM
Do errors still happen when the CPU is boosted?

You could also try multi-word DMA.  That is an older slower kind of ATA DMA.

It's interesting that you get corruption instead of read errors.  UltraDMA is supposed to safeguard data integrity using CRC, and CRC errors should be reported as read/write errors.
Title: Re: Nano 1G: ATA DMA broken in some devices
Post by: maraz on July 27, 2010, 01:10:57 PM
Yes, errors still happen when the CPU is boosted - in fact, they seem to occur more frequently as this happens, presumably due to increased heat output?

As for MWDMA, I'd be happy to try it - is there a patch floating around somewhere?
Title: Re: Nano 1G: ATA DMA broken in some devices
Post by: dreamlayers on July 27, 2010, 03:10:46 PM
The code should support MWDMA.  Normally, it uses MWDMA when the device doesn't support UDMA but supports MWDMA.  Search for ATA_MAX_UDMA in firmware/drivers/ata.c and comment out the following:
Code: [Select]
    if (identify_info[53] & (1<<2))
        /* Ultra DMA mode info present, find a mode */
        dma_mode = get_best_mode(identify_info[88], ATA_MAX_UDMA, 0x40);
Title: Re: Nano 1G: ATA DMA broken in some devices
Post by: maraz on July 29, 2010, 07:31:35 AM
Alright, I tested a MWDMA-only build of r27545, and while it certainly seems a lot more stable than UDMA (no "No .rockbox directory" on startup, not once!), there are still read problems like garbled fonts, skipping music etc., and writes tend to corrupt the file system. Symptoms take a lot longer to set in and they seem less intense.
Title: Re: Nano 1G: ATA DMA broken in some devices
Post by: maraz on September 23, 2010, 04:04:23 PM
As there seem to be other cases of this bug appearing, it would be interesting to see if its appearance is tied to a specific model of chips. Attached is the ATA identify info of my faulting nano dumped by Rockbox.

Others suffering from the ATA problem, please go to System -> Debug -> Dump ATA identify info, this generates a identify_info.bin to the device root which you can then post here.
Title: Re: Nano 1G: ATA DMA broken in some devices
Post by: gevaerts on September 23, 2010, 05:08:06 PM
Others suffering from the ATA problem, please go to System -> Debug -> Dump ATA identify info, this generates a identify_info.bin to the device root which you can then post here.

Maybe also people without the problem (but still with a 1st gen nano of course). I assume those will be useful too to see what the actual important differences are. Please say if it's from a nano with the problem or without though.
Title: Re: Nano 1G: ATA DMA broken in some devices aka "works only when cooled"
Post by: soap on September 24, 2010, 08:52:21 PM
Heat exacerbated memory issues have been an issue with some Nanos for at least three years:  (maraz is no stranger to this)
http://forums.rockbox.org/index.php?topic=12326.0
http://forums.rockbox.org/index.php?topic=14059.0
Title: Re: Nano 1G: ATA DMA broken in some devices aka "works only when cooled"
Post by: linkpalmer on May 10, 2011, 02:36:51 PM
This is likely a n00b question, but I suspect this is the problem with my ipod, but how do I apply the .patch file to my build? Thanks guys
Title: Re: Nano 1G: ATA DMA broken in some devices aka "works only when cooled"
Post by: Buschel on May 10, 2011, 05:18:56 PM
ATA DMA is switched off in rockbox 3.8.1 and all Trunk builds from r29476 on (March, 1st, 2011).