Rockbox.org home
Downloads
Release release
Dev builds dev builds
Extras extras
themes themes
Documentation
Manual manual
Wiki wiki
Device Status device status
Support
Forums forums
Mailing lists mailing lists
IRC IRC
Development
Bugs bugs
Patches patches
Dev Guide dev guide
Search



Donate

Rockbox Technical Forums


Login with username, password and session length
Home Help Search Staff List Login Register
News:

Thank You for your continued support and contributions!

+  Rockbox Technical Forums
|-+  Support and General Use
| |-+  Audio Playback, Database and Playlists
| | |-+  Data abort on Sansa Clip+ with latest daily build
« previous next »
  • Print
Pages: 1 [2] 3 4 5

Author Topic: Data abort on Sansa Clip+ with latest daily build  (Read 3124 times)

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #15 on: February 16, 2023, 10:02:44 AM »
I'm getting more and more data aborts.. transcript of the last one:

Data abort at 30063908
FSR 0x8
(domain 0, fault 8 )
address 0x772046d0
pc: 30063908
sp: 300d47a8
bt end

I'm running the 20230215 daily with the following config (recreated from scratch and not imported from a previous firmware version):

volume: -40
bass: 22
treble: 1
repeat: all
contrast: 50
battery display: numeric
eq enabled: on
eq low shelf filter: 32, 7, -1
surround enabled: 8
afr enabled: moderate
pbe: 75
compressor knee: hard knee
fms: -
wps: /.rockbox/wps/informativebeauty.wps
iconset: -
viewers iconset: -
qs top: -
qs left: shuffle
qs right: repeat
qs bottom: -
Semitone pitch change: on

I do not have installed the additional fonts package, just the Informative Beauty theme. Tonight I'll try to perform more bisecting and try to reproduce the bug with the steps bahus pointed out.
« Last Edit: February 16, 2023, 10:14:16 AM by Carlo »
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #16 on: February 17, 2023, 12:42:24 PM »
(Replying here from the Sansa Zip thread)

I'm still getting crashes, albeit it seems they're not as frequent as before - had three in a hour and half. Could you please provide me with your build+mapfile? Thanks.

If it may be of any help, I've noticed the crashes do happen more often while cycling through relatively few files (10-15) inside the same directory, instead of larger (100+ files) ones, and they always happen immediately after switching to the next or previous track. The earliest build I can confirm has the bug is this one:

2023-01-15 buflib: Add CONFIG_BUFLIB_BACKEND for selecting a buflib backend

Could the bug be related to one of those commits?

2023-01-12 add chunk_alloc to playlist.c #2   or
2023-01-12 [BugFix] playlist.c DIRCACHE stop scanning when changing indices
« Last Edit: February 17, 2023, 02:02:14 PM by Carlo »
Logged

Offline Bilgus

  • Developer
  • Member
  • *
  • Posts: 913
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #17 on: February 17, 2023, 03:13:11 PM »
Who knows random bugs are extremely hard to figure out without a way to reliably reproduce them
if you can get me something that will crash everytime git bisect is the fastest way to figure it out..

Speculating and guessing at what commit is a hard way to find bugs
unless its a glaringly obvious bug

The last two bugs I fixed were not obvious to me at first glance
But I had an order of operations that made for a reproducible bug
then I had a place to start looking and a way to check if it was fixed after.

If they were obvious I'd have caught them when I was coding and testing
the same goes for the others commits as well and probably more so..

if you are compiling your own builds you can grab the mapfile for your build its going to be in the build dir and its called rockbox.map
every build updates the mapfile so you just need to type
Code: [Select]
checkout master
Code: [Select]
make -j2 && make fullzipwait for rockbox to build and open 'rockbox.map' in the same directory as 'rockbox-full.zip'

Later, if you want to show us just copy and paste the section your pc was pointing at with a bit of context

example:
so I got a pc of 3006a080
closest I can find is 0x00000000300699b4                font_load_ex

Quote
.text        0x0000000030069180     0x1098 /rockbox/.BUILDS/ClipZip/firmware/libfirmware.a(font.o)
                ..........
                0x00000000300697a0                font_get
                0x0000000030069800                font_get_width
                0x00000000300698bc                font_get_bits
                0x00000000300699b4                font_load_ex
                0x000000003006a128                font_load
                0x000000003006a138                font_getstringnsize
                .........

so now i'll starting looking for commits that touched here first
but the thing is even armed with that data it still might be the wrong place
who's to say its not code elsewhere, a buffer overrun could cause issues
in a different place too by overwriting someone else's memory or code

your map file could help with that too once you armed with enough info to go looking for memory addresses of buffers and functions..

still for recently introduced bugs git bisect is way faster than all of that it allows you to narrow it down to what commit broke it
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #18 on: February 17, 2023, 04:32:45 PM »
I am indeed using bisect too, however as you said there's no surefire way to trigger the bug, so I may tag as good a commit that's actually a bad one. After I updated to the latest master and installed the build on my Clip+ the bug showed up almost 40 minutes after booting, so it may indeed take quite a bit of time to trigger it. Again, I have used Rockbox for years on several devices with a very similar configuration and setup, and I never encountered even a single data abort error.

The issue might also be present in the 20221230 and 20221231 builds I've used for several hours without encountering any data abort.

I'll keep bisecting and trying to pinpoint a specific commit, however I'm taking things slow as I want to be absoutely sure that a given commit is good before proceding onto the next one. I'll try testing on multiple Clip+ at once.
« Last Edit: February 17, 2023, 06:16:10 PM by Carlo »
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #19 on: February 18, 2023, 10:14:50 AM »
A quick update about the bisecting saga: I've used the following build for more than one hour and didn't get any crash:

2023-01-12 Add INIT_ATTR to system_init()    <- This one seems good

After the next bisection step I compiled and installed the following:

2023-01-13 toolchain: Bump zlib to 1.2.13 due to 1.2.12 being withdrawn   <- Got a crash here

And got a crash before the two minutes mark:

Data abort at 30062e54
FSR 0x8
(domain 0, fault 8 )
address 0x8831881d
pc: 30062e54
sp: 300d3960
bt end

Looking at the rockbox.map it seems 0x30062e54 is related to font_load_ex() as you already discovered:

0x0000000030061fdc                font_get_bits
0x00000000300620d4                font_load_ex
0x0000000030062850                font_load

The only commit between 20230112 and 20230113 related to fonts it's this one:

"Avoid using buflib names for storing font paths"
https://git.rockbox.org/cgit/rockbox.git/commit/?id=879888b158376f1ea2c92dd49e0c7617d07fd5b2

I'll perform some more testing later tonight.


Edit: 20130113 "Remove buflib allocation names, part one" crashes too. I've narrowed down the possible list to just 7 commits or so. The latest crash:

Data abort at 30063348
..
pc:30063348
sp:300d4040

0x00000000300632f0                lcd_setfont
0x0000000030063304                lcd_getfont
0x0000000030063318                lcd_getstringsize


Edit 2: this is my current bisection range:

2023-01-13 Remove buflib allocation names, part one  <- Bad (crash < 2 minutes after booting)
2023-01-13 Avoid using buflib names for storing font paths  <- Bad (crash < 2 minutes after booting)
2023-01-13 keyboard.c make editline respect current UI font <- Bad (crash after 30-40 mins, pc 3006241c, font_get)
2023-01-12 add chunk_alloc to playlist.c #2
2023-01-12 [BugFix] playlist.c DIRCACHE stop scanning when changing indices  <- Good (no crash after 1+ hour)
2023-01-12 Fix red in 7e5fc4076a
2023-01-12 Add INIT_ATTR to i2c_init()
2023-01-12 Add INIT_ATTR to system_init() <- Good (no crash after 1h+ hour)

The pc on my latest crash (2023-01-13 keyboard.c make editline respect current UI font) is 3006241c:

0x00000000300623f8   font_get
0x0000000030062458  font_get_width
0x0000000030062514  font_get_bits

However I have the strong impression that the latest crash may be caused by the font_filename or chunk_alloc() bugs that you've fixed in your last commits, as it happened after at least 30 minutes from booting, instead of after just a few minutes like with the later builds. The issue that still hasn't been fixed gets triggered in just a few minutes on my device.

I will test "2023-01-13 Avoid using buflib names for storing font paths" and see how long does it take to crash, as I believe the latest crash isn't related to the issue that remains undiscovered. (It crashes after a few minutes).


« Last Edit: February 18, 2023, 01:57:28 PM by Carlo »
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #20 on: February 18, 2023, 01:59:59 PM »
Creating a new message to avoid confusion. This is my final bisecting report:

2023-01-13 Remove buflib allocation names, part one  <- Bad (crash < 2 minutes after booting)
2023-01-13 Avoid using buflib names for storing font paths  <- Bad (crash < 2 minutes after booting)
2023-01-13 keyboard.c make editline respect current UI font <- Bad (crash after 30-40 mins, pc 3006241c, font_get, bug already fixed?)
2023-01-12 add chunk_alloc to playlist.c #2
2023-01-12 [BugFix] playlist.c DIRCACHE stop scanning when changing indices  <- Good (no crash after 3+ hours)
2023-01-12 Fix red in 7e5fc4076a
2023-01-12 Add INIT_ATTR to i2c_init()
2023-01-12 Add INIT_ATTR to system_init() <- Good (no crash after 1+ hour)

I believe the bug that has yet to be identified and fixed has been introduced in "2023-01-13 Avoid using buflib names for storing font paths" and the crash I had with "2023-01-13 keyboard.c make editline respect current UI font" was related instead to one of the bugs you've fixed in the latest commits.

"2023-01-13 Avoid using buflib names for storing font paths" crashes extremely quickly for me, I just need to skip a few songs inside a directory to trigger the bug. I believe that's the culprit.
« Last Edit: February 18, 2023, 06:33:13 PM by Carlo »
Logged

Offline Bilgus

  • Developer
  • Member
  • *
  • Posts: 913
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #21 on: February 18, 2023, 10:23:39 PM »
is this one good or bad '2023-01-12 add chunk_alloc to playlist.c #2'
I would figure its the culprit..
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #22 on: February 19, 2023, 07:08:37 AM »
I thought that I already had implicitly tested "2023-01-12 add chunk_alloc to playlist.c #2" when I tried "2023-01-13 keyboard.c make editline..", as it's the next immediate commit and the keyboard changes seem absolutely minimal and safe.

"2023-01-13 Avoid using buflib names.." crashes almost immediately. This is the first commit that's extremely unstable for me. I simply have to boot the Clip+, start playing a file at random and then quickly skipping tracks to get a crash. The crash is triggered immediately after a song is loaded, never while it's playing.

"2023-01-13 keyboard.c make editline.." (so, transitively, "2023-01-12 add chunk_alloc.." too) crashed once after quite a bit of time, so I believe the crash itself may be caused by one of the chunk_alloc bugs you've already fixed, as the different bug I'm experiencing is triggered way more quickly.

"2023-01-12 [BugFix] playlist.c DIRCACHE stop scanning" was tested extensively and never crashed.
« Last Edit: February 19, 2023, 08:23:00 AM by Carlo »
Logged

Offline Bilgus

  • Developer
  • Member
  • *
  • Posts: 913
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #23 on: February 19, 2023, 11:32:00 AM »
the thing is "2023-01-13 Avoid using buflib names.."
is probably a red herring likely its just more susceptible
to the bug since a buffer overrun might be more likely to wipe out important data
lets try removing that chunked alloc and i'll give you a build
to try
I'd expect there to be  a bug in my code more than amachronics anyway
but it still might be either and moving stuff around or adding more memory pressure is
causing an existing bug to be exposed

here is a build to try with playlist name chunking disabled
since you are building your own builds you should just be able to compile it
Code: [Select]
git checkout 2703cc0599034f6d72b2c0eeee7bd4d6ac399bed

https://gerrit.rockbox.org/r/c/rockbox/+/5147
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #24 on: February 19, 2023, 12:03:16 PM »
Thank you, I'm going to compile a new build right now.

In order to fetch your changes I had to do the following:

rb@debian:~/rockbox$ git ls-remote | grep -i '2703cc05'
From git://git.rockbox.org/rockbox
2703cc0599034f6d72b2c0eeee7bd4d6ac399bed        refs/changes/47/5147/1

rb@debian:~/rockbox$ git pull origin refs/changes/47/5147/1

I can confirm playlist.c and playlist.h have been updated, and I'll test the new build and give you an update ASAP.

Edit: no crashes after one hour of testing.
« Last Edit: February 19, 2023, 01:08:29 PM by Carlo »
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #25 on: February 19, 2023, 05:07:26 PM »
I've used my Clip+ for more than two hours without a single issue, so it seems your workaround nailed it. Thank you!
Logged

Offline Bilgus

  • Developer
  • Member
  • *
  • Posts: 913
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #26 on: February 19, 2023, 09:47:54 PM »
we;; its apparently something with my code do me a favor and continue testing that build
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #27 on: February 20, 2023, 05:12:30 AM »
Absolutely, I will continue to use it on my daily driver. I'm also available for testing your addional builds on my other Clip+ players before they get pushed to master.
Logged

Offline Carlo

  • Member
  • *
  • Posts: 49
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #28 on: February 20, 2023, 12:35:53 PM »
I've used my Clip+ for several hours and it's been completely bug free. Zero crashes!
Logged

Offline Bilgus

  • Developer
  • Member
  • *
  • Posts: 913
Re: Data abort on Sansa Clip+ with latest daily build
« Reply #29 on: February 21, 2023, 02:25:59 PM »
IVE added in a test function to test the data integrity of the chunk allocator
But IVE yet to find any bugs in my implementation
Ill keep trying but it makes me wonder if its just putting pressure on the buflib causing a bug to appear or if there is something IM missing in the playlist functions using chunk_alloc
Logged

  • Print
Pages: 1 [2] 3 4 5
« previous next »
+  Rockbox Technical Forums
|-+  Support and General Use
| |-+  Audio Playback, Database and Playlists
| | |-+  Data abort on Sansa Clip+ with latest daily build
 

  • SMF 2.0.19 | SMF © 2021, Simple Machines
  • Rockbox Privacy Policy
  • XHTML
  • RSS
  • WAP2

Page created in 0.705 seconds with 20 queries.