Quantcast

data writing-to-file causing intermittent hanging

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

data writing-to-file causing intermittent hanging

qmay123
I've got a c++ program recording various telemetry data read from serial ports, etc. The program records data at approximately 12KB/s, but is buffered for about 4 seconds and written to file in chunks. So about 48KB at a time.

I noticed that around every 30 seconds, the program would 'freeze' causing a loss of data being read in.  I switched from a Class 4 to a Class 10 microSD card and this reduced the frequency of the 'freeze'. I multi-threaded the program to double-buffer the write-data, this reduced the frequency even more, but I still hang about once every 2.5 minutes and lose a good number of samples in the process.  FYI the code was tested on a desktop windows machine and a desktop Ubuntu machine and works exactly as expected, no errors, great data.

I'm using the pre-built image from Dec. 2011 (most recent pre-built image) I'm wondering if there are any I/O settings or something that would cause this kind of behavior. The data size&speed is well within the SD card capability and the code has been thoroughly debugged. Any suggestions?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: data writing-to-file causing intermittent hanging

qmay123
In case it helps anyone, initial testing shows that moving to an ext4 filesystem helped further improve this error (although it did not eliminate it completely).

Testing the identical process on a temporary RAMFS completely eliminated the issue, but is unacceptable in terms of memory loss; however, it does help point to the data writing process as the culprit and not the code routine itself.

If anyone has any tips to make data writing safer/more reliable it'd be appreciated!

Also, has anyone worked with the industrial grade microSD cards, such as these: http://www.atpinc.com/p2-4a.php?sn=00000391
?

Thanks!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: data writing-to-file causing intermittent hanging

Ryan
qmay,

Maybe you could run the OS from the NAND flash and just store the data on the microSD card? I haven't used SLC microSD cards before, but I have used SLC USB drives and the write speeds are significantly faster. It is very noticeable when writing large amounts of data or cloning entire drives.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: data writing-to-file causing intermittent hanging

Scott Ellis
This post was updated on .
In reply to this post by qmay123
Some code or pseudo code from you might help.

Are you dropping data because the serial reader thread is blocked
by the writer thread?

Or is your whole system somehow freezing while disk writes are
going on causing you to miss data?

You could test this second case by running your current program
writing to a RAMFS and at the same time running another program
doing disk writes to try and freeze the system.


Hopefully it's the first case.

If so, it's not clear to me what your double-buffering step entails
or how adding an extra copy helps things.


Our typical multi-threaded data routines look like this

Collection Thread (queue writer)

loop:
  alloc buffer (or more typically grab from a pre-allocated pool)
  read data from device
  add buffer to queue


Disk Thread (queue reader)

loop:
  if queue empty
    sleep
  else
    remove buffer from queue
    write data to disk
    free buffer (or return to pool)


With only one queue writer and reader, moving only the head
or tail respectively, you don't even need locking.

The only way the collection thread gets blocked with this algorithm is
if we run out of memory for buffers. But if that's happening then
we are truly exceeding the throughput limits of the system.

Your throughput requirements are not that big.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: data writing-to-file causing intermittent hanging

qmay123
Thanks Ryan, I might try that, would need to flash a NAND with updated kernel first. 


jumpnow:

Pseudo Code:


*****
Collection Thread:
 loop:
    Read data from device
    if data is new, then
         Store data in shared global vars  (lock and unlock var mutex before & after)
 end loop

******
Buffer Thread:
loop
   if time_elapsed > x then
         grab data from shared global vars (lock and unlock var mutex before & after)
         place data in preallocated buffer
         if data has been grabbed 100 times, trip the write flag

*******
Write Thread:
loop
   if write flag
   copy buffer locally
   clear buffer
   write(local copy)
   sleep for a little bit before check if write flag is tripped again
*******

The purpose of the write thread is to allow the OS to determine the CPU priority given to writing data, with hopes that it could do it smarter than if it was in the buffer thread and to keep the write from blocking the buffer thread. 

The 'hang' that I see is that the collection loop does not update the shared data, but the buffer thread continues to grab the data at the same rate, so I see the same data for .5 seconds worth of data or so, so several lines. So, the whole system is not frozen since the buffer thread is still going. I thought for a while that the collection thread had a bug, but the experiments and fixes I did as described all point to data writing as the bottleneck if I'm not mistaken. Testing by writing to ramfs and running flawlessly, in my mind, confirms that suspicion. What do you think? 



On Thu, Jan 3, 2013 at 10:17 AM, jumpnowdev <[hidden email]> wrote:
Some code or pseudo code from you might help.

Are you dropping data because the serial reader thread is blocked
by the writer thread?

Or is your whole system somehow freezing while disk writes are
going on causing you to miss data?

You could test this second case by running your current program
writing to a RAMFS and at the same time running another program
doing disk writes to try and freeze the system.


Hopefully it's the first case.

If so, it's not clear to me what your double-buffering step entails
or how adding an extra copy helps things.


Our typical multi-threaded data routines look like this

Collection Thread (queue writer)

loop:
  alloc buffer (or more typically grab from a pre-allocated pool)
  read data from device
  add buffer to queue


Disk Thread (queue reader)

loop:
  if queue empty
    sleep
  else
    remove buffer from queue
    write data to disk
    free buffer (or return to pool)


With only one queue writer and reader, moving only the head
or tail respectively, you don't even need locking.

The only way the reader thread gets blocked with this algorithm is
if we run out of memory for buffers. But if that's happening then
we are truly exceeding the throughput limits of the system.

Your throughput requirements are not that big.




--
View this message in context: http://gumstix.8.n6.nabble.com/data-writing-to-file-causing-intermittent-hanging-tp4966278p4966363.html
Sent from the Gumstix mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122712
_______________________________________________
gumstix-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gumstix-users



--
Dan Kuehme
AREA-I
where ideas take flight

1590 N. Roberts Rd., Ste 203
Kennesaw, GA 30144
Phone: 678.594.5227
Fax:     678.594.5228
Cell:     678.653.6662


------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122712
_______________________________________________
gumstix-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gumstix-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: data writing-to-file causing intermittent hanging

Scott Ellis
I guess I can't tell from your pseudo code where the problem is.

I agree that disk writing latency is probably your issue, but latencies in the
disk writing thread should not be impacting the collection thread.

If it does, then you've negated a primary reason to multi-thread.

As long as the system can sustain your total write throughput requirements,
then your algorithm should be buffering any speed differences between the
threads.

You could test your algorithm by putting in a sleep (maybe random) into your
your current disk writer code when you are using a ram disk. See if you get the
same symptoms.

Putting a timer around your disk writing code could get you some idea of the
variability and worst case times when using the disk.

You could also copy some files from the shell and measure the O/S disk write
speeds on your system. There's no reason your app shouldn't be able to match
this.

Just some ideas.

FWIW, I know we've done systems with higher throughput then this on the Gumstix.




Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: data writing-to-file causing intermittent hanging

qmay123
Thanks jump, I appreciate the thorough answers. I might go a little timer crazy on some of the processes and see if I notice anything.

I think I'll also run the process in ramfs and have a shell script copy over the write file at a certain rate and see if the behavior persists. Will report results.

Thanks!
Loading...