Page 1 of 1
unbuffered file write
Posted: Fri Dec 20, 2013 2:17 pm
by julia.s
Hello everyone,
I am new here, thank you for letting me register with your discussion club.
I am working on cluster monitoring and management and one of the functions is to periodically monitor I/O degradation with load.
My current monitoring script is shell based and ends up calling many processes, for disk I/O using the popular
dd if=/dev/zero of=test_$$ bs=64k count=64 conv=fdatasync
but would like to replace with PHP implementation in effort to reduce process spawn and impact of monitoring solution on monitored container. Shell math is awkward and each expression or output scrap requires a process spawn.
I have this:
Code: Select all
$f = fopen($path, 'wb');
stream_set_write_buffer($f, 0);
for($i=0; $i<$count; $i++)
fwrite($f, $blk);
fflush($f);
fclose($f);
But no matter, the operation is always buffered up to certain file size. I have routine to keep calling test and doubling file size until there is significant drop in calculated I/O speed, but my routine then goes all the way to 1-2GB file in order to get a reasonable believable I/O rate, which defeats the original purpose of migration the script to PHP to lower the impact.
Is there a way to do a true unbuffered disk write in PHP with small files (say 16MB) or should I just stick with calling dd ? 1-2 processes is way better than calling hundreds of them every 5 minutes.
Thank you
Julia
Re: unbuffered file write
Posted: Mon Dec 23, 2013 2:34 am
by Eric!
For starters check what stream_set_write_buffer() returns. If it is 0 then it worked otherwise it is buffered.
Why are you starting hundreds of processes to measure disk I/O?
Re: unbuffered file write
Posted: Wed Dec 25, 2013 10:20 pm
by julia.s
As shown in my question and example I already use stream_set_write_buffer(). It does not disable buffering.
There is not hundreds processes to measure disk I/O, but a shell script that performs dozen other functions beside I/O monitoring. The I/O portion is:
iorate=$( (dd if=/dev/zero of=test_$$ bs=64k count=$((ioblk*16)) conv=fdatasync &&rm -f test_$$) 2>&1 | tail -1| awk '{ print $(NF-1) $NF }' )
As you see, the line above alone spawns 5 processes. I have this script to report PID into CSV file along with measurements and it increases by ~135 every 5min. Even something as trivial as echo is a process. Multiplication of two values spawns two processed:
rate=$(echo $rate*1000 | bc)
In PHP:
Re: unbuffered file write
Posted: Thu Dec 26, 2013 5:00 pm
by Eric!
Specifically I meant for you to check what it returns like this:
Code: Select all
$result=stream_set_write_buffer($f, 0);
if($result==0) echo "Unbuffered write enabled";
else var_dump($result)
Can you just write blocks of data buffered or unbuffered and flush and close it. Measure the elapsed time of the whole process. You'll have some PHP overhead built into it, but your measurements are mostly all relative right?
Re: unbuffered file write
Posted: Thu Dec 26, 2013 11:34 pm
by julia.s
Correct, measurements are relative only to show degradation. Unfortunately writing data to buffer has nothing to do with disk.
I found several ways to perform non-buffered I/O. If file has not been accessed in quite some time, it is not going to be in cache, the only problem is how long filesystem holds file in cache. Actually, there is several caching layers including hard-drive itself.
First thing I tried is to read root device directly. Must be root, seems adding the user to "disk" or "root" groups does not help. This access is also buffered, but I found the cache only holds for less than minute on my test machine, also cache can be defeated by moving seek position, keep in mind some skip for read-ahead, reading 16MB and skipping next 16MB worked on my machine. Perhaps taking several reads from different areas of disk and throw away max/min may help remove some cache hits.
Second thing I tried was DIO Pecl extension. Does not require root privileges. On my machine the cache held less than 30 seconds, but it may vary from harddrive to harddrive and depend on how much is going on - this machine has several daemons that periodically access database and and log into several files. If harddrive has 64MB and there is not much going on, the data may still be there even after several minutes. I will have script self-calibrate on first run on idle machine and find out certain system specs, like maximum I/O rate for the machine among others, so if I accidentally hit cache, it may be safe to assume maximum rate from first run instead.
Obviously, if I am not running as root and DIO extension is not available, as fallback there is no helping it, I have to call the dd.
Re: unbuffered file write
Posted: Fri Dec 27, 2013 12:14 am
by Eric!
If you call either flush or fclose this should purge the buffer immediately. If you have hardware caching going on that is a different story. Here's some pseudo code:
1. Build $block string say 65536 characters to match your 64k dd block sizes
2. fopen test file
3. start timer
4. fwrite $block
5. fclose
6. stop timer
Do this about 20 times and see if the results are close to what dd produces and/or if they are consistent.
How do you know how long the data is staying in the cache? Is this a HW or SW cache?