The input and output in a web
application usually flow between browser, server, and database, but there are
many circumstances in which files are involved too. Files are useful for
retrieving remote web pages for local processing, storing data without a
database, and saving information that other programs need access to. Plus, as
PHP becomes a tool for more than just pumping out web pages, the file I/O
functions are even more useful.
PHP's interface for file I/O is similar
to C's, although less complicated. The fundamental unit of identifying a file to
read from or write to is a file handle. This
handle identifies your connection to a specific file, and you use it for
operations on the file. This chapter focuses on opening and closing files and
manipulating file handles in PHP, as well as what you can do with the file
contents once you've opened a file. Chapter 19 deals with
directories and file metadata such as permissions.
$fh = fopen('/tmp/cookie-data','w') or die("can't open file");
if (-1 == fwrite($fh,$_COOKIE['flavor'])) { die("can't write data"); }
fclose($fh) or die("can't close file");
The function fopen( ) returns a
file handle if its attempt to open the file is successful. If it can't open the
file (because of incorrect permissions, for example), it returns false.
Section
18.2 and Section
18.4 cover ways to open files.
The function fwrite( ) writes
the value of the flavor cookie to the file handle. It returns the
number of bytes written. If it can't write the string (not enough disk space,
for example), it returns -1.
Last, fclose( ) closes the file
handle. This is done automatically at the end of a request, but it's a good idea
to explicitly close all files you open anyway. It prevents problems using the
code in a command-line context and frees up system resources. It also allows you
to check the return code from fclose( ). Buffered data might not be
actually written to disk until fclose( ) is called, so it's here that
"disk full" errors are sometimes reported.
As with other processes, PHP must have the correct permissions
to read from and write to a file. This is usually straightforward in a
command-line context but can cause confusion when running scripts within a web
server. Your web server (and consequently your PHP scripts) probably runs as a
specific user dedicated to web serving (or perhaps as user nobody). For
good security reasons, this user often has restricted permissions on what files
it can access. If your script is having trouble with a file operation, make sure
the web server's user or group — not yours — has permission to perform that file
operation. Some web serving setups may run your script as you, though, in which
case you need to make sure that your scripts can't accidentally read or write
personal files that aren't part of your web site.
Because most file-handling functions just return false
on error, you have to do some additional work to find more details about that
error. When the track_errors configuration directive is on,
each error message is put in the global variable $php_errormsg.
Including this variable as part of your error output makes debugging easier:
$fh = fopen('/tmp/cookie-data','w') or die("can't open: $php_errormsg");
if (-1 == fwrite($fh,$_COOKIE['flavor'])) { die("can't write: $php_errormsg") };
fclose($fh) or die("can't close: $php_errormsg");
If you don't have permission to write to the /tmp/cookie-data, the example dies with this error
output:
can't open: fopen("/tmp/cookie-data", "w") - Permission denied
There are differences in how files are treated by Windows and by
Unix. To ensure your file access code works appropriately on Unix and Windows,
take care to handle line-delimiter characters and
pathnames correctly.
A line delimiter on Windows is two characters: ASCII 13 (carriage return) followed by ASCII 10 ( linefeed or newline). On Unix, it's just ASCII 10. The
typewriter-era names for these characters explain why you can get
"stair-stepped" text when printing out a Unix-delimited file. Imagine these
character names as commands to the platen in a typewriter or character-at-a-time
printer. A carriage return sends the platen back to the beginning of the line
it's on, and a line feed advances the paper by one line. A misconfigured printer
encountering a Unix-delimited file dutifully follows instructions and does a
linefeed at the end of each line. This advances to the next line but doesn't
move the horizontal printing position back to the left margin. The next
stair-stepped line of text begins (horizontally) where the previous line left
off.
PHP functions that use a newline as a
line-ending delimiter (for example, fgets( )) work on both Windows and
Unix because a newline is the character at the end of the line on either
platform.
$fh = fopen('/tmp/lines-of-data.txt','r') or die($php_errormsg);
while($s = fgets($fh,1024)) {
$s = rtrim($s);
// do something with $s ...
}
fclose($fh) or die($php_errormsg);
This function removes any trailing whitespace in the line,
including ASCII 13 and ASCII 10 (as well as tab and space). If there's
whitespace at the end of a line that you want to preserve, but you still want to
remove carriage returns and line feeds, use an appropriate regular expression:
$fh = fopen('/tmp/lines-of-data.txt','r') or die($php_errormsg);
while($s = fgets($fh,1024)) {
$s = preg_replace('/\r?\n$/','',$s);
// do something with $s ...
}
fclose($fh) or die($php_errormsg);
Unix and Windows also differ on the character used to
separate directories in pathnames. Unix uses a slash
(/), and Windows uses a backslash (\).
PHP makes sorting this out easy, however, because the Windows version of PHP
also understands / as a directory separator. For example, this code
successfully prints the contents of C:\Alligator\Crocodile Menu.txt:
$fh = fopen('c:/alligator/crocodile menu.txt','r') or die($php_errormsg);
while($s = fgets($fh,1024)) {
print $s;
}
fclose($fh) or die($php_errormsg);
This piece of code also takes advantage of the fact that
Windows filenames aren't case-sensitive. However, Unix filenames are.
Sorting out linebreak confusion isn't only a problem in your
code that reads and writes files but in your source code as well. If you have
multiple people working on a project, make sure all developers configure their
editors to use the same kind of linebreaks.
Once you've
opened a file, PHP gives you many tools to process its data. In keeping with
PHP's C-like I/O interface, the two basic functions to read data from a file are
fread( ) , which reads a
specified number of bytes, and fgets( ), which reads a line at a time
(up to a specified number of bytes.) This code handles lines up to 256 bytes
long:
$fh = fopen('orders.txt','r') or die($php_errormsg);
while (! feof($fh)) {
$s = fgets($fh,256);
process_order($s);
}
fclose($fh) or die($php_errormsg);
If orders.txt has a 300-byte
line, fgets( ) returns only the first 256 bytes. The next fgets(
) returns the next 44 bytes and stops when it finds the newline. The next
fgets( ) moves to the next line of the file. Examples in this chapter
generally give fgets( ) a second argument of 1048576: 1 MB. This is
longer than lines in most text files, but the presence of such an outlandish
number should serve as a reminder to consider your maximum expected line length
when using fgets().
Many operations on file contents, such as picking a line at
random (see Section
18.11) are conceptually simpler (and require less code) if the entire file
is read into a string or array. Section
18.6 provides a method for reading a file into a string, and the file(
) function puts each line of a file into an array.
The tradeoff for simplicity, however, is memory consumption. This can be
especially harmful when you are using PHP as a server module. Generally, when a
process (such as a web server process with PHP embedded in it) allocates memory
(as PHP does to read an entire file into a string or array), it can't return
that memory to the operating system until it dies. This means that calling
file( ) on a 1 MB file from PHP running as an Apache module increases
the size of that Apache process by 1 MB until the process dies. Repeated a few
times, this decreases server efficiency. There are certainly good reasons for
processing an entire file at once, but be conscious of the memory-use
implications when you do.
Section
18.21 through Section
18.24 deal with running other programs from within a PHP program. Some
program-execution operators or functions offer ways to run a program and read
its output all at once (backticks) or read its last line of output (system(
)). PHP can use pipes to run a program, pass it input, or read its output.
Because a pipe is read with standard I/O functions (fgets( ) and
fread( )), you decide how you want the input and you can do other tasks
between reading chunks of input. Similarly, writing to a pipe is done with
fputs( ) and fwrite( ), so you can pass input to a program in
arbitrary increments.
Pipes have the same permission issues as regular files. The PHP
process must have execute permission on the program being opened as a pipe. If
you have trouble opening a pipe, especially if PHP is running as a special web
server user, make sure the user is allowed to execute the program you are
opening a pipe to.