Understanding File Handling Functions in C

Discussion in 'C' started by poornaMoksha, Nov 29, 2011.

  1. Being a Linux system programmer there are numerous situations when one needs to deal with files kept on disk. Like opening a file, reading a file(one character at a time, one line at a time, the whole file at a time), seeking in a file and then finally closing it. One can use various library provided functions for these operations like :

    fopen() // open a file
    fclose() // Close a file
    fgets() // Read a '\n' terminated line from file
    fread() // Read a chunk of bytes from file
    fseek() // Seek inside a file
    ftell() // Tell the current file pointer position in file
    etc...

    There also exist another category of POSIX complaint functions like :

    open()
    close()
    read()
    lseek()
    etc...

    The difference between these two categories is that
    1. The functions like fopen, fclose etc are C standard functions while the other category of open(), close() etc are not POSIX complaint. This means that code written with open(), close() etc is not a standard C code and hence non-portable. Where as the code written with fopen(), fclose etc is a standard code and can be ported on any type of systems.
    2. Usually open(), close() etc can be found on *nix systems while fopen(), fclose() etc can be found on any type of systems that support C/C++.
    3. While fopen(), fclose() etc are library calls, the open(), close() etc are system calls.
    4. Being close to system, the functions like open(),close() etc provide greater control over the system as compared to their library counterparts.
    I'd like to split this article into two parts and explain the library functions used for file handling(like fopen(), fclose() etc)

    Explanation



    Lets take a single code and explain the following functions through it :
    1. fopen()
    2. fgets()
    3. ftell()
    4. fclose()
    Here is the code :

    Code:
    #include<stdio.h> 
    #include<stdlib.h> 
    #include<string.h> 
    #include<unistd.h> 
     
    int main(int argc, char* argv[]) 
    { 
        char buff[1024]; 
     
        char *ptr = NULL; 
        FILE *fp  = NULL; 
     
        // Set the buffer with NULLs 
        memset(buff,'\0', sizeof(buff)); 
     
        // Open the file with fopen() 
        // Here the first argument should be the 
        // absolute file path. If complete path is not given 
        // then the file is searched in same directory from 
        // which the code is executed. 
     
        // The second argument is the mode in which we want to 
        // open the file. The mode can be read(r), write(w),  
        // read-write(r+) etc. For more information, read the 
        // man page of fopen() 
     
        // On success, this function returns a FILE* type variable 
        // this is known as a file descriptor and this descriptor is 
        // used as an interaction mechanism by various other file 
        // related functions 
     
        //Here we are trying to open a file ps.txt in read only mode. 
        fp = fopen("ps.txt", "r"); 
     
        // If any error occurs, fopen() returns NULL and errno is set 
     
        // Check the error and return if some error did occur. 
        if(NULL == fp) 
        { 
            printf("\n File open failed\n"); 
            return -1; 
        } 
     
        // The function ftell() gives the current file pointer position in the file. 
        // This file pointer position is a numeric value and should be zero at the  
        // beginning of the file. As we start reading the file through functions like  
        // fgets(), fread() etc then the file pointer position increments by the number 
        // of bytes read. The return of ftell() shall be stored in a long data type. 
         
        // This function takes the file descriptor (returned by fopen()) as a parameter. 
     
        // Here we are trying to print the value returned by ftell(). 
        printf("\n TO begin with, the file pointer position is at : %ld\n", ftell(fp)); 
     
     
        // From the man page : 
         
        // fgets() reads in at most one less than size characters from stream and stores  
        // them into the buffer pointed to by s.  Reading stops after an EOF or a newline. 
        // If a newline is read, it is stored into the buffer.  A '\0' is stored after the  
        // last character in the buffer. 
     
        // Here we are trying to read the file ps.txt line by line through its descriptor fp 
        // The line read is stored in the buffer 'buff' 
        while(NULL != fgets(buff, sizeof(buff), fp)) 
        { 
            printf("\n\n\n The line fetched through fgets is : \n [%s]\n",buff); 
     
            // Search for character '[' in this line 
            ptr = strchr(buff, '['); 
     
            // If no such character found 
            if(NULL == ptr) 
            { 
                //Search for '/'  
                ptr = strchr(buff, '/'); 
            } 
            if(NULL != ptr) 
            { 
               // If any of the two characters were found 
               printf("INFO::Either of the two (/ or [) found\n"); 
            } 
            else 
            { 
                // if none of the two found 
                printf("INFO::Neither / nor [ found in this line\n"); 
            } 
     
            // Wait for one second 
            sleep(1); 
     
            // Reset the buffer with NULLs 
            memset(buff,'\0', sizeof(buff)); 
             
            // Print the current file pointer position before fetching the neXt line 
            printf("INFO::Now, the file pointer position is at : %ld\n", ftell(fp)); 
        } 
     
        // This function closes the file descriptor given by fopen(). 
        // It means after a successful call to this function, the 
        // file descriptor fp is no more valid.  
     
        // It is not a fatal error if we not close the file after we are done  
        // with the file operations because when the program exits, the kernel 
        // automatically closes all the file descriptors opened by the program. 
        // But the problem may occur if we have too many open file descriptors 
        // across various programs running on system as the number of open file  
        // descriptors a system can have is limited. So its good to always close  
        // a file descriptor when we are done with it. 
     
        // Here we close this file descriptor. 
        fclose(fp); 
     
        return 0; 
    }

    The Output


    Since bulk of the explanation for the function is added in the code itself, so lets see the output here.

    Before that lets see the contents of text file ps.txt that we opened in this code :

    Code:
    UID        PID  PPID  C STIME TTY          TIME CMD 
    root         1     0  0 05:29 ?        00:00:00 /sbin/init 
    root         2     0  0 05:29 ?        00:00:00 [kthreadd] 
    root         3     2  0 05:29 ?        00:00:00 [migration/0] 
    root         4     2  0 05:29 ?        00:00:00 [ksoftirqd/0] 
    root         5     2  0 05:29 ?        00:00:00 [watchdog/0] 
    root         6     2  0 05:29 ?        00:00:00 [migration/1] 
    root         7     2  0 05:29 ?        00:00:00 [ksoftirqd/1] 
    root         8     2  0 05:29 ?        00:00:00 [watchdog/1] 
    root         9     2  0 05:29 ?        00:00:00 [events/0] 
    root        10     2  0 05:29 ?        00:00:00 [events/1] 
    root        11     2  0 05:29 ?        00:00:00 [cpuset] 
    root        12     2  0 05:29 ?        00:00:00 [khelper] 
    root        13     2  0 05:29 ?        00:00:00 [netns] 
    root        14     2  0 05:29 ?        00:00:00 [async/mgr] 
    root        15     2  0 05:29 ?        00:00:00 [pm] 
    root        17     2  0 05:29 ?        00:00:00 [sync_supers] 
    root        18     2  0 05:29 ?        00:00:00 [bdi-default] 
    root        19     2  0 05:29 ?        00:00:00 [kintegrityd/0] 
    root        20     2  0 05:29 ?        00:00:00 [kintegrityd/1] 
    root        21     2  0 05:29 ?        00:00:00 [kblockd/0] 
    root        22     2  0 05:29 ?        00:00:00 [kblockd/1] 
    root        23     2  0 05:29 ?        00:00:00 [kacpid] 
    root        24     2  0 05:29 ?        00:00:00 [kacpi_notify] 
    root        25     2  0 05:29 ?        00:00:00 [kacpi_hotplug] 
    root        26     2  0 05:29 ?        00:00:00 [ata/0] 
    root        27     2  0 05:29 ?        00:00:00 [ata/1] 
    root        28     2  0 05:29 ?        00:00:00 [ata_aux] 
    root        29     2  0 05:29 ?        00:00:00 [ksuspend_usbd] 
    root        30     2  0 05:29 ?        00:00:00 [khubd] 
    root        31     2  0 05:29 ?        00:00:00 [kseriod] 
    root        32     2  0 05:29 ?        00:00:00 [kmmcd] 
    root        35     2  0 05:29 ?        00:00:00 [khungtaskd] 
    root        36     2  0 05:29 ?        00:00:00 [kswapd0] 
    root        37     2  0 05:29 ?        00:00:00 [ksmd] 
    root        38     2  0 05:29 ?        00:00:00 [aio/0] 
    root        39     2  0 05:29 ?        00:00:00 [aio/1] 
    root        40     2  0 05:29 ?        00:00:00 [ecryptfs-kthrea] 
    root        41     2  0 05:29 ?        00:00:00 [crypto/0] 
    root        42     2  0 05:29 ?        00:00:00 [crypto/1] 
    root        55     2  0 05:29 ?        00:00:00 [scsi_eh_0] 
    root        56     2  0 05:29 ?        00:00:00 [scsi_eh_1] 
    root        59     2  0 05:29 ?        00:00:00 [scsi_eh_2] 
    root        60     2  0 05:29 ?        00:00:00 [scsi_eh_3] 
    root        62     2  0 05:29 ?        00:00:00 [kstriped] 
    root        63     2  0 05:29 ?        00:00:00 [kmpathd/0] 
    root        64     2  0 05:29 ?        00:00:00 [kmpathd/1] 
    ... 
    ... 
    ... 
    
    Note that your ps.txt could contain anything. No need to copy paste from here.

    Here is the output:

    Code:
     ~/practice $ ./file 
     
     TO begin with, the file pointer position is at : 0 
     
     
     
     The line fetched through fgets is :  
     [UID        PID  PPID  C STIME TTY          TIME CMD 
    ] 
    INFO::Neither / nor [ found in this line 
    INFO::Now, the file pointer position is at : 52 
     
     
     
     The line fetched through fgets is :  
     [root         1     0  0 05:29 ?        00:00:00 /sbin/init 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 111 
     
     
     
     The line fetched through fgets is :  
     [root         2     0  0 05:29 ?        00:00:00 [kthreadd] 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 170 
     
     
     
     The line fetched through fgets is :  
     [root         3     2  0 05:29 ?        00:00:00 [migration/0] 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 232 
     
     
     
     The line fetched through fgets is :  
     [root         4     2  0 05:29 ?        00:00:00 [ksoftirqd/0] 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 294 
     
     
     
     The line fetched through fgets is :  
     [root         5     2  0 05:29 ?        00:00:00 [watchdog/0] 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 355 
     
     
     
     The line fetched through fgets is :  
     [root         6     2  0 05:29 ?        00:00:00 [migration/1] 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 417 
     
     
     
     The line fetched through fgets is :  
     [root         7     2  0 05:29 ?        00:00:00 [ksoftirqd/1] 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 479 
     
     
     
     The line fetched through fgets is :  
     [root         8     2  0 05:29 ?        00:00:00 [watchdog/1] 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 540 
     
     
     
     The line fetched through fgets is :  
     [root         9     2  0 05:29 ?        00:00:00 [events/0] 
    ] 
    INFO::Either of the two (/ or [) found 
    INFO::Now, the file pointer position is at : 599
    Now, after seeing the output, the functionality of these functions become trivial. The first position that ftell() returns is '0'. This is expected as we have not started reading the file as yet. We see that once we get a line using fgets(), it comes along a new line character as can be seen from the printed lines in this output. Now, as we progress reading line after line, the value returned by ftell() increases and finally we close the file by using fclose().


    An exercise for everyone



    For those who were new to file handling before reading this article, just try this exercise and see what happens and then try to analyze why it happened :
    1. Remove ps.txt from your current directory and then execute this code?
    2. Use "w" instead of "r" in the function fopen() and then execute this code?
    3. Remove the call to fclose() and see if you get any warning or error.?
    For any doubts or queries, leave a comment here.

    Conclusion



    Well, to conclude, in this article we studied the difference between the two category of file handling functions and then we discussed fopen(), ftell(), fgets() and fclose() through a practical example. Will take up the rest of the functions in part-II.

    Stay tuned for more!!!
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice