Programming with Temporary files in Linux

Discussion in 'C' started by poornaMoksha, Dec 30, 2011.

  1. poornaMoksha

    poornaMoksha New Member

    Joined:
    Jan 29, 2011
    Messages:
    150
    Likes Received:
    33
    Trophy Points:
    0
    Occupation:
    Software developer
    Location:
    India
    If you are a developer then you would definitely be aware of the concept of temporary files. Temporary files, as the name suggest is temporary in its persistence. Either a process creates a temporary file to hold data for certain time or to pass information to another process. An ideal process makes sure that the temporary files are deleted as soon as the process is done with them.

    An example



    Lets suppose a developer (who is unaware of the standard ways to create and use temporary file) is asked to develop a code that uses a temporary file to hold data for sometime and then is deleted. What would he/she come up with ?

    Lets see a generic code :

    Code:
    #include<stdio.h>
    
    int main(void)
    {
        unsigned int i = 0;
    
        // Data to be written
        char str[] = "hello_world";
    
        FILE *fd = NULL;
    
        // Create and open a temporary file
        // named tempfile.txt
        fd = fopen("tempfile.txt","w+");
        if(NULL == fd)
        {
            // Error out if file could not be
            // opened
            printf("\n File open failed\n");
            return -1;
        }
        printf("\n Data to be written is [%s] \n",str);
    
        // Write the data to temporary file
        fwrite((void*)str,sizeof(str),1,fd);
    
        printf("\n Doing some time Consuming stuff.. \n");
    
        // Do some time Consuming stuff
        for(i=0; i<(0XFFFFFFFF)/2; i++);
    
        // Read back the data from file
        fread(str,sizeof(str),1,fd);
    
        printf("\n Data read [%s] \n",str);
    
        return 0;
    }
    In the above code :
    • A file named 'tempfile.txt' is opened or created (if does not exist) as a temporary file.
    • Data is written to it.
    • Some other processing work takes place (In our case we simulated this through a for loop)
    • Now the data is read back and displayed.
    The output of the above program is :

    Code:
    $ ./temp
    
     Data to be written is [hello_world] 
    
     Doing some time Consuming stuff.. 
    
     Data read [hello_world]
    The output was in sync. with the expectations.

    But, lets look beyond the obvious. If playing with temporary files were so easy then I would not have been writing article on this :).
    Can you see some flaws in this program??? If not, let me tell you.
    1. The very first short coming is the name of the temporary file. Since name is hard-coded in the code so every instance of the executable would use the same name. Which means that if more than one instances of the same executable are running at the same path then all the instances would be messing up with the same file name. This would be a blunder.
    2. What if some hacker comes to know that our code creates a temporary file names 'tempfile.txt'. There could be a big security breach in our program.
    3. As in this code, the developer could forget to close or delete the file while returning from the program. This could leave the temporary file on the file system with all so valuable information.
    So, now you realize that why its not so easy working with temporary files.

    Pitfalls to be kept in mind



    When using temporary files, one should be aware of the major pitfalls that can be compromised by a hacker to disrupt your process.

    Here we will list three major PitFalls :
    1. The name of the file that is being used as temporary file should be different each time the program is run as more than one instance of your program may run simultaneously.
    2. The permission of the temporary file should be set in such a way that the users which are not authorized to access this temporary file cannot do so.
    3. Also, the naming convention for temporary files should not be very easy to guess. This would ensure an extra security cover from hackers.

    Standard functions used for Temporary files



    As we have already seen that there may be different loopholes while writing a nonstandard code for dealing with temporary files. So, on Linux there are many standard APIs to achieve this task but here in this article we will discuss the two major ones.

    A) mkstemp()

    The signature of this function is :

    Code:
    #include <stdlib.h>
    
    int mkstemp(char *template);
    The mkstemp() function generates a unique temporary filename from template, creates and opens the file, and returns an open file descriptor for
    the file.

    The last six characters of template must be "XXXXXX" and these are replaced with a string that makes the filename unique. Since it will be modified, template must not be a string constant, but should be declared as a character array.

    The file is created with permissions 0600, that is, read plus write for owner only. (In glibc versions 2.06 and earlier, the file is created with
    permissions 0666, that is, read and write for all users.) The returned file descriptor provides both read and write access to the file. The file
    is opened with the open(2) O_EXCL flag, guaranteeing that the caller is the process that creates the file.

    B) tmpfile()

    The signature of this function is :

    Code:
     #include <stdio.h>
    
    FILE *tmpfile(void);
    The tmpfile() function opens a unique temporary file in binary read/write (w+b) mode. The file will be automatically deleted when it is closed or
    the program terminates.

    The function mkstemp() does not automatically delete the temporary file created while tmpfile() does. So tmpfile() is used in the case where there is only process that needs to hold on the data for some time in the temporary file and then needs to delete it as the function tmpfile() automatically does so. While mkstemp() is useful only when the file needs to be accessed by some other process also.

    Example code



    Since the two functions perform more or less the same task so, we will take up makstemp to show a working example of handling temporary files.

    Here is the code :

    Code:
    #include <stdlib.h>
    #include <unistd.h>
    #include <stdio.h>
    
    // This function creates a temporary file
    // through the function mkstemp() and writes
    // the required data to it.
    
    // The value of 'temp_filename' that I have taken
    // is in accordance with the what the signature of 
    // this API says. All the trailing XXXXXX are replaced 
    // by something unique so that the temporary file
    // name always becomes unique.
    int write_tmp (char* buffer, size_t length)
    {
      // Create a template name
      char temp_filename[] = "/tmp/temp_file.XXXXXX";
    
      // Call the mkstemp function with the template
      int fd = mkstemp (temp_filename);
    
      // This will delete the file when no references
      // to it are left. In our case the file will get
      // deleted as soon as program quits or whenever we 
      // call close function.  
      unlink (temp_filename);
    
      // Write the length in the beginning of file
      write (fd, &length, sizeof (length));
    
      // Write the data
      write (fd, buffer, length);
    
      // return the descriptor
      return fd;
    }
    
    
    // This function is used to read in the data that was
    // written to the temporary file.
    
    // It uses the file descriptor returned by mkstemp() function
    // to read the data from the file. When the reading is done,
    // the file is closed by issuing a close() call on the file
    // descriptor which removes the only reference from the file
    // and as we have already called unlink() while writing the data
    // so the file gets automatically deleted.
    char* read_tmp (int temp_file, size_t* length)
    {
      char* buffer;
    
      // The file descriptor obtained while
      // writing the data to file
      int fd = temp_file;
    
      // Seek to beginning
      lseek (fd, 0, SEEK_SET);
    
      // Read the length
      read (fd, length, sizeof (*length));
    
      // Allocate the memory to hold the data
      buffer = (char*) malloc (*length);
    
      // Read the data from file into the buffer
      read (fd, buffer, *length);
    
      // Close the file. This removes the only reference from file
      close (fd);
    
      // return the data to the calling function
      return buffer;
    }
    
    int main(void)
    {
        // Actual data to be read/written
        char buff[] = "HELLO-WORLD";
        // Length of the data
        size_t length = sizeof(buff);
        // Call the write function to write the data
        // into the temporary file
        int fd = write_tmp(buff, sizeof(buff));
        // Read the data from temporary file
        char *ptr = read_tmp(fd, &length);
        // Print the retrieved data to user
        printf("\n The data retrieved is [%s]\n", ptr);
    
        return 0;
    }
    As I have already explained a lot in the code as comments. So lets continue to the output part.
    Code:
    $ ./temp
    
     The data retrieved is [HELLO-WORLD]
    So we see that we were easily able to use the function mkstemp() for creating a temporary file. The function takes care of all the loopholes that a temporary file can hold so user does not need to worry for it.

    Conclusion



    To conclude, In this article we studied the concept of temporary files, why do we need them, what are the loopholes using a general approach for handling them, what are the popular APIs available for handling temp files and finally how to use them.

    Stay tuned for more!!!
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice