1. We have moved from vBulletin to XenForo and you are viewing the site in the middle of the move. Though the functional aspect of everything is working fine, we are still working on other changes including the new design on Xenforo.
    Dismiss Notice

Huffman Encoding in C (Minimum Variance Encoding)

Discussion in 'C' started by rai_gandalf, Nov 30, 2006.

?

How did you find this program?

  1. Perfect!

    30.3%
  2. Good!

    60.6%
  3. Average - Its just about Functional

    6.1%
  4. Bad - Rotten Tomatoes!

    3.0%
  1. rai_gandalf

    rai_gandalf New Member

    Introduction



    After a long hiatus, I am back. Months without programming and then finally I got some time on my hands. So I set out to code the Huffman's Data Compression Algorithm. And the result is here! :D

    The code is well-commented and I hav given some additional documentation. Also, I hav tested it extensively - right from small words to complete Metallica songs :D

    It executes for all, but I dont know if it gives the OPTIMUM huffman code. I tried out manually for a few small words and it matches, but the large ones - man is not gifted with that much patience I am afraid. :p

    But if there are any probs, do lemme know.

    The Code


    Code:
    /* HUFFMAN ENCODING Implementation in C */
    
    /*
      Implemented By :
      Rajiv A Iyer
      TE Comps, SIES GST, Nerul
      contact :  rai_maximus@yahoo.co.in
    */
    
    /* SOURCE CODE */
    #include<stdio.h>
    #include<stdlib.h>
    #include<conio.h>
    #include<string.h>
    
    #define NULL 0
    #define MAX_NO_SYMB 64
    #define MAX_BITS_PER_SYMB 100
    
    /* Symbolic Constants */
    const int TRUE=1,FALSE=0,RESET=-1;
    const int NOT_A_SON=0,LEFT_SON=1,RIGHT_SON=2;
    
    /* Global Variables */
    char *input,*output,code[MAX_NO_SYMB+1][MAX_BITS_PER_SYMB];
    int n,frequency[MAX_NO_SYMB+1];
    
    
    /* Structure Specifications */
    /* Binary Tree Node Specification */
    /* 'bintree_node' is a Binary Tree, that holds a Character, its Serial No &
        its Relative/Accumulative Frequency as its Data Values */
    struct bintree_node
    {
      union data
      {
        char ch;
        char *str;
      }d;
      int ch_srno;
      int ch_freq;
      int son_type;
      struct bintree_node *left,*right,*father;
    };
    
    /* Linked List Node Specification */
    struct linked_list_node
    {
      struct bintree_node *btroot;
      struct linked_list_node *next;
    };
    
    /* 'position' is an array of 'MAX_NO_SYMB' elements, that contains a pointer to
       the node of the Binary Tree representing the 'i'th Symbol */
    struct bintree_node *position[MAX_NO_SYMB+1];
    
    /* 'rootnodes' is an External Pointer to an Ascending Ordered Linked List,
         which contains pointers to roots of partially built Binary Trees */
    struct linked_list_node *rootnodes;
    
    
    
    /* Function Declarations */
    /* I/O Functions */
    void read(void);
    void disp_string(char[]);
    void disp_all_freq(void);
    void disp_code(void);
    
    /* Mapping or Hashing Routines & Frequency Initializations */
    int alpha_mapping(char);
    char alpha_reconv(int);
    void compute_frequencies(void);
    
    /* Binary Tree Related Routines */
    struct bintree_node* makenode(int,int,char[]);
    
    /* Linked List related Routines */
    void ll_insert(struct bintree_node*);
    struct bintree_node* ll_min_delete(void);
    int verify_multi_entries(void);
    int udef_strcmp(char[],char[]);
    
    /* Groundwork Routines Reqd for the MAIN "huffman_encode()" Routine */
    char* compute_composite_name(struct bintree_node*,struct bintree_node*);
    
    /* Main Logical Routine for Huffman's Encoding Algorithm */
    void huffman_encode(void);
    
    /* Temporary De-Bugging Functions */
    void disp_ll(void);
    
    
    // Main Program
    void main()
    {
      clrscr();
      read();
      disp_string(input);
    
      compute_frequencies();
      disp_all_freq();
    
      huffman_encode();
    }
    
    
    
    // INPUT-OUTPUT ROUTINES
    /*
      Name of Routine     :  read
      Type/Classified as  :  I/O
      Function Call       :  read() from Main
      Parameters Passed   :  NONE
      Return Type         :  Void
      Purpose             :  Reads the String that the User Enters & Verifies it
                             for Validity
    */
    void read(void)
    {
      int i,j=0,p,k;
      char tmp_char;
      char temp_char_arr[4];
      int flag_ip,end_ip_flag;
    
      clrscr();
      //printf("\n\n\n");
      printf("Enter the No. Of lines the I/P may consist of :  ");
      scanf("%d",&n);
    
      input=(char*)malloc(n*80*sizeof(char));
    
      printf("\n\nEnter the Message to be Encoded \n(At Any Time Type \"END\" and Press 'ENTER' key to terminate Reading) :\n\n");
      do{
        end_ip_flag=FALSE;
    
        for(i=j;i<n*80 && end_ip_flag==FALSE;i++)
        {
          //flushall();
          tmp_char=getchar();
    
          input[i]=tmp_char;
          if(tmp_char=='\n' && i>=3)
          {
    
            for(k=2,p=i-1;k>=0;k--,p--)
              temp_char_arr[k]=input[p];
            temp_char_arr[3]='\0';
            if(strcmp(temp_char_arr,"END")==0)
            {
              end_ip_flag=TRUE;
              /* Clipping OFF the 'END' entered by the user from the 'input' string */
              input[i-3]='\0';
    
              /* Applying Rectification for the Error NEWLINE Character in input[0] */
              for(k=0;input[k]!='\0';k++)
              input[k]=input[k+1];
    
              break;
            }
          }
        }
    
        /* Checking if String Entered is Valid */
        flag_ip=TRUE;
        for(j=0;input[j]!='\0';j++)
          if(!((input[j]>='a' && input[j]<='z') || (input[j]>='A' && input[j]<='Z') ||
               (input[j]>='0' && input[j]<='9') || (input[j]==' ' || input[j]=='\n')))
          {
            printf("\n\n\n\aINVALID CHARACTER INPUT  at \a%d Position!! \a Please Try Again!!\n\n\n\a",j);
            getch();
            input[j]='\0';
            flag_ip=FALSE;
            break;
          }
    
        clrscr();
    
        if(flag_ip==FALSE)
        {
          printf("Continue to Enter the Message to be Encoded\n(At Any Time Type \"END\" and Press 'ENTER' key to terminate Reading) :  ");
          for(j=0;input[j]!='\0';j++)
            printf("%c",input[j]);
        }
      }while(flag_ip==FALSE);
      getch();
    }
    
    
    /*
      Name of Routine     :  disp
      Type/Classified as  :  I/O
      Function Call       :  disp(input) from Main
      Parameters Passed   :  str[]
      Return Type         :  Void
      Purpose             :  Displays a given string passed to it
    */
    void disp_string(char str[])
    {
      int i;
    
      //clrscr();
      printf("\n\n\n");
      printf("The String is :  ");
      for(i=0;str[i]!='\0';i++)
      {
        //if(str[i]=='\n')
          //printf("_NEWLINE_");
        //else
          printf("%c",str[i]);
      }
      getch();
    }
    
    
    /*
      Name of Routine     :  disp_all_freq
      Type/Classified as  :  I/O
      Function Call       :  disp_all_freq()  from  Main
      Parameters Passed   :  NONE
      Return Type         :  Void
      Purpose             :  Displays the Relative/Accumulated Frequencies of
                             ALL Symbols in Alphabetical Order.
    */
    void disp_all_freq(void)
    {
      int i,max,maxpos;
    
      char c;
    
      clrscr();
      //printf("\n\n\n");
    
      printf("Relative Frequency Table is as shown :\n");
      printf("\nSymbol\t\t\tFrequency\n");
    
      max=frequency[1];
      maxpos=1;
      for(i=1;i<=MAX_NO_SYMB;i++)
        // Comment Below 'IF' to obtain Frequency Chart for ALL Symbols AND not just
        // for those with non-zero frequency
        if(frequency[i]!=0)
        {
          c=alpha_reconv(i);
          if(c=='\n')
            printf("\nNEWLINE");
          else if(c==' ')
            printf("\nSPACE");
          else
            printf("\n\t%c",c);
    
          printf("\t\t\t%d",frequency[i]);
    
          if(frequency[i]>max)
          {
            max=frequency[i];
            maxpos=i;
          }
        }
    
      getch();
      printf("\n\n");
    
      printf("\n\nMax Frequency :  %d  for  Symbol :  ",max);
      c=alpha_reconv(maxpos);
      if(c=='\n')
        printf("\nNEWLINE");
      else if(c==' ')
        printf("\nSPACE");
      else
        printf("%c",c);
      getch();
    }
    
    
    /*
      Name of Routine     :  disp_code
      Type/Classified as  :  I/O
      Function Call       :  disp_code(code)  from  'huffman_encode()'
      Parameters Passed   :  NONE
      Return Type         :  Void
      Purpose             :  Displays the Code Words for ALL Symbols found in Text
    */
    void disp_code(void)
    {
      int i;
      char c;
    
      //printf("\n\n\n\n");
      //clrscr();
      printf("Code-Words Displayed for Alphabets/Symbols in Alphabetical Order is :\n");
    
      for(i=1;i<=MAX_NO_SYMB;i++)
        if(frequency[i]!=0)
        {
          c=alpha_reconv(i);
          if(c==' ')
            printf("\n\nSPACE");
          else if(c=='\n')
            printf("\n\nNEWLINE");
          else
            printf("\n\n%c",c);
          if(code[i]==NULL)
            printf("\n\aERROR!!\a\n");
          else
            printf("\t====>\t%s",code[i]);
        }
      getch();
    
      printf("\n\n\n\n\n");
      printf("Hence, Using Huffman's Data Compression Algorithm, Encoded Message is :\n\n");
      for(i=0;input[i]!='\0';i++)
        printf("%s ",code[alpha_mapping(input[i])]);
      getch();
    }
    
    
    
    // MAPPING/HASHING & FREQUENCY INITIALIZATION ROUTINES
    /*
      Name of Routine     :  alpha_mapping
      Type/Classified as  :  Mapping/Hashing
      Function Call       :  frequency[alpha_mapping(ch)] or code[alpha_mapping(ch)]
      Parameters Passed   :  char ch - Represents the Character, which is to be Mapped
      Return Type         :  int
      Purpose             :  Maps a Character into a corresponding SrNo./location as
                             per the decided Mapping Scheme.
    */
    int alpha_mapping(char ch)
    {
      int loc;
    
      if(ch=='a')
        loc=1;
    
      //if(ch>='a' && ch<='z')
      else if(ch>='b' && ch<='z')
        loc=(ch-'a')+1;
      else if(ch>='A' && ch<='Z')
        loc=(ch-'A')+26+1;
      else if(ch>='0' && ch<='9')
        loc=(ch-'0')+52+1;
      else if(ch==' ')
        loc=63;
      else
        loc=64;
    
      return loc;
    }
    
    /*
      Name of Routine     :  alpha_reconv
      Type/Classified as  :  Mapping/Hashing
      Function Call       :  alpha_reconv(srno)
      Parameters Passed   :  int srno - Represents the Symbol Sr.No., which is to be
                             De-Mapped into a Character
      Return Type         :  char
      Purpose             :  De-Maps an Integer SrNo. into a corresponding Character
                             as per the decided Mapping Scheme.
    */
    char alpha_reconv(int srno)
    {
      char ch;
    
      if(srno>=1 && srno<=26)
        ch=(char)(srno-1+'a');
      else if(srno>=27 && srno<=52)
        ch=(char)(srno-26-1+'A');
      else if(srno>=53 && srno<=62)
        ch=(char)(srno-52-1+'0');
      else if(srno==63)
        ch=' ';
      else
        ch='\n';
    
      return ch;
    }
    
    
    
    /*
      Name of Routine     :  compute_frequencies
      Type/Classified as  :  Mapping/Frequency Initializations
      Function Call       :  compute_frequencies() from Main
      Parameters Passed   :  NONE
      Return Type         :  Void
      Purpose             :  Initializes the Accumulative Frequencies
                             of ALL symbols in the Text
    */
    void compute_frequencies(void)
    {
      int i;
    
      /* Initializing ALL Frequencies to '0' */
      for(i=0;i<=MAX_NO_SYMB;i++)
      {
        frequency[i]=0;
      }
    
      /* Computing ALL the Symbol's Frequencies */
      for(i=0;input[i]!='\0';i++)
        frequency[alpha_mapping(input[i])]++;
    }
    
    
    
    
    // BINARY TREE RELATED ROUTINES
    /*
      Name of Routine     :  makenode
      Type/Classified as  :  Binary Tree Related Routine
      Function Call       :  makenode(srno,frq,str)  from  huffman_encode()
      Parameters Passed   :  1. 'srno' - Represents the Sr.No. of the Symbol(s)
                                 being Entered as a New Node in the Tree - {Refer 'alpha_mapping()'}
                                 This parameter is 'RESET' for Composite Nodes
                             2. 'frq' - Represents the Accumulative Freq. of
                                 the Symbol(s) being Entered as a New node in the Tree
                             3. 'str[]' - Character String for Composite Nodes -
                                 This parameter is "\0" for Simple Nodes
      Return Type         :  struct bintree_node*
      Purpose             :  Creates a NEW Node in the Tree, with the above Data
                             Specifications & then Returns a pointer to this node.
    */
    struct bintree_node* makenode(int srno,int frq,char s[])
    {
      struct bintree_node* pnode;
    
      pnode=(struct bintree_node*)malloc(sizeof(struct bintree_node));
    
      if(pnode==NULL)
      {
        printf("\n\n\n\aMemory Overflow!! \a Program Terminates!!\a\n\n\n");
        exit(1);
      }
      else
      {
        /* Setting the Data Values of the New Node */
        pnode->ch_srno=srno;
        pnode->ch_freq=frq;
        if(srno==RESET)
        // We are dealing with Composite Node creation
        {
          pnode->d.str=(char*)malloc((strlen(s)+1)*sizeof(char));
          strcpy(pnode->d.str,s);
        }
        else
        // We are dealing with Simple Node creation
          pnode->d.ch=alpha_reconv(srno);
    
        /* Setting New Node as a LEAF Node */
        pnode->left=pnode->right=NULL;
    
        /* Setting it as a Root Node */
        pnode->father=NULL;
        pnode->son_type=NOT_A_SON;
    
        return pnode;
      }
      // To Suppress Compiler-Warning:
      return NULL;
    }
    
    
    
    
    // LINKED LIST RELATED ROUTINES
    /*
      Name of Routine     :  ll_insert
      Type/Classified as  :  Linked List Related Routine
                                     AND/OR
                             MAIN BRAIN / LOGICAL CORE
      Function Call       :  ll_insert(p)  from  huffman_encode()
      Parameters Passed   :  'p' - Represents a Node of the Binary Tree, which
                             is to be inserted into the Linked List
      Return Type         :  Void
      Purpose             :  Inserts an Element 'p' into the Linked List 'rootnodes'
                             keeping the LL ordered in Ascending Order (as per
                             Relative Frequency & Alphabetical Order of Binary Tree
                             Root Nodes contained in 'btroot' section)
    */
    void ll_insert(struct bintree_node *p)
    {
      struct linked_list_node *pnode;
      struct linked_list_node *current,*follow,*c,*f;
    
      int tcount1=0,tcount2=0;
    
      /* Allocating Memory for the Linked Lists's node */
      pnode=(struct linked_list_node*)malloc(sizeof(struct linked_list_node));
    
      if(pnode==NULL)
      {
        printf("\n\n\n\aMemory Overflow!! \a Program Terminates!!\a\n\n\n");
        getch();exit(1);
      }
      else
      {
        /* Setting the Data Values */
        pnode->btroot=p;
        pnode->next=NULL;
    
        /* Testing if the Linked List is EMPTY */
        if(rootnodes==NULL)
        {
          /* The Linked List is EMPTY. Hence, inserting the Very 1st node */
          rootnodes=pnode;
          rootnodes->next=NULL;
        }
        else
        {
          /* The Linked List is NOT Empty. Hence, inserting the Node at the
             Appropriate position. Hence, Searching for Place of Insertion */
          follow=NULL;
          current=rootnodes;
          while(current!=NULL)
          {
            // TEMPORARY DE-BUGGING STATEMENTS
            /*
            tcount1++;
            printf("\n\n\ntcount1 = %d",tcount1);
            printf("\ncurrent->btroot->ch_srno = %d\tcurrent->btroot->ch_freq = %d",current->btroot->ch_srno,current->btroot->ch_freq);
            */
    
            if(pnode->btroot->ch_freq < current->btroot->ch_freq)
            {
              // Node to be inserted has lesser cumulative Freq than 'current' node
    
              if(follow==NULL)
                //Inserting 'pnode' as the NEW ROOT node
                rootnodes=pnode;
              else
                //Inserting 'pnode' after 'follow'
                follow->next=pnode;
    
              //Inserting 'pnode' before 'current' OR Re-orienting Remaining LL wrt newly inserted 'pnode'
              pnode->next=current;
              break;
            }
            else if(pnode->btroot->ch_freq == current->btroot->ch_freq)
            {
              // Node to be inserted has EQUAL cumulative Freq as 'current' node
              // Hence to resolve ambiguity of correctly placing the node
              // We Initiate another transversal of LL FROM the node pointed by 'current'
              // To resolve the EQUAL frequency problem, we consider:
              // ALPHABETICAL precedence
              // coupled with
              // SIMPLE_Node>COMPOSITE_Node precedence
    
              f=follow;
              c=current;
              while(c!=NULL && c->btroot->ch_freq == pnode->btroot->ch_freq)
              {
                // TEMPORARY DE-BUGGING STATEMENTS
                /*
                tcount2++;
                printf("\n\ntcount2 = %d",tcount2);
                printf("\nc->btroot->ch_srno = %d\tc->btroot->ch_freq = %d",c->btroot->ch_srno,c->btroot->ch_freq);
                */
    
                if(pnode->btroot->ch_srno==RESET)
                // 'pnode' is a COMPOSITE NODE (a node whose label is a STRING)
                {
                  if(c->btroot->ch_srno==RESET)
                  // 'c' is a COMPOSITE NODE (a node whose label is a STRING)
                    if(udef_strcmp(pnode->btroot->d.str,c->btroot->d.str)<0)
                    // Label of 'pnode' has ALHABETICAL PRECEDENCE over Label of 'c'
                    // Hence, we insert 'pnode' BEFORE 'c'
                      break;
                }
    
                else
                // 'pnode' is a SIMPLE node (a node whose label is a CHARACTER)
                {
                  if(c->btroot->ch_srno==RESET)
                  // 'c' is a COMPOSITE NODE (a node whose label is a STRING)
                  // Since 'pnode' is a SIMPLE node and 'c' is a COMPOSITE node
                  // Hence, we insert 'pnode' BEFORE 'c'
                    break;
    
                  else if(alpha_mapping(pnode->btroot->d.ch) < alpha_mapping(c->btroot->d.ch))
                  // Label of 'pnode' has ALHABETICAL PRECEDENCE over Label of 'c'
                  // Hence, we insert 'pnode' BEFORE 'c'
                    break;
                }
    
                /* Continuing our Rightward Traversal */
                f=c;
                c=c->next;
              }
    
              if(f==NULL)
              // 'pnode' must be inserted as 1st node before 'current'
                rootnodes=pnode;
    
              else
              // 'pnode' is inserted after 'follow'
                f->next=pnode;
    
              // 'pnode' is inserted before 'current' OR Remaining LL re-oriented wrt newly inserted 'pnode'
              pnode->next=c;
              break;
            }
    
            /* Continuing our Rightward Traversal */
            follow=current;
            current=current->next;
          }
    
          if(current==NULL)
          // LL has come to an end - Hence insert 'pnode' as LAST node
          {
            follow->next=pnode;
            pnode->next=current;
          }
        }
    
        // TEMPORARY DE-BUGGING STATEMENTS
        //getch();
      }
    }
    
    
    /*
      Name of Routine     :  ll_min_delete
      Type/Classified as  :  Linked List Related Routine
      Function Call       :  p=ll_min_delete(rootnodes)  from  huffman_encode()
      Parameters Passed   :  NONE
      Return Type         :  struct bintree_node*
      Purpose             :  Removes a Min Valued Element (wrt Frequency & Alphabetical Order)
                             And Returns it. As Insertion is done in an ORDERED fashion,
                             this routine simply removes each time the FIRST node
                             in the link list.
    */
    struct bintree_node* ll_min_delete(void)
    {
      struct linked_list_node *current;
      struct bintree_node *min;
    
      if(rootnodes==NULL)
      {
        // THIS CODE-SEGMENT SHOULD/WILL NEVER EXECUTE
        printf("\n\n\n\aLinked List EMPTY!! \a ERROR!! CANNOT DELETE!! PROGRAM TERMINATES!!\a\n\n\n");
        getch();
        exit(1);
      }
      else
      {
        current=rootnodes;
        rootnodes=current->next;
    
        min=current->btroot;
        free(current);
        return min;
      }
      // To Suppress Compiler-Warning:
      return NULL;
    }
    
    
    /*
      Name of Routine     :  verify_multi_entries
      Type/Classified as  :  Linked List Related Routine
      Function Call       :  if(verify_multi_entries(rootnodes))  from  huffman_encode()
      Parameters Passed   :  NONE
      Return Type         :  int
      Purpose             :  Tests the presence of Multiple nodes in a Linked List
                             If Multiple nodes present, then Returns TRUE
                             Else, Returns FALSE
    */
    int verify_multi_entries(void)
    {
      int count=0;
      struct linked_list_node *current;
    
      current=rootnodes;
    
      while(current!=NULL)
      {
        count++;
        current=current->next;
      }
    
      if(count>1)
        return TRUE;
    
      return FALSE;
    }
    
    
    /*
      Name of Routine     :  udef_strcmp
      Type/Classified as  :  Linked List Related Routine
      Function Call       :  if(udef_strcmp(p->btroot->d.str,c->btroot->d.sr))
                             from  ll_insert()
      Parameters Passed   :  's1[] & s2[]' - Both Strings, which are to be compared
                             as per the Mapping Scheme
      Return Type         :  int
      Purpose             :  Tests the Aplhabetical Precedence of 2 strings passed to
                             it NOT as per ASCII Scheme, but as per the Programmer
                             decided Mapping Scheme. {Refer alpha_mapping()}
    */
    int udef_strcmp(char s1[],char s2[])
    {
      int i;
      int p,q;
    
      for(i=0;s1[i]!='\0' && s2[i]!='\0';i++)
      {
        p=alpha_mapping(s1[i]);
        q=alpha_mapping(s2[i]);
        if(p!=q)
          return (p-q);
      }
    
      if(s1[i]=='\0')
        return (-p);
      else if(s2[i]=='\0')
        return (q);
    
      // To Suppress Compiler-Warning:
      return 1;
    }
    
    
    
    // LL_Insertion_GROUNDWORK ROUTINES
    /*
      Name of Routine     :  compute_composite_name
      Type/Classified as  :  LL_Insertion_Groundwork
      Function Call       :  compute_composite_name(p1,p2)
      Parameters Passed   :  'p1' & 'p2'  -  2 Binary Tree Nodes
                              whose Strings/Characters must be conjoined
                              into a single composite string.
      Return Type         :  char*
      Purpose             :  Joins the 2 Strings (Char-Char,Char-Str,Str-Char,Str-Str)
                             of the nodes 'p1' & 'p2' & returns a pointer to the
                             String Memory block.
    */
    char* compute_composite_name(struct bintree_node *p1,struct bintree_node *p2)
    {
      char *s;
      char temp[10];
    
      if(p1->ch_srno==RESET)
      {
        if(p2->ch_srno==RESET)
        {
          s=(char*)malloc((strlen(p1->d.str)+strlen(p2->d.str)+1)*sizeof(char));
          strcpy(s,p1->d.str);
          strcat(s,p2->d.str);
        }
        else
        {
          switch(p2->d.ch)
          {
            case ' ':
              strcpy(temp,"_SPC_");
              break;
            case '\n':
              strcpy(temp,"_NEW_");
              break;
            default:
              temp[0]=p2->d.ch;
              temp[1]='\0';
          }
          s=(char*)malloc((strlen(p1->d.str)+strlen(temp)+1)*sizeof(char));
          strcpy(s,p1->d.str);
          strcat(s,temp);
        }
      }
      else
      {
        switch(p1->d.ch)
        {
          case ' ':
            strcpy(temp,"_SPC_");
            break;
          case '\n':
            strcpy(temp,"_NEW_");
            break;
          default:
            temp[0]=p1->d.ch;
            temp[1]='\0';
        }
    
        if(p2->ch_srno==RESET)
        {
          s=(char*)malloc((strlen(temp)+strlen(p2->d.str)+1)*sizeof(char));
          strcpy(s,temp);
          strcat(s,p2->d.str);
        }
        else
        {
          s=(char*)malloc((2*strlen(temp)+1)*sizeof(char));
          strcpy(s,temp);
          switch(p2->d.ch)
          {
            case ' ':
              strcpy(temp,"_SPC_");
              break;
            case '\n':
              strcpy(temp,"_NEW_");
              break;
            default:
              temp[0]=p2->d.ch;
              temp[1]='\0';
          }
          strcat(s,temp);
        }
      }
      return s;
    }
    
    
    
    // MAIN BRAIN/LOGICAL CORE ROUTINE(s)
    /*
      Name of Routine     :  huffman_encode
      Type/Classified as  :  MAIN BRAIN / LOGICAL CORE
      Function Call       :  huffman_encode() from Main
      Parameters Passed   :  NONE
      Return Type         :  Void
      Purpose             :  Does the Actual Process of Huffman Encoding - Implements
                             the Algorithm. Thus, its the Brain of the Program
    */
    void huffman_encode(void)
    {
      int i,j,l,effective_no_symb=0;
      char *s;
    
      struct bintree_node *p,*p1,*p2,*root;
    
      /* Forming the Linked List, Initially Empty */
      rootnodes=NULL;
    
      /* Creating 'effective_no_symb' number of Distinct Binary Trees, each of which
         consist initially of only the 'root' element */
      for(i=1;i<=MAX_NO_SYMB;i++)
        if(frequency[i]!=0)
        {
          effective_no_symb++;
    
          /* Construct a NEW Binary Tree with a Single 'root' node */
          p=makenode(i,frequency[i],"\0");
    
          // TEMPORARY DE-BUGGING STATEMENTS
          /*
          clrscr();
          printf("p = %X\tp->ch_srno = %d\tp->ch_freq = %d\tp->d.ch = %c",p,p->ch_srno,p->ch_freq,p->d.ch);
          */
    
          /* Assigning to Pointer contained in 'position[i]', the Address of 'p' */
          position[i]=p;
    
          /* Inserting the Root Node 'P' of the current Distinct Binary Tree into
             the Linked List 'rootnodes' */
          ll_insert(p);
    
          // TEMPORARY DE-BUGGING STATEMENTS
          /*
          printf("\n\nLinked List now contains :\n");
          disp_ll();
          */
        }
    
      // TEMPORARY DE-BUGGING STATEMENTS
      printf("\n\nNodes Inserted into Linked List Totally(i.e. Effective No. of Symb) =  %d\a\n\n",effective_no_symb);
      /*
      printf("\n\nLinked List after insertion of ALL Nodes is :\n");
      disp_ll();
      getch();
      */
    
    
      /* Executing Loop As Long As there are More Than 1 Elements in the Queue 'rootnodes' */
      while(verify_multi_entries())
      {
        /* Removing the 2 Minimum Elements from the Priority Queue */
        p1=ll_min_delete();
        p2=ll_min_delete();
    
        /* Computing the Composite Node's Composite String Name &
           Hence, forming a new node, whose Frequency is Sum of the Frequencies
           of the 2 Minimum Nodes Ejected */
        s=compute_composite_name(p1,p2);
        p=makenode(RESET,(p1->ch_freq+p2->ch_freq),s);
    
    
        // TEMPORARY DE-BUGGING STATEMENTS
        /*
        printf("\n\n\n");
        printf("\n\nTwo Nodes Deleted ('p1' & 'p2') :\n");
        printf("\np1 = %X\tp1->ch_srno = %d\tp1->ch_freq = %d",p1,p1->ch_srno,p1->ch_freq);
        if(p1->ch_srno!=RESET)
          printf("\tp1->d.ch = %c",p1->d.ch);
        else
          printf("\tp1->d.str = %s",p1->d.str);
    
        printf("\n\np2 = %X\tp2->ch_srno = %d\tp2->ch_freq = %d",p2,p2->ch_srno,p2->ch_freq);
        if(p2->ch_srno!=RESET)
          printf("\tp2->d.ch = %c",p2->d.ch);
        else
          printf("\tp2->d.str = %s",p2->d.str);
    
        printf("\n\np = %X\tp->ch_srno = %d\tp->ch_freq = %d",p,p->ch_srno,p->ch_freq);
        if(p->ch_srno!=RESET)
          printf("\tp->d.ch = %c",p->d.ch);
        else
          printf("\tp->d.str = %s",p->d.str);
        getch();
        */
    
        // TEMPORARY DE-BUGGING STATEMENTS
        /*
        printf("\n\n\nLinked List AFTER Removal of the 2 Min Nodes (And BEFORE Insertion of 3rd Composite Node) :\n");
        disp_ll();
        getch();
        */
    
    
        /* Setting 'p1' as the LEFT Son of 'p' & Setting 'p2' as its RIGHT Son */
        p->left=p1;
        p->right=p2;
        p1->son_type=LEFT_SON;
        p2->son_type=RIGHT_SON;
        p1->father=p2->father=p;
    
        /* Inserting Node 'p' back into the Linked List */
        ll_insert(p);
    
    
        // TEMPORARY DE-BUGGING STATEMENTS
        /*
        printf("\n\n\n\nLinked List AFTER Insertion of 3rd Composite Node :\n");
        disp_ll();
        getch();
        */
      }
    
    
      /* Tree is Created. Now, Constructing the Codes */
      /* Extracting the Single Binary Tree that NOW Exists as a
         Singular Entry of Linked List 'rootnodes' */
      root=ll_min_delete();
    
      for(i=0;i<=MAX_NO_SYMB;i++)
        for(j=0;j<=MAX_BITS_PER_SYMB;j++)
          code[i][j]='\0';
    
      //clrscr();
      printf("\n\n\n\n\n");
      printf("FINDING CODE-WORDS :\n");
      for(i=1;i<=MAX_NO_SYMB;i++)
        if(frequency[i]!=0)
        {
          /* Travelling & Finding Code-Word */
          p=position[i];
    
          printf("\n\nFor Symbol  '%c'\nCode-Word is :\t====>\t",p->d.ch);
    
          while(p!=root)
          {
            l=strlen(code[i]);
            for(j=l;j>0;j--)
              code[i][j]=code[i][j-1];
    
            if(p->son_type==LEFT_SON)
            {
              code[i][0]='0';
              printf("0 ( l = %d ) \t",l);
            }
            else
            {
              code[i][0]='1';
              printf("1 ( l = %d ) \t",l);
            }
            code[i][l+1]='\0';
    
            /* Travelling to the Father of 'p' */
            p=p->father;
          }
          printf("\n'code[%d]' contents :  %s\t\tStrlen(code[%d]) = %d",i,code[i],i,strlen(code[i]));
          getch();
        }
    
        /* Displaying the Code-Word formed */
        disp_code();
    }// Routine Ends
    
    
    
    // TEMPORARY DE-BUGGING ROUTINES BEGIN
    /*
      Name of Routine     :  huffman_encode
      Type/Classified as  :  Temporary De-Bugging Routine
                             (U can delete this routine & all its associated
                              Function calls, without affecting the Logical
                              Working of the Program. Its just for the purpose
                              of Demonstrating the inner Working & De-Bugging)
      Function Call       :  huffman_encode() from Main
      Parameters Passed   :  NONE
      Return Type         :  Void
      Purpose             :  Displays the SLL 'rootnodes' as it currently is.
    */
    void disp_ll(void)
    {
      int count=0;
      struct linked_list_node *current=rootnodes;
    
      printf("\n\nLINKED LIST BEGINS :\n");
    
      if(rootnodes==NULL)
        printf("\n\aLinked List EMPTY!!\a\n\n");
    
      //current=rootnodes;
      while(current!=NULL)
      {
        count++;
        //printf("\a");
        printf("\n\nNode :  %d\tcurrent = %X\tcurrent->btroot = %X",count,current,current->btroot);
        printf("\ncurrent->btroot->ch_srno = %d\t",current->btroot->ch_srno);
        if(current->btroot->ch_srno==RESET)
          printf("current->btroot->d.str = %s",current->btroot->d.str);
        else
          printf("current->btroot->d.ch = %c",current->btroot->d.ch);
        printf("\ncurrent->btroot->ch_freq = %d",current->btroot->ch_freq);
        current=current->next;
      }
      printf("\n\nLINKED LIST ENDS!!");
    }
     

    Attached Files:

    Last edited: Feb 11, 2008
  2. shabbir

    shabbir Administrator Staff Member

    Rajiv, Its long time but good to see you again here. I havent gone through the code but I assume it would be something you have always been (great).
     
  3. rai_gandalf

    rai_gandalf New Member

    Documentation n Samples in attached Zip File

    I forgot to mention, download the attached Zip File as it contains not only the C File, but also some documentation and some ready-made samples. :)

    Ciao,
    Rajiv

    PS: A little bit of feedback is always encouragin :eek: :)
     
  4. ajelectrowhiz

    ajelectrowhiz New Member

    Hey... im new around here.. but nice one....
     
  5. ReekenX

    ReekenX New Member

    Nice, I like it :)
     
  6. sun_kangane

    sun_kangane New Member

    hiy that was very nice.................
     
  7. Peter_APIIT

    Peter_APIIT New Member

    Excellent programming techniques. By the way, i don't understand what you writing. I only know is a program to zip file.

    Besides that, i also have a suggestion , why you don't try to code the linux compression algorithm and post back in future. Keep up the good work.
     
  8. mikcarl

    mikcarl New Member

    Do you have a program for Huffman decode the already encoded data in C++.
     
  9. mikcarl

    mikcarl New Member

    Oh and can you create huffman code that reads the data that it has to encode from a text file and then decodes the data and sends it to the text file and the code does not ask for the IP in C++ and by the way when i compile your this program it does not compile something wrong with it.
     
  10. kiranmanjappa

    kiranmanjappa New Member

    good.... looking forward for LZW implementations.... ..:)
     
  11. shwethap

    shwethap New Member

    hi can u plz post the C code of huffman decoder....
    I worked with ur huffman encoder pgm ... The coding technique is fantastic and working fine....
    plz i ll be waiting ....
     
  12. InfoMan

    InfoMan New Member

    Great rai,

    Your Code was very nice, good for ya.
     
  13. jacknespor2000

    jacknespor2000 New Member

    Do you know of how to change your implementation from reading from stdin to reading from a file rather? I'm attempting to use your code to read in a file and compress the text in the file. Could you give me any pointers as to how this should be done?
     
  14. jacknespor2000

    jacknespor2000 New Member

    Actually i figured out the solution to my last question. But now I'm wondering if you've written a decoding function for the huffman algorithm. I've had no luck so far and am pretty lost. Any help you could give me would be extremely appreciated!

    Thanks!
     
  15. hour

    hour New Member

    excuse me, after compress, can this program decompress it? if not, can you show me the algorithm to decompress it. ^^
     
  16. sura

    sura Banned

    It is nice . I like it.....
     

Share This Page