1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to get source of a Web Page in C

Discussion in 'C' started by lionaneesh, Jan 31, 2011.

  1. lionaneesh

    lionaneesh Active Member

    Joined:
    Mar 21, 2010
    Messages:
    848
    Likes Received:
    224
    Trophy Points:
    43
    Occupation:
    Student
    Location:
    India
    Note : The following source will only work with non-chunked encoding servers...The servers which have enabled the encoding set to chunked will not properly work with this source...

    And I assume basic knowledge of SOCKETS UNIX API and C language as prerequisites...


    Source


    Code:
    #include<stdio.h>
    #include<netdb.h>
    #include<sys/types.h>
    #include<sys/socket.h>
    #include<arpa/inet.h>
    #include<string.h>
    
    #define RESPONSE_RECV_LIMIT 3000
    #define SOURCE_START_IDENTIFIER "<!DOCTYPE"
    #define SOURCE_START_IDENTIFIER2 "<html>" 		//this is the name of the identifier that the 
    #define FILENAME "/"		 		// ENTER THE FILENAME HERE
    #define PORT	"80"			 		// default for web-browsers
    
    int main(int argc , char *argv[])
    {
    	if(argc != 2)
    	{
    		printf("Usage %s : hostname\n",argv[0]);
    		return(0);
    	}
    
    	char response[RESPONSE_RECV_LIMIT+1];  // + 1 is for null
    	char *source;
    	int sockfd,newfd,err;
    	char ip[INET6_ADDRSTRLEN];
    	struct addrinfo *p,hints,*res;
    	int len,len_s;
    	int yes=1;
    	struct sockaddr_storage their_addr;
    	socklen_t addr_size;
    	void *addr;
    	char *ver;
    	char request[100];
    
    	sprintf(request,"GET %s HTTP/1.1\r\nHost: %s\r\n\r\n",FILENAME,argv[1]);
    
    	// print the request we are making
    
    	printf("%s\n\n",request);
    
    	memset(&hints,0,sizeof hints);
    
    	hints.ai_socktype=SOCK_STREAM;
    
    	hints.ai_family=AF_UNSPEC;
    
    	if ((err = getaddrinfo(argv[1],PORT, &hints, &res)) != 0)
    	{
    		fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(err));
    		return 1;
    	}
    
    	for(p=res;p!=NULL;p=p->ai_next)
    	{		
    		if( ( sockfd = socket(p->ai_family,p->ai_socktype,p->ai_protocol) ) == -1)
    		{
    			printf("Socket error !!!\n");
    			return(0);
    		}
    
    		if (connect(sockfd, p->ai_addr, p->ai_addrlen) == -1) 
    		{
    			close(sockfd);
    			perror("client: connect");
    			continue;
    		}
    	}
    
    	if(send(sockfd,request,strlen(request),0) < strlen(request))
    	{
    		perror("Send Error!!\n");
    	}
    
    	freeaddrinfo(res);
    
    	if( recv(sockfd,response,RESPONSE_RECV_LIMIT,0) == 0 )
    	{
    		perror("Recv : ");
    		return(1);
    	}
    
    	close(sockfd); // we dont need it any more
    
    //	printf("%s",response); // for debugging purposes
    
    	source = strstr(response,SOURCE_START_IDENTIFIER);
    
    	if(source == NULL)
    	{
    		source = strstr(response,SOURCE_START_IDENTIFIER2);		
    	}	
    	printf("%s\n",source);
    	return(0);
    }
    
    Compiling :-
    Code:
    gcc getSource.c -o getSource 
    

    Sample



    I am providing sample with apache on my server …

    You can see the settings here:-

    Code:
    aneesh@aneesh-laptop:~/articles/C/getSrc$ telnet 127.0.0.1 80
    
    Trying 127.0.0.1...
    
    Connected to 127.0.0.1.
    
    Escape character is '^]'.
    
    GET / HTTP/1.1
    
    HTTP/1.1 400 Bad Request
    
    Date: Mon, 31 Jan 2011 16:04:45 GMT
    
    Server: Apache/2.2.14 (Ubuntu)
    
    Vary: Accept-Encoding
    
    Content-Length: 301
    
    Connection: close
    
    Content-Type: text/html; charset=iso-8859-1
    
    Output :-
    Code:
    aneesh@aneesh-laptop:~/articles/C/getSrc$ ./getSource 127.0.0.1
    
    GET / HTTP/1.1
    
    Host: 127.0.0.1
    
    
    
    
    
    
    
    <html><body><h1>It works!</h1>
    
    <p>This is the default web page for this server.</p>
    
    <p>The web server software is running but no content has been added, yet.</p>
    
    </body></html>
    
    
    Hey guyz stay tuned as i am trying hard to add Chunked data functionality to it and maybe i'll write another article on it ...
     
  2. lionaneesh

    lionaneesh Active Member

    Joined:
    Mar 21, 2010
    Messages:
    848
    Likes Received:
    224
    Trophy Points:
    43
    Occupation:
    Student
    Location:
    India
    Thanks for accepting my article..
    I hope you guyz like it!!!!!
     
    Scripting likes this.
  3. nicolerisse

    nicolerisse Banned

    Joined:
    Feb 18, 2011
    Messages:
    6
    Likes Received:
    0
    Trophy Points:
    0
    I don´t accept it...
     
  4. lionaneesh

    lionaneesh Active Member

    Joined:
    Mar 21, 2010
    Messages:
    848
    Likes Received:
    224
    Trophy Points:
    43
    Occupation:
    Student
    Location:
    India
    Sorry but what you cant accept...
    Please be descriptive in your posts
     
  5. somay

    somay New Member

    Joined:
    Apr 29, 2011
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    0
    socket programming in c and c++
    Tcp/ip programming also
     
  6. somay

    somay New Member

    Joined:
    Apr 29, 2011
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    0
    socket programming

    socket programming in c and c++
    Tcp/ip programming also[/quote]
     

Share This Page