python - Cache Proxy Server Returning 404 with www.google.com -

- July 15, 2013

i have homework assignment involves implementing proxy cache server in python web pages. here implementation of it

from socket import * import sys  def main():     #create server socket, bind port , start listening     tcpsersock = socket(af_inet, sock_stream) #initializing socket     tcpsersock.bind(("", 8030)) #binding socket port     tcpsersock.listen(5) #listening page requests     while true:         #start receiving data client         print 'ready serve...'         tcpclisock, addr = tcpsersock.accept()         print 'received connection from:', addr         message = tcpclisock.recv(1024)         print message          #extract filename given message         filename = ""         try:             filename = message.split()[1].partition("/")[2].replace("/", "")         except:             continue         fileexist = false          try: #check whether file exists in cache             f = open(filename, "r")             outputdata = f.readlines()             fileexist = true             #proxyserver finds cache hit , generates response message             tcpclisock.send("http/1.0 200 ok\r\n")             tcpclisock.send("content-type:text/html\r\n")             data in outputdata:                 tcpclisock.send(data)             print 'read cache'         except ioerror: #error handling file not found in cache             if fileexist == false:                  c = socket(af_inet, sock_stream) #create socket on proxyserver                  try:                     srv = getaddrinfo(filename, 80)                     c.connect((filename, 80)) #https://docs.python.org/2/library/socket.html                     # create temporary file on socket , ask port 80                     # file requested client                     fileobj = c.makefile('r', 0)                     fileobj.write("get " + "http://" + filename + " http/1.0\r\n")                     # read response buffer                     buffr = fileobj.readlines()                     # create new file in cache requested file.                     # send response in buffer client socket ,                     # corresponding file in cache                     tmpfile = open(filename,"wb")                     data in buffr:                         tmpfile.write(data)                         tcpclisock.send(data)                 except:                     print "illegal request"             else: #file not found                 print "404: file not found"         tcpclisock.close() #close client , server sockets  main()

i configured browsers use proxy server so

but problem when run no matter web page try access returns 404 error initial connection , connection reset error subsequent connections. have no idea why appreciated, thanks!

there quite number of issues code.

your url parser quite cumbersome. instead of line

filename = message.split()[1].partition("/")[2].replace("/", "")

i use

import re parsed_url = re.match(r'get\s+http://(([^/]+)(.*))\shttp/1.*$', message) local_path = parsed_url.group(3) host_name = parsed_url.group(2) filename = parsed_url.group(1)

if catch exception there, should throw error because request proxy doesn't understand (e.g. post).

when assemble request destination server, use

fileobj.write("get {object} http/1.0\n".format(object=local_path)) fileobj.write("host: {host}\n\n".format(host=host_name))

you should include of header lines original request because can make major difference returned content.

furthermore, cache entire response header lines, should not add own when serving cache.

what have doesn't work, anyway, because there no guarantee 200 , text/html content. should check response code , cache if did indeed 200.

Search This Blog

SSIS

python - Cache Proxy Server Returning 404 with www.google.com -

Comments

Post a Comment

Popular posts from this blog

c# - How Configure Devart dotConnect for SQLite Code First? -

java - Copying object fields -

c++ - Clear the memory after returning a vector in a function -