[OUTDATED – It has been a while since I looked at this, so it’s probably very outdated. Please check the comments.]
I came across a nice example of a Twisted “man-in-the-middle” style proxy on Stack Overflow. This style of proxy is great for logging traffic between two endpoints, as well as modifying the requests and responses that travel between them.
The original was posted here, and I reproduce the vast majority of the code below with some modifications. My real motivation for posting this is to “get the code out there”, because I had a hard time finding it originally. A big thanks to the original author for posting his code on Stack Overflow.
All you need to do is change the three constants at the top, and add whatever validation/modification logic you want in the dataReceived
and write
methods. Those four methods are labeled so you know which “hop” the data is taking. A request is going to take the following path: client => proxy => server => proxy => client. For example, the first dataReceived
method handles data travelling from the client to your proxy.
#!/usr/bin/env python LISTEN_PORT = 8000 SERVER_PORT = 1234 SERVER_ADDR = "server address" from twisted.internet import protocol, reactor # Adapted from http://stackoverflow.com/a/15645169/221061 class ServerProtocol(protocol.Protocol): def __init__(self): self.buffer = None self.client = None def connectionMade(self): factory = protocol.ClientFactory() factory.protocol = ClientProtocol factory.server = self reactor.connectTCP(SERVER_ADDR, SERVER_PORT, factory) # Client => Proxy def dataReceived(self, data): if self.client: self.client.write(data) else: self.buffer = data # Proxy => Client def write(self, data): self.transport.write(data) class ClientProtocol(protocol.Protocol): def connectionMade(self): self.factory.server.client = self self.write(self.factory.server.buffer) self.factory.server.buffer = '' # Server => Proxy def dataReceived(self, data): self.factory.server.write(data) # Proxy => Server def write(self, data): if data: self.transport.write(data) def main(): factory = protocol.ServerFactory() factory.protocol = ServerProtocol reactor.listenTCP(LISTEN_PORT, factory) reactor.run() if __name__ == '__main__': main()
It seems it get slows for some page, but expectionally instant when using DuckDuckGo or Google and sometimes when I activate the twisted logger it seems there is “[Uninitialized] Stopping factory twisted” errors and then the webbrowser keep “loading nothing” indefinitely.
That’s odd.
What is the code that you are trying?
I tried your code, set my connection to use the proxy on localhost with port 8000 then navigate on a website and it loads forever and nothing shows up in the mitm proxy on the console.
I also noticed what @DarkRedman mentioned. It resulted in timeouts. Any way to fix this?
I have never noticed anything like that. Any more details so I can try to reproduce? e.g. code you were using (if more than what I posted)
That’s because this is not a web proxy. It is a TCP proxy which means that twisted simply forwards the raw TCP packet data without modification to the target server. Web proxies rely on a slightly different protocol to proxy requests. This is what a web browser request looks like WITHOUT using a proxy:
GET / HTTP/1.1
Host: foo.com
This is what a web browser request looks like when it is configured to use a proxy:
GET http://foo.com/ HTTP/1.1
Host: foo.com
Even SSL/TLS requests follow a different logic when a browser uses a proxy. Instead of performing the SSL/TLS handshake at the start, your web browser will send a CONNECT request to the proxy to make sure that the proxy can connect to the target server. Once the browser receives confirmation, it then initiates the SSL/TLS session and continues with the request.
In conclusion, if you want a web proxy then you’re going to have to use a different code snippet. If you want a TCP proxy then this is the desired solution.
I have tried in Python 3.6 and in 2, but it does not work, it is thinking, someone will pass the problem
self.client in line 25 doesn’t seem to be working, but if it did: -Wouldn’t it be sending the data to the client again in line 38? Seems like line 37 (self.factory.server.client=self) is not adding the client instance to the server instance, so self.client in line 25 is always None as initially set initially line 14. -Am I missing something?