Pages

Monday, December 13, 2010

BufferedStream Performance

BufferedStream Performance


Lately I was asked to perform some optimizations on the transfer side regarding the remoting, or write my own remoting system that would be faster, I ended up using some open source code to make it it time, but all of the solutions were actually slower then plain .Net Remoting over tcp (my custom solution was using tcp also).

I had to modify the code myself and use buffering, I used BufferedStream from Net which had significant impact on the performance, and my solution was finally faster then built in Remoting of Net platform :-).

So let's look at a simple demo how things are faster with buffering, to make it simple I will show how to send data using the NetworkStream with and without BufferedStream.

Simple Example

First the Server code:

//Server

            TcpListener server = new TcpListener(IPAddress.Loopback, 8889);
            server.Start();
           
            while (true)
            { 
                TcpClient client = server.AcceptTcpClient();
                client.NoDelay = true;
                NetworkStream stream = client.GetStream();

                while (true)
                {
                    byte[] data = new byte[stream.ReadByte()];
                    stream.Read(data, 0, data.Length);
                }
            }

Nothing fancy here the server get's the message processes it if needed and then sends back results but to keep it simple the server only gets messages, buffering not needed here really but it will be added as in real life scenario server never just reads data from the client.

Next up is the Client:

//Client !

            TcpClient client = new TcpClient("localhost", 8889);
            client.NoDelay = true;
            NetworkStream stream = client.GetStream();
            System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
            
            MemoryStream ms = null;


            Stopwatch watch = new Stopwatch();
            watch.Start();
            for (int i = 0; i < 10000; i++)
            {
                byte[] data = encoding.GetBytes("Hello Server");

                ms = new MemoryStream();
                ms.WriteByte((byte)data.Length);
                ms.Write(data, 0, data.Length);

                byte[] d = ms.ToArray();

                stream.Write(d, 0, d.Length);
                //stream.Flush();
            }
            watch.Stop();

            Console.WriteLine(watch.ElapsedMilliseconds);
            Console.ReadKey();

The client sends a simple message, that's using the data length pattern (always send data length at the start of the message). Normally the client would also read data and the server would write and flush the stream but we wouldn't see the raw performance gain so let's keep it simple for a while, so how this could help in outperforming remoting? Ill explain later.

Now let's add to the client and the server BufforedStreams.

//Client !
            System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
            BufferedStream bf = new BufferedStream(stream);
            MemoryStream ms = null;
            ...
            ...
            byte[] d = ms.ToArray();

            bf.Write(d, 0, d.Length);
            //stream.Flush();

//Server !
            NetworkStream stream = client.GetStream();
            BufferedStream bf = new BufferedStream(stream);

            while (true)
            {
                byte[] data = new byte[stream.ReadByte()];
                bf.Read(data, 0, data.Length);
            }

The Results Of Simple

without the buffering: ~650ms
using BufferedStream: ~20ms


So thats over 30 times faster!, but in a real life example the client and the server will read and write to each other, and then the BufferedStream will need flushing, so this will downgrade it's performance, besides the overhead of using it could be potentially a lot slower when dealing with just a few bytes of data (consider message passing).

So how does that help outperform remoting?

Well it helps a lot, as to have such architecture of remoting objects, the client needs to send a command to the server and the server will perform at least a few writes and reads for that client, thus using buffering will have a positive performance improvement, besides the methods often need to return a lot of data and the client proxy needs to be re synced.

Complex Example (well sort of ;p)

By contrast let's look at a more complex and realistic example, when the client and the server write and read to each other, and then compare the performance. Note: I will only post the BufferedCode.

//Server !
           while (true)
           {
               byte[] data = new byte[bf.ReadByte()];
               bf.Read(data, 0, data.Length);
               //bf.WriteLine(encoding.GetString(data));

               ms = new MemoryStream();
               ms.WriteByte((byte)(data.Length * 2)); //send something bigger
               ms.Write(data, 0, data.Length);
               ms.Write(data, 0, data.Length);

               byte[] dataToSend = ms.ToArray();

               bf.Write(dataToSend, 0, dataToSend.Length);
               bf.Flush();
            }

//Client !
            for (int i = 0; i < 10000; i++)
            {
                byte[] data = encoding.GetBytes("Hello Server");

                ms = new MemoryStream();
                ms.WriteByte((byte)data.Length);
                ms.Write(data, 0, data.Length);

                byte[] d = ms.ToArray();
                bf.Write(d, 0, d.Length);

                int readSize = bf.ReadByte();
                byte[] dataToRead = new byte[readSize];
                bf.Read(dataToRead, 0, dataToRead.Length);
            }


The Results Of Complex

without the buffering: ~1900ms
using BufferedStream: ~2200ms


Now the benefits are minimal, but they get better once the data send from the server is bigger, and to further improve the times the server could do multiple writes, also messing with the Buffer size can help allot. The real life done by comparison my modified remoting framework and the NET Remoting show that the gain in real life scenario is around 30%, but could be higher if the client and the server would not just exchange remoting objects per request, but use batching, data mixing etc.

Summing Up


BufferedStream is worth using when performance needs to critical, also it has many applications in Disk IO. That is all :-)

No comments:

 
ranktrackr.net