Its common to define a system's behaviour by its protocol. Actually, a protocol messes up two distinct aspects of communication, namely:
- Encoding of messages
- Semantics and behavior (request/response, signals, state transition of communication parties ..)
Frequently (not always), these two very distinct aspects are mixed up without need. So we are forced to run the whole internet in "debug mode", as 99% of webservice and webapp communication is done using textual protocols.
The overhead in CPU consumption compared to a well defined binary encoding is factor ~3-5 (JSON) up to >10-20 (XML). The unnecessary waste of bandwith also adds to that greatly (yes you can zip, but this in turn will waste even more CPU).
I haven't calculated the numbers, but this is environmental pollution at big scale. Unnecessary CPU consumption to this extent wastes a lot of energy (global warming anyone ?).
Solution is easy:
- Standardize on some simple encodings (pure binary, self describing binary ("binary json"), textual)
- Define the behavioral part of a protocol (exclude encoding)
- use textual encoding during development, use binary in production.
There go all our pretty logs and quick curl tests and such. Text-protocols allow quick look into what's going on binary protocols don't. As far as I last checked services don't self heel - a human has to jump in. Which might be a reflection of other things going wrong but that is how it is :)
ReplyDeleteI think that's a narrow view. In fact, text is also binary encoded. Its because of standardization on how to read e.g. UTF-8, tools "know" how to decode it. One could standardize on a self-describing binary format and provide a similar tool chain of "pretty printers" without problems (think of it like a "special" charset encoding). However in the first place protocol behaviour must not be mixed up with message encoding
Delete""" and provide a similar tool chain of "pretty printers" without problems (think of it like a "special" charset encoding)"""
DeleteThis is an even more narrow view. Text formats work with what we already have, here and now. What you propose is replacing for something that's not out there, is not inoperable, and most tools don't know about it.
First get the tooling you say it's easy to provide "without problems", then ask for the change to binary formats.
I'd like to remind you the internet should serve users not developers :). Anyway Http 2 will solve the issues. Took only like 15 years and will take like 10 years until its widely adopted :)
DeleteI fully agree in your desire to separate protocol handling from that of encoding (format). Doing this is under-appreciated way to both decouple handling, and to allow for more efficient handling, by choosing the most suitable encoding for context. Sometimes this should be a well-known textual format (for inter-operability, flexibility and diagnostics); other times a more compact binary representation makes more sense. Protocols should not unnecessarily limit the choice.
ReplyDeleteI am not sure I share your concern on performance aspects: without disputing specific encoding numbers, I just question their significant in big picture. In case of HTTP, for example, decoding of Header values is a miniscule part of handling that even if it was changed to binary encoding, benefits in many cases would not be significant at all. Payload remains as-is, and the real complexity in HTTP comes from managing connections, state, liveness checks and other protocol level aspects.
regarding the Http-Header I left out context. E.g. for a low-latency, many client long polling http server, the payload consists of a single sequence number (of last received message), so header parsing indeed adds significant overhead in case there are no pending messages (processing is session lookup+sequence number comparision) for this special case. Another example is DOS-protection. It eats significant processng time to weed out DOS requests from application requests.
DeleteOfc in generaö you are right in that "header parsing" is not a good example regarding the big picture...
DeleteAny in-between component (proxy, reverseproxy) needs to encode and decodes (parse) headers, it's probably f relevance especially for tiny payloads typical for e.g. webservice remote calls
DeleteMy best citation on the mec.symp.group ...The next time i've to write another Json parser/serializer (brrrr Jackson...:() i'll pray that someone listen to you..
ReplyDelete:) thanks Rüdiger
You are doing jackson wrong, it helps dealing with json and afaik is pretty fast, it has not invented JSon ..
DeleteThe java binding and the streaming API are not free..in any way.Although Jackson is "pretty" fast it not deliver any zero-copy (AFAIK) ability in the serialisation stage...and produce TONS of garbage.Comparing it with a serious serializer ( hand-made?) and really GC-free is simple. But that's another story...so far if i'll have to send a long why on earth i've to send more than 8 bytes? ^^ P.s. Jackson is not a bad tool per se and i avere with you that is a great help if you don't want to deal directly with Json...
DeleteHm .. do you have kind of reusable opensource variant of a zero copy, low garbage json parser ? I'd be interested in something like that, as Jackson doubles some stuff i already do in the serialization layer, such that JSon-Codec is well below what would be possible.
DeleteHi Rüdiger,
DeleteI've only custom own-rolled libs that i've developed for my own needs...undocumented too :P
But TextWire of https://github.com/OpenHFT/Chronicle-Bytes looks very promising...if you'll wait few weeks i've contacted Peter Lawrey to contribute to this repo and maybe there will be a little more docs and example for it :)
Wrong repo sorry :P
DeleteI really need a coffee this morning...https://github.com/OpenHFT/Chronicle-Wire
Interesting bottom up.approach (many "planned" features though ;) ). In contradiction fast-serialization goes top down providing different wire formats to represent serialized object graphs (binary,json). Maybe i could add a chronicle wire Codec to fst once C-wire is in a more mature state
DeleteMsgPack is efficient, schema-less, and has a 1:1 mapping with JSON (unlike BSON despite its name).
ReplyDeleteI am aware of msgpack. I even tried to build a codec for fast serialization based on msgpack but somehow lost track. Might be a better alternative to json for actor <=> javascript remoting. Are there any java benchmarks ?
DeleteI used to push Sun's XDR [1] for this very reason. Client-server apps with binary protocols, too. That's because the web was more inefficient, complex, and insecure in about every area. It was crap. It's main advantages were the networking effect, instant distribution, and widespread compatibility of HTML/JS. We could've just fixed the problems in our native C/S and P2P models but adopted web instead.
ReplyDeleteTwo other good alternatives from long ago were Juice [2] for applets and Globe [3] for WAN architecture.
[1] https://en.wikipedia.org/wiki/External_Data_Representation
[2] http://www.modulaware.com/mdlt69.htm
[3] http://www.cs.vu.nl/~philip/globe/
Nick P
wow, the globe project looks interesting. Annoyed by lack of abstraction and poor performance of existing distributed application products, I am active in a somewhat similar direction: http://ruedigermoeller.github.io/kontraktor/ . Well its actually mostly JVM bound erm and not that global ;).
DeleteWe use binary protocols from day 1 at Aerospike - that's one of the "small" reasons we massively reduce server counts compared to other databases. You might be surprised how hard it is to fight against an entire industry - hardware companies don't like getting cut out, cloud companies don't like getting cut out, open source companies that charge by node count don't like getting cut out. All of these guys pay lip service to efficiency, then bury technology that is actually more efficient. Flash (SSD) storage is similar - you end up paying a lot less for most use cases compared to DRAM and compared to Rotational, but only a handful of database companies have optimized for Flash.
ReplyDeleteAgree. In addition there is widespread lack of knowledge of what is technically possible. People's gut has adopted to crappy tech. Premature scaleout dominates.
DeleteIt's better to use binary protocol for service to service communication, much more efficient than any text protocol, takes less bandwidth too.
ReplyDeleteThis article is very informative and easy to understand. Thank you for sharing!
ReplyDeleteI read this article. I think You put a lot of effort to create this article. I appreciate your work.
ReplyDeletethesis Writing Service
Devops is not a Tool.Devops Is a Practice, Methodology, Culture or process used in an Organization or Company for fast collaboration, integration and communication between Development and Operational Teams. In order to increase, automate the speed of productivity and delivery with reliability.
ReplyDeletepython training in bangalore
aws training in bangalore
artificial intelligence training in bangalore
data science training in bangalore
machine learning training in bangalore
hadoop training in bangalore
devops training in bangalore
corporate training companies
ReplyDeletecorporate training companies in mumbai
corporate training companies in pune
corporate training companies in delhi
corporate training companies in chennai
corporate training companies in hyderabad
corporate training companies in bangalore
Gaining Python certifications will validate your skills and advance your career.
ReplyDeletepython certification
Good article about Java. There's a lot of good points here and you explained them very well. www.spectrummobile.com/activate
ReplyDeleteGood Post
ReplyDeleteYaaron Studios is one of the rapidly growing editing studios in Hyderabad. We are the best Video Editing services in Hyderabad. We provides best graphic works like logo reveals, corporate presentation Etc. And also we gives the best Outdoor/Indoor shoots and Ad Making services.
video editing studios in hyderabad
short film editors in hyderabad
corporate video editing studio in hyderabad
ad making company in hyderabad
Nice Blog
ReplyDelete"Pressure Vessel Design Course is one of the courses offered by Sanjary Academy in Hyderabad. We have offer professional
Engineering Course like Piping Design Course,QA / QC Course,document Controller course,pressure Vessel Design Course,
Welding Inspector Course, Quality Management Course, #Safety officer course."
Piping Design Course in India
Piping Design Course in Hyderabad
QA / QC Course
QA / QC Course in india
QA / QC Course in Hyderabad
Document Controller course
Pressure Vessel Design Course
Welding Inspector Course
Quality Management Course
Quality Management Course in india
Safety officer course