We talk about programming like it is about writing code, but the code ends up being less importantthan the architecture, and the architecture ends up being less important than social issues.
The Why LSP post discusses the “socialissues” solved by LSP. LSP (as a part of overarching Microsoft strategy) is brilliant, because itmoved the world to a new equilibrium where not having basic IDE support is frowned upon. This postinstead discusses architectural aspects of LSP, which I personally find not as brilliant(especially given thatDart Analysis Protocolpredates LSP and is technically superior in some aspects). Perhaps itcould be useful for someone designing other LSP-shaped protocols! Note that it’s been couple ofyears since I was actively involved in LSP, probably the grass is greener these days!
Let’s get to the list of properties, good and bad, in no particular order.
And let’s start with an aspect of the architecture which is genius, and which, I think, isresponsible for a big share of LSP success on the technical side. If you build a tool for workingwith multiple programming languages, one of the biggest questions is how to find common groundamong different, but ultimately similar, languages. A first attempt is to uncover essentialcommonality: after all, all languages have files, variables, functions, classes, right? This is …
maybe not necessary a dead end, but definitely a thorny and treacherous path — languages aredifferent, each language is weird in at least some of its aspects, and common ground risks to levelaway meaningful distinctions.
So, what does LSP do here? It just doesn’t provide a semantic model of the code base. Instead, it isfocused squarely on the presentation. No matter how different each programming language is, theyall, in the end, use the same completion widget. So LSP is formulated in terms of what’s shown inthe completion widget, not in terms of the underlying semantic language entities. That means thateach language has an internal semantic model which is full fidelity for this particular language,and uses it to provide the best completion experience which is possible for a given completionwidget. This is how rust-analyzer is structured internally as well:
Compiler layer deals with the messy language analysis tasks, it derives more structuredinformation (types) from less structured information (source text), explicitly tracking analysislayers and phases.
The HIR (high-level intermediate representation) is a façade around the compiler, which providesa rich graph-based object model of code which looks as if all derived information, like types, ispre-computed.
The IDE layer uses HIR to compute things like completions, and presents them as Rust-specific,but semantics-less POD structures to be shown to the user in GUI more or less as is.
One consequence of this architecture is that LSP requests map to editor widgets, and not to theunderlying language concepts, even when several different widgets are powered by the same underlyingdata. For example, LSP has separate requests for:
hierarchical outline of a file displayed in the side bar,
“breadcrumbs” shown in the header,
syntax-aware selection ranges,
code folding.
Although all four features are just different views into an AST, there’s no “get AST” request in theLSP. Different requests allow to fine-tune presentation for the different use-cases, and thedetails do differ! Semantic selection might contain some sub-syntax ranges inside string literalsand comments, breadcrumb need to include things like conditionals of if expressions, while theoutline might want to get rid of less important nodes. Attentive reader will notice that breadcrumbsand the outline actually use the same LSP request. Even LSP doesn’t follow LSP philosophy fully!
After a big thing that LSP did right, let’s look at a small thing that it got wrong. Let’s look athow information is transmitted over the wire.
JSON is actually OK! Many people complain that JSON is slow, but that’s not actually the casegenerally. There are some edge cases, where particular client libraries can be slow as was the caseat least at some point with Swift and Emacs, but JSON is definitely fast enough for Rust, Java andJavaScript. Of course, something substantially better than JSON is possible in theory.
I think ideally we need “WebAssembly for IPC”, a format that:
has dual text and binary encoding,
is stupidly simple,
is thoroughly, readably, and precisely specified,
and, in general, is principled and a joy to use.
There’s no such format yet, so JSON it is. Good enough.
HTTP framing is not OK. On the wire, the messages framed like this:
Content-Length: 92 \r\n\r\nActual message
That is:
case-insensitive “content-length” header,
followed by length of the following message, formatted as a decimal number in ASCII,
followed by double \r\n,
followed by the actual message.
This resembles HTTP, but is not actual HTTP, so you need to write a bit of custom code to dealwith the framing. That’s not hard:
Prone to complexity amplification, invites jsonrpc framework with all the latest patterns.
"jsonrpc": "2.0" is meaningless noise which you have to look at during debugging.
Error codes like -32601 (ah, that comes from xml-rpc!).
Includes notifications. Notification are a big anti-pattern in RPC, for a somewhat subtle reason.More on this later.
What to do instead? Do what Dart does, some excerpts from the specification:
Messages are delineated by newlines. This means,in particular, that the JSON encoding process must not introduce newlines within a message. Notehowever that newlines are used in this document for readability.
To ease interoperability with Lisp-based clients (which may not be able to easily distinguishbetween empty lists, empty maps, and null), client-to-server communication is allowed to replace anyinstance of “{}” or “[]” with null. The server will always properly represent empty lists as “[]”
and empty maps as “{}”.
Clients can make a request of the server and the server will provide a response for each requestthat it receives. While many of the requests that can be made by a client are informational innature, we have chosen to always return a response so that clients can know whether the request wasreceived and was correct.
LSP uses (line, column) pairs for coordinates. The neat thing here is that this solves significantchunk of \n vs \r\n problems — client and server may represent line endings differently, butthis doesn’t matter, because coordinates are the same.
Focus on the presentation provides another motivation, because location information received by theclient can be directly presented to the user, without the need to parse the underlying file. I havemixed feelings about this.
The problem, column is counted using UTF-16 code units. This is, like, “no”. For many reasons,but in particular, UTF-16 is definitely the wrong number to show to the user as a “column”.
There’s no entirely obvious answer what should be used instead. My personal favorite would becounting utf-8 code units (so, just bytes). You need some coordinate space. Any reasonablecoordinate space won’t be useful for presentation, so you might as well use the space that matchesthe underlying utf-8 encoding, so that accessing substrings is O(1).
Using unicode codepoints would perhaps be the most agreeable solution. Codepoints are useless —
you’ll need to convert to grapheme clusters for presentation, and to utf-8 code units to do anythingwith the string. Still, codepoints are a common denominator, they are more often correct ifincorrectly used for presentation, and they have a nice property that any index less than length isvalid irrespective of the actual string.
As mentioned above, one drawback of one-way notifications from jsonrpc is that they don’t allowsignaling errors. But there’s a more subtle problem here: because you don’t receive response to anotification, it might be hard to order it relative to other events. The Dart protocol is prettystrict about the ordering of events:
There is no guarantee concerning the order in which responses will be returned, but there is aguarantee that the server will process requests in the order in which they are sent as long as thetransport mechanism also makes this guarantee.
This guarantee ensures that the client and the server mutually understand each other’s state. Forevery request the client knows which file modifications happened before it, and which came afterwards.
In LSP, when the client wants to modify the state of a file on the server, it sends a notification.LSP also supports server-initiated edits. Now, if the client sends a didChangeTextDocumentnotification, and then receives a workspace/applyEdit request from the server, there’s no way forthe client to know whether the edit takes the latest change into the account or not. WeredidChangeTextDocument a request instead, the client could have looked at the relative order of thecorresponding response and workspace/applyEdit.
LSP papers over this fundamental loss of causality by including numeric versions of the documentswith every edit, but this is a best effort solution. Edits might be invalidated by changes tounrelated documents. For example, for a rename refactor, if a new usage was introduced in a new fileafter the refactor was computed, version numbers of the changed files would wrongly tell you thatthe edit is still correct, while it will miss this new usage.
Practically, this is a small problem — it works most of the time (I think I have seen zeroactual bugs caused by causality loss), and even the proper solution can’t order events originatingfrom the client relative to the events originating from the file system. But the fix is also verysimple — just don’t voluntarily lose causality links!
And this touches what I think is the biggest architectural issue with LSP. LSP is an RPC protocol
— it is formed by “edge triggered” requests that make something happen on the other side. But thisis not how most of IDE features work. What actually is needed is “level triggered”statesynchronization. The client and the server need to agree what something is, deciding the courseof action is secondary. It is “to be or not to be” rather than “what is to be done”.
At the bottom is synchronization of text documents — the server and the client need to agree whichfiles there are, and what is their content.
Above is synchronization of derived data. For example, there’s a set of errors in the project. Thisset changes when the underlying text files change. Errors change with some lag, as it takes time tocompute them (and sometimes files changes faster than the errors could be re-computed).
Things like file outline, syntax highlighting, cross-reference information, e.t.c, all follow thesame pattern.
Crucially, predicting which changes to the source invalidate which derived data requires languagespecific knowledge. Changing the text of foo.rs might affect syntax highlighting in bar.rs (assyntax highlighting is affected by types).
In LSP, highlighting and such are requests. This means that either the client is incorrect and showsstale highlighting results, or it conservatively re-queries all highlighting results after everychange, wasting the CPU, and still showing stale results sometimes, when an update happens outsideof the client (eg, when cargo finished downloading external crates).
The Dart model is more flexible, performant and elegant. Instead of highlighting being a request, itis a subscription. The client subscribes to syntax highlighting of particular files, the servernotifies the client whenever highlights for the selected files change. That is, two pieces of stateare synchronized between the client and the server:
The set of files the client is subscribed to
The actual state of syntax highlighting for these files.
The former is synchronized by sending the whole “current set” of files in a request, whenever theset changes. The latter is synchronized by sending incremental updates.
Subscriptions are granular both in terms of the file set, as well as in terms of features. Theclient might subscribe for errors in the whole project, and for highlights in the currently openeddocuments only.
Subscriptions are implemented in terms of RPC, but they are an overarching organizational patternfollowed by the majority of the requests. LSP doesn’t have an equivalent, and has real bugs withoutdated information shown to the user.
I don’t think Dart goes as far as possible here. JetBrains Rider, if I understand correctly, doessomething smarter:
I think the idea behind the rider protocol is that you directly define the state you want tosynchronize between the client and the server as state. The protocol then manages “magic”
synchronization of the state by sending minimal diffs.
In this refactoring, the user selects a function declaration, then rearrangesparameters in some way (reorders, removes, adds, renames, changes types, whatever), and then the IDEfixes all call-sites.
The thing that makes this refactor complex is that it is interactive — it’s not an atomic request
“rename foo to bar”, it’s a dialog between the IDE and the user. There are many parameters thatthe user tweaks based on the analysis of the original code and the already specified aspects of therefactoring.
LSP doesn’t support this workflows. Dart somewhat supports them, though each refactoring gets to usecustom messages (that is, there’s quite good overall protocol for multistep refactorings, but eachrefactoring essentially sends any over the wire, and the IDE on the other side hard-codes specificGUIs for specific refactorings). This per-refactoring work is not nice, but it is much better thannot having these complex refactorings at all.
A small one to conclude. Significant chunk of conceptual LSP complexity comes from support fordynamic registration of capabilities. I don’t understand why that features is there, rust-analyzeruses dynamic registration only for specifying which files should be watched. And that would be muchsimpler if it used a plain request (or a subscription mechanism).