Overview

August 21, 2024 ยท View on GitHub

The F# Language Service (FSharp.Editor, using FSharp.Compiler.Service) is designed to support tooling in Visual Studio and other IDEs. This document gives an overview of the features supported and notes on their technical characteristics.

Kinds of data processed and served in F# tooling

The following tables are split into two categories: syntactic and semantic. They contain common kinds of information requested, the kind of data that is involved, and roughly how expensive the operation is in terms of expected memory allocation and CPU processing.

IDE actions based on syntax

ActionData inspectedData returnedExpected CPU/Allocations (S/M/L/XL)
Syntactic ClassificationCurrent doc's source textText span and classification type for each token in the documentS
Breakpoint ResolutionCurrent doc's syntax treeText span representing where breakpoints were resolvedS
Debugging data tip infoCurrent doc's source textText span representing the token being inspectedS
Brace pair matchingCurrent doc's source textText spans representing brace pairs that match in the input documentS
"Smart" indentationCurrent doc's source textIndentation location in a documentS
Code fixes operating only on syntaxCurrent doc's source textSmall text change for documentS
XML doc template generationCurrent doc's syntax treeSmall (usually) text change for documentS
Brace pair completionCurrent doc's source textAdditional brace pair inserted into source textS
Source document navigationCurrent doc's syntax tree"Navigation Items" with optional child navigation items containing ranges in source codeS
Code outliningCurrent doc's source textText spans representing blocks of F# code that are collapsable as a groupS - M
Editor formattingCurrent doc's source textNew source text for the documentS - L
Syntax diagnosticsCurrent doc's source textList of diagnostic data including the span of text corresponding to the diagnosticS
Global construct search and navigationAll syntax trees for all projectsAll items that match a user's search pattern with spans of text that represent where a given item is locatedS-L

You likely noticed that nearly all of the syntactical operations are marked S. Aside from extreme cases, like files with 50k lines or higher, syntax-only operations typically finish very quickly. In addition to being computationally inexpensive, they are also run asynchronously and free-threaded.

Editor formatting is a bit of an exception. Most IDEs offer common commands for format an entire document, and although they also offer commands to format a given text selection, users typically choose to format the whole document. This means an entire document has to be inspected and potentially rewritten based on often complex rules. In practice this isn't bad when working with a document that has already been formatted, but it can be expensive for larger documents with strange stylistic choices.

Most of the syntax operations require an entire document's source text or parse tree. It stands to reason that this could be improved by operating on a diff of a parse tree instead of the whole thing. This is likely a very complex thing to implement though, since none of the F# compiler infrastructure works in this way today.

IDE actions based on semantics

ActionData inspectedData returnedExpected CPU/Allocations (S/M/L/XL)
Most code fixesCurrent document's typecheck dataSet (1 or more) of suggested text replacementsS-M
Semantic classificationCurrent document's typecheck dataSpans of text with semantic classification type for all constructs in a documentS-L
Code generation / refactoringsCurrent document's typecheck data and/or current resolved symbol/symbolsText replacement(s)S-L
Code completionCurrent document's typecheck data and currently-resolved symbol user is typing atList of all symbols in scope that are "completable" based on where completion is invokedS-L
Editor tooltipsCurrent document's typecheck data and resolved symbol where user invoked a tooltipF# tooltip data based on inspecting a type and its declarations, then pretty-printing themS-XL
Diagnostics based on F# semanticsCurrent document's typecheck dataDiagnostic info for each symbol with diagnostics to show, including the range of text associated with the diagnosticM-XL
Symbol highlighting in a documentCurrent document's typecheck data and currently-resolved symbol where user's caret is locatedRanges of text representing instances of that symbol in the documentS-M
Semantic navigation (for example, Go to Definition)Current document's typecheck data and currently-resolved symbol where the user invoked navigationLocation of a symbol's declarationS-M
RenameGraph of all projects that use the symbol that rename is triggered on and the typecheck data for each of those projectsList of all uses of all symbols that are to be renamedS-XL
Find all referencesGraph of all projects that Find References is triggered on and the typecheck data for each of those projectsList of all uses of all symbols that are foundS-XL
Unused value/symbol analysisTypecheck data for the current documentList of all symbols that aren't a public API and are unusedS-M
Unused open analysisTypecheck data for the current document and all symbol data brought into scope by each open declarationList of open declarations whose symbols it exposes aren't used in the current documentS-L
Missing open analysisTypecheck data for the current document, resolved symbol with an error, and list of available namespaces or modulesList of candidate namespaces or modules that can be openedS-M
Misspelled name suggestion analysisTypecheck data for the current document and resolved symbol with an errorList of candidates that are in scope and best match the misspelled name based on a string distance algorithmS-M
Name simplification analysisTypecheck data for the current document and all symbol data brought into scope by each open declarationList of text changes available for any fully- or partially-qualified symbol that can be simplifiedS-XL

You likely noticed that every cost associated with an action has a range. This is based on two factors:

  1. If the semantic data being operated on is cached
  2. How much semantic data must be processed for the action to be completed

Most actions are S if they operate on cached data and the compiler determines that no data needs to be re-computed. The size of their range is influenced largely by the kind of semantic operations each action has to do, such as:

  • Typechecking a single document and processing the resulting data
  • Typechecking a document and its containing project and then processing the resulting data
  • Resolving a single symbol in a document
  • Resolving the definition of a single symbol in a codebase
  • Inspecting all symbols brought into scope by a given open declaration
  • Inspecting all symbols in a document
  • Inspecting all symbols in all documents contained in a graph of projects

For example, commands like Find All References and Rename can be cheap if a codebase is small, hence the lower bound being S. But if the symbol in question is used across many documents in a large project graph, they are very expensive because the entire graph must be crawled and all symbols contained in its documents must be inspected.

In contrast, actions like highlighting all symbols in a document aren't terribly expensive even for very large files. That's because the symbols to be inspected are ultimately only in a single document.