1 | // Copyright 2018 The Go Authors. All rights reserved. |
---|---|
2 | // Use of this source code is governed by a BSD-style |
3 | // license that can be found in the LICENSE file. |
4 | |
5 | /* |
6 | Package analysis defines the interface between a modular static |
7 | analysis and an analysis driver program. |
8 | |
9 | # Background |
10 | |
11 | A static analysis is a function that inspects a package of Go code and |
12 | reports a set of diagnostics (typically mistakes in the code), and |
13 | perhaps produces other results as well, such as suggested refactorings |
14 | or other facts. An analysis that reports mistakes is informally called a |
15 | "checker". For example, the printf checker reports mistakes in |
16 | fmt.Printf format strings. |
17 | |
18 | A "modular" analysis is one that inspects one package at a time but can |
19 | save information from a lower-level package and use it when inspecting a |
20 | higher-level package, analogous to separate compilation in a toolchain. |
21 | The printf checker is modular: when it discovers that a function such as |
22 | log.Fatalf delegates to fmt.Printf, it records this fact, and checks |
23 | calls to that function too, including calls made from another package. |
24 | |
25 | By implementing a common interface, checkers from a variety of sources |
26 | can be easily selected, incorporated, and reused in a wide range of |
27 | driver programs including command-line tools (such as vet), text editors and |
28 | IDEs, build and test systems (such as go build, Bazel, or Buck), test |
29 | frameworks, code review tools, code-base indexers (such as SourceGraph), |
30 | documentation viewers (such as godoc), batch pipelines for large code |
31 | bases, and so on. |
32 | |
33 | # Analyzer |
34 | |
35 | The primary type in the API is Analyzer. An Analyzer statically |
36 | describes an analysis function: its name, documentation, flags, |
37 | relationship to other analyzers, and of course, its logic. |
38 | |
39 | To define an analysis, a user declares a (logically constant) variable |
40 | of type Analyzer. Here is a typical example from one of the analyzers in |
41 | the go/analysis/passes/ subdirectory: |
42 | |
43 | package unusedresult |
44 | |
45 | var Analyzer = &analysis.Analyzer{ |
46 | Name: "unusedresult", |
47 | Doc: "check for unused results of calls to some functions", |
48 | Run: run, |
49 | ... |
50 | } |
51 | |
52 | func run(pass *analysis.Pass) (interface{}, error) { |
53 | ... |
54 | } |
55 | |
56 | An analysis driver is a program such as vet that runs a set of |
57 | analyses and prints the diagnostics that they report. |
58 | The driver program must import the list of Analyzers it needs. |
59 | Typically each Analyzer resides in a separate package. |
60 | To add a new Analyzer to an existing driver, add another item to the list: |
61 | |
62 | import ( "unusedresult"; "nilness"; "printf" ) |
63 | |
64 | var analyses = []*analysis.Analyzer{ |
65 | unusedresult.Analyzer, |
66 | nilness.Analyzer, |
67 | printf.Analyzer, |
68 | } |
69 | |
70 | A driver may use the name, flags, and documentation to provide on-line |
71 | help that describes the analyses it performs. |
72 | The doc comment contains a brief one-line summary, |
73 | optionally followed by paragraphs of explanation. |
74 | |
75 | The Analyzer type has more fields besides those shown above: |
76 | |
77 | type Analyzer struct { |
78 | Name string |
79 | Doc string |
80 | Flags flag.FlagSet |
81 | Run func(*Pass) (interface{}, error) |
82 | RunDespiteErrors bool |
83 | ResultType reflect.Type |
84 | Requires []*Analyzer |
85 | FactTypes []Fact |
86 | } |
87 | |
88 | The Flags field declares a set of named (global) flag variables that |
89 | control analysis behavior. Unlike vet, analysis flags are not declared |
90 | directly in the command line FlagSet; it is up to the driver to set the |
91 | flag variables. A driver for a single analysis, a, might expose its flag |
92 | f directly on the command line as -f, whereas a driver for multiple |
93 | analyses might prefix the flag name by the analysis name (-a.f) to avoid |
94 | ambiguity. An IDE might expose the flags through a graphical interface, |
95 | and a batch pipeline might configure them from a config file. |
96 | See the "findcall" analyzer for an example of flags in action. |
97 | |
98 | The RunDespiteErrors flag indicates whether the analysis is equipped to |
99 | handle ill-typed code. If not, the driver will skip the analysis if |
100 | there were parse or type errors. |
101 | The optional ResultType field specifies the type of the result value |
102 | computed by this analysis and made available to other analyses. |
103 | The Requires field specifies a list of analyses upon which |
104 | this one depends and whose results it may access, and it constrains the |
105 | order in which a driver may run analyses. |
106 | The FactTypes field is discussed in the section on Modularity. |
107 | The analysis package provides a Validate function to perform basic |
108 | sanity checks on an Analyzer, such as that its Requires graph is |
109 | acyclic, its fact and result types are unique, and so on. |
110 | |
111 | Finally, the Run field contains a function to be called by the driver to |
112 | execute the analysis on a single package. The driver passes it an |
113 | instance of the Pass type. |
114 | |
115 | # Pass |
116 | |
117 | A Pass describes a single unit of work: the application of a particular |
118 | Analyzer to a particular package of Go code. |
119 | The Pass provides information to the Analyzer's Run function about the |
120 | package being analyzed, and provides operations to the Run function for |
121 | reporting diagnostics and other information back to the driver. |
122 | |
123 | type Pass struct { |
124 | Fset *token.FileSet |
125 | Files []*ast.File |
126 | OtherFiles []string |
127 | IgnoredFiles []string |
128 | Pkg *types.Package |
129 | TypesInfo *types.Info |
130 | ResultOf map[*Analyzer]interface{} |
131 | Report func(Diagnostic) |
132 | ... |
133 | } |
134 | |
135 | The Fset, Files, Pkg, and TypesInfo fields provide the syntax trees, |
136 | type information, and source positions for a single package of Go code. |
137 | |
138 | The OtherFiles field provides the names, but not the contents, of non-Go |
139 | files such as assembly that are part of this package. See the "asmdecl" |
140 | or "buildtags" analyzers for examples of loading non-Go files and reporting |
141 | diagnostics against them. |
142 | |
143 | The IgnoredFiles field provides the names, but not the contents, |
144 | of ignored Go and non-Go source files that are not part of this package |
145 | with the current build configuration but may be part of other build |
146 | configurations. See the "buildtags" analyzer for an example of loading |
147 | and checking IgnoredFiles. |
148 | |
149 | The ResultOf field provides the results computed by the analyzers |
150 | required by this one, as expressed in its Analyzer.Requires field. The |
151 | driver runs the required analyzers first and makes their results |
152 | available in this map. Each Analyzer must return a value of the type |
153 | described in its Analyzer.ResultType field. |
154 | For example, the "ctrlflow" analyzer returns a *ctrlflow.CFGs, which |
155 | provides a control-flow graph for each function in the package (see |
156 | golang.org/x/tools/go/cfg); the "inspect" analyzer returns a value that |
157 | enables other Analyzers to traverse the syntax trees of the package more |
158 | efficiently; and the "buildssa" analyzer constructs an SSA-form |
159 | intermediate representation. |
160 | Each of these Analyzers extends the capabilities of later Analyzers |
161 | without adding a dependency to the core API, so an analysis tool pays |
162 | only for the extensions it needs. |
163 | |
164 | The Report function emits a diagnostic, a message associated with a |
165 | source position. For most analyses, diagnostics are their primary |
166 | result. |
167 | For convenience, Pass provides a helper method, Reportf, to report a new |
168 | diagnostic by formatting a string. |
169 | Diagnostic is defined as: |
170 | |
171 | type Diagnostic struct { |
172 | Pos token.Pos |
173 | Category string // optional |
174 | Message string |
175 | } |
176 | |
177 | The optional Category field is a short identifier that classifies the |
178 | kind of message when an analysis produces several kinds of diagnostic. |
179 | |
180 | The Diagnostic struct does not have a field to indicate its severity |
181 | because opinions about the relative importance of Analyzers and their |
182 | diagnostics vary widely among users. The design of this framework does |
183 | not hold each Analyzer responsible for identifying the severity of its |
184 | diagnostics. Instead, we expect that drivers will allow the user to |
185 | customize the filtering and prioritization of diagnostics based on the |
186 | producing Analyzer and optional Category, according to the user's |
187 | preferences. |
188 | |
189 | Most Analyzers inspect typed Go syntax trees, but a few, such as asmdecl |
190 | and buildtag, inspect the raw text of Go source files or even non-Go |
191 | files such as assembly. To report a diagnostic against a line of a |
192 | raw text file, use the following sequence: |
193 | |
194 | content, err := ioutil.ReadFile(filename) |
195 | if err != nil { ... } |
196 | tf := fset.AddFile(filename, -1, len(content)) |
197 | tf.SetLinesForContent(content) |
198 | ... |
199 | pass.Reportf(tf.LineStart(line), "oops") |
200 | |
201 | # Modular analysis with Facts |
202 | |
203 | To improve efficiency and scalability, large programs are routinely |
204 | built using separate compilation: units of the program are compiled |
205 | separately, and recompiled only when one of their dependencies changes; |
206 | independent modules may be compiled in parallel. The same technique may |
207 | be applied to static analyses, for the same benefits. Such analyses are |
208 | described as "modular". |
209 | |
210 | A compiler’s type checker is an example of a modular static analysis. |
211 | Many other checkers we would like to apply to Go programs can be |
212 | understood as alternative or non-standard type systems. For example, |
213 | vet's printf checker infers whether a function has the "printf wrapper" |
214 | type, and it applies stricter checks to calls of such functions. In |
215 | addition, it records which functions are printf wrappers for use by |
216 | later analysis passes to identify other printf wrappers by induction. |
217 | A result such as “f is a printf wrapper” that is not interesting by |
218 | itself but serves as a stepping stone to an interesting result (such as |
219 | a diagnostic) is called a "fact". |
220 | |
221 | The analysis API allows an analysis to define new types of facts, to |
222 | associate facts of these types with objects (named entities) declared |
223 | within the current package, or with the package as a whole, and to query |
224 | for an existing fact of a given type associated with an object or |
225 | package. |
226 | |
227 | An Analyzer that uses facts must declare their types: |
228 | |
229 | var Analyzer = &analysis.Analyzer{ |
230 | Name: "printf", |
231 | FactTypes: []analysis.Fact{new(isWrapper)}, |
232 | ... |
233 | } |
234 | |
235 | type isWrapper struct{} // => *types.Func f “is a printf wrapper” |
236 | |
237 | The driver program ensures that facts for a pass’s dependencies are |
238 | generated before analyzing the package and is responsible for propagating |
239 | facts from one package to another, possibly across address spaces. |
240 | Consequently, Facts must be serializable. The API requires that drivers |
241 | use the gob encoding, an efficient, robust, self-describing binary |
242 | protocol. A fact type may implement the GobEncoder/GobDecoder interfaces |
243 | if the default encoding is unsuitable. Facts should be stateless. |
244 | Because serialized facts may appear within build outputs, the gob encoding |
245 | of a fact must be deterministic, to avoid spurious cache misses in |
246 | build systems that use content-addressable caches. |
247 | The driver makes a single call to the gob encoder for all facts |
248 | exported by a given analysis pass, so that the topology of |
249 | shared data structures referenced by multiple facts is preserved. |
250 | |
251 | The Pass type has functions to import and export facts, |
252 | associated either with an object or with a package: |
253 | |
254 | type Pass struct { |
255 | ... |
256 | ExportObjectFact func(types.Object, Fact) |
257 | ImportObjectFact func(types.Object, Fact) bool |
258 | |
259 | ExportPackageFact func(fact Fact) |
260 | ImportPackageFact func(*types.Package, Fact) bool |
261 | } |
262 | |
263 | An Analyzer may only export facts associated with the current package or |
264 | its objects, though it may import facts from any package or object that |
265 | is an import dependency of the current package. |
266 | |
267 | Conceptually, ExportObjectFact(obj, fact) inserts fact into a hidden map keyed by |
268 | the pair (obj, TypeOf(fact)), and the ImportObjectFact function |
269 | retrieves the entry from this map and copies its value into the variable |
270 | pointed to by fact. This scheme assumes that the concrete type of fact |
271 | is a pointer; this assumption is checked by the Validate function. |
272 | See the "printf" analyzer for an example of object facts in action. |
273 | |
274 | Some driver implementations (such as those based on Bazel and Blaze) do |
275 | not currently apply analyzers to packages of the standard library. |
276 | Therefore, for best results, analyzer authors should not rely on |
277 | analysis facts being available for standard packages. |
278 | For example, although the printf checker is capable of deducing during |
279 | analysis of the log package that log.Printf is a printf wrapper, |
280 | this fact is built in to the analyzer so that it correctly checks |
281 | calls to log.Printf even when run in a driver that does not apply |
282 | it to standard packages. We would like to remove this limitation in future. |
283 | |
284 | # Testing an Analyzer |
285 | |
286 | The analysistest subpackage provides utilities for testing an Analyzer. |
287 | In a few lines of code, it is possible to run an analyzer on a package |
288 | of testdata files and check that it reported all the expected |
289 | diagnostics and facts (and no more). Expectations are expressed using |
290 | "// want ..." comments in the input code. |
291 | |
292 | # Standalone commands |
293 | |
294 | Analyzers are provided in the form of packages that a driver program is |
295 | expected to import. The vet command imports a set of several analyzers, |
296 | but users may wish to define their own analysis commands that perform |
297 | additional checks. To simplify the task of creating an analysis command, |
298 | either for a single analyzer or for a whole suite, we provide the |
299 | singlechecker and multichecker subpackages. |
300 | |
301 | The singlechecker package provides the main function for a command that |
302 | runs one analyzer. By convention, each analyzer such as |
303 | go/analysis/passes/findcall should be accompanied by a singlechecker-based |
304 | command such as go/analysis/passes/findcall/cmd/findcall, defined in its |
305 | entirety as: |
306 | |
307 | package main |
308 | |
309 | import ( |
310 | "golang.org/x/tools/go/analysis/passes/findcall" |
311 | "golang.org/x/tools/go/analysis/singlechecker" |
312 | ) |
313 | |
314 | func main() { singlechecker.Main(findcall.Analyzer) } |
315 | |
316 | A tool that provides multiple analyzers can use multichecker in a |
317 | similar way, giving it the list of Analyzers. |
318 | */ |
319 | package analysis |
320 |
Members