diff --git a/src/attributes.md b/src/attributes.md index 5858781510..696e6184b6 100644 --- a/src/attributes.md +++ b/src/attributes.md @@ -17,9 +17,7 @@ AttrInput -> ``` r[attributes.intro] -An _attribute_ is a general, free-form metadatum that is interpreted according -to name, convention, language, and compiler version. Attributes are modeled -on Attributes in [ECMA-335], with the syntax coming from [ECMA-334] \(C#). +An _attribute_ is a general, free-form metadatum that is interpreted according to name, convention, language, and compiler version. Attributes are modeled on Attributes in [ECMA-335], with the syntax coming from [ECMA-334] \(C#). r[attributes.inner] _Inner attributes_, written with a bang (`!`) after the hash (`#`), apply to the form that the attribute is declared within. @@ -62,17 +60,10 @@ _Outer attributes_, written without the bang after the hash, apply to the form t > ``` r[attributes.input] -The attribute consists of a path to the attribute, followed by an optional -delimited token tree whose interpretation is defined by the attribute. -Attributes other than macro attributes also allow the input to be an equals -sign (`=`) followed by an expression. See the [meta item -syntax](#meta-item-attribute-syntax) below for more details. +The attribute consists of a path to the attribute, followed by an optional delimited token tree whose interpretation is defined by the attribute. Attributes other than macro attributes also allow the input to be an equals sign (`=`) followed by an expression. See the [meta item syntax](#meta-item-attribute-syntax) below for more details. r[attributes.safety] -An attribute may be unsafe to apply. To avoid undefined behavior when using -these attributes, certain obligations that cannot be checked by the compiler -must be met. To assert these have been, the attribute is wrapped in -`unsafe(..)`, e.g. `#[unsafe(no_mangle)]`. +An attribute may be unsafe to apply. To avoid undefined behavior when using these attributes, certain obligations that cannot be checked by the compiler must be met. To assert these have been, the attribute is wrapped in `unsafe(..)`, e.g. `#[unsafe(no_mangle)]`. The following attributes are unsafe: @@ -92,29 +83,21 @@ Attributes can be classified into the following kinds: r[attributes.allowed-position] Attributes may be applied to many forms in the language: -* All [item declarations] accept outer attributes while [external blocks], - [functions], [implementations], and [modules] accept inner attributes. -* Most [statements] accept outer attributes (see [Expression Attributes] for - limitations on expression statements). -* [Block expressions] accept outer and inner attributes, but only when they are - the outer expression of an [expression statement] or the final expression of - another block expression. +* All [item declarations] accept outer attributes while [external blocks], [functions], [implementations], and [modules] accept inner attributes. +* Most [statements] accept outer attributes (see [Expression Attributes] for limitations on expression statements). +* [Block expressions] accept outer and inner attributes, but only when they are the outer expression of an [expression statement] or the final expression of another block expression. * [Enum] variants and [struct] and [union] fields accept outer attributes. * [Match expression arms][match expressions] accept outer attributes. * [Generic lifetime or type parameter][generics] accept outer attributes. -* Expressions accept outer attributes in limited situations, see [Expression - Attributes] for details. -* [Function][functions], [closure] and [function pointer] - parameters accept outer attributes. This includes attributes on variadic parameters - denoted with `...` in function pointers and [external blocks][variadic functions]. +* Expressions accept outer attributes in limited situations, see [Expression Attributes] for details. +* [Function][functions], [closure] and [function pointer] parameters accept outer attributes. This includes attributes on variadic parameters denoted with `...` in function pointers and [external blocks][variadic functions]. * [Inline assembly] template strings and operands accept outer attributes. Only certain attributes are accepted semantically; for details, see [asm.attributes.supported-attributes]. r[attributes.meta] ## Meta item attribute syntax r[attributes.meta.intro] -A "meta item" is the syntax used for the [Attr] rule by most [built-in -attributes]. It has the following grammar: +A "meta item" is the syntax used for the [Attr] rule by most [built-in attributes]. It has the following grammar: r[attributes.meta.syntax] ```grammar,attributes @@ -132,15 +115,10 @@ MetaItemInner -> ``` r[attributes.meta.literal-expr] -Expressions in meta items must macro-expand to literal expressions, which must not -include integer or float type suffixes. Expressions which are not literal expressions -will be syntactically accepted (and can be passed to proc-macros), but will be rejected after parsing. +Expressions in meta items must macro-expand to literal expressions, which must not include integer or float type suffixes. Expressions which are not literal expressions will be syntactically accepted (and can be passed to proc-macros), but will be rejected after parsing. r[attributes.meta.order] -Note that if the attribute appears within another macro, it will be expanded -after that outer macro. For example, the following code will expand the -`Serialize` proc-macro first, which must preserve the `include_str!` call in -order for it to be expanded: +Note that if the attribute appears within another macro, it will be expanded after that outer macro. For example, the following code will expand the `Serialize` proc-macro first, which must preserve the `include_str!` call in order for it to be expanded: ```rust ignore #[derive(Serialize)] @@ -162,9 +140,7 @@ fn foo() {} ``` r[attributes.meta.builtin] -Various built-in attributes use different subsets of the meta item syntax to -specify their inputs. The following grammar rules show some commonly used -forms: +Various built-in attributes use different subsets of the meta item syntax to specify their inputs. The following grammar rules show some commonly used forms: r[attributes.meta.builtin.syntax] ```grammar,attributes @@ -198,30 +174,21 @@ r[attributes.activity] ## Active and inert attributes r[attributes.activity.intro] -An attribute is either active or inert. During attribute processing, *active -attributes* remove themselves from the form they are on while *inert attributes* -stay on. +An attribute is either active or inert. During attribute processing, *active attributes* remove themselves from the form they are on while *inert attributes* stay on. -The [`cfg`] and [`cfg_attr`] attributes are active. -[Attribute macros] are active. All other attributes are inert. +The [`cfg`] and [`cfg_attr`] attributes are active. [Attribute macros] are active. All other attributes are inert. r[attributes.tool] ## Tool attributes r[attributes.tool.intro] -The compiler may allow attributes for external tools where each tool resides -in its own module in the [tool prelude]. The first segment of the attribute -path is the name of the tool, with one or more additional segments whose -interpretation is up to the tool. +The compiler may allow attributes for external tools where each tool resides in its own module in the [tool prelude]. The first segment of the attribute path is the name of the tool, with one or more additional segments whose interpretation is up to the tool. r[attributes.tool.ignored] -When a tool is not in use, the tool's attributes are accepted without a -warning. When the tool is in use, the tool is responsible for processing and -interpretation of its attributes. +When a tool is not in use, the tool's attributes are accepted without a warning. When the tool is in use, the tool is responsible for processing and interpretation of its attributes. r[attributes.tool.prelude] -Tool attributes are not available if the [`no_implicit_prelude`] attribute is -used. +Tool attributes are not available if the [`no_implicit_prelude`] attribute is used. ```rust // Tells the rustfmt tool to not format the following element. @@ -253,13 +220,11 @@ The following is an index of all built-in attributes. - Derive - [`derive`] --- Automatic trait implementations. - - [`automatically_derived`] --- Marker for implementations created by - `derive`. + - [`automatically_derived`] --- Marker for implementations created by `derive`. - Macros - [`macro_export`] --- Exports a `macro_rules` macro for cross-crate usage. - - [`macro_use`] --- Expands macro visibility, or imports macros from other - crates. + - [`macro_use`] --- Expands macro visibility, or imports macros from other crates. - [`proc_macro`] --- Defines a function-like macro. - [`proc_macro_derive`] --- Defines a derive macro. - [`proc_macro_attribute`] --- Defines an attribute macro. @@ -268,27 +233,21 @@ The following is an index of all built-in attributes. - [`allow`], [`expect`], [`warn`], [`deny`], [`forbid`] --- Alters the default lint level. - [`deprecated`] --- Generates deprecation notices. - [`must_use`] --- Generates a lint for unused values. - - [`diagnostic::on_unimplemented`] --- Hints the compiler to emit a certain error - message if a trait is not implemented. + - [`diagnostic::on_unimplemented`] --- Hints the compiler to emit a certain error message if a trait is not implemented. - [`diagnostic::do_not_recommend`] --- Hints the compiler to not show a certain trait impl in error messages. - ABI, linking, symbols, and FFI - [`link`] --- Specifies a native library to link with an `extern` block. - - [`link_name`] --- Specifies the name of the symbol for functions or statics - in an `extern` block. - - [`link_ordinal`] --- Specifies the ordinal of the symbol for functions or - statics in an `extern` block. + - [`link_name`] --- Specifies the name of the symbol for functions or statics in an `extern` block. + - [`link_ordinal`] --- Specifies the ordinal of the symbol for functions or statics in an `extern` block. - [`no_link`] --- Prevents linking an extern crate. - [`repr`] --- Controls type layout. - [`crate_type`] --- Specifies the type of crate (library, executable, etc.). - [`no_main`] --- Disables emitting the `main` symbol. - - [`export_name`] --- Specifies the exported symbol name for a function or - static. - - [`link_section`] --- Specifies the section of an object file to use for a - function or static. + - [`export_name`] --- Specifies the exported symbol name for a function or static. + - [`link_section`] --- Specifies the section of an object file to use for a function or static. - [`no_mangle`] --- Disables symbol name encoding. - - [`used`] --- Forces the compiler to keep a static item in the output - object file. + - [`used`] --- Forces the compiler to keep a static item in the output object file. - [`crate_name`] --- Specifies the crate name. - Code generation @@ -301,8 +260,7 @@ The following is an index of all built-in attributes. - [`instruction_set`] --- Specify the instruction set used to generate a function's code. - Documentation - - `doc` --- Specifies documentation. See [The Rustdoc Book] for more - information. [Doc comments] are transformed into `doc` attributes. + - `doc` --- Specifies documentation. See [The Rustdoc Book] for more information. [Doc comments] are transformed into `doc` attributes. - Preludes - [`no_std`] --- Removes std from the prelude. @@ -312,8 +270,7 @@ The following is an index of all built-in attributes. - [`path`] --- Specifies the filename for a module. - Limits - - [`recursion_limit`] --- Sets the maximum recursion limit for certain - compile-time operations. + - [`recursion_limit`] --- Sets the maximum recursion limit for certain compile-time operations. - [`type_length_limit`] --- Sets the maximum size of a polymorphic type. - Runtime @@ -322,12 +279,10 @@ The following is an index of all built-in attributes. - [`windows_subsystem`] --- Specifies the windows subsystem to link with. - Features - - `feature` --- Used to enable unstable or experimental compiler features. See - [The Unstable Book] for features implemented in `rustc`. + - `feature` --- Used to enable unstable or experimental compiler features. See [The Unstable Book] for features implemented in `rustc`. - Type System - - [`non_exhaustive`] --- Indicate that a type will have more fields/variants - added in future. + - [`non_exhaustive`] --- Indicate that a type will have more fields/variants added in future. - Debugger - [`debugger_visualizer`] --- Embeds a file that specifies debugger output for a type. diff --git a/src/conditional-compilation.md b/src/conditional-compilation.md index f03362063c..bacad209f6 100644 --- a/src/conditional-compilation.md +++ b/src/conditional-compilation.md @@ -37,8 +37,7 @@ r[cfg.conditional] Whether to compile can depend on the target architecture of the compiled crate, arbitrary values passed to the compiler, and other things further described below. r[cfg.predicate] -Each form of conditional compilation takes a _configuration predicate_ that -evaluates to true or false. The predicate is one of the following: +Each form of conditional compilation takes a _configuration predicate_ that evaluates to true or false. The predicate is one of the following: r[cfg.predicate.option] * A configuration option. The predicate is true if the option is set, and false if it is unset. @@ -74,8 +73,7 @@ r[cfg.options.set] ## Set configuration options r[cfg.options.general] -Which configuration options are set is determined statically during the -compilation of the crate. +Which configuration options are set is determined statically during the compilation of the crate. r[cfg.options.target] Some options are _compiler-set_ based on data about the compilation. @@ -84,8 +82,7 @@ r[cfg.options.other] Other options are _arbitrarily-set_ based on input passed to the compiler outside of the code. r[cfg.options.crate] -It is not possible to set a -configuration option from within the source code of the crate being compiled. +It is not possible to set a configuration option from within the source code of the crate being compiled. > [!NOTE] > For `rustc`, arbitrary-set configuration options are set using the [`--cfg`] flag. Configuration values for a specified target can be displayed with `rustc --print cfg --target $TARGET`. @@ -97,9 +94,7 @@ r[cfg.target_arch] ### `target_arch` r[cfg.target_arch.gen] -Key-value option set once with the target's CPU architecture. The value is -similar to the first element of the platform's target triple, but not -identical. +Key-value option set once with the target's CPU architecture. The value is similar to the first element of the platform's target triple, but not identical. r[cfg.target_arch.values] Example values: @@ -116,8 +111,7 @@ r[cfg.target_feature] ### `target_feature` r[cfg.target_feature.general] -Key-value option set for each platform feature available for the current -compilation target. +Key-value option set for each platform feature available for the current compilation target. r[cfg.target_feature.values] Example values: @@ -130,19 +124,16 @@ Example values: * `"sse2"` * `"sse4.1"` -See the [`target_feature` attribute] for more details on the available -features. +See the [`target_feature` attribute] for more details on the available features. r[cfg.target_feature.crt_static] -An additional feature of `crt-static` is available to the -`target_feature` option to indicate that a [static C runtime] is available. +An additional feature of `crt-static` is available to the `target_feature` option to indicate that a [static C runtime] is available. r[cfg.target_os] ### `target_os` r[cfg.target_os.general] -Key-value option set once with the target's operating system. This value is -similar to the second and third element of the platform's target triple. +Key-value option set once with the target's operating system. This value is similar to the second and third element of the platform's target triple. r[cfg.target_os.values] Example values: @@ -162,9 +153,7 @@ r[cfg.target_family] ### `target_family` r[cfg.target_family.general] -Key-value option providing a more generic description of a target, such as the family of the -operating systems or architectures that the target generally falls into. Any number of -`target_family` key-value pairs can be set. +Key-value option providing a more generic description of a target, such as the family of the operating systems or architectures that the target generally falls into. Any number of `target_family` key-value pairs can be set. r[cfg.target_family.values] Example values: @@ -186,13 +175,7 @@ r[cfg.target_env] ### `target_env` r[cfg.target_env.general] -Key-value option set with further disambiguating information about the target -platform with information about the ABI or `libc` used. For historical reasons, -this value is only defined as not the empty-string when actually needed for -disambiguation. Thus, for example, on many GNU platforms, this value will be -empty. This value is similar to the fourth element of the platform's target -triple. One difference is that embedded ABIs such as `gnueabihf` will simply -define `target_env` as `"gnu"`. +Key-value option set with further disambiguating information about the target platform with information about the ABI or `libc` used. For historical reasons, this value is only defined as not the empty-string when actually needed for disambiguation. Thus, for example, on many GNU platforms, this value will be empty. This value is similar to the fourth element of the platform's target triple. One difference is that embedded ABIs such as `gnueabihf` will simply define `target_env` as `"gnu"`. r[cfg.target_env.values] Example values: @@ -209,13 +192,10 @@ r[cfg.target_abi] ### `target_abi` r[cfg.target_abi.general] -Key-value option set to further disambiguate the target with information about -the target ABI. +Key-value option set to further disambiguate the target with information about the target ABI. r[cfg.target_abi.disambiguation] -For historical reasons, this value is only defined as not the empty-string when actually -needed for disambiguation. Thus, for example, on many GNU platforms, this value will be -empty. +For historical reasons, this value is only defined as not the empty-string when actually needed for disambiguation. Thus, for example, on many GNU platforms, this value will be empty. r[cfg.target_abi.values] Example values: @@ -228,8 +208,7 @@ Example values: r[cfg.target_endian] ### `target_endian` -Key-value option set once with either a value of "little" or "big" depending -on the endianness of the target's CPU. +Key-value option set once with either a value of "little" or "big" depending on the endianness of the target's CPU. r[cfg.target_pointer_width] ### `target_pointer_width` @@ -262,12 +241,10 @@ r[cfg.target_has_atomic] ### `target_has_atomic` r[cfg.target_has_atomic.general] -Key-value option set for each bit width that the target supports -atomic loads, stores, and compare-and-swap operations. +Key-value option set for each bit width that the target supports atomic loads, stores, and compare-and-swap operations. r[cfg.target_has_atomic.stdlib] -When this cfg is present, all of the stable [`core::sync::atomic`] APIs are available for -the relevant atomic width. +When this cfg is present, all of the stable [`core::sync::atomic`] APIs are available for the relevant atomic width. r[cfg.target_has_atomic.values] Possible values: @@ -282,22 +259,17 @@ Possible values: r[cfg.test] ### `test` -Enabled when compiling the test harness. Done with `rustc` by using the -[`--test`] flag. See [Testing] for more on testing support. +Enabled when compiling the test harness. Done with `rustc` by using the [`--test`] flag. See [Testing] for more on testing support. r[cfg.debug_assertions] ### `debug_assertions` -Enabled by default when compiling without optimizations. -This can be used to enable extra debugging code in development but not in -production. For example, it controls the behavior of the standard library's -[`debug_assert!`] macro. +Enabled by default when compiling without optimizations. This can be used to enable extra debugging code in development but not in production. For example, it controls the behavior of the standard library's [`debug_assert!`] macro. r[cfg.proc_macro] ### `proc_macro` -Set when the crate being compiled is being compiled with the `proc_macro` -[crate type]. +Set when the crate being compiled is being compiled with the `proc_macro` [crate type]. r[cfg.panic] ### `panic` @@ -448,9 +420,7 @@ Zero, one, or more attributes may be listed. Multiple attributes will each be ex r[cfg.macro] ### The `cfg` macro -The built-in `cfg` macro takes in a single configuration predicate and evaluates -to the `true` literal when the predicate is true and the `false` literal when -it is false. +The built-in `cfg` macro takes in a single configuration predicate and evaluates to the `true` literal when the predicate is true and the `false` literal when it is false. For example: diff --git a/src/crates-and-source-files.md b/src/crates-and-source-files.md index 1ace243ad5..bb05a0971d 100644 --- a/src/crates-and-source-files.md +++ b/src/crates-and-source-files.md @@ -12,52 +12,28 @@ r[crate.syntax] > Although Rust, like any other language, can be implemented by an interpreter as well as a compiler, the only existing implementation is a compiler, and the language has always been designed to be compiled. For these reasons, this section assumes a compiler. r[crate.compile-time] -Rust's semantics obey a *phase distinction* between compile-time and -run-time.[^phase-distinction] Semantic rules that have a *static -interpretation* govern the success or failure of compilation, while -semantic rules that have a *dynamic interpretation* govern the behavior of the -program at run-time. +Rust's semantics obey a *phase distinction* between compile-time and run-time.[^phase-distinction] Semantic rules that have a *static interpretation* govern the success or failure of compilation, while semantic rules that have a *dynamic interpretation* govern the behavior of the program at run-time. r[crate.unit] -The compilation model centers on artifacts called _crates_. Each compilation -processes a single crate in source form, and if successful, produces a single -crate in binary form: either an executable or some sort of -library.[^cratesourcefile] +The compilation model centers on artifacts called _crates_. Each compilation processes a single crate in source form, and if successful, produces a single crate in binary form: either an executable or some sort of library.[^cratesourcefile] r[crate.module] -A _crate_ is a unit of compilation and linking, as well as versioning, -distribution, and runtime loading. A crate contains a _tree_ of nested -[module] scopes. The top level of this tree is a module that is -anonymous (from the point of view of paths within the module) and any item -within a crate has a canonical [module path] denoting its location -within the crate's module tree. +A _crate_ is a unit of compilation and linking, as well as versioning, distribution, and runtime loading. A crate contains a _tree_ of nested [module] scopes. The top level of this tree is a module that is anonymous (from the point of view of paths within the module) and any item within a crate has a canonical [module path] denoting its location within the crate's module tree. r[crate.input-source] -The Rust compiler is always invoked with a single source file as input, and -always produces a single output crate. The processing of that source file may -result in other source files being loaded as modules. Source files have the -extension `.rs`. +The Rust compiler is always invoked with a single source file as input, and always produces a single output crate. The processing of that source file may result in other source files being loaded as modules. Source files have the extension `.rs`. r[crate.module-def] -A Rust source file describes a module, the name and location of which — -in the module tree of the current crate — are defined from outside the -source file: either by an explicit [Module][grammar-Module] item in a referencing -source file, or by the name of the crate itself. +A Rust source file describes a module, the name and location of which — in the module tree of the current crate — are defined from outside the source file: either by an explicit [Module][grammar-Module] item in a referencing source file, or by the name of the crate itself. r[crate.inline-module] -Every source file is a -module, but not every module needs its own source file: [module -definitions][module] can be nested within one file. +Every source file is a module, but not every module needs its own source file: [module definitions][module] can be nested within one file. r[crate.items] -Each source file contains a sequence of zero or more [Item] definitions, and -may optionally begin with any number of [attributes] -that apply to the containing module, most of which influence the behavior of -the compiler. +Each source file contains a sequence of zero or more [Item] definitions, and may optionally begin with any number of [attributes] that apply to the containing module, most of which influence the behavior of the compiler. r[crate.attributes] -The anonymous crate module can have additional attributes that -apply to the crate as a whole. +The anonymous crate module can have additional attributes that apply to the crate as a whole. > [!NOTE] > The file's contents may be preceded by a [shebang]. @@ -81,9 +57,7 @@ r[crate.main.general] A crate that contains a `main` [function] can be compiled to an executable. r[crate.main.restriction] -If a `main` function is present, it must take no arguments, must not declare any -[trait or lifetime bounds], must not have any [where clauses], and its return -type must implement the [`Termination`] trait. +If a `main` function is present, it must take no arguments, must not declare any [trait or lifetime bounds], must not have any [where clauses], and its return type must implement the [`Termination`] trait. ```rust fn main() {} @@ -139,24 +113,18 @@ r[crate.crate_name] ## The `crate_name` attribute r[crate.crate_name.general] -The *`crate_name` [attribute]* may be applied at the crate level to specify the -name of the crate with the [MetaNameValueStr] syntax. +The *`crate_name` [attribute]* may be applied at the crate level to specify the name of the crate with the [MetaNameValueStr] syntax. ```rust #![crate_name = "mycrate"] ``` r[crate.crate_name.restriction] -The crate name must not be empty, and must only contain [Unicode alphanumeric] -or `_` (U+005F) characters. +The crate name must not be empty, and must only contain [Unicode alphanumeric] or `_` (U+005F) characters. -[^phase-distinction]: This distinction would also exist in an interpreter. - Static checks like syntactic analysis, type checking, and lints should - happen before the program is executed regardless of when it is executed. +[^phase-distinction]: This distinction would also exist in an interpreter. Static checks like syntactic analysis, type checking, and lints should happen before the program is executed regardless of when it is executed. -[^cratesourcefile]: A crate is somewhat analogous to an *assembly* in the - ECMA-335 CLI model, a *library* in the SML/NJ Compilation Manager, a *unit* - in the Owens and Flatt module system, or a *configuration* in Mesa. +[^cratesourcefile]: A crate is somewhat analogous to an *assembly* in the ECMA-335 CLI model, a *library* in the SML/NJ Compilation Manager, a *unit* in the Owens and Flatt module system, or a *configuration* in Mesa. [Unicode alphanumeric]: char::is_alphanumeric [`!`]: types/never.md diff --git a/src/glossary.md b/src/glossary.md index c6c1656b88..72bb1431a9 100644 --- a/src/glossary.md +++ b/src/glossary.md @@ -3,14 +3,11 @@ r[glossary.ast] ### Abstract syntax tree -An ‘abstract syntax tree’, or ‘AST’, is an intermediate representation of -the structure of the program when the compiler is compiling it. +An ‘abstract syntax tree’, or ‘AST’, is an intermediate representation of the structure of the program when the compiler is compiling it. ### Alignment -The alignment of a value specifies what addresses values are preferred to -start at. Always a power of two. References to a value must be aligned. -[More][alignment]. +The alignment of a value specifies what addresses values are preferred to start at. Always a power of two. References to a value must be aligned. [More][alignment]. r[glossary.abi] ### Application binary interface (ABI) @@ -22,52 +19,31 @@ An *application binary interface* (ABI) defines how compiled code interacts with ### Arity -Arity refers to the number of arguments a function or operator takes. -For some examples, `f(2, 3)` and `g(4, 6)` have arity 2, while `h(8, 2, 6)` -has arity 3. The `!` operator has arity 1. +Arity refers to the number of arguments a function or operator takes. For some examples, `f(2, 3)` and `g(4, 6)` have arity 2, while `h(8, 2, 6)` has arity 3. The `!` operator has arity 1. ### Array -An array, sometimes also called a fixed-size array or an inline array, is a value -describing a collection of elements, each selected by an index that can be computed -at run time by the program. It occupies a contiguous region of memory. +An array, sometimes also called a fixed-size array or an inline array, is a value describing a collection of elements, each selected by an index that can be computed at run time by the program. It occupies a contiguous region of memory. ### Associated item -An associated item is an item that is associated with another item. Associated -items are defined in [implementations] and declared in [traits]. Only -functions, constants, and type aliases can be associated. Contrast to a [free -item]. +An associated item is an item that is associated with another item. Associated items are defined in [implementations] and declared in [traits]. Only functions, constants, and type aliases can be associated. Contrast to a [free item]. ### Blanket implementation -Any implementation where a type appears [uncovered](#uncovered-type). `impl Foo -for T`, `impl Bar for T`, `impl Bar> for T`, and `impl Bar -for Vec` are considered blanket impls. However, `impl Bar> for -Vec` is not a blanket impl, as all instances of `T` which appear in this `impl` -are covered by `Vec`. +Any implementation where a type appears [uncovered](#uncovered-type). `impl Foo for T`, `impl Bar for T`, `impl Bar> for T`, and `impl Bar for Vec` are considered blanket impls. However, `impl Bar> for Vec` is not a blanket impl, as all instances of `T` which appear in this `impl` are covered by `Vec`. ### Bound -Bounds are constraints on a type or trait. For example, if a bound -is placed on the argument a function takes, types passed to that function -must abide by that constraint. +Bounds are constraints on a type or trait. For example, if a bound is placed on the argument a function takes, types passed to that function must abide by that constraint. ### Combinator -Combinators are higher-order functions that apply only functions and -earlier defined combinators to provide a result from its arguments. -They can be used to manage control flow in a modular fashion. +Combinators are higher-order functions that apply only functions and earlier defined combinators to provide a result from its arguments. They can be used to manage control flow in a modular fashion. ### Crate -A crate is the unit of compilation and linking. There are different [types of -crates], such as libraries or executables. Crates may link and refer to other -library crates, called external crates. A crate has a self-contained tree of -[modules], starting from an unnamed root module called the crate root. [Items] -may be made visible to other crates by marking them as public in the crate -root, including through [paths] of public modules. -[More][crate]. +A crate is the unit of compilation and linking. There are different [types of crates], such as libraries or executables. Crates may link and refer to other library crates, called external crates. A crate has a self-contained tree of [modules], starting from an unnamed root module called the crate root. [Items] may be made visible to other crates by marking them as public in the crate root, including through [paths] of public modules. [More][crate]. ### Dispatch @@ -79,136 +55,95 @@ A dynamically sized type (DST) is a type without a statically known size or alig ### Entity -An [*entity*] is a language construct that can be referred to in some way -within the source program, usually via a [path][paths]. Entities include -[types], [items], [generic parameters], [variable bindings], [loop labels], -[lifetimes], [fields], [attributes], and [lints]. +An [*entity*] is a language construct that can be referred to in some way within the source program, usually via a [path][paths]. Entities include [types], [items], [generic parameters], [variable bindings], [loop labels], [lifetimes], [fields], [attributes], and [lints]. ### Expression -An expression is a combination of values, constants, variables, operators -and functions that evaluate to a single value, with or without side-effects. +An expression is a combination of values, constants, variables, operators and functions that evaluate to a single value, with or without side-effects. For example, `2 + (3 * 4)` is an expression that returns the value 14. ### Free item -An [item] that is not a member of an [implementation], such as a *free -function* or a *free const*. Contrast to an [associated item]. +An [item] that is not a member of an [implementation], such as a *free function* or a *free const*. Contrast to an [associated item]. ### Fundamental traits -A fundamental trait is one where adding an impl of it for an existing type is a breaking change. -The `Fn` traits and `Sized` are fundamental. +A fundamental trait is one where adding an impl of it for an existing type is a breaking change. The `Fn` traits and `Sized` are fundamental. ### Fundamental type constructors -A fundamental type constructor is a type where implementing a [blanket implementation](#blanket-implementation) over it -is a breaking change. `&`, `&mut`, `Box`, and `Pin` are fundamental. +A fundamental type constructor is a type where implementing a [blanket implementation](#blanket-implementation) over it is a breaking change. `&`, `&mut`, `Box`, and `Pin` are fundamental. -Any time a type `T` is considered [local](#local-type), `&T`, `&mut T`, `Box`, and `Pin` -are also considered local. Fundamental type constructors cannot [cover](#uncovered-type) other types. -Any time the term "covered type" is used, -the `T` in `&T`, `&mut T`, `Box`, and `Pin` is not considered covered. +Any time a type `T` is considered [local](#local-type), `&T`, `&mut T`, `Box`, and `Pin` are also considered local. Fundamental type constructors cannot [cover](#uncovered-type) other types. Any time the term "covered type" is used, the `T` in `&T`, `&mut T`, `Box`, and `Pin` is not considered covered. ### Inhabited -A type is inhabited if it has constructors and therefore can be instantiated. An inhabited type is -not "empty" in the sense that there can be values of the type. Opposite of -[Uninhabited](#uninhabited). +A type is inhabited if it has constructors and therefore can be instantiated. An inhabited type is not "empty" in the sense that there can be values of the type. Opposite of [Uninhabited](#uninhabited). ### Inherent implementation -An [implementation] that applies to a nominal type, not to a trait-type pair. -[More][inherent implementation]. +An [implementation] that applies to a nominal type, not to a trait-type pair. [More][inherent implementation]. ### Inherent method -A [method] defined in an [inherent implementation], not in a trait -implementation. +A [method] defined in an [inherent implementation], not in a trait implementation. ### Initialized -A variable is initialized if it has been assigned a value and hasn't since been -moved from. All other memory locations are assumed to be uninitialized. Only -unsafe Rust can create a memory location without initializing it. +A variable is initialized if it has been assigned a value and hasn't since been moved from. All other memory locations are assumed to be uninitialized. Only unsafe Rust can create a memory location without initializing it. ### Local trait -A `trait` which was defined in the current crate. A trait definition is local -or not independent of applied type arguments. Given `trait Foo`, -`Foo` is always local, regardless of the types substituted for `T` and `U`. +A `trait` which was defined in the current crate. A trait definition is local or not independent of applied type arguments. Given `trait Foo`, `Foo` is always local, regardless of the types substituted for `T` and `U`. ### Local type -A `struct`, `enum`, or `union` which was defined in the current crate. -This is not affected by applied type arguments. `struct Foo` is considered local, but -`Vec` is not. `LocalType` is local. Type aliases do not -affect locality. +A `struct`, `enum`, or `union` which was defined in the current crate. This is not affected by applied type arguments. `struct Foo` is considered local, but `Vec` is not. `LocalType` is local. Type aliases do not affect locality. ### Module -A module is a container for zero or more [items]. Modules are organized in a -tree, starting from an unnamed module at the root called the crate root or the -root module. [Paths] may be used to refer to items from other modules, which -may be restricted by [visibility rules]. -[More][modules] +A module is a container for zero or more [items]. Modules are organized in a tree, starting from an unnamed module at the root called the crate root or the root module. [Paths] may be used to refer to items from other modules, which may be restricted by [visibility rules]. [More][modules] ### Name -A [*name*] is an [identifier] or [lifetime or loop label] that refers to an -[entity](#entity). A *name binding* is when an entity declaration introduces -an identifier or label associated with that entity. [Paths], -identifiers, and labels are used to refer to an entity. +A [*name*] is an [identifier] or [lifetime or loop label] that refers to an [entity](#entity). A *name binding* is when an entity declaration introduces an identifier or label associated with that entity. [Paths], identifiers, and labels are used to refer to an entity. ### Name resolution -[*Name resolution*] is the compile-time process of tying [paths], -[identifiers], and [labels] to [entity](#entity) declarations. +[*Name resolution*] is the compile-time process of tying [paths], [identifiers], and [labels] to [entity](#entity) declarations. ### Namespace -A *namespace* is a logical grouping of declared [names](#name) based on the -kind of [entity](#entity) the name refers to. Namespaces allow the occurrence -of a name in one namespace to not conflict with the same name in another -namespace. +A *namespace* is a logical grouping of declared [names](#name) based on the kind of [entity](#entity) the name refers to. Namespaces allow the occurrence of a name in one namespace to not conflict with the same name in another namespace. -Within a namespace, names are organized in a hierarchy, where each level of -the hierarchy has its own collection of named entities. +Within a namespace, names are organized in a hierarchy, where each level of the hierarchy has its own collection of named entities. ### Nominal types -Types that can be referred to by a path directly. Specifically [enums], -[structs], [unions], and [trait object types]. +Types that can be referred to by a path directly. Specifically [enums], [structs], [unions], and [trait object types]. ### Dyn-compatible traits -[Traits] that can be used in [trait object types] (`dyn Trait`). -Only traits that follow specific [rules][dyn compatibility] are *dyn compatible*. +[Traits] that can be used in [trait object types] (`dyn Trait`). Only traits that follow specific [rules][dyn compatibility] are *dyn compatible*. These were formerly known as *object safe* traits. ### Path -A [*path*] is a sequence of one or more path segments used to refer to an -[entity](#entity) in the current scope or other levels of a -[namespace](#namespace) hierarchy. +A [*path*] is a sequence of one or more path segments used to refer to an [entity](#entity) in the current scope or other levels of a [namespace](#namespace) hierarchy. ### Prelude -Prelude, or The Rust Prelude, is a small collection of items - mostly traits - that are -imported into every module of every crate. The traits in the prelude are pervasive. +Prelude, or The Rust Prelude, is a small collection of items - mostly traits - that are imported into every module of every crate. The traits in the prelude are pervasive. ### Scope -A [*scope*] is the region of source text where a named [entity](#entity) may -be referenced with that name. +A [*scope*] is the region of source text where a named [entity](#entity) may be referenced with that name. ### Scrutinee -A scrutinee is the expression that is matched on in `match` expressions and -similar pattern matching constructs. For example, in `match x { A => 1, B => 2 }`, -the expression `x` is the scrutinee. +A scrutinee is the expression that is matched on in `match` expressions and similar pattern matching constructs. For example, in `match x { A => 1, B => 2 }`, the expression `x` is the scrutinee. ### Size @@ -216,12 +151,9 @@ The size of a value has two definitions. The first is that it is how much memory must be allocated to store that value. -The second is that it is the offset in bytes between successive elements in an -array with that item type. +The second is that it is the offset in bytes between successive elements in an array with that item type. -It is a multiple of the alignment, including zero. The size can change -depending on compiler version (as new optimizations are made) and target -platform (similar to how `usize` varies per-platform). +It is a multiple of the alignment, including zero. The size can change depending on compiler version (as new optimizations are made) and target platform (similar to how `usize` varies per-platform). [More][alignment]. @@ -229,42 +161,33 @@ platform (similar to how `usize` varies per-platform). A slice is dynamically-sized view into a contiguous sequence, written as `[T]`. -It is often seen in its borrowed forms, either mutable or shared. The shared -slice type is `&[T]`, while the mutable slice type is `&mut [T]`, where `T` represents -the element type. +It is often seen in its borrowed forms, either mutable or shared. The shared slice type is `&[T]`, while the mutable slice type is `&mut [T]`, where `T` represents the element type. ### Statement -A statement is the smallest standalone element of a programming language -that commands a computer to perform an action. +A statement is the smallest standalone element of a programming language that commands a computer to perform an action. ### String literal -A string literal is a string stored directly in the final binary, and so will be -valid for the `'static` duration. +A string literal is a string stored directly in the final binary, and so will be valid for the `'static` duration. Its type is `'static` duration borrowed string slice, `&'static str`. ### String slice -A string slice is the most primitive string type in Rust, written as `str`. It is -often seen in its borrowed forms, either mutable or shared. The shared -string slice type is `&str`, while the mutable string slice type is `&mut str`. +A string slice is the most primitive string type in Rust, written as `str`. It is often seen in its borrowed forms, either mutable or shared. The shared string slice type is `&str`, while the mutable string slice type is `&mut str`. Strings slices are always valid UTF-8. ### Trait -A trait is a language item that is used for describing the functionalities a type must provide. -It allows a type to make certain promises about its behavior. +A trait is a language item that is used for describing the functionalities a type must provide. It allows a type to make certain promises about its behavior. Generic functions and generic structs can use traits to constrain, or bound, the types they accept. ### Turbofish -Paths with generic parameters in expressions must prefix the opening brackets with a `::`. -Combined with the angular brackets for generics, this looks like a fish `::<>`. -As such, this syntax is colloquially referred to as turbofish syntax. +Paths with generic parameters in expressions must prefix the opening brackets with a `::`. Combined with the angular brackets for generics, this looks like a fish `::<>`. As such, this syntax is colloquially referred to as turbofish syntax. Examples: @@ -273,28 +196,19 @@ let ok_num = Ok::<_, ()>(5); let vec = [1, 2, 3].iter().map(|n| n * 2).collect::>(); ``` -This `::` prefix is required to disambiguate generic paths with multiple comparisons in a comma-separate list. -See [the bastion of the turbofish][turbofish test] for an example where not having the prefix would be ambiguous. +This `::` prefix is required to disambiguate generic paths with multiple comparisons in a comma-separate list. See [the bastion of the turbofish][turbofish test] for an example where not having the prefix would be ambiguous. ### Uncovered type -A type which does not appear as an argument to another type. For example, -`T` is uncovered, but the `T` in `Vec` is covered. This is only relevant for -type arguments. +A type which does not appear as an argument to another type. For example, `T` is uncovered, but the `T` in `Vec` is covered. This is only relevant for type arguments. ### Undefined behavior -Compile-time or run-time behavior that is not specified. This may result in, -but is not limited to: process termination or corruption; improper, incorrect, -or unintended computation; or platform-specific results. -[More][undefined-behavior]. +Compile-time or run-time behavior that is not specified. This may result in, but is not limited to: process termination or corruption; improper, incorrect, or unintended computation; or platform-specific results. [More][undefined-behavior]. ### Uninhabited -A type is uninhabited if it has no constructors and therefore can never be instantiated. An -uninhabited type is "empty" in the sense that there are no values of the type. The canonical -example of an uninhabited type is the [never type] `!`, or an enum with no variants -`enum Never { }`. Opposite of [Inhabited](#inhabited). +A type is uninhabited if it has no constructors and therefore can never be instantiated. An uninhabited type is "empty" in the sense that there are no values of the type. The canonical example of an uninhabited type is the [never type] `!`, or an enum with no variants `enum Never { }`. Opposite of [Inhabited](#inhabited). [`extern` blocks]: items.extern [`extern fn`]: items.fn.extern diff --git a/src/influences.md b/src/influences.md index d056c792de..359c292abb 100644 --- a/src/influences.md +++ b/src/influences.md @@ -1,22 +1,16 @@ # Influences -Rust is not a particularly original language, with design elements coming from -a wide range of sources. Some of these are listed below (including elements -that have since been removed): +Rust is not a particularly original language, with design elements coming from a wide range of sources. Some of these are listed below (including elements that have since been removed): -* SML, OCaml: algebraic data types, pattern matching, type inference, - semicolon statement separation -* C++: references, RAII, smart pointers, move semantics, monomorphization, - memory model +* SML, OCaml: algebraic data types, pattern matching, type inference, semicolon statement separation +* C++: references, RAII, smart pointers, move semantics, monomorphization, memory model * ML Kit, Cyclone: region based memory management * Haskell (GHC): typeclasses, type families * Newsqueak, Alef, Limbo: channels, concurrency -* Erlang: message passing, thread failure, ~~linked thread failure~~, - ~~lightweight concurrency~~ +* Erlang: message passing, thread failure, ~~linked thread failure~~, ~~lightweight concurrency~~ * Swift: optional bindings * Scheme: hygienic macros * C#: attributes * Ruby: closure syntax, ~~block syntax~~ * NIL, Hermes: ~~typestate~~ -* [Unicode Annex #31](http://www.unicode.org/reports/tr31/): identifier and - pattern syntax +* [Unicode Annex #31](http://www.unicode.org/reports/tr31/): identifier and pattern syntax diff --git a/src/inline-assembly.md b/src/inline-assembly.md index 4168780859..e47539ceba 100644 --- a/src/inline-assembly.md +++ b/src/inline-assembly.md @@ -2,8 +2,7 @@ r[asm] # Inline assembly r[asm.intro] -Support for inline assembly is provided via the [`asm!`], [`naked_asm!`], and [`global_asm!`] macros. -It can be used to embed handwritten assembly in the assembly output generated by the compiler. +Support for inline assembly is provided via the [`asm!`], [`naked_asm!`], and [`global_asm!`] macros. It can be used to embed handwritten assembly in the assembly output generated by the compiler. [`asm!`]: core::arch::asm [`naked_asm!`]: core::arch::naked_asm @@ -115,9 +114,7 @@ r[asm.scope.intro] Inline assembly can be used in one of three ways. r[asm.scope.asm] -With the `asm!` macro, the assembly code is emitted in a function scope and integrated into the compiler-generated assembly code of a function. -This assembly code must obey [strict rules](#rules-for-inline-assembly) to avoid undefined behavior. -Note that in some cases the compiler may choose to emit the assembly code as a separate function and generate a call to it. +With the `asm!` macro, the assembly code is emitted in a function scope and integrated into the compiler-generated assembly code of a function. This assembly code must obey [strict rules](#rules-for-inline-assembly) to avoid undefined behavior. Note that in some cases the compiler may choose to emit the assembly code as a separate function and generate a call to it. ```rust # #[cfg(target_arch = "x86_64")] { @@ -138,8 +135,7 @@ core::arch::naked_asm!("/* {} */", const 0); ``` r[asm.scope.global_asm] -With the `global_asm!` macro, the assembly code is emitted in a global scope, outside a function. -This can be used to hand-write entire functions using assembly code, and generally provides much more freedom to use arbitrary registers and assembler directives. +With the `global_asm!` macro, the assembly code is emitted in a global scope, outside a function. This can be used to hand-write entire functions using assembly code, and generally provides much more freedom to use arbitrary registers and assembler directives. ```rust # fn main() {} @@ -186,8 +182,7 @@ unsafe { core::arch::asm!("/* {x} */"); } // ERROR: no argument named x ``` r[asm.ts-args.one-or-more] -An `asm!` invocation may have one or more template string arguments; an `asm!` with multiple template string arguments is treated as if all the strings were concatenated with a `\n` between them. -The expected usage is for each template string argument to correspond to a line of assembly code. +An `asm!` invocation may have one or more template string arguments; an `asm!` with multiple template string arguments is treated as if all the strings were concatenated with a `\n` between them. The expected usage is for each template string argument to correspond to a line of assembly code. ```rust # #[cfg(target_arch = "x86_64")] { @@ -260,12 +255,7 @@ r[asm.ts-args.opaque] The exact assembly code syntax is target-specific and opaque to the compiler except for the way operands are substituted into the template string to form the code passed to the assembler. r[asm.ts-args.llvm-syntax] -Currently, all supported targets follow the assembly code syntax used by LLVM's internal assembler which usually corresponds to that of the GNU assembler (GAS). -On x86, the `.intel_syntax noprefix` mode of GAS is used by default. -On ARM, the `.syntax unified` mode is used. -These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with `.section`) must be restored to its original value at the end of the asm string. -Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior. -Further constraints on the directives used by inline assembly are indicated by [Directives Support](#directives-support). +Currently, all supported targets follow the assembly code syntax used by LLVM's internal assembler which usually corresponds to that of the GNU assembler (GAS). On x86, the `.intel_syntax noprefix` mode of GAS is used by default. On ARM, the `.syntax unified` mode is used. These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with `.section`) must be restored to its original value at the end of the asm string. Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior. Further constraints on the directives used by inline assembly are indicated by [Directives Support](#directives-support). [format-syntax]: std::fmt#syntax [rfc-2795]: https://github.com/rust-lang/rfcs/pull/2795 @@ -316,8 +306,7 @@ Several types of operands are supported: r[asm.operand-type.supported-operands.in] * `in() ` - - `` can refer to a register class or an explicit register. - The allocated register name is substituted into the asm template string. + - `` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string. - The allocated register will contain the value of `` at the start of the assembly code. - The allocated register must contain the same value at the end of the assembly code (except if a `lateout` is allocated to the same register). @@ -330,8 +319,7 @@ unsafe { core::arch::asm!("/* {} */", in(reg) 5); } r[asm.operand-type.supported-operands.out] * `out() ` - - `` can refer to a register class or an explicit register. - The allocated register name is substituted into the asm template string. + - `` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string. - The allocated register will contain an undefined value at the start of the assembly code. - `` must be a (possibly uninitialized) place expression, to which the contents of the allocated register are written at the end of the assembly code. - An underscore (`_`) may be specified instead of an expression, which will cause the contents of the register to be discarded at the end of the assembly code (effectively acting as a clobber). @@ -362,8 +350,7 @@ assert_eq!(x, 5) r[asm.operand-type.supported-operands.inout] * `inout() ` - - `` can refer to a register class or an explicit register. - The allocated register name is substituted into the asm template string. + - `` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string. - The allocated register will contain the value of `` at the start of the assembly code. - `` must be a mutable initialized place expression, to which the contents of the allocated register are written at the end of the assembly code. @@ -466,9 +453,7 @@ unsafe { ``` r[asm.operand-type.left-to-right] -Operand expressions are evaluated from left to right, just like function call arguments. -After the `asm!` has executed, outputs are written to in left to right order. -This is significant if two outputs point to the same place: that place will contain the value of the rightmost output. +Operand expressions are evaluated from left to right, just like function call arguments. After the `asm!` has executed, outputs are written to in left to right order. This is significant if two outputs point to the same place: that place will contain the value of the rightmost output. ```rust # #[cfg(target_arch = "x86_64")] { @@ -507,8 +492,7 @@ r[asm.register-operands] ## Register operands r[asm.register-operands.register-or-class] -Input and output operands can be specified either as an explicit register or as a register class from which the register allocator can select a register. -Explicit registers are specified as string literals (e.g. `"eax"`) while register classes are specified as identifiers (e.g. `reg`). +Input and output operands can be specified either as an explicit register or as a register class from which the register allocator can select a register. Explicit registers are specified as string literals (e.g. `"eax"`) while register classes are specified as identifiers (e.g. `reg`). ```rust # #[cfg(target_arch = "x86_64")] { @@ -561,8 +545,7 @@ Only the following types are allowed as operands for inline assembly: - Floating-point numbers - Pointers (thin only) - Function pointers -- SIMD vectors (structs defined with `#[repr(simd)]` and which implement `Copy`). -This includes architecture-specific vector types defined in `std::arch` such as `__m128` (x86) or `int8x16_t` (ARM). +- SIMD vectors (structs defined with `#[repr(simd)]` and which implement `Copy`). This includes architecture-specific vector types defined in `std::arch` such as `__m128` (x86) or `int8x16_t` (ARM). ```rust # #[cfg(target_arch = "x86_64")] { @@ -651,10 +634,7 @@ Here is the list of currently supported register classes: > - Some register classes are marked as "Only clobbers" which means that registers in these classes cannot be used for inputs or outputs, only clobbers of the form `out() _` or `lateout() _`. r[asm.register-operands.value-type-constraints] -Each register class has constraints on which value types they can be used with. -This is necessary because the way a value is loaded into a register depends on its type. -For example, on big-endian systems, loading a `i32x4` and a `i8x16` into a SIMD register may result in different register contents even if the byte-wise memory representation of both values is identical. -The availability of supported types for a particular register class may depend on what target features are currently enabled. +Each register class has constraints on which value types they can be used with. This is necessary because the way a value is loaded into a register depends on its type. For example, on big-endian systems, loading a `i32x4` and a `i8x16` into a SIMD register may result in different register contents even if the byte-wise memory representation of both values is identical. The availability of supported types for a particular register class may depend on what target features are currently enabled. | Architecture | Register class | Target feature | Allowed types | | ------------ | -------------- | -------------- | ------------- | @@ -718,8 +698,7 @@ unsafe { core::arch::asm!("/* {} */", in(reg) z); } ``` r[asm.register-operands.smaller-value] -If a value is of a smaller size than the register it is allocated in then the upper bits of that register will have an undefined value for inputs and will be ignored for outputs. -The only exception is the `freg` register class on RISC-V where `f32` values are NaN-boxed in a `f64` as required by the RISC-V architecture. +If a value is of a smaller size than the register it is allocated in then the upper bits of that register will have an undefined value for inputs and will be ignored for outputs. The only exception is the `freg` register class on RISC-V where `f32` values are NaN-boxed in a `f64` as required by the RISC-V architecture. ```rust,no_run @@ -735,9 +714,7 @@ assert_eq!(x & 0xFFFFFFFF, 4); // However, this one will succeed ``` r[asm.register-operands.separate-input-output] -When separate input and output expressions are specified for an `inout` operand, both expressions must have the same type. -The only exception is if both operands are pointers or integers, in which case they are only required to have the same size. -This restriction exists because the register allocators in LLVM and GCC sometimes cannot handle tied operands with different types. +When separate input and output expressions are specified for an `inout` operand, both expressions must have the same type. The only exception is if both operands are pointers or integers, in which case they are only required to have the same size. This restriction exists because the register allocators in LLVM and GCC sometimes cannot handle tied operands with different types. ```rust # #[cfg(target_arch = "x86_64")] { @@ -765,9 +742,7 @@ r[asm.register-names] ## Register names r[asm.register-names.supported-register-aliases] -Some registers have multiple names. -These are all treated by the compiler as identical to the base register name. -Here is the list of all supported register aliases: +Some registers have multiple names. These are all treated by the compiler as identical to the base register name. Here is the list of all supported register aliases: | Architecture | Base register | Aliases | | ------------ | ------------- | ------- | @@ -882,8 +857,7 @@ r[asm.template-modifiers] ## Template modifiers r[asm.template-modifiers.intro] -The placeholders can be augmented by modifiers which are specified after the `:` in the curly braces. -These modifiers do not affect register allocation, but change the way operands are formatted when inserted into the template string. +The placeholders can be augmented by modifiers which are specified after the `:` in the curly braces. These modifiers do not affect register allocation, but change the way operands are formatted when inserted into the template string. r[asm.template-modifiers.only-one] Only one modifier is allowed per template placeholder. @@ -943,8 +917,7 @@ The supported modifiers are a subset of LLVM's (and GCC's) [asm template argumen > [!NOTE] > - on ARM `e` / `f`: this prints the low or high doubleword register name of a NEON quad (128-bit) register. -> - on x86: our behavior for `reg` with no modifiers differs from what GCC does. -> GCC will infer the modifier based on the operand value type, while we default to the full register size. +> - on x86: our behavior for `reg` with no modifiers differs from what GCC does. GCC will infer the modifier based on the operand value type, while we default to the full register size. > - on x86 `xmm_reg`: the `x`, `t` and `g` LLVM modifiers are not yet implemented in LLVM (they are supported by GCC only), but this should be a simple change. ```rust @@ -959,10 +932,7 @@ assert_eq!(x, 0x1000u16); ``` r[asm.template-modifiers.smaller-value] -As stated in the previous section, passing an input value smaller than the register width will result in the upper bits of the register containing undefined values. -This is not a problem if the inline asm only accesses the lower bits of the register, which can be done by using a template modifier to use a subregister name in the assembly code (e.g. `ax` instead of `rax`). -Since this an easy pitfall, the compiler will suggest a template modifier to use where appropriate given the input type. -If all references to an operand already have modifiers then the warning is suppressed for that operand. +As stated in the previous section, passing an input value smaller than the register width will result in the upper bits of the register containing undefined values. This is not a problem if the inline asm only accesses the lower bits of the register, which can be done by using a template modifier to use a subregister name in the assembly code (e.g. `ax` instead of `rax`). Since this an easy pitfall, the compiler will suggest a template modifier to use where appropriate given the input type. If all references to an operand already have modifiers then the warning is suppressed for that operand. [llvm-argmod]: http://llvm.org/docs/LangRef.html#asm-template-argument-modifiers @@ -970,8 +940,7 @@ r[asm.abi-clobbers] ## ABI clobbers r[asm.abi-clobbers.intro] -The `clobber_abi` keyword can be used to apply a default set of clobbers to the assembly code. -This will automatically insert the necessary clobber constraints as needed for calling a function with a particular calling convention: if the calling convention does not fully preserve the value of a register across a call then `lateout("...") _` is implicitly added to the operands list (where the `...` is replaced by the register's name). +The `clobber_abi` keyword can be used to apply a default set of clobbers to the assembly code. This will automatically insert the necessary clobber constraints as needed for calling a function with a particular calling convention: if the calling convention does not fully preserve the value of a register across a call then `lateout("...") _` is implicitly added to the operands list (where the `...` is replaced by the register's name). ```rust # #[cfg(target_arch = "x86_64")] { @@ -1063,13 +1032,10 @@ r[asm.options] ## Options r[asm.options.supported-options] -Flags are used to further influence the behavior of the inline assembly code. -Currently the following options are defined: +Flags are used to further influence the behavior of the inline assembly code. Currently the following options are defined: r[asm.options.supported-options.pure] -- `pure`: The assembly code has no side effects, must eventually return, and its outputs depend only on its direct inputs (i.e. the values themselves, not what they point to) or values read from memory (unless the `nomem` options is also set). - This allows the compiler to execute the assembly code fewer times than specified in the program (e.g. by hoisting it out of a loop) or even eliminate it entirely if the outputs are not used. - The `pure` option must be combined with either the `nomem` or `readonly` options, otherwise a compile-time error is emitted. +- `pure`: The assembly code has no side effects, must eventually return, and its outputs depend only on its direct inputs (i.e. the values themselves, not what they point to) or values read from memory (unless the `nomem` options is also set). This allows the compiler to execute the assembly code fewer times than specified in the program (e.g. by hoisting it out of a loop) or even eliminate it entirely if the outputs are not used. The `pure` option must be combined with either the `nomem` or `readonly` options, otherwise a compile-time error is emitted. ```rust # #[cfg(target_arch = "x86_64")] { @@ -1095,9 +1061,7 @@ assert_eq!(z, 0); ``` r[asm.options.supported-options.nomem] -- `nomem`: The assembly code does not read from or write to any memory accessible outside of the assembly code. - This allows the compiler to cache the values of modified global variables in registers across execution of the assembly code since it knows that they are not read from or written to by it. - The compiler also assumes that the assembly code does not perform any kind of synchronization with other threads, e.g. via fences. +- `nomem`: The assembly code does not read from or write to any memory accessible outside of the assembly code. This allows the compiler to cache the values of modified global variables in registers across execution of the assembly code since it knows that they are not read from or written to by it. The compiler also assumes that the assembly code does not perform any kind of synchronization with other threads, e.g. via fences. ```rust,no_run @@ -1143,9 +1107,7 @@ assert_eq!(z, 1); ``` r[asm.options.supported-options.readonly] -- `readonly`: The assembly code does not write to any memory accessible outside of the assembly code. - This allows the compiler to cache the values of unmodified global variables in registers across execution of the assembly code since it knows that they are not written to by it. - The compiler also assumes that this assembly code does not perform any kind of synchronization with other threads, e.g. via fences. +- `readonly`: The assembly code does not write to any memory accessible outside of the assembly code. This allows the compiler to cache the values of unmodified global variables in registers across execution of the assembly code since it knows that they are not written to by it. The compiler also assumes that this assembly code does not perform any kind of synchronization with other threads, e.g. via fences. ```rust,no_run @@ -1189,8 +1151,7 @@ assert_eq!(z, 1); ``` r[asm.options.supported-options.preserves_flags] -- `preserves_flags`: The assembly code does not modify the flags register (defined in the rules below). - This allows the compiler to avoid recomputing the condition flags after execution of the assembly code. +- `preserves_flags`: The assembly code does not modify the flags register (defined in the rules below). This allows the compiler to avoid recomputing the condition flags after execution of the assembly code. r[asm.options.supported-options.noreturn] - `noreturn`: The assembly code does not fall through; behavior is undefined if it does. It may still jump to `label` blocks. If any `label` blocks return unit, the `asm!` block will return unit. Otherwise it will return `!` (never). As with a call to a function that does not return, local variables in scope are not dropped before execution of the assembly code. @@ -1225,8 +1186,7 @@ let _: () = unsafe { ``` r[asm.options.supported-options.nostack] -- `nostack`: The assembly code does not push data to the stack, or write to the stack red-zone (if supported by the target). - If this option is *not* used then the stack pointer is guaranteed by the compiler at the start of the assembly code to be suitably aligned (according to the target ABI) for a function call. +- `nostack`: The assembly code does not push data to the stack, or write to the stack red-zone (if supported by the target). If this option is *not* used then the stack pointer is guaranteed by the compiler at the start of the assembly code to be suitably aligned (according to the target ABI) for a function call. ```rust,no_run @@ -1237,8 +1197,7 @@ unsafe { core::arch::asm!("push rax", "pop rax", options(nostack)); } ``` r[asm.options.supported-options.att_syntax] -- `att_syntax`: This option is only valid on x86, and causes the assembler to use the `.att_syntax prefix` mode of the GNU assembler. - Register operands are substituted in with a leading `%`. +- `att_syntax`: This option is only valid on x86, and causes the assembler to use the `.att_syntax prefix` mode of the GNU assembler. Register operands are substituted in with a leading `%`. ```rust # #[cfg(target_arch = "x86_64")] { @@ -1257,8 +1216,7 @@ assert_eq!(x, y); ``` r[asm.options.supported-options.raw] -- `raw`: This causes the template string to be parsed as a raw assembly string, with no special handling for `{` and `}`. - This is primarily useful when including raw assembly code from an external file using `include_str!`. +- `raw`: This causes the template string to be parsed as a raw assembly string, with no special handling for `{` and `}`. This is primarily useful when including raw assembly code from an external file using `include_str!`. r[asm.options.checks] The compiler performs some additional checks on options: @@ -1325,15 +1283,12 @@ To avoid undefined behavior, these rules must be followed when using function-sc r[asm.rules.reg-not-input] - Any registers not specified as inputs will contain an undefined value on entry to the assembly code. - - An "undefined value" in the context of inline assembly means that the register can (non-deterministically) have any one of the possible values allowed by the architecture. - Notably it is not the same as an LLVM `undef` which can have a different value every time you read it (since such a concept does not exist in assembly code). + - An "undefined value" in the context of inline assembly means that the register can (non-deterministically) have any one of the possible values allowed by the architecture. Notably it is not the same as an LLVM `undef` which can have a different value every time you read it (since such a concept does not exist in assembly code). r[asm.rules.reg-not-output] - Any registers not specified as outputs must have the same value upon exiting the assembly code as they had on entry, otherwise behavior is undefined. - - This only applies to registers which can be specified as an input or output. - Other registers follow target-specific rules. - - Note that a `lateout` may be allocated to the same register as an `in`, in which case this rule does not apply. - Code should not rely on this however since it depends on the results of register allocation. + - This only applies to registers which can be specified as an input or output. Other registers follow target-specific rules. + - Note that a `lateout` may be allocated to the same register as an `in`, in which case this rule does not apply. Code should not rely on this however since it depends on the results of register allocation. r[asm.rules.unwind] - Behavior is undefined if execution unwinds out of the assembly code. @@ -1362,8 +1317,7 @@ r[asm.rules.noreturn] - If the `noreturn` option is set then behavior is undefined if execution falls through the end of the assembly code. r[asm.rules.pure] -- If the `pure` option is set then behavior is undefined if the `asm!` has side-effects other than its direct outputs. - Behavior is also undefined if two executions of the `asm!` code with the same inputs result in different outputs. +- If the `pure` option is set then behavior is undefined if the `asm!` has side-effects other than its direct outputs. Behavior is also undefined if two executions of the `asm!` code with the same inputs result in different outputs. - When used with the `nomem` option, "inputs" are just the direct inputs of the `asm!`. - When used with the `readonly` option, "inputs" comprise the direct inputs of the assembly code and any memory that it is allowed to read. @@ -1439,8 +1393,7 @@ r[asm.rules.only-on-exit] - The requirement of restoring the stack pointer and non-output registers to their original value only applies when exiting the assembly code. - This means that assembly code that does not fall through and does not jump to any `label` blocks, even if not marked `noreturn`, doesn't need to preserve these registers. - When returning to the assembly code of a different `asm!` block than you entered (e.g. for context switching), these registers must contain the value they had upon entering the `asm!` block that you are *exiting*. - - You cannot exit the assembly code of an `asm!` block that has not been entered. - Neither can you exit the assembly code of an `asm!` block whose assembly code has already been exited (without first entering it again). + - You cannot exit the assembly code of an `asm!` block that has not been entered. Neither can you exit the assembly code of an `asm!` block whose assembly code has already been exited (without first entering it again). - You are responsible for switching any target-specific state (e.g. thread-local storage, stack bounds). - You cannot jump from an address in one `asm!` block to an address in another, even within the same function or block, without treating their contexts as potentially different and requiring context switching. You cannot assume that any particular value in those contexts (e.g. current stack pointer or temporary values below the stack pointer) will remain unchanged between the two `asm!` blocks. - The set of memory locations that you may access is the intersection of those allowed by the `asm!` blocks you entered and exited. @@ -1449,8 +1402,7 @@ r[asm.rules.not-successive] - You cannot assume that two `asm!` blocks adjacent in source code, even without any other code between them, will end up in successive addresses in the binary without any other instructions between them. r[asm.rules.not-exactly-once] -- You cannot assume that an `asm!` block will appear exactly once in the output binary. - The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places. +- You cannot assume that an `asm!` block will appear exactly once in the output binary. The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places. r[asm.rules.x86-prefix-restriction] - On x86, inline assembly must not end with an instruction prefix (such as `LOCK`) that would apply to instructions generated by the compiler. @@ -1567,13 +1519,7 @@ r[asm.validity] ### Correctness and validity r[asm.validity.necessary-but-not-sufficient] -In addition to all of the previous rules, the string argument to `asm!` must ultimately become--- -after all other arguments are evaluated, formatting is performed, and operands are translated--- -assembly that is both syntactically correct and semantically valid for the target architecture. -The formatting rules allow the compiler to generate assembly with correct syntax. -Rules concerning operands permit valid translation of Rust operands into and out of the assembly code. -Adherence to these rules is necessary, but not sufficient, for the final expanded assembly to be -both correct and valid. For instance: +In addition to all of the previous rules, the string argument to `asm!` must ultimately become---after all other arguments are evaluated, formatting is performed, and operands are translated---assembly that is both syntactically correct and semantically valid for the target architecture. The formatting rules allow the compiler to generate assembly with correct syntax. Rules concerning operands permit valid translation of Rust operands into and out of the assembly code. Adherence to these rules is necessary, but not sufficient, for the final expanded assembly to be both correct and valid. For instance: - arguments may be placed in positions which are syntactically incorrect after formatting - an instruction may be correctly written, but given architecturally invalid operands @@ -1581,20 +1527,13 @@ both correct and valid. For instance: - a set of instructions, each correct and valid, may cause undefined behavior if placed in immediate succession r[asm.validity.non-exhaustive] -As a result, these rules are _non-exhaustive_. The compiler is not required to check the -correctness and validity of the initial string nor the final assembly that is generated. -The assembler may check for correctness and validity but is not required to do so. -When using `asm!`, a typographical error may be sufficient to make a program unsound, -and the rules for assembly may include thousands of pages of architectural reference manuals. -Programmers should exercise appropriate care, as invoking this `unsafe` capability comes with -assuming the responsibility of not violating rules of both the compiler or the architecture. +As a result, these rules are _non-exhaustive_. The compiler is not required to check the correctness and validity of the initial string nor the final assembly that is generated. The assembler may check for correctness and validity but is not required to do so. When using `asm!`, a typographical error may be sufficient to make a program unsound, and the rules for assembly may include thousands of pages of architectural reference manuals. Programmers should exercise appropriate care, as invoking this `unsafe` capability comes with assuming the responsibility of not violating rules of both the compiler or the architecture. r[asm.directives] ### Directives support r[asm.directives.subset-supported] -Inline assembly supports a subset of the directives supported by both GNU AS and LLVM's internal assembler, given as follows. -The result of using other directives is assembler-specific (and may cause an error, or may be accepted as-is). +Inline assembly supports a subset of the directives supported by both GNU AS and LLVM's internal assembler, given as follows. The result of using other directives is assembler-specific (and may cause an error, or may be accepted as-is). r[asm.directives.stateful] If inline assembly includes any "stateful" directive that modifies how subsequent assembly is processed, the assembly code must undo the effects of any such directives before the inline assembly ends. @@ -1723,8 +1662,7 @@ On x86 targets, both 32-bit and 64-bit, the following additional directives are - `.code32` - `.code64` -Use of `.code16`, `.code32`, and `.code64` directives are only supported if the state is reset to the default before exiting the assembly code. -32-bit x86 uses `.code32` by default, and x86_64 uses `.code64` by default. +Use of `.code16`, `.code32`, and `.code64` directives are only supported if the state is reset to the default before exiting the assembly code. 32-bit x86 uses `.code32` by default, and x86_64 uses `.code64` by default. r[asm.target-specific-directives.arm-32-bit] ##### ARM (32-bit) diff --git a/src/macro-ambiguity.md b/src/macro-ambiguity.md index 47dc60f1fb..231fd2c8cf 100644 --- a/src/macro-ambiguity.md +++ b/src/macro-ambiguity.md @@ -1,9 +1,7 @@ r[macro.ambiguity] # Appendix: Macro follow-set ambiguity formal specification -This page documents the formal specification of the follow rules for [Macros -By Example]. They were originally specified in [RFC 550], from which the bulk -of this text is copied, and expanded upon in subsequent RFCs. +This page documents the formal specification of the follow rules for [Macros By Example]. They were originally specified in [RFC 550], from which the bulk of this text is copied, and expanded upon in subsequent RFCs. r[macro.ambiguity.convention] ## Definitions & conventions @@ -11,34 +9,21 @@ r[macro.ambiguity.convention] r[macro.ambiguity.convention.defs] - `macro`: anything invocable as `foo!(...)` in source code. - `MBE`: macro-by-example, a macro defined by `macro_rules`. - - `matcher`: the left-hand-side of a rule in a `macro_rules` invocation, or a - subportion thereof. - - `macro parser`: the bit of code in the Rust parser that will parse the - input using a grammar derived from all of the matchers. - - `fragment`: The class of Rust syntax that a given matcher will accept (or - "match"). + - `matcher`: the left-hand-side of a rule in a `macro_rules` invocation, or a subportion thereof. + - `macro parser`: the bit of code in the Rust parser that will parse the input using a grammar derived from all of the matchers. + - `fragment`: The class of Rust syntax that a given matcher will accept (or "match"). - `repetition` : a fragment that follows a regular repeating pattern - - `NT`: non-terminal, the various "meta-variables" or repetition matchers - that can appear in a matcher, specified in MBE syntax with a leading `$` - character. + - `NT`: non-terminal, the various "meta-variables" or repetition matchers that can appear in a matcher, specified in MBE syntax with a leading `$` character. - `simple NT`: a "meta-variable" non-terminal (further discussion below). - - `complex NT`: a repetition matching non-terminal, specified via repetition - operators (`*`, `+`, `?`). - - `token`: an atomic element of a matcher; i.e. identifiers, operators, - open/close delimiters, *and* simple NT's. - - `token tree`: a tree structure formed from tokens (the leaves), complex - NT's, and finite sequences of token trees. - - `delimiter token`: a token that is meant to divide the end of one fragment - and the start of the next fragment. - - `separator token`: an optional delimiter token in an complex NT that - separates each pair of elements in the matched repetition. + - `complex NT`: a repetition matching non-terminal, specified via repetition operators (`*`, `+`, `?`). + - `token`: an atomic element of a matcher; i.e. identifiers, operators, open/close delimiters, *and* simple NT's. + - `token tree`: a tree structure formed from tokens (the leaves), complex NT's, and finite sequences of token trees. + - `delimiter token`: a token that is meant to divide the end of one fragment and the start of the next fragment. + - `separator token`: an optional delimiter token in an complex NT that separates each pair of elements in the matched repetition. - `separated complex NT`: a complex NT that has its own separator token. - - `delimited sequence`: a sequence of token trees with appropriate open- and - close-delimiters at the start and end of the sequence. - - `empty fragment`: The class of invisible Rust syntax that separates tokens, - i.e. whitespace, or (in some lexical contexts), the empty token sequence. - - `fragment specifier`: The identifier in a simple NT that specifies which - fragment the NT accepts. + - `delimited sequence`: a sequence of token trees with appropriate open- and close-delimiters at the start and end of the sequence. + - `empty fragment`: The class of invisible Rust syntax that separates tokens, i.e. whitespace, or (in some lexical contexts), the empty token sequence. + - `fragment specifier`: The identifier in a simple NT that specifies which fragment the NT accepts. - `language`: a context-free language. Example: @@ -50,103 +35,52 @@ macro_rules! i_am_an_mbe { ``` r[macro.ambiguity.convention.matcher] -`(start $foo:expr $($i:ident),* end)` is a matcher. The whole matcher is a -delimited sequence (with open- and close-delimiters `(` and `)`), and `$foo` -and `$i` are simple NT's with `expr` and `ident` as their respective fragment -specifiers. +`(start $foo:expr $($i:ident),* end)` is a matcher. The whole matcher is a delimited sequence (with open- and close-delimiters `(` and `)`), and `$foo` and `$i` are simple NT's with `expr` and `ident` as their respective fragment specifiers. r[macro.ambiguity.convention.complex-nt] -`$(i:ident),*` is *also* an NT; it is a complex NT that matches a -comma-separated repetition of identifiers. The `,` is the separator token for -the complex NT; it occurs in between each pair of elements (if any) of the -matched fragment. +`$(i:ident),*` is *also* an NT; it is a complex NT that matches a comma-separated repetition of identifiers. The `,` is the separator token for the complex NT; it occurs in between each pair of elements (if any) of the matched fragment. -Another example of a complex NT is `$(hi $e:expr ;)+`, which matches any -fragment of the form `hi ; hi ; ...` where `hi ;` occurs at -least once. Note that this complex NT does not have a dedicated separator -token. +Another example of a complex NT is `$(hi $e:expr ;)+`, which matches any fragment of the form `hi ; hi ; ...` where `hi ;` occurs at least once. Note that this complex NT does not have a dedicated separator token. -(Note that Rust's parser ensures that delimited sequences always occur with -proper nesting of token tree structure and correct matching of open- and -close-delimiters.) +(Note that Rust's parser ensures that delimited sequences always occur with proper nesting of token tree structure and correct matching of open- and close-delimiters.) r[macro.ambiguity.convention.vars] -We will tend to use the variable "M" to stand for a matcher, variables "t" and -"u" for arbitrary individual tokens, and the variables "tt" and "uu" for -arbitrary token trees. (The use of "tt" does present potential ambiguity with -its additional role as a fragment specifier; but it will be clear from context -which interpretation is meant.) +We will tend to use the variable "M" to stand for a matcher, variables "t" and "u" for arbitrary individual tokens, and the variables "tt" and "uu" for arbitrary token trees. (The use of "tt" does present potential ambiguity with its additional role as a fragment specifier; but it will be clear from context which interpretation is meant.) r[macro.ambiguity.convention.set] -"SEP" will range over separator tokens, "OP" over the repetition operators -`*`, `+`, and `?`, "OPEN"/"CLOSE" over matching token pairs surrounding a -delimited sequence (e.g. `[` and `]`). +"SEP" will range over separator tokens, "OP" over the repetition operators `*`, `+`, and `?`, "OPEN"/"CLOSE" over matching token pairs surrounding a delimited sequence (e.g. `[` and `]`). r[macro.ambiguity.convention.sequence-vars] -Greek letters "α" "β" "γ" "δ" stand for potentially empty token-tree sequences. -(However, the Greek letter "ε" (epsilon) has a special role in the presentation -and does not stand for a token-tree sequence.) - - * This Greek letter convention is usually just employed when the presence of - a sequence is a technical detail; in particular, when we wish to *emphasize* - that we are operating on a sequence of token-trees, we will use the notation - "tt ..." for the sequence, not a Greek letter. - -Note that a matcher is merely a token tree. A "simple NT", as mentioned above, -is an meta-variable NT; thus it is a non-repetition. For example, `$foo:ty` is -a simple NT but `$($foo:ty)+` is a complex NT. - -Note also that in the context of this formalism, the term "token" generally -*includes* simple NTs. - -Finally, it is useful for the reader to keep in mind that according to the -definitions of this formalism, no simple NT matches the empty fragment, and -likewise no token matches the empty fragment of Rust syntax. (Thus, the *only* -NT that can match the empty fragment is a complex NT.) This is not actually -true, because the `vis` matcher can match an empty fragment. Thus, for the -purposes of the formalism, we will treat `$v:vis` as actually being -`$($v:vis)?`, with a requirement that the matcher match an empty fragment. +Greek letters "α" "β" "γ" "δ" stand for potentially empty token-tree sequences. (However, the Greek letter "ε" (epsilon) has a special role in the presentation and does not stand for a token-tree sequence.) + + * This Greek letter convention is usually just employed when the presence of a sequence is a technical detail; in particular, when we wish to *emphasize* that we are operating on a sequence of token-trees, we will use the notation "tt ..." for the sequence, not a Greek letter. + +Note that a matcher is merely a token tree. A "simple NT", as mentioned above, is an meta-variable NT; thus it is a non-repetition. For example, `$foo:ty` is a simple NT but `$($foo:ty)+` is a complex NT. + +Note also that in the context of this formalism, the term "token" generally *includes* simple NTs. + +Finally, it is useful for the reader to keep in mind that according to the definitions of this formalism, no simple NT matches the empty fragment, and likewise no token matches the empty fragment of Rust syntax. (Thus, the *only* NT that can match the empty fragment is a complex NT.) This is not actually true, because the `vis` matcher can match an empty fragment. Thus, for the purposes of the formalism, we will treat `$v:vis` as actually being `$($v:vis)?`, with a requirement that the matcher match an empty fragment. r[macro.ambiguity.invariant] ### The matcher invariants r[macro.ambiguity.invariant.list] -To be valid, a matcher must meet the following three invariants. The definitions -of FIRST and FOLLOW are described later. +To be valid, a matcher must meet the following three invariants. The definitions of FIRST and FOLLOW are described later. -1. For any two successive token tree sequences in a matcher `M` (i.e. `M = ... - tt uu ...`) with `uu ...` nonempty, we must have FOLLOW(`... tt`) ∪ {ε} ⊇ - FIRST(`uu ...`). -1. For any separated complex NT in a matcher, `M = ... $(tt ...) SEP OP ...`, - we must have `SEP` ∈ FOLLOW(`tt ...`). -1. For an unseparated complex NT in a matcher, `M = ... $(tt ...) OP ...`, if - OP = `*` or `+`, we must have FOLLOW(`tt ...`) ⊇ FIRST(`tt ...`). +1. For any two successive token tree sequences in a matcher `M` (i.e. `M = ... tt uu ...`) with `uu ...` nonempty, we must have FOLLOW(`... tt`) ∪ {ε} ⊇ FIRST(`uu ...`). +1. For any separated complex NT in a matcher, `M = ... $(tt ...) SEP OP ...`, we must have `SEP` ∈ FOLLOW(`tt ...`). +1. For an unseparated complex NT in a matcher, `M = ... $(tt ...) OP ...`, if OP = `*` or `+`, we must have FOLLOW(`tt ...`) ⊇ FIRST(`tt ...`). r[macro.ambiguity.invariant.follow-matcher] -The first invariant says that whatever actual token that comes after a matcher, -if any, must be somewhere in the predetermined follow set. This ensures that a -legal macro definition will continue to assign the same determination as to -where `... tt` ends and `uu ...` begins, even as new syntactic forms are added -to the language. +The first invariant says that whatever actual token that comes after a matcher, if any, must be somewhere in the predetermined follow set. This ensures that a legal macro definition will continue to assign the same determination as to where `... tt` ends and `uu ...` begins, even as new syntactic forms are added to the language. r[macro.ambiguity.invariant.separated-complex-nt] -The second invariant says that a separated complex NT must use a separator token -that is part of the predetermined follow set for the internal contents of the -NT. This ensures that a legal macro definition will continue to parse an input -fragment into the same delimited sequence of `tt ...`'s, even as new syntactic -forms are added to the language. +The second invariant says that a separated complex NT must use a separator token that is part of the predetermined follow set for the internal contents of the NT. This ensures that a legal macro definition will continue to parse an input fragment into the same delimited sequence of `tt ...`'s, even as new syntactic forms are added to the language. r[macro.ambiguity.invariant.unseparated-complex-nt] -The third invariant says that when we have a complex NT that can match two or -more copies of the same thing with no separation in between, it must be -permissible for them to be placed next to each other as per the first invariant. -This invariant also requires they be nonempty, which eliminates a possible -ambiguity. +The third invariant says that when we have a complex NT that can match two or more copies of the same thing with no separation in between, it must be permissible for them to be placed next to each other as per the first invariant. This invariant also requires they be nonempty, which eliminates a possible ambiguity. -**NOTE: The third invariant is currently unenforced due to historical oversight -and significant reliance on the behaviour. It is currently undecided what to do -about this going forward. Macros that do not respect the behaviour may become -invalid in a future edition of Rust. See the [tracking issue].** +**NOTE: The third invariant is currently unenforced due to historical oversight and significant reliance on the behaviour. It is currently undecided what to do about this going forward. Macros that do not respect the behaviour may become invalid in a future edition of Rust. See the [tracking issue].** r[macro.ambiguity.sets] ### FIRST and FOLLOW, informally @@ -154,26 +88,20 @@ r[macro.ambiguity.sets] r[macro.ambiguity.sets.intro] A given matcher M maps to three sets: FIRST(M), LAST(M) and FOLLOW(M). -Each of the three sets is made up of tokens. FIRST(M) and LAST(M) may also -contain a distinguished non-token element ε ("epsilon"), which indicates that M -can match the empty fragment. (But FOLLOW(M) is always just a set of tokens.) +Each of the three sets is made up of tokens. FIRST(M) and LAST(M) may also contain a distinguished non-token element ε ("epsilon"), which indicates that M can match the empty fragment. (But FOLLOW(M) is always just a set of tokens.) Informally: r[macro.ambiguity.sets.first] - * FIRST(M): collects the tokens potentially used first when matching a - fragment to M. + * FIRST(M): collects the tokens potentially used first when matching a fragment to M. r[macro.ambiguity.sets.last] - * LAST(M): collects the tokens potentially used last when matching a fragment - to M. + * LAST(M): collects the tokens potentially used last when matching a fragment to M. r[macro.ambiguity.sets.follow] - * FOLLOW(M): the set of tokens allowed to follow immediately after some - fragment matched by M. + * FOLLOW(M): the set of tokens allowed to follow immediately after some fragment matched by M. - In other words: t ∈ FOLLOW(M) if and only if there exists (potentially - empty) token sequences α, β, γ, δ where: + In other words: t ∈ FOLLOW(M) if and only if there exists (potentially empty) token sequences α, β, γ, δ where: * M matches β, @@ -182,14 +110,9 @@ r[macro.ambiguity.sets.follow] * The concatenation α β γ δ is a parseable Rust program. r[macro.ambiguity.sets.universe] -We use the shorthand ANYTOKEN to denote the set of all tokens (including simple -NTs). For example, if any token is legal after a matcher M, then FOLLOW(M) = -ANYTOKEN. +We use the shorthand ANYTOKEN to denote the set of all tokens (including simple NTs). For example, if any token is legal after a matcher M, then FOLLOW(M) = ANYTOKEN. -(To review one's understanding of the above informal descriptions, the reader -at this point may want to jump ahead to the [examples of -FIRST/LAST](#examples-of-first-and-last) before reading their formal -definitions.) +(To review one's understanding of the above informal descriptions, the reader at this point may want to jump ahead to the [examples of FIRST/LAST](#examples-of-first-and-last) before reading their formal definitions.) r[macro.ambiguity.sets.def] ### FIRST, LAST @@ -198,15 +121,13 @@ r[macro.ambiguity.sets.def.intro] Below are formal inductive definitions for FIRST and LAST. r[macro.ambiguity.sets.def.notation] -"A ∪ B" denotes set union, "A ∩ B" denotes set intersection, and "A \ B" -denotes set difference (i.e. all elements of A that are not present in B). +"A ∪ B" denotes set union, "A ∩ B" denotes set intersection, and "A \ B" denotes set difference (i.e. all elements of A that are not present in B). r[macro.ambiguity.sets.def.first] #### FIRST r[macro.ambiguity.sets.def.first.intro] -FIRST(M) is defined by case analysis on the sequence M and the structure of its -first token-tree (if any): +FIRST(M) is defined by case analysis on the sequence M and the structure of its first token-tree (if any): r[macro.ambiguity.sets.def.first.epsilon] * if M is the empty sequence, then FIRST(M) = { ε }, @@ -214,45 +135,21 @@ r[macro.ambiguity.sets.def.first.epsilon] r[macro.ambiguity.sets.def.first.token] * if M starts with a token t, then FIRST(M) = { t }, - (Note: this covers the case where M starts with a delimited token-tree - sequence, `M = OPEN tt ... CLOSE ...`, in which case `t = OPEN` and thus - FIRST(M) = { `OPEN` }.) + (Note: this covers the case where M starts with a delimited token-tree sequence, `M = OPEN tt ... CLOSE ...`, in which case `t = OPEN` and thus FIRST(M) = { `OPEN` }.) - (Note: this critically relies on the property that no simple NT matches the - empty fragment.) + (Note: this critically relies on the property that no simple NT matches the empty fragment.) r[macro.ambiguity.sets.def.first.complex] - * Otherwise, M is a token-tree sequence starting with a complex NT: `M = $( tt - ... ) OP α`, or `M = $( tt ... ) SEP OP α`, (where `α` is the (potentially - empty) sequence of token trees for the rest of the matcher). + * Otherwise, M is a token-tree sequence starting with a complex NT: `M = $( tt ... ) OP α`, or `M = $( tt ... ) SEP OP α`, (where `α` is the (potentially empty) sequence of token trees for the rest of the matcher). - * Let SEP\_SET(M) = { SEP } if SEP is present and ε ∈ FIRST(`tt ...`); - otherwise SEP\_SET(M) = {}. + * Let SEP\_SET(M) = { SEP } if SEP is present and ε ∈ FIRST(`tt ...`); otherwise SEP\_SET(M) = {}. - * Let ALPHA\_SET(M) = FIRST(`α`) if OP = `*` or `?` and ALPHA\_SET(M) = {} if - OP = `+`. + * Let ALPHA\_SET(M) = FIRST(`α`) if OP = `*` or `?` and ALPHA\_SET(M) = {} if OP = `+`. * FIRST(M) = (FIRST(`tt ...`) \\ {ε}) ∪ SEP\_SET(M) ∪ ALPHA\_SET(M). -The definition for complex NTs deserves some justification. SEP\_SET(M) defines -the possibility that the separator could be a valid first token for M, which -happens when there is a separator defined and the repeated fragment could be -empty. ALPHA\_SET(M) defines the possibility that the complex NT could be empty, -meaning that M's valid first tokens are those of the following token-tree -sequences `α`. This occurs when either `*` or `?` is used, in which case there -could be zero repetitions. In theory, this could also occur if `+` was used with -a potentially-empty repeating fragment, but this is forbidden by the third -invariant. - -From there, clearly FIRST(M) can include any token from SEP\_SET(M) or -ALPHA\_SET(M), and if the complex NT match is nonempty, then any token starting -FIRST(`tt ...`) could work too. The last piece to consider is ε. SEP\_SET(M) and -FIRST(`tt ...`) \ {ε} cannot contain ε, but ALPHA\_SET(M) could. Hence, this -definition allows M to accept ε if and only if ε ∈ ALPHA\_SET(M) does. This is -correct because for M to accept ε in the complex NT case, both the complex NT -and α must accept it. If OP = `+`, meaning that the complex NT cannot be empty, -then by definition ε ∉ ALPHA\_SET(M). Otherwise, the complex NT can accept zero -repetitions, and then ALPHA\_SET(M) = FOLLOW(`α`). So this definition is correct -with respect to \varepsilon as well. +The definition for complex NTs deserves some justification. SEP\_SET(M) defines the possibility that the separator could be a valid first token for M, which happens when there is a separator defined and the repeated fragment could be empty. ALPHA\_SET(M) defines the possibility that the complex NT could be empty, meaning that M's valid first tokens are those of the following token-tree sequences `α`. This occurs when either `*` or `?` is used, in which case there could be zero repetitions. In theory, this could also occur if `+` was used with a potentially-empty repeating fragment, but this is forbidden by the third invariant. + +From there, clearly FIRST(M) can include any token from SEP\_SET(M) or ALPHA\_SET(M), and if the complex NT match is nonempty, then any token starting FIRST(`tt ...`) could work too. The last piece to consider is ε. SEP\_SET(M) and FIRST(`tt ...`) \ {ε} cannot contain ε, but ALPHA\_SET(M) could. Hence, this definition allows M to accept ε if and only if ε ∈ ALPHA\_SET(M) does. This is correct because for M to accept ε in the complex NT case, both the complex NT and α must accept it. If OP = `+`, meaning that the complex NT cannot be empty, then by definition ε ∉ ALPHA\_SET(M). Otherwise, the complex NT can accept zero repetitions, and then ALPHA\_SET(M) = FOLLOW(`α`). So this definition is correct with respect to \varepsilon as well. r[macro.ambiguity.sets.def.last] #### LAST @@ -267,52 +164,41 @@ r[macro.ambiguity.sets.def.last.token] * if M is a singleton token t, then LAST(M) = { t } r[macro.ambiguity.sets.def.last.rep-star] - * if M is the singleton complex NT repeating zero or more times, `M = $( tt - ... ) *`, or `M = $( tt ... ) SEP *` + * if M is the singleton complex NT repeating zero or more times, `M = $( tt ... ) *`, or `M = $( tt ... ) SEP *` * Let sep_set = { SEP } if SEP present; otherwise sep_set = {}. * if ε ∈ LAST(`tt ...`) then LAST(M) = LAST(`tt ...`) ∪ sep_set - * otherwise, the sequence `tt ...` must be non-empty; LAST(M) = LAST(`tt - ...`) ∪ {ε}. + * otherwise, the sequence `tt ...` must be non-empty; LAST(M) = LAST(`tt ...`) ∪ {ε}. r[macro.ambiguity.sets.def.last.rep-plus] - * if M is the singleton complex NT repeating one or more times, `M = $( tt ... - ) +`, or `M = $( tt ... ) SEP +` + * if M is the singleton complex NT repeating one or more times, `M = $( tt ... ) +`, or `M = $( tt ... ) SEP +` * Let sep_set = { SEP } if SEP present; otherwise sep_set = {}. * if ε ∈ LAST(`tt ...`) then LAST(M) = LAST(`tt ...`) ∪ sep_set - * otherwise, the sequence `tt ...` must be non-empty; LAST(M) = LAST(`tt - ...`) + * otherwise, the sequence `tt ...` must be non-empty; LAST(M) = LAST(`tt ...`) r[macro.ambiguity.sets.def.last.rep-question] - * if M is the singleton complex NT repeating zero or one time, `M = $( tt ...) - ?`, then LAST(M) = LAST(`tt ...`) ∪ {ε}. + * if M is the singleton complex NT repeating zero or one time, `M = $( tt ...) ?`, then LAST(M) = LAST(`tt ...`) ∪ {ε}. r[macro.ambiguity.sets.def.last.delim] - * if M is a delimited token-tree sequence `OPEN tt ... CLOSE`, then LAST(M) = - { `CLOSE` }. + * if M is a delimited token-tree sequence `OPEN tt ... CLOSE`, then LAST(M) = { `CLOSE` }. r[macro.ambiguity.sets.def.last.sequence] * if M is a non-empty sequence of token-trees `tt uu ...`, * If ε ∈ LAST(`uu ...`), then LAST(M) = LAST(`tt`) ∪ (LAST(`uu ...`) \ { ε }). - * Otherwise, the sequence `uu ...` must be non-empty; then LAST(M) = - LAST(`uu ...`). + * Otherwise, the sequence `uu ...` must be non-empty; then LAST(M) = LAST(`uu ...`). ### Examples of FIRST and LAST -Below are some examples of FIRST and LAST. -(Note in particular how the special ε element is introduced and -eliminated based on the interaction between the pieces of the input.) +Below are some examples of FIRST and LAST. (Note in particular how the special ε element is introduced and eliminated based on the interaction between the pieces of the input.) -Our first example is presented in a tree structure to elaborate on how -the analysis of the matcher composes. (Some of the simpler subtrees -have been elided.) +Our first example is presented in a tree structure to elaborate on how the analysis of the matcher composes. (Some of the simpler subtrees have been elided.) ```text INPUT: $( $d:ident $e:expr );* $( $( h )* );* $( f ; )+ g @@ -358,8 +244,7 @@ r[macro.ambiguity.sets.def.follow] ### FOLLOW(M) r[macro.ambiguity.sets.def.follow.intro] -Finally, the definition for FOLLOW(M) is built up as follows. pat, expr, etc. -represent simple nonterminals with the given fragment specifier. +Finally, the definition for FOLLOW(M) is built up as follows. pat, expr, etc. represent simple nonterminals with the given fragment specifier. r[macro.ambiguity.sets.def.follow.pat] * FOLLOW(pat) = {`=>`, `,`, `=`, `|`, `if`, `in`}`. @@ -368,27 +253,19 @@ r[macro.ambiguity.sets.def.follow.expr-stmt] * FOLLOW(expr) = FOLLOW(expr_2021) = FOLLOW(stmt) = {`=>`, `,`, `;`}`. r[macro.ambiguity.sets.def.follow.ty-path] - * FOLLOW(ty) = FOLLOW(path) = {`{`, `[`, `,`, `=>`, `:`, `=`, `>`, `>>`, `;`, - `|`, `as`, `where`, block nonterminals}. + * FOLLOW(ty) = FOLLOW(path) = {`{`, `[`, `,`, `=>`, `:`, `=`, `>`, `>>`, `;`, `|`, `as`, `where`, block nonterminals}. r[macro.ambiguity.sets.def.follow.vis] - * FOLLOW(vis) = {`,`l any keyword or identifier except a non-raw `priv`; any - token that can begin a type; ident, ty, and path nonterminals}. + * FOLLOW(vis) = {`,`l any keyword or identifier except a non-raw `priv`; any token that can begin a type; ident, ty, and path nonterminals}. r[macro.ambiguity.sets.def.follow.simple] - * FOLLOW(t) = ANYTOKEN for any other simple token, including block, ident, - tt, item, lifetime, literal and meta simple nonterminals, and all terminals. + * FOLLOW(t) = ANYTOKEN for any other simple token, including block, ident, tt, item, lifetime, literal and meta simple nonterminals, and all terminals. r[macro.ambiguity.sets.def.follow.other-matcher] - * FOLLOW(M), for any other M, is defined as the intersection, as t ranges over - (LAST(M) \ {ε}), of FOLLOW(t). + * FOLLOW(M), for any other M, is defined as the intersection, as t ranges over (LAST(M) \ {ε}), of FOLLOW(t). r[macro.ambiguity.sets.def.follow.type-first] -The tokens that can begin a type are, as of this writing, {`(`, `[`, `!`, `*`, -`&`, `&&`, `?`, lifetimes, `>`, `>>`, `::`, any non-keyword identifier, `super`, -`self`, `Self`, `extern`, `crate`, `$crate`, `_`, `for`, `impl`, `fn`, `unsafe`, -`typeof`, `dyn`}, although this list may not be complete because people won't -always remember to update the appendix when new ones are added. +The tokens that can begin a type are, as of this writing, {`(`, `[`, `!`, `*`, `&`, `&&`, `?`, lifetimes, `>`, `>>`, `::`, any non-keyword identifier, `super`, `self`, `Self`, `extern`, `crate`, `$crate`, `_`, `for`, `impl`, `fn`, `unsafe`, `typeof`, `dyn`}, although this list may not be complete because people won't always remember to update the appendix when new ones are added. Examples of FOLLOW for complex M: @@ -398,8 +275,7 @@ Examples of FOLLOW for complex M: ### Examples of valid and invalid matchers -With the above specification in hand, we can present arguments for -why particular matchers are legal and others are not. +With the above specification in hand, we can present arguments for why particular matchers are legal and others are not. * `($ty:ty < foo ,)` : illegal, because FIRST(`< foo ,`) = { `<` } ⊈ FOLLOW(`ty`) diff --git a/src/macros-by-example.md b/src/macros-by-example.md index eafe6eeb4a..392f3d9ed8 100644 --- a/src/macros-by-example.md +++ b/src/macros-by-example.md @@ -40,30 +40,18 @@ MacroTranscriber -> DelimTokenTree ``` r[macro.decl.intro] -`macro_rules` allows users to define syntax extension in a declarative way. We -call such extensions "macros by example" or simply "macros". +`macro_rules` allows users to define syntax extension in a declarative way. We call such extensions "macros by example" or simply "macros". -Each macro by example has a name, and one or more _rules_. Each rule has two -parts: a _matcher_, describing the syntax that it matches, and a _transcriber_, -describing the syntax that will replace a successfully matched invocation. Both -the matcher and the transcriber must be surrounded by delimiters. Macros can -expand to expressions, statements, items (including traits, impls, and foreign -items), types, or patterns. +Each macro by example has a name, and one or more _rules_. Each rule has two parts: a _matcher_, describing the syntax that it matches, and a _transcriber_, describing the syntax that will replace a successfully matched invocation. Both the matcher and the transcriber must be surrounded by delimiters. Macros can expand to expressions, statements, items (including traits, impls, and foreign items), types, or patterns. r[macro.decl.transcription] ## Transcribing r[macro.decl.transcription.intro] -When a macro is invoked, the macro expander looks up macro invocations by name, -and tries each macro rule in turn. It transcribes the first successful match; if -this results in an error, then future matches are not tried. +When a macro is invoked, the macro expander looks up macro invocations by name, and tries each macro rule in turn. It transcribes the first successful match; if this results in an error, then future matches are not tried. r[macro.decl.transcription.lookahead] -When matching, no lookahead is performed; if the compiler cannot unambiguously determine how to -parse the macro invocation one token at a time, then it is an error. In the -following example, the compiler does not look ahead past the identifier to see -if the following token is a `)`, even though that would allow it to parse the -invocation unambiguously: +When matching, no lookahead is performed; if the compiler cannot unambiguously determine how to parse the macro invocation one token at a time, then it is an error. In the following example, the compiler does not look ahead past the identifier to see if the following token is a `)`, even though that would allow it to parse the invocation unambiguously: ```rust,compile_fail macro_rules! ambiguity { @@ -74,23 +62,12 @@ ambiguity!(error); // Error: local ambiguity ``` r[macro.decl.transcription.syntax] -In both the matcher and the transcriber, the `$` token is used to invoke special -behaviours from the macro engine (described below in [Metavariables] and -[Repetitions]). Tokens that aren't part of such an invocation are matched and -transcribed literally, with one exception. The exception is that the outer -delimiters for the matcher will match any pair of delimiters. Thus, for -instance, the matcher `(())` will match `{()}` but not `{{}}`. The character -`$` cannot be matched or transcribed literally. +In both the matcher and the transcriber, the `$` token is used to invoke special behaviours from the macro engine (described below in [Metavariables] and [Repetitions]). Tokens that aren't part of such an invocation are matched and transcribed literally, with one exception. The exception is that the outer delimiters for the matcher will match any pair of delimiters. Thus, for instance, the matcher `(())` will match `{()}` but not `{{}}`. The character `$` cannot be matched or transcribed literally. r[macro.decl.transcription.fragment] ### Forwarding a matched fragment -When forwarding a matched fragment to another macro-by-example, matchers in -the second macro will see an opaque AST of the fragment type. The second macro -can't use literal tokens to match the fragments in the matcher, only a -fragment specifier of the same type. The `ident`, `lifetime`, and `tt` -fragment types are an exception, and *can* be matched by literal tokens. The -following illustrates this restriction: +When forwarding a matched fragment to another macro-by-example, matchers in the second macro will see an opaque AST of the fragment type. The second macro can't use literal tokens to match the fragments in the matcher, only a fragment specifier of the same type. The `ident`, `lifetime`, and `tt` fragment types are an exception, and *can* be matched by literal tokens. The following illustrates this restriction: ```rust,compile_fail macro_rules! foo { @@ -105,8 +82,7 @@ macro_rules! bar { foo!(3); ``` -The following illustrates how tokens can be directly matched after matching a -`tt` fragment: +The following illustrates how tokens can be directly matched after matching a `tt` fragment: ```rust // compiles OK @@ -125,8 +101,7 @@ r[macro.decl.meta] ## Metavariables r[macro.decl.meta.intro] -In the matcher, `$` _name_ `:` _fragment-specifier_ matches a Rust syntax -fragment of the kind specified and binds it to the metavariable `$`_name_. +In the matcher, `$` _name_ `:` _fragment-specifier_ matches a Rust syntax fragment of the kind specified and binds it to the metavariable `$`_name_. r[macro.decl.meta.specifier] Valid fragment specifiers are: @@ -148,10 +123,7 @@ Valid fragment specifiers are: * `vis`: a possibly empty [Visibility] qualifier r[macro.decl.meta.transcription] -In the transcriber, metavariables are referred to simply by `$`_name_, since -the fragment kind is specified in the matcher. Metavariables are replaced with -the syntax element that matched them. -Metavariables can be transcribed more than once or not at all. +In the transcriber, metavariables are referred to simply by `$`_name_, since the fragment kind is specified in the matcher. Metavariables are replaced with the syntax element that matched them. Metavariables can be transcribed more than once or not at all. r[macro.decl.meta.dollar-crate] The keyword metavariable [`$crate`] can be used to refer to the current crate. @@ -174,15 +146,10 @@ r[macro.decl.repetition] ## Repetitions r[macro.decl.repetition.intro] -In both the matcher and transcriber, repetitions are indicated by placing the -tokens to be repeated inside `$(`…`)`, followed by a repetition operator, -optionally with a separator token between. +In both the matcher and transcriber, repetitions are indicated by placing the tokens to be repeated inside `$(`…`)`, followed by a repetition operator, optionally with a separator token between. r[macro.decl.repetition.separator] -The separator token can be any token -other than a delimiter or one of the repetition operators, but `;` and `,` are -the most common. For instance, `$( $i:ident ),*` represents any number of -identifiers separated by commas. Nested repetitions are permitted. +The separator token can be any token other than a delimiter or one of the repetition operators, but `;` and `,` are the most common. For instance, `$( $i:ident ),*` represents any number of identifiers separated by commas. Nested repetitions are permitted. r[macro.decl.repetition.operators] The repetition operators are: @@ -192,50 +159,24 @@ The repetition operators are: - `?` --- indicates an optional fragment with zero or one occurrence. r[macro.decl.repetition.optional-restriction] -Since `?` represents at most one occurrence, it cannot be used with a -separator. +Since `?` represents at most one occurrence, it cannot be used with a separator. r[macro.decl.repetition.fragment] -The repeated fragment both matches and transcribes to the specified number of -the fragment, separated by the separator token. Metavariables are matched to -every repetition of their corresponding fragment. For instance, the `$( $i:ident -),*` example above matches `$i` to all of the identifiers in the list. - -During transcription, additional restrictions apply to repetitions so that the -compiler knows how to expand them properly: - -1. A metavariable must appear in exactly the same number, kind, and nesting - order of repetitions in the transcriber as it did in the matcher. So for the - matcher `$( $i:ident ),*`, the transcribers `=> { $i }`, - `=> { $( $( $i )* )* }`, and `=> { $( $i )+ }` are all illegal, but - `=> { $( $i );* }` is correct and replaces a comma-separated list of - identifiers with a semicolon-separated list. -2. Each repetition in the transcriber must contain at least one metavariable to - decide how many times to expand it. If multiple metavariables appear in the - same repetition, they must be bound to the same number of fragments. For - instance, `( $( $i:ident ),* ; $( $j:ident ),* ) => (( $( ($i,$j) ),* ))` must - bind the same number of `$i` fragments as `$j` fragments. This means that - invoking the macro with `(a, b, c; d, e, f)` is legal and expands to - `((a,d), (b,e), (c,f))`, but `(a, b, c; d, e)` is illegal because it does - not have the same number. This requirement applies to every layer of nested - repetitions. +The repeated fragment both matches and transcribes to the specified number of the fragment, separated by the separator token. Metavariables are matched to every repetition of their corresponding fragment. For instance, the `$( $i:ident ),*` example above matches `$i` to all of the identifiers in the list. + +During transcription, additional restrictions apply to repetitions so that the compiler knows how to expand them properly: + +1. A metavariable must appear in exactly the same number, kind, and nesting order of repetitions in the transcriber as it did in the matcher. So for the matcher `$( $i:ident ),*`, the transcribers `=> { $i }`, `=> { $( $( $i )* )* }`, and `=> { $( $i )+ }` are all illegal, but `=> { $( $i );* }` is correct and replaces a comma-separated list of identifiers with a semicolon-separated list. +2. Each repetition in the transcriber must contain at least one metavariable to decide how many times to expand it. If multiple metavariables appear in the same repetition, they must be bound to the same number of fragments. For instance, `( $( $i:ident ),* ; $( $j:ident ),* ) => (( $( ($i,$j) ),* ))` must bind the same number of `$i` fragments as `$j` fragments. This means that invoking the macro with `(a, b, c; d, e, f)` is legal and expands to `((a,d), (b,e), (c,f))`, but `(a, b, c; d, e)` is illegal because it does not have the same number. This requirement applies to every layer of nested repetitions. r[macro.decl.scope] ## Scoping, exporting, and importing r[macro.decl.scope.intro] -For historical reasons, the scoping of macros by example does not work entirely -like items. Macros have two forms of scope: textual scope, and path-based scope. -Textual scope is based on the order that things appear in source files, or even -across multiple files, and is the default scoping. It is explained further below. -Path-based scope works exactly the same way that item scoping does. The scoping, -exporting, and importing of macros is controlled largely by attributes. +For historical reasons, the scoping of macros by example does not work entirely like items. Macros have two forms of scope: textual scope, and path-based scope. Textual scope is based on the order that things appear in source files, or even across multiple files, and is the default scoping. It is explained further below. Path-based scope works exactly the same way that item scoping does. The scoping, exporting, and importing of macros is controlled largely by attributes. r[macro.decl.scope.unqualified] -When a macro is invoked by an unqualified identifier (not part of a multi-part -path), it is first looked up in textual scoping. If this does not yield any -results, then it is looked up in path-based scoping. If the macro's name is -qualified with a path, then it is only looked up in path-based scoping. +When a macro is invoked by an unqualified identifier (not part of a multi-part path), it is first looked up in textual scoping. If this does not yield any results, then it is looked up in path-based scoping. If the macro's name is qualified with a path, then it is only looked up in path-based scoping. ```rust,ignore @@ -253,13 +194,7 @@ r[macro.decl.scope.textual] ### Textual scope r[macro.decl.scope.textual.intro] -Textual scope is based largely on the order that things appear in source files, -and works similarly to the scope of local variables declared with `let` except -it also applies at the module level. When `macro_rules!` is used to define a -macro, the macro enters the scope after the definition (note that it can still -be used recursively, since names are looked up from the invocation site), up -until its surrounding scope, typically a module, is closed. This can enter child -modules and even span across multiple files: +Textual scope is based largely on the order that things appear in source files, and works similarly to the scope of local variables declared with `let` except it also applies at the module level. When `macro_rules!` is used to define a macro, the macro enters the scope after the definition (note that it can still be used recursively, since names are looked up from the invocation site), up until its surrounding scope, typically a module, is closed. This can enter child modules and even span across multiple files: ```rust,ignore @@ -283,8 +218,7 @@ m!{} // OK: appears after declaration of m in src/lib.rs ``` r[macro.decl.scope.textual.shadow] -It is not an error to define a macro multiple times; the most recent declaration -will shadow the previous one unless it has gone out of scope. +It is not an error to define a macro multiple times; the most recent declaration will shadow the previous one unless it has gone out of scope. ```rust macro_rules! m { @@ -311,8 +245,7 @@ mod inner { m!(1); ``` -Macros can be declared and used locally inside functions as well, and work -similarly: +Macros can be declared and used locally inside functions as well, and work similarly: ```rust fn foo() { @@ -718,8 +651,7 @@ fn unit() { } ``` -Note that, because `$crate` refers to the current crate, it must be used with a -fully qualified module path when referring to non-macro items: +Note that, because `$crate` refers to the current crate, it must be used with a fully qualified module path when referring to non-macro items: ```rust pub mod inner { @@ -733,11 +665,7 @@ pub mod inner { ``` r[macro.decl.hygiene.vis] -Additionally, even though `$crate` allows a macro to refer to items within its -own crate when expanding, its use has no effect on visibility. An item or macro -referred to must still be visible from the invocation site. In the following -example, any attempt to invoke `call_foo!()` from outside its crate will fail -because `foo()` is not public. +Additionally, even though `$crate` allows a macro to refer to items within its own crate when expanding, its use has no effect on visibility. An item or macro referred to must still be visible from the invocation site. In the following example, any attempt to invoke `call_foo!()` from outside its crate will fail because `foo()` is not public. ```rust #[macro_export] @@ -755,22 +683,12 @@ r[macro.decl.follow-set] ## Follow-set ambiguity restrictions r[macro.decl.follow-set.intro] -The parser used by the macro system is reasonably powerful, but it is limited in -order to prevent ambiguity in current or future versions of the language. +The parser used by the macro system is reasonably powerful, but it is limited in order to prevent ambiguity in current or future versions of the language. r[macro.decl.follow-set.token-restriction] -In particular, in addition to the rule about ambiguous expansions, a nonterminal -matched by a metavariable must be followed by a token which has been decided can -be safely used after that kind of match. - -As an example, a macro matcher like `$i:expr [ , ]` could in theory be accepted -in Rust today, since `[,]` cannot be part of a legal expression and therefore -the parse would always be unambiguous. However, because `[` can start trailing -expressions, `[` is not a character which can safely be ruled out as coming -after an expression. If `[,]` were accepted in a later version of Rust, this -matcher would become ambiguous or would misparse, breaking working code. -Matchers like `$i:expr,` or `$i:expr;` would be legal, however, because `,` and -`;` are legal expression separators. The specific rules are: +In particular, in addition to the rule about ambiguous expansions, a nonterminal matched by a metavariable must be followed by a token which has been decided can be safely used after that kind of match. + +As an example, a macro matcher like `$i:expr [ , ]` could in theory be accepted in Rust today, since `[,]` cannot be part of a legal expression and therefore the parse would always be unambiguous. However, because `[` can start trailing expressions, `[` is not a character which can safely be ruled out as coming after an expression. If `[,]` were accepted in a later version of Rust, this matcher would become ambiguous or would misparse, breaking working code. Matchers like `$i:expr,` or `$i:expr;` would be legal, however, because `,` and `;` are legal expression separators. The specific rules are: r[macro.decl.follow-set.token-expr-stmt] * `expr` and `stmt` may only be followed by one of: `=>`, `,`, or `;`. @@ -782,14 +700,10 @@ r[macro.decl.follow-set.token-pat] * `pat` may only be followed by one of: `=>`, `,`, `=`, `if`, or `in`. r[macro.decl.follow-set.token-path-ty] - * `path` and `ty` may only be followed by one of: `=>`, `,`, `=`, `|`, `;`, - `:`, `>`, `>>`, `[`, `{`, `as`, `where`, or a macro variable of `block` - fragment specifier. + * `path` and `ty` may only be followed by one of: `=>`, `,`, `=`, `|`, `;`, `:`, `>`, `>>`, `[`, `{`, `as`, `where`, or a macro variable of `block` fragment specifier. r[macro.decl.follow-set.token-vis] - * `vis` may only be followed by one of: `,`, an identifier other than a - non-raw `priv`, any token that can begin a type, or a metavariable with a - `ident`, `ty`, or `path` fragment specifier. + * `vis` may only be followed by one of: `,`, an identifier other than a non-raw `priv`, any token that can begin a type, or a metavariable with a `ident`, `ty`, or `path` fragment specifier. r[macro.decl.follow-set.token-other] * All other fragment specifiers have no restrictions. @@ -799,18 +713,12 @@ r[macro.decl.follow-set.edition2021] > Before the 2021 edition, `pat` may also be followed by `|`. r[macro.decl.follow-set.repetition] -When repetitions are involved, then the rules apply to every possible number of -expansions, taking separators into account. This means: - - * If the repetition includes a separator, that separator must be able to - follow the contents of the repetition. - * If the repetition can repeat multiple times (`*` or `+`), then the contents - must be able to follow themselves. - * The contents of the repetition must be able to follow whatever comes - before, and whatever comes after must be able to follow the contents of the - repetition. - * If the repetition can match zero times (`*` or `?`), then whatever comes - after must be able to follow whatever comes before. +When repetitions are involved, then the rules apply to every possible number of expansions, taking separators into account. This means: + + * If the repetition includes a separator, that separator must be able to follow the contents of the repetition. + * If the repetition can repeat multiple times (`*` or `+`), then the contents must be able to follow themselves. + * The contents of the repetition must be able to follow whatever comes before, and whatever comes after must be able to follow the contents of the repetition. + * If the repetition can match zero times (`*` or `?`), then whatever comes after must be able to follow whatever comes before. For more detail, see the [formal specification]. diff --git a/src/macros.md b/src/macros.md index efc51e4c65..334be5b2f7 100644 --- a/src/macros.md +++ b/src/macros.md @@ -2,15 +2,12 @@ r[macro] # Macros r[macro.intro] -The functionality and syntax of Rust can be extended with custom definitions -called macros. They are given names, and invoked through a consistent -syntax: `some_extension!(...)`. +The functionality and syntax of Rust can be extended with custom definitions called macros. They are given names, and invoked through a consistent syntax: `some_extension!(...)`. There are two ways to define new macros: * [Macros by Example] define new syntax in a higher-level, declarative way. -* [Procedural Macros] define function-like macros, custom derives, and custom - attributes using functions that operate on input tokens. +* [Procedural Macros] define function-like macros, custom derives, and custom attributes using functions that operate on input tokens. r[macro.invocation] ## Macro invocation @@ -35,9 +32,7 @@ MacroInvocationSemi -> ``` r[macro.invocation.intro] -A macro invocation expands a macro at compile time and replaces the -invocation with the result of the macro. Macros may be invoked in the -following situations: +A macro invocation expands a macro at compile time and replaces the invocation with the result of the macro. Macros may be invoked in the following situations: r[macro.invocation.expr] * [Expressions] and [statements] @@ -58,10 +53,7 @@ r[macro.invocation.extern] * [External blocks] r[macro.invocation.item-statement] -When used as an item or a statement, the [MacroInvocationSemi] form is used -where a semicolon is required at the end when not using curly braces. -[Visibility qualifiers] are never allowed before a macro invocation or -[`macro_rules`] definition. +When used as an item or a statement, the [MacroInvocationSemi] form is used where a semicolon is required at the end when not using curly braces. [Visibility qualifiers] are never allowed before a macro invocation or [`macro_rules`] definition. ```rust // Used as an expression. diff --git a/src/names.md b/src/names.md index 9934f8bc84..f5b1bda9bd 100644 --- a/src/names.md +++ b/src/names.md @@ -2,29 +2,19 @@ r[names] # Names r[names.intro] -An *entity* is a language construct that can be referred to in some way within -the source program, usually via a [path]. Entities include [types], [items], -[generic parameters], [variable bindings], [loop labels], [lifetimes], -[fields], [attributes], and [lints]. +An *entity* is a language construct that can be referred to in some way within the source program, usually via a [path]. Entities include [types], [items], [generic parameters], [variable bindings], [loop labels], [lifetimes], [fields], [attributes], and [lints]. -A *declaration* is a syntactical construct that can introduce a *name* to -refer to an entity. Entity names are valid within a [*scope*] --- a region of -source text where that name may be referenced. +A *declaration* is a syntactical construct that can introduce a *name* to refer to an entity. Entity names are valid within a [*scope*] --- a region of source text where that name may be referenced. -Some entities are [explicitly declared](#explicitly-declared-entities) in the -source code, and some are [implicitly declared](#implicitly-declared-entities) -as part of the language or compiler extensions. +Some entities are [explicitly declared](#explicitly-declared-entities) in the source code, and some are [implicitly declared](#implicitly-declared-entities) as part of the language or compiler extensions. [*Paths*] are used to refer to an entity, possibly in another module or type. -Lifetimes and loop labels use a [dedicated syntax][lifetimes-and-loop-labels] using a -leading quote. +Lifetimes and loop labels use a [dedicated syntax][lifetimes-and-loop-labels] using a leading quote. -Names are segregated into different [*namespaces*], allowing entities in -different namespaces to share the same name without conflict. +Names are segregated into different [*namespaces*], allowing entities in different namespaces to share the same name without conflict. -[*Name resolution*] is the compile-time process of tying paths, identifiers, -and labels to entity declarations. +[*Name resolution*] is the compile-time process of tying paths, identifiers, and labels to entity declarations. Access to certain names may be restricted based on their [*visibility*]. @@ -41,8 +31,7 @@ r[names.explicit.item-decl] * [Use declarations] * [Function declarations] and [function parameters] * [Type aliases] - * [struct], [union], [enum], enum variant declarations, and their named - fields + * [struct], [union], [enum], enum variant declarations, and their named fields * [Constant item declarations] * [Static item declarations] * [Trait item declarations] and their [associated items] @@ -75,15 +64,13 @@ r[names.explicit.macro_export] * The [`macro_export` attribute] can introduce an alias for the macro into the crate root r[names.explicit.macro-invocation] -Additionally, [macro invocations] and [attributes] can introduce names by -expanding to one of the above items. +Additionally, [macro invocations] and [attributes] can introduce names by expanding to one of the above items. r[names.implicit] ## Implicitly declared entities r[names.implicit.list] -The following entities are implicitly defined by the language, or are -introduced by compiler options and extensions: +The following entities are implicitly defined by the language, or are introduced by compiler options and extensions: r[names.implicit.primitive-types] * [Language prelude]: @@ -118,8 +105,7 @@ r[names.implicit.lifetime-static] * The [`'static`] lifetime r[names.implicit.root] -Additionally, the crate root module does not have a name, but can be referred -to with certain [path qualifiers] or aliases. +Additionally, the crate root module does not have a name, but can be referred to with certain [path qualifiers] or aliases. [*Name resolution*]: names/name-resolution.md [*namespaces*]: names/namespaces.md diff --git a/src/names/name-resolution.md b/src/names/name-resolution.md index b82554055a..ba76d63514 100644 --- a/src/names/name-resolution.md +++ b/src/names/name-resolution.md @@ -349,9 +349,7 @@ pub fn f() { ``` > [!NOTE] -> This restriction is needed due to implementation details in the compiler, -> specifically the current scope visitation logic and the complexity of supporting -> this behavior. This ambiguity error may be removed in the future. +> This restriction is needed due to implementation details in the compiler, specifically the current scope visitation logic and the complexity of supporting this behavior. This ambiguity error may be removed in the future. r[names.resolution.expansion.macros] ### Macros @@ -375,9 +373,7 @@ The available scope kinds are visited in the following order. Each of these scop > For more info see [derive helper scope]. > [!NOTE] -> This visitation order may change in the future, such as interleaving the -> visitation of textual and path-based scope candidates based on their lexical -> scopes. +> This visitation order may change in the future, such as interleaving the visitation of textual and path-based scope candidates based on their lexical scopes. > [!EDITION-2018] > Starting in edition 2018 the `#[macro_use]` prelude is not visited when [`#[no_implicit_prelude]`][names.preludes.no_implicit_prelude] is present. diff --git a/src/names/namespaces.md b/src/names/namespaces.md index b44d6f7da7..7e76e60b16 100644 --- a/src/names/namespaces.md +++ b/src/names/namespaces.md @@ -2,15 +2,9 @@ r[names.namespaces] # Namespaces r[names.namespaces.intro] -A *namespace* is a logical grouping of declared [names]. Names are segregated -into separate namespaces based on the kind of entity the name refers to. -Namespaces allow the occurrence of a name in one namespace to not conflict -with the same name in another namespace. +A *namespace* is a logical grouping of declared [names]. Names are segregated into separate namespaces based on the kind of entity the name refers to. Namespaces allow the occurrence of a name in one namespace to not conflict with the same name in another namespace. -There are several different namespaces that each contain different kinds of -entities. The usage of a name will look for the declaration of that name in -different namespaces, based on the context, as described in the [name -resolution] chapter. +There are several different namespaces that each contain different kinds of entities. The usage of a name will look for the declaration of that name in different namespaces, based on the context, as described in the [name resolution] chapter. r[names.namespaces.kinds] The following is a list of namespaces, with their corresponding entities: @@ -37,8 +31,7 @@ The following is a list of namespaces, with their corresponding entities: * [Generic const parameters] * [Associated const declarations] * [Associated function declarations] - * Local bindings --- [`let`], [`if let`], [`while let`], [`for`], [`match`] - arms, [function parameters], [closure parameters] + * Local bindings --- [`let`], [`if let`], [`while let`], [`for`], [`match`] arms, [function parameters], [closure parameters] * Captured [closure] variables * Macro Namespace * [`macro_rules` declarations] @@ -87,33 +80,23 @@ fn example<'Foo>(f: Foo) { r[names.namespaces.without] ## Named entities without a namespace -The following entities have explicit names, but the names are not a part of -any specific namespace. +The following entities have explicit names, but the names are not a part of any specific namespace. ### Fields r[names.namespaces.without.fields] -Even though struct, enum, and union fields are named, the named fields do not -live in an explicit namespace. They can only be accessed via a [field -expression], which only inspects the field names of the specific type being -accessed. +Even though struct, enum, and union fields are named, the named fields do not live in an explicit namespace. They can only be accessed via a [field expression], which only inspects the field names of the specific type being accessed. ### Use declarations r[names.namespaces.without.use] -A [use declaration] has named aliases that it imports into scope, but the -`use` item itself does not belong to a specific namespace. Instead, it can -introduce aliases into multiple namespaces, depending on the item kind being -imported. +A [use declaration] has named aliases that it imports into scope, but the `use` item itself does not belong to a specific namespace. Instead, it can introduce aliases into multiple namespaces, depending on the item kind being imported. r[names.namespaces.sub-namespaces] ## Sub-namespaces r[names.namespaces.sub-namespaces.intro] -The macro namespace is split into two sub-namespaces: one for [bang-style macros] and one for [attributes]. -When an attribute is resolved, any bang-style macros in scope will be ignored. -And conversely resolving a bang-style macro will ignore attribute macros in scope. -This prevents one style from shadowing another. +The macro namespace is split into two sub-namespaces: one for [bang-style macros] and one for [attributes]. When an attribute is resolved, any bang-style macros in scope will be ignored. And conversely resolving a bang-style macro will ignore attribute macros in scope. This prevents one style from shadowing another. For example, the [`cfg` attribute] and the [`cfg` macro] are two different entities with the same name in the macro namespace, but they can still be used in their respective context. diff --git a/src/names/preludes.md b/src/names/preludes.md index 62692966a8..d114a7d63c 100644 --- a/src/names/preludes.md +++ b/src/names/preludes.md @@ -2,13 +2,9 @@ r[names.preludes] # Preludes r[names.preludes.intro] -A *prelude* is a collection of names that are automatically brought into scope -of every module in a crate. +A *prelude* is a collection of names that are automatically brought into scope of every module in a crate. -These prelude names are not part of the module itself: they are implicitly -queried during [name resolution]. For example, even though something like -[`Box`] is in scope in every module, you cannot refer to it as `self::Box` -because it is not a member of the current module. +These prelude names are not part of the module itself: they are implicitly queried during [name resolution]. For example, even though something like [`Box`] is in scope in every module, you cannot refer to it as `self::Box` because it is not a member of the current module. r[names.preludes.kinds] There are several different preludes: @@ -49,10 +45,7 @@ r[names.preludes.extern] ## Extern prelude r[names.preludes.extern.intro] -External crates imported with [`extern crate`] in the root module or provided -to the compiler (as with the `--extern` flag with `rustc`) are added to the -*extern prelude*. If imported with an alias such as `extern crate orig_name as -new_name`, then the symbol `new_name` is instead added to the prelude. +External crates imported with [`extern crate`] in the root module or provided to the compiler (as with the `--extern` flag with `rustc`) are added to the *extern prelude*. If imported with an alias such as `extern crate orig_name as new_name`, then the symbol `new_name` is instead added to the prelude. r[names.preludes.extern.core] The [`core`] crate is always added to the extern prelude. @@ -77,8 +70,7 @@ r[names.preludes.extern.edition2018] > Cargo does bring in `proc_macro` to the extern prelude for proc-macro crates only. @@ -123,8 +115,7 @@ r[names.preludes.lang] ## Language prelude r[names.preludes.lang.intro] -The language prelude includes names of types and attributes that are built-in -to the language. The language prelude is always in scope. +The language prelude includes names of types and attributes that are built-in to the language. The language prelude is always in scope. r[names.preludes.lang.entities] It includes the following: @@ -143,15 +134,13 @@ r[names.preludes.macro_use] ## `macro_use` prelude r[names.preludes.macro_use.intro] -The `macro_use` prelude includes macros from external crates that were -imported by the [`macro_use` attribute] applied to an [`extern crate`]. +The `macro_use` prelude includes macros from external crates that were imported by the [`macro_use` attribute] applied to an [`extern crate`]. r[names.preludes.tool] ## Tool prelude r[names.preludes.tool.intro] -The tool prelude includes tool names for external tools in the [type -namespace]. See the [tool attributes] section for more details. +The tool prelude includes tool names for external tools in the [type namespace]. See the [tool attributes] section for more details. r[names.preludes.no_implicit_prelude] diff --git a/src/names/scopes.md b/src/names/scopes.md index 382aa29241..1c8b062c14 100644 --- a/src/names/scopes.md +++ b/src/names/scopes.md @@ -2,10 +2,7 @@ r[names.scopes] # Scopes r[names.scopes.intro] -A *scope* is the region of source text where a named [entity] may be referenced with that name. -The following sections provide details on the scoping rules and behavior, which depend on the kind of entity and where it is declared. -The process of how names are resolved to entities is described in the [name resolution] chapter. -More information on "drop scopes" used for the purpose of running destructors may be found in the [destructors] chapter. +A *scope* is the region of source text where a named [entity] may be referenced with that name. The following sections provide details on the scoping rules and behavior, which depend on the kind of entity and where it is declared. The process of how names are resolved to entities is described in the [name resolution] chapter. More information on "drop scopes" used for the purpose of running destructors may be found in the [destructors] chapter. r[names.scopes.items] ## Item scopes @@ -17,22 +14,19 @@ r[names.scopes.items.statement] The name of an item declared as a [statement] has a scope that extends from the start of the block the item statement is in until the end of the block. r[names.scopes.items.duplicate] -It is an error to introduce an item with a duplicate name of another item in the same [namespace] within the same module or block. -[Asterisk glob imports] have special behavior for dealing with duplicate names and shadowing, see the linked chapter for more details. +It is an error to introduce an item with a duplicate name of another item in the same [namespace] within the same module or block. [Asterisk glob imports] have special behavior for dealing with duplicate names and shadowing, see the linked chapter for more details. r[names.scopes.items.shadow-prelude] Items in a module may shadow items in a [prelude](#prelude-scopes). r[names.scopes.items.nested-modules] -Item names from outer modules are not in scope within a nested module. -A [path] may be used to refer to an item in another module. +Item names from outer modules are not in scope within a nested module. A [path] may be used to refer to an item in another module. r[names.scopes.associated-items] ### Associated item scopes r[names.scopes.associated-items.scope] -[Associated items] are not scoped and can only be referred to by using a [path] leading from the type or trait they are associated with. -[Methods] can also be referred to via [call expressions]. +[Associated items] are not scoped and can only be referred to by using a [path] leading from the type or trait they are associated with. [Methods] can also be referred to via [call expressions]. r[names.scopes.associated-items.duplicate] Similar to items within a module or block, it is an error to introduce an item within a trait or implementation that is a duplicate of another item in the trait or impl in the same namespace. @@ -86,12 +80,10 @@ r[names.scopes.generic-parameters] ## Generic parameter scopes r[names.scopes.generic-parameters.param-list] -Generic parameters are declared in a [GenericParams] list. -The scope of a generic parameter is within the item it is declared on. +Generic parameters are declared in a [GenericParams] list. The scope of a generic parameter is within the item it is declared on. r[names.scopes.generic-parameters.order-independent] -All parameters are in scope within the generic parameter list regardless of the order they are declared. -The following shows some examples where a parameter may be referenced before it is declared: +All parameters are in scope within the generic parameter list regardless of the order they are declared. The following shows some examples where a parameter may be referenced before it is declared: ```rust // The 'b bound is referenced before it is declared. @@ -158,8 +150,7 @@ The `'static` lifetime and [placeholder lifetime] `'_` have a special meaning an #### Lifetime generic parameter scopes r[names.scopes.lifetimes.generic] -[Constant] and [static] items and [const contexts] only ever allow `'static` lifetime references, so no other lifetime may be in scope within them. -[Associated consts] do allow referring to lifetimes declared in their trait or implementation. +[Constant] and [static] items and [const contexts] only ever allow `'static` lifetime references, so no other lifetime may be in scope within them. [Associated consts] do allow referring to lifetimes declared in their trait or implementation. #### Higher-ranked trait bound scopes @@ -226,9 +217,7 @@ r[names.scopes.loop-label] ## Loop label scopes r[names.scopes.loop-label.scope] -[Loop labels] may be declared by a [loop expression]. -The scope of a loop label is from the point it is declared till the end of the loop expression. -The scope does not extend into [items], [closures], [async blocks], [const arguments], [const contexts], and the iterator expression of the defining [`for` loop]. +[Loop labels] may be declared by a [loop expression]. The scope of a loop label is from the point it is declared till the end of the loop expression. The scope does not extend into [items], [closures], [async blocks], [const arguments], [const contexts], and the iterator expression of the defining [`for` loop]. ```rust 'a: for n in 0..3 { @@ -258,8 +247,7 @@ The scope does not extend into [items], [closures], [async blocks], [const argum ``` r[names.scopes.loop-label.shadow] -Loop labels may shadow labels of the same name in outer scopes. -References to a label refer to the closest definition. +Loop labels may shadow labels of the same name in outer scopes. References to a label refer to the closest definition. ```rust // Loop label shadowing example. @@ -275,15 +263,13 @@ r[names.scopes.prelude] ## Prelude scopes r[names.scopes.prelude.intro] -[Preludes] bring entities into scope of every module. -The entities are not members of the module, but are implicitly queried during [name resolution]. +[Preludes] bring entities into scope of every module. The entities are not members of the module, but are implicitly queried during [name resolution]. r[names.scopes.prelude.shadow] The prelude names may be shadowed by declarations in a module. r[names.scopes.prelude.layers] -The preludes are layered such that one shadows another if they contain entities of the same name. -The order that preludes may shadow other preludes is the following where earlier entries may shadow later ones: +The preludes are layered such that one shadows another if they contain entities of the same name. The order that preludes may shadow other preludes is the following where earlier entries may shadow later ones: 1. [Extern prelude] 2. [Tool prelude] @@ -294,15 +280,13 @@ The order that preludes may shadow other preludes is the following where earlier r[names.scopes.macro_rules] ## `macro_rules` scopes -The scope of `macro_rules` macros is described in the [Macros By Example] chapter. -The behavior depends on the use of the [`macro_use`] and [`macro_export`] attributes. +The scope of `macro_rules` macros is described in the [Macros By Example] chapter. The behavior depends on the use of the [`macro_use`] and [`macro_export`] attributes. r[names.scopes.derive] ## Derive macro helper attributes r[names.scopes.derive.scope] -[Derive macro helper attributes] are in scope in the item where their corresponding [`derive` attribute] is specified. -The scope extends from just after the `derive` attribute to the end of the item. +[Derive macro helper attributes] are in scope in the item where their corresponding [`derive` attribute] is specified. The scope extends from just after the `derive` attribute to the end of the item. r[names.scopes.derive.shadow] Helper attributes shadow other attributes of the same name in scope. diff --git a/src/paths.md b/src/paths.md index 8c6d7965f6..d28877d2a5 100644 --- a/src/paths.md +++ b/src/paths.md @@ -2,8 +2,7 @@ r[paths] # Paths r[paths.intro] -A *path* is a sequence of one or more path segments separated by `::` tokens. -Paths are used to refer to [items], values, [types], [macros], and [attributes]. +A *path* is a sequence of one or more path segments separated by `::` tokens. Paths are used to refer to [items], values, [types], [macros], and [attributes]. Two examples of simple paths consisting of only identifier segments: @@ -28,8 +27,7 @@ SimplePathSegment -> ``` r[paths.simple.intro] -Simple paths are used in [visibility] markers, [attributes], [macros][mbe], and [`use`] items. -For example: +Simple paths are used in [visibility] markers, [attributes], [macros][mbe], and [`use`] items. For example: ```rust use std::io::{self, Write}; @@ -74,12 +72,10 @@ GenericArgsBounds -> ``` r[paths.expr.intro] -Paths in expressions allow for paths with generic arguments to be specified. They are -used in various places in [expressions] and [patterns]. +Paths in expressions allow for paths with generic arguments to be specified. They are used in various places in [expressions] and [patterns]. r[paths.expr.turbofish] -The `::` token is required before the opening `<` for generic arguments to avoid -ambiguity with the less-than operator. This is colloquially known as "turbofish" syntax. +The `::` token is required before the opening `<` for generic arguments to avoid ambiguity with the less-than operator. This is colloquially known as "turbofish" syntax. ```rust (0..10).collect::>(); @@ -87,8 +83,7 @@ Vec::::with_capacity(1024); ``` r[paths.expr.argument-order] -The order of generic arguments is restricted to lifetime arguments, then type -arguments, then const arguments, then equality constraints. +The order of generic arguments is restricted to lifetime arguments, then type arguments, then const arguments, then equality constraints. r[paths.expr.complex-const-params] Const arguments must be surrounded by braces unless they are a [literal], an [inferred const], or a single segment path. An [inferred const] may not be surrounded by braces. @@ -117,8 +112,7 @@ let _: [_; 1] = f::<{ _ }>(); > In a generic argument list, an [inferred const] is parsed as an [inferred type][InferredType] but then semantically treated as a separate kind of [const generic argument]. r[paths.expr.impl-trait-params] -The synthetic type parameters corresponding to `impl Trait` types are implicit, -and these cannot be explicitly specified. +The synthetic type parameters corresponding to `impl Trait` types are implicit, and these cannot be explicitly specified. r[paths.qualified] ## Qualified paths @@ -133,9 +127,7 @@ QualifiedPathInType -> QualifiedPathType (`::` TypePathSegment)+ ``` r[paths.qualified.intro] -Fully qualified paths allow for disambiguating the path for [trait implementations] and -for specifying [canonical paths](#canonical-paths). When used in a type specification, it -supports using the type syntax specified below. +Fully qualified paths allow for disambiguating the path for [trait implementations] and for specifying [canonical paths](#canonical-paths). When used in a type specification, it supports using the type syntax specified below. ```rust struct S; @@ -170,12 +162,10 @@ TypePathFnInputs -> Type (`,` Type)* `,`? ``` r[paths.type.intro] -Type paths are used within type definitions, trait bounds, type parameter bounds, -and qualified paths. +Type paths are used within type definitions, trait bounds, type parameter bounds, and qualified paths. r[paths.type.turbofish] -Although the `::` token is allowed before the generics arguments, it is not required -because there is no ambiguity like there is in [PathInExpression]. +Although the `::` token is allowed before the generics arguments, it is not required because there is no ambiguity like there is in [PathInExpression]. ```rust # mod ops { @@ -196,16 +186,13 @@ type G = std::boxed::Box isize>; r[paths.qualifiers] ## Path qualifiers -Paths can be denoted with various leading qualifiers to change the meaning of -how it is resolved. +Paths can be denoted with various leading qualifiers to change the meaning of how it is resolved. r[paths.qualifiers.global-root] ### `::` r[paths.qualifiers.global-root.intro] -Paths starting with `::` are considered to be *global paths* where the segments of the path -start being resolved from a place which differs based on edition. Each identifier in -the path must resolve to an item. +Paths starting with `::` are considered to be *global paths* where the segments of the path start being resolved from a place which differs based on edition. Each identifier in the path must resolve to an item. r[paths.qualifiers.global-root.edition2018] > [!EDITION-2018] @@ -272,12 +259,10 @@ r[paths.qualifiers.type-self.trait] * In a [trait] definition, it refers to the type implementing the trait. r[paths.qualifiers.type-self.impl] -* In an [implementation], it refers to the type being implemented. - When implementing a tuple or unit [struct], it also refers to the constructor in the [value namespace]. +* In an [implementation], it refers to the type being implemented. When implementing a tuple or unit [struct], it also refers to the constructor in the [value namespace]. r[paths.qualifiers.type-self.type] -* In the definition of a [struct], [enumeration], or [union], it refers to the type being defined. - The definition is not allowed to be infinitely recursive (there must be an indirection). +* In the definition of a [struct], [enumeration], or [union], it refers to the type being defined. The definition is not allowed to be infinitely recursive (there must be an indirection). r[paths.qualifiers.type-self.scope] The scope of `Self` behaves similarly to a generic parameter; see the [`Self` scope] section for more details. @@ -348,8 +333,7 @@ mod b { ``` r[paths.qualifiers.super.repetition] -`super` may be repeated several times after the first `super` or `self` to refer to -ancestor modules. +`super` may be repeated several times after the first `super` or `self` to refer to ancestor modules. ```rust mod a { @@ -390,13 +374,10 @@ r[paths.qualifiers.macro-crate] ### `$crate` r[paths.qualifiers.macro-crate.allowed-positions] -[`$crate`] is only used within [macro transcribers], and can only be used as the first -segment, without a preceding `::`. +[`$crate`] is only used within [macro transcribers], and can only be used as the first segment, without a preceding `::`. r[paths.qualifiers.macro-crate.hygiene] -[`$crate`] will expand to a path to access items from the -top level of the crate where the macro is defined, regardless of which crate the macro is -invoked. +[`$crate`] will expand to a path to access items from the top level of the crate where the macro is defined, regardless of which crate the macro is invoked. ```rust pub fn increment(x: u32) -> u32 { @@ -414,41 +395,28 @@ r[paths.canonical] ## Canonical paths r[paths.canonical.intro] -Items defined in a module or implementation have a *canonical path* that -corresponds to where within its crate it is defined. +Items defined in a module or implementation have a *canonical path* that corresponds to where within its crate it is defined. r[paths.canonical.alias] All other paths to these items are aliases. r[paths.canonical.def] -The canonical path is defined as a *path prefix* appended by -the path segment the item itself defines. +The canonical path is defined as a *path prefix* appended by the path segment the item itself defines. r[paths.canonical.non-canonical] -[Implementations] and [use declarations] do not have canonical paths, although -the items that implementations define do have them. Items defined in -block expressions do not have canonical paths. Items defined in a module that -does not have a canonical path do not have a canonical path. Associated items -defined in an implementation that refers to an item without a canonical path, -e.g. as the implementing type, the trait being implemented, a type parameter or -bound on a type parameter, do not have canonical paths. +[Implementations] and [use declarations] do not have canonical paths, although the items that implementations define do have them. Items defined in block expressions do not have canonical paths. Items defined in a module that does not have a canonical path do not have a canonical path. Associated items defined in an implementation that refers to an item without a canonical path, e.g. as the implementing type, the trait being implemented, a type parameter or bound on a type parameter, do not have canonical paths. r[paths.canonical.module-prefix] The path prefix for modules is the canonical path to that module. r[paths.canonical.bare-impl-prefix] -For bare implementations, it is the canonical path of the item being implemented -surrounded by angle (`<>`) brackets. +For bare implementations, it is the canonical path of the item being implemented surrounded by angle (`<>`) brackets. r[paths.canonical.trait-impl-prefix] -For [trait implementations], it is the canonical path of the item being implemented -followed by `as` followed by the canonical path to the trait all surrounded in -angle (`<>`) brackets. +For [trait implementations], it is the canonical path of the item being implemented followed by `as` followed by the canonical path to the trait all surrounded in angle (`<>`) brackets. r[paths.canonical.local-canonical-path] -The canonical path is only meaningful within a given crate. There is no global -namespace across crates; an item's canonical path merely identifies it within -the crate. +The canonical path is only meaningful within a given crate. There is no global namespace across crates; an item's canonical path merely identifies it within the crate. ```rust // Comments show the canonical path of the item. diff --git a/src/procedural-macros.md b/src/procedural-macros.md index 369e1b9d67..00fd21c216 100644 --- a/src/procedural-macros.md +++ b/src/procedural-macros.md @@ -2,21 +2,16 @@ r[macro.proc] # Procedural macros r[macro.proc.intro] -*Procedural macros* allow creating syntax extensions as execution of a function. -Procedural macros come in one of three flavors: +*Procedural macros* allow creating syntax extensions as execution of a function. Procedural macros come in one of three flavors: * [Function-like macros] - `custom!(...)` * [Derive macros] - `#[derive(CustomDerive)]` * [Attribute macros] - `#[CustomAttribute]` -Procedural macros allow you to run code at compile time that operates over Rust -syntax, both consuming and producing Rust syntax. You can sort of think of -procedural macros as functions from an AST to another AST. +Procedural macros allow you to run code at compile time that operates over Rust syntax, both consuming and producing Rust syntax. You can sort of think of procedural macros as functions from an AST to another AST. r[macro.proc.def] -Procedural macros must be defined in the root of a crate with the [crate type] of -`proc-macro`. -The macros may not be used from the crate where they are defined, and can only be used when imported in another crate. +Procedural macros must be defined in the root of a crate with the [crate type] of `proc-macro`. The macros may not be used from the crate where they are defined, and can only be used when imported in another crate. > [!NOTE] > When using Cargo, Procedural macro crates are defined with the `proc-macro` key in your manifest: @@ -27,57 +22,31 @@ The macros may not be used from the crate where they are defined, and can only b > ``` r[macro.proc.result] -As functions, they must either return syntax, panic, or loop endlessly. Returned -syntax either replaces or adds the syntax depending on the kind of procedural -macro. Panics are caught by the compiler and are turned into a compiler error. -Endless loops are not caught by the compiler which hangs the compiler. +As functions, they must either return syntax, panic, or loop endlessly. Returned syntax either replaces or adds the syntax depending on the kind of procedural macro. Panics are caught by the compiler and are turned into a compiler error. Endless loops are not caught by the compiler which hangs the compiler. -Procedural macros run during compilation, and thus have the same resources that -the compiler has. For example, standard input, error, and output are the same -that the compiler has access to. Similarly, file access is the same. Because -of this, procedural macros have the same security concerns that [Cargo's -build scripts] have. +Procedural macros run during compilation, and thus have the same resources that the compiler has. For example, standard input, error, and output are the same that the compiler has access to. Similarly, file access is the same. Because of this, procedural macros have the same security concerns that [Cargo's build scripts] have. r[macro.proc.error] -Procedural macros have two ways of reporting errors. The first is to panic. The -second is to emit a [`compile_error`] macro invocation. +Procedural macros have two ways of reporting errors. The first is to panic. The second is to emit a [`compile_error`] macro invocation. r[macro.proc.proc_macro-crate] ## The `proc_macro` crate r[macro.proc.proc_macro-crate.intro] -Procedural macro crates almost always will link to the compiler-provided -[`proc_macro` crate]. The `proc_macro` crate provides types required for -writing procedural macros and facilities to make it easier. +Procedural macro crates almost always will link to the compiler-provided [`proc_macro` crate]. The `proc_macro` crate provides types required for writing procedural macros and facilities to make it easier. r[macro.proc.proc_macro-crate.token-stream] -This crate primarily contains a [`TokenStream`] type. Procedural macros operate -over *token streams* instead of AST nodes, which is a far more stable interface -over time for both the compiler and for procedural macros to target. A -*token stream* is roughly equivalent to `Vec` where a `TokenTree` -can roughly be thought of as lexical token. For example `foo` is an `Ident` -token, `.` is a `Punct` token, and `1.2` is a `Literal` token. The `TokenStream` -type, unlike `Vec`, is cheap to clone. +This crate primarily contains a [`TokenStream`] type. Procedural macros operate over *token streams* instead of AST nodes, which is a far more stable interface over time for both the compiler and for procedural macros to target. A *token stream* is roughly equivalent to `Vec` where a `TokenTree` can roughly be thought of as lexical token. For example `foo` is an `Ident` token, `.` is a `Punct` token, and `1.2` is a `Literal` token. The `TokenStream` type, unlike `Vec`, is cheap to clone. r[macro.proc.proc_macro-crate.span] -All tokens have an associated `Span`. A `Span` is an opaque value that cannot -be modified but can be manufactured. `Span`s represent an extent of source -code within a program and are primarily used for error reporting. While you -cannot modify a `Span` itself, you can always change the `Span` *associated* -with any token, such as through getting a `Span` from another token. +All tokens have an associated `Span`. A `Span` is an opaque value that cannot be modified but can be manufactured. `Span`s represent an extent of source code within a program and are primarily used for error reporting. While you cannot modify a `Span` itself, you can always change the `Span` *associated* with any token, such as through getting a `Span` from another token. r[macro.proc.hygiene] ## Procedural macro hygiene -Procedural macros are *unhygienic*. This means they behave as if the output -token stream was simply written inline to the code it's next to. This means that -it's affected by external items and also affects external imports. +Procedural macros are *unhygienic*. This means they behave as if the output token stream was simply written inline to the code it's next to. This means that it's affected by external items and also affects external imports. -Macro authors need to be careful to ensure their macros work in as many contexts -as possible given this limitation. This often includes using absolute paths to -items in libraries (for example, `::std::option::Option` instead of `Option`) or -by ensuring that generated functions have names that are unlikely to clash with -other functions (like `__internal_foo` instead of `foo`). +Macro authors need to be careful to ensure their macros work in as many contexts as possible given this limitation. This often includes using absolute paths to items in libraries (for example, `::std::option::Option` instead of `Option`) or by ensuring that generated functions have names that are unlikely to clash with other functions (like `__internal_foo` instead of `foo`). @@ -359,68 +328,47 @@ r[macro.proc.token] ## Declarative macro tokens and procedural macro tokens r[macro.proc.token.intro] -Declarative `macro_rules` macros and procedural macros use similar, but -different definitions for tokens (or rather [`TokenTree`s].) +Declarative `macro_rules` macros and procedural macros use similar, but different definitions for tokens (or rather [`TokenTree`s].) r[macro.proc.token.macro_rules] Token trees in `macro_rules` (corresponding to `tt` matchers) are defined as - Delimited groups (`(...)`, `{...}`, etc) -- All operators supported by the language, both single-character and - multi-character ones (`+`, `+=`). +- All operators supported by the language, both single-character and multi-character ones (`+`, `+=`). - Note that this set doesn't include the single quote `'`. - Literals (`"string"`, `1`, etc) - - Note that negation (e.g. `-1`) is never a part of such literal tokens, - but a separate operator token. + - Note that negation (e.g. `-1`) is never a part of such literal tokens, but a separate operator token. - Identifiers, including keywords (`ident`, `r#ident`, `fn`) - Lifetimes (`'ident`) -- Metavariable substitutions in `macro_rules` (e.g. `$my_expr` in - `macro_rules! mac { ($my_expr: expr) => { $my_expr } }` after the `mac`'s - expansion, which will be considered a single token tree regardless of the - passed expression) +- Metavariable substitutions in `macro_rules` (e.g. `$my_expr` in `macro_rules! mac { ($my_expr: expr) => { $my_expr } }` after the `mac`'s expansion, which will be considered a single token tree regardless of the passed expression) r[macro.proc.token.tree] Token trees in procedural macros are defined as - Delimited groups (`(...)`, `{...}`, etc) -- All punctuation characters used in operators supported by the language (`+`, - but not `+=`), and also the single quote `'` character (typically used in - lifetimes, see below for lifetime splitting and joining behavior) +- All punctuation characters used in operators supported by the language (`+`, but not `+=`), and also the single quote `'` character (typically used in lifetimes, see below for lifetime splitting and joining behavior) - Literals (`"string"`, `1`, etc) - - Negation (e.g. `-1`) is supported as a part of integer - and floating point literals. + - Negation (e.g. `-1`) is supported as a part of integer and floating point literals. - Identifiers, including keywords (`ident`, `r#ident`, `fn`) r[macro.proc.token.conversion.intro] -Mismatches between these two definitions are accounted for when token streams -are passed to and from procedural macros. \ -Note that the conversions below may happen lazily, so they might not happen if -the tokens are not actually inspected. +Mismatches between these two definitions are accounted for when token streams are passed to and from procedural macros. Note that the conversions below may happen lazily, so they might not happen if the tokens are not actually inspected. r[macro.proc.token.conversion.to-proc_macro] When passed to a proc-macro - All multi-character operators are broken into single characters. - Lifetimes are broken into a `'` character and an identifier. - The keyword metavariable [`$crate`] is passed as a single identifier. -- All other metavariable substitutions are represented as their underlying - token streams. - - Such token streams may be wrapped into delimited groups ([`Group`]) with - implicit delimiters ([`Delimiter::None`]) when it's necessary for - preserving parsing priorities. - - `tt` and `ident` substitutions are never wrapped into such groups and - always represented as their underlying token trees. +- All other metavariable substitutions are represented as their underlying token streams. + - Such token streams may be wrapped into delimited groups ([`Group`]) with implicit delimiters ([`Delimiter::None`]) when it's necessary for preserving parsing priorities. + - `tt` and `ident` substitutions are never wrapped into such groups and always represented as their underlying token trees. r[macro.proc.token.conversion.from-proc_macro] When emitted from a proc macro -- Punctuation characters are glued into multi-character operators - when applicable. +- Punctuation characters are glued into multi-character operators when applicable. - Single quotes `'` joined with identifiers are glued into lifetimes. -- Negative literals are converted into two tokens (the `-` and the literal) - possibly wrapped into a delimited group ([`Group`]) with implicit delimiters - ([`Delimiter::None`]) when it's necessary for preserving parsing priorities. +- Negative literals are converted into two tokens (the `-` and the literal) possibly wrapped into a delimited group ([`Group`]) with implicit delimiters ([`Delimiter::None`]) when it's necessary for preserving parsing priorities. r[macro.proc.token.doc-comment] -Note that neither declarative nor procedural macros support doc comment tokens -(e.g. `/// Doc`), so they are always converted to token streams representing -their equivalent `#[doc = r"str"]` attributes when passed to macros. +Note that neither declarative nor procedural macros support doc comment tokens (e.g. `/// Doc`), so they are always converted to token streams representing their equivalent `#[doc = r"str"]` attributes when passed to macros. [Attribute macros]: #the-proc_macro_attribute-attribute [Cargo's build scripts]: ../cargo/reference/build-scripts.html diff --git a/src/visibility-and-privacy.md b/src/visibility-and-privacy.md index 759a992f8b..d5d7b64de9 100644 --- a/src/visibility-and-privacy.md +++ b/src/visibility-and-privacy.md @@ -12,27 +12,16 @@ Visibility -> ``` r[vis.intro] -These two terms are often used interchangeably, and what they are attempting to -convey is the answer to the question "Can this item be used at this location?" +These two terms are often used interchangeably, and what they are attempting to convey is the answer to the question "Can this item be used at this location?" r[vis.name-hierarchy] -Rust's name resolution operates on a global hierarchy of namespaces. Each level -in the hierarchy can be thought of as some item. The items are one of those -mentioned above, but also include external crates. Declaring or defining a new -module can be thought of as inserting a new tree into the hierarchy at the -location of the definition. +Rust's name resolution operates on a global hierarchy of namespaces. Each level in the hierarchy can be thought of as some item. The items are one of those mentioned above, but also include external crates. Declaring or defining a new module can be thought of as inserting a new tree into the hierarchy at the location of the definition. r[vis.privacy] -To control whether interfaces can be used across modules, Rust checks each use -of an item to see whether it should be allowed or not. This is where privacy -warnings are generated, or otherwise "you used a private item of another module -and weren't allowed to." +To control whether interfaces can be used across modules, Rust checks each use of an item to see whether it should be allowed or not. This is where privacy warnings are generated, or otherwise "you used a private item of another module and weren't allowed to." r[vis.default] -By default, everything is *private*, with two exceptions: Associated -items in a `pub` Trait are public by default; Enum variants -in a `pub` enum are also public by default. When an item is declared as `pub`, -it can be thought of as being accessible to the outside world. For example: +By default, everything is *private*, with two exceptions: Associated items in a `pub` Trait are public by default; Enum variants in a `pub` enum are also public by default. When an item is declared as `pub`, it can be thought of as being accessible to the outside world. For example: ```rust # fn main() {} @@ -52,51 +41,25 @@ pub enum State { ``` r[vis.access] -With the notion of an item being either public or private, Rust allows item -accesses in two cases: - -1. If an item is public, then it can be accessed externally from some module - `m` if you can access all the item's ancestor modules from `m`. You can - also potentially be able to name the item through re-exports. See below. -2. If an item is private, it may be accessed by the current module and its - descendants. - -These two cases are surprisingly powerful for creating module hierarchies -exposing public APIs while hiding internal implementation details. To help -explain, here's a few use cases and what they would entail: - -* A library developer needs to expose functionality to crates which link - against their library. As a consequence of the first case, this means that - anything which is usable externally must be `pub` from the root down to the - destination item. Any private item in the chain will disallow external - accesses. - -* A crate needs a global available "helper module" to itself, but it doesn't - want to expose the helper module as a public API. To accomplish this, the - root of the crate's hierarchy would have a private module which then - internally has a "public API". Because the entire crate is a descendant of - the root, then the entire local crate can access this private module through - the second case. - -* When writing unit tests for a module, it's often a common idiom to have an - immediate child of the module to-be-tested named `mod test`. This module - could access any items of the parent module through the second case, meaning - that internal implementation details could also be seamlessly tested from the - child module. - -In the second case, it mentions that a private item "can be accessed" by the -current module and its descendants, but the exact meaning of accessing an item -depends on what the item is. +With the notion of an item being either public or private, Rust allows item accesses in two cases: + +1. If an item is public, then it can be accessed externally from some module `m` if you can access all the item's ancestor modules from `m`. You can also potentially be able to name the item through re-exports. See below. +2. If an item is private, it may be accessed by the current module and its descendants. + +These two cases are surprisingly powerful for creating module hierarchies exposing public APIs while hiding internal implementation details. To help explain, here's a few use cases and what they would entail: + +* A library developer needs to expose functionality to crates which link against their library. As a consequence of the first case, this means that anything which is usable externally must be `pub` from the root down to the destination item. Any private item in the chain will disallow external accesses. + +* A crate needs a global available "helper module" to itself, but it doesn't want to expose the helper module as a public API. To accomplish this, the root of the crate's hierarchy would have a private module which then internally has a "public API". Because the entire crate is a descendant of the root, then the entire local crate can access this private module through the second case. + +* When writing unit tests for a module, it's often a common idiom to have an immediate child of the module to-be-tested named `mod test`. This module could access any items of the parent module through the second case, meaning that internal implementation details could also be seamlessly tested from the child module. + +In the second case, it mentions that a private item "can be accessed" by the current module and its descendants, but the exact meaning of accessing an item depends on what the item is. r[vis.usage] -Accessing a module, for example, would mean looking inside of it (to import more items). On the other hand, accessing a -function would mean that it is invoked. Additionally, path expressions and -import statements are considered to access an item in the sense that the -import/expression is only valid if the destination is in the current visibility -scope. +Accessing a module, for example, would mean looking inside of it (to import more items). On the other hand, accessing a function would mean that it is invoked. Additionally, path expressions and import statements are considered to access an item in the sense that the import/expression is only valid if the destination is in the current visibility scope. -Here's an example of a program which exemplifies the three cases outlined -above: +Here's an example of a program which exemplifies the three cases outlined above: ```rust // This module is private, meaning that no external crate can access this @@ -148,33 +111,25 @@ pub mod submodule { # fn main() {} ``` -For a Rust program to pass the privacy checking pass, all paths must be valid -accesses given the two rules above. This includes all use statements, -expressions, types, etc. +For a Rust program to pass the privacy checking pass, all paths must be valid accesses given the two rules above. This includes all use statements, expressions, types, etc. r[vis.scoped] ## `pub(in path)`, `pub(crate)`, `pub(super)`, and `pub(self)` r[vis.scoped.intro] -In addition to public and private, Rust allows users to declare an item as -visible only within a given scope. The rules for `pub` restrictions are as -follows: +In addition to public and private, Rust allows users to declare an item as visible only within a given scope. The rules for `pub` restrictions are as follows: r[vis.scoped.in] -- `pub(in path)` makes an item visible within the provided `path`. - `path` must be a simple path which resolves to an ancestor module of the item whose visibility is being declared. - Each identifier in `path` must refer directly to a module (not to a name introduced by a `use` statement). +- `pub(in path)` makes an item visible within the provided `path`. `path` must be a simple path which resolves to an ancestor module of the item whose visibility is being declared. Each identifier in `path` must refer directly to a module (not to a name introduced by a `use` statement). r[vis.scoped.crate] - `pub(crate)` makes an item visible within the current crate. r[vis.scoped.super] -- `pub(super)` makes an item visible to the parent module. This is equivalent - to `pub(in super)`. +- `pub(super)` makes an item visible to the parent module. This is equivalent to `pub(in super)`. r[vis.scoped.self] -- `pub(self)` makes an item visible to the current module. This is equivalent -to `pub(in self)` or not using `pub` at all. +- `pub(self)` makes an item visible to the current module. This is equivalent to `pub(in self)` or not using `pub` at all. r[vis.scoped.edition2018] > [!EDITION-2018] @@ -239,10 +194,7 @@ r[vis.reexports] ## Re-exporting and visibility r[vis.reexports.intro] -Rust allows publicly re-exporting items through a `pub use` directive. Because -this is a public directive, this allows the item to be used in the current -module through the rules above. It essentially allows public access into the -re-exported item. For example, this program is valid: +Rust allows publicly re-exporting items through a `pub use` directive. Because this is a public directive, this allows the item to be used in the current module through the rules above. It essentially allows public access into the re-exported item. For example, this program is valid: ```rust pub use self::implementation::api; @@ -256,10 +208,7 @@ mod implementation { # fn main() {} ``` -This means that any external crate referencing `implementation::api::f` would -receive a privacy violation, while the path `api::f` would be allowed. +This means that any external crate referencing `implementation::api::f` would receive a privacy violation, while the path `api::f` would be allowed. r[vis.reexports.private-item] -When re-exporting a private item, it can be thought of as allowing the "privacy -chain" being short-circuited through the reexport instead of passing through -the namespace hierarchy as it normally would. +When re-exporting a private item, it can be thought of as allowing the "privacy chain" being short-circuited through the reexport instead of passing through the namespace hierarchy as it normally would.