Lints to ensure link text for EIPs should match the EIP's number#99
Lints to ensure link text for EIPs should match the EIP's number#99KatyaRyazantseva wants to merge 16 commits intoethereum:masterfrom
Conversation
SamWilsn
left a comment
There was a problem hiding this comment.
Overall, I'd try to avoid hard coding regexes where possible. If they're hard coded, they can't be changed by the configuration file.
eipw-lint/src/lib.rs
Outdated
| ( | ||
| "markdown-link-other", | ||
| MarkdownLinkOther { | ||
| pattern: markdown::LinkOther(r"^(EIP|ERC)-(\d+)\s*\S*$") |
There was a problem hiding this comment.
Should this be case insensitive ((?i))?
There was a problem hiding this comment.
I think we have the other lint to check for incorrectly cased references. So even if [eip-1](./eip-1.md) passes markdown-link-other, it'll fail the other lint.
Or at least I think it will 🤔
| if self.link_depth > 0 { | ||
| self.link_depth = self.link_depth.checked_sub(1).unwrap(); | ||
| } |
There was a problem hiding this comment.
If self.link_depth is greater than zero, the checked_sub is unnecessary.
There was a problem hiding this comment.
checked_sub in the depart_link function returns self.link_depth to 0. I check link_depth in enter_text fn. If link_depth == 0, it means the text is not from the link, so, I skip it.
There was a problem hiding this comment.
What I mean is that if self.link_depth > 0 then self.link_depth - 1 can never underflow. The checked_sub will never fail.
| if self.link_depth > 0 { | ||
| self.current_link.text = txt.to_owned(); | ||
| self.check(ast)?; | ||
| } |
There was a problem hiding this comment.
How does this behave for content like:
[**EIP-1**5678](./eip-1.md#rationale)There was a problem hiding this comment.
It fails. I will add it to the tests. Should it fail?
There was a problem hiding this comment.
Hm, perhaps I spoke too soon then 🤣
I was worried that because every text node in the link sets self.current_link.text, you could end up in situations where, for example, bold would create two text nodes, breaking the lint. Maybe something like [**EIP-1**EIP-1](./eip-1.md).
If my concerns are unfounded, ignore me!
| | | ||
| 4 | [EIP-1](./eip-2.md) | ||
| | | ||
| = info: link text should match `[EIP|ERC-2]` |
There was a problem hiding this comment.
If you can, I'd rather the specific expected text in the message. Maybe something like:
error[markdown-link-eip]: link text does not match link destination
|
4 | [EIP-1](./eip-2.md)
|
= help: `use [EIP-2](./eip-2.md)` instead
There was a problem hiding this comment.
I am working on the help hint. I check the link and build the correct version of the text based on the link. Is it correct? So, my hints will look like this:
[EIP-1](./eip-2.md) -> [EIP-2](./eip-2.md)
[EIP-1: Foo](./eip-1.md) -> [EIP-1](./eip-1.md)
[Another Proposal](./eip-1.md) -> [EIP-1](./eip-1.md)
[EIP-1](./eip-1.md#motivation) -> [EIP-1: Motivation](./eip-1.md#motivation)
[EIP-2: Hello](./eip-1.md#abstract) -> [EIP-1: Abstract](./eip-1.md#abstract)
[Another Proposal](./eip-1.md#rationale) -> [EIP-1: Rationale](./eip-1.md#rationale)
Example:
error[markdown-link-eip]: link text does not match link destination
|
4 | [Another Proposal](./eip-1.md#rationale)
|
= help: ` [EIP-1: Rationale](./eip-1.md#rationale)` instead
| | | ||
| 4 | [Another Proposal](./eip-1.md#rationale) | ||
| | | ||
| = info: link text should match `[EIP|ERC-1<section-description>]` |
There was a problem hiding this comment.
Similar kind of comment here. The help text should (if possible) include the correct text to use. Authors might not understand how to make something "match".
|
Pushed updates. I tried to fix everything according to comments. Haven't found EIP-1 example in existing eips. So, for now ignoring you)) Can we create a new issue for it in case it comes later? Hardcoded regexes in lib.rs are kind of a style for the whole file, I followed the pattern. If we need to change regexes in config files, we need to refactor it globally. |
The configuration in Other regexes, like this one cannot be overridden. |
|
I've added a test in 09d7de7 that demonstrates the problem I brought up in #99 (comment). |
This one is dynamic. It depends on the lib.rs pattern. I extract the number of eip there and put into a new regex. Do you have any better ideas? |
|
fixed bold text |
|
@SamWilsn, I've done with the fixes. Could you please review it and merge if there are no any comments? |
| fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> Result<String, Error> { | ||
| if let Some(captures) = re.captures(text) { | ||
| Ok(captures | ||
| .get(index) | ||
| .map(|m| m.as_str().to_string()) | ||
| .unwrap_or_default()) | ||
| } else { | ||
| Ok(String::new()) | ||
| } | ||
| } |
There was a problem hiding this comment.
This function never returns an error, so you can simplify it a bit:
| fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> Result<String, Error> { | |
| if let Some(captures) = re.captures(text) { | |
| Ok(captures | |
| .get(index) | |
| .map(|m| m.as_str().to_string()) | |
| .unwrap_or_default()) | |
| } else { | |
| Ok(String::new()) | |
| } | |
| } | |
| fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> String { | |
| if let Some(captures) = re.captures(text) { | |
| captures | |
| .get(index) | |
| .map(|m| m.as_str().to_string()) | |
| .unwrap_or_default() | |
| } else { | |
| String::new() | |
| } | |
| } |
| Ok(captures | ||
| .get(index) | ||
| .map(|m| m.as_str().to_string()) | ||
| .unwrap_or_default()) |
There was a problem hiding this comment.
I'd say that if the regular expression has the wrong number of capture groups, we should inform the user instead of silently returning the empty string.
You can ignore my previous comment about simplifying, and use Error::custom and something like:
#[derive(Debug, Snafu)]
struct BadRegex;| let url_eip_text = self.extract_capture(&self.current_link.url, &self.re, 1)?; | ||
| let url_eip_number = self.extract_capture(&self.current_link.url, &self.re, 2)?; | ||
| let url_section = self.extract_capture(&self.current_link.url, &self.re, 4)?; |
There was a problem hiding this comment.
This repeats the regex search each time, which is pretty inefficient.
| fn check(&self, ast: &Ast) -> Result<Next, Error> { | ||
| let url_eip_text = self.extract_capture(&self.current_link.url, &self.re, 1)?; | ||
| let url_eip_number = self.extract_capture(&self.current_link.url, &self.re, 2)?; | ||
| let url_section = self.extract_capture(&self.current_link.url, &self.re, 4)?; |
There was a problem hiding this comment.
eipw-lint already depends on url for parsing URLs. You could use it here to get the fragment (#...).
| let section_description = Visitor::transform_section_description(&url_section); | ||
| format!( | ||
| "[{}{}: {}]({})", | ||
| url_eip_text.to_uppercase(), |
There was a problem hiding this comment.
Weirdly enough, ERCs are still stored in files named like eip-1234.md, so you can't use the filename to predict whether you need "EIP-..." or "ERC-..."
The correct way to solve this (reading the linked file) won't work for reasons outside eipw's scope, so... I guess the best we can do is something like:
help: use
[EIP-1237](./eip-1237.md)or[ERC-1237](./eip-1237.md)instead
|
@SamWilsn I updated Rust (1.80.1 the latest stable version) and now have this issue with |
|
If you're using rustup, you can use, for example, I'll update the rust version in a separate PR. |
Issue 67. Covered examples with tests.