Use marker type to enforce validation of Address's network#1489
Use marker type to enforce validation of Address's network#1489apoelstra merged 1 commit intorust-bitcoin:masterfrom
Address's network#1489Conversation
Kixunil
left a comment
There was a problem hiding this comment.
Thanks! I think the answer to your serde question is simply don't impl Deserialize for UncheckedNetwork. See my other comments as well.
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
One annoying property of these derives is they require V to implement the traits. We should just implement them manually.
Unrelated but I think deriving PartialOrd was wrong. I would expect the addresses to be lexicographically sorted but this sorts according to the internal representation.
There was a problem hiding this comment.
Why was this marked as resolved if the derives are still here?
There was a problem hiding this comment.
Sorry, I did not understand you would like them implemented manually as part of this PR. Would you be OK if it's done as follow-up? It is not completely related to this change and I can imagine it might spark new set of discussions. I could look at it right after this PR.
There was a problem hiding this comment.
I believe it's strongly related to this PR because that's literally doing this PR right. But since the downsides are mostly about documentation being bloated/less readable and a bunch of more code the compiler must process I think it's not too bad to put it in a separate PR if it helps you.
FTR AFAIK nobody is working in this file so you shouldn't get git conflicts anytime soon if you decide to do it in this one.
There was a problem hiding this comment.
Got it. I prefer opening new PR for it. It might take me some time to figure out some of the implementations.
There was a problem hiding this comment.
The derivative crate can actually help but we have strict policy on dependencies here. @apoelstra what do you think? It has MSRV 1.36, doesn't look maintained but presumably does what's needed so maybe that's not an issue (esp. for proc macro). I don't know the author but I plan to review the crate anyway for my project and I would publish crev proof.
There was a problem hiding this comment.
One thing I realized. If we remove the derives from the marker types (NetworkChecked, NetworkUnchecked), I think the following would not be possible:
#[derive(Clone)]
pub struct MyStruct<V: NetworkValidation> {
pub address: Address<V>
}
#[test]
fn test_my_struct() {
let s: MyStruct<NetworkChecked> = todo!();
let s2 = s.clone();
}This compiles when NetworkChecked has Clone, otherwise it does not.
So although rust-bitcoin library may not need the derives (because we could manually implement all the traits directly on Address), people may want to use the marker types in their own types that use derivation.
At this moment, similar problem exists with Debug because Address<V> does not implement Debug and I guess the compiler does not know that all NetworkValidation implementations have it (they don't have it in this PR, but I added it in my experiments).
In this situation, the compiler suggests to add where Address<V>: Debug to MyStruct or #[derive(Debug)] to Address<V>. I haven't figured out yet how to deal with this situation.
There was a problem hiding this comment.
Hmm, yes, I was thinking that we would deal with the annoyance so downstream users don't have to but maybe it's not that simple. But making the struct generic like you showed already has other problems, so I'm not sure if we should be very worried about it. Most people probably should have the field fixed.
I kinda like that such types in a separate module look like enums.
And also
No, they are not stored stored, just faked via
Because, frankly, the derive macros are retarded and put wrong bounds on generic arguments. I remember complaining about this at |
|
Concept ACK. this To answer your questions
This is triply stupid because (a) every trait ought to be implemented on uninhabited types, and (b) even if it weren't uninhabited, it shouldn't matter what traits are impl'd for phantom data, and (c) even if it did matter, phantoms ought to implement every trait. But Rust doesn't work like that, so here we are. You have to add a bunch of extra |
I like this argument, but I disagree -- I think we should allow displaying etc on unchecked addresses. It's reasonable for some applications that simply pass addresses around to want to parse them/check for validity while still being agnostic about the specific network. |
Same argument as before: tempts people to hard-code networks. Maybe if the adaptor only provided to check that the network is not mainnet (for quick prototyping) I'd be open to it.
impl Iterator for Uninhabited {
type Item = ??? what goes here ???;
// ...
}
It seems to me that the whole
Same issue as above There is a fourth one though. If you have this: #[derive(Clone)]
struct Foo<I: Iterator> {
item: I::Item,
}Then the trait bound created by the derive is
They could just call |
|
Oh, crap, now I realized we can't implement |
|
I think
Hmm, maybe we want to use |
There's a good chance it is. But I also don't like the idea of closing the door for now. I think it would be better to make the representation of /// ...
///
/// `repr(transparent)` is for internal purposes only, do not rely on it!
#[repr(transparent)] // so that we can use `unsafe` to cast the types
struct Address<V: Validation> {
raw: RawAddress,
_phantom: PhantomData<V>,
}
struct RawAddress {
payload: Payload,
network: Network,
}
Oh, yeah, forgot about that one. It seems to make sense to have |
|
I agree, though I'd be okay accepting this PR as-is and then doing a followup to hide the internals. |
|
Thanks a lot to everybody. I reflected most of the comments in these three commits:
Also, part of the last one is polishing of documentation (there were invalid links etc.). What remains is the naming. The reasoning behind my choice of However, I understand why the naming may bother people and I don't insist on it. Also, since you are much more familiar with the project, you all will have better feeling for the names. Alternative I would prefer next is With that, we could have I cannot think of good names with 'valid' instead of 'check' that would make sense. I would quite like |
|
0e448ae3398cd4486855693f524c7e0e40f7f902 looks good to me, except for the naming.
There is no way for this library to know what "valid" means or whether it is true about a given address. The only notion of "validity" we have is whether or not it parses correctly, and therefore any instance of I agree with |
0e448ae to
b9311a4
Compare
|
Renamed to And squashed. |
Address's networkAddress's network
apoelstra
left a comment
There was a problem hiding this comment.
ACK b9311a4cabb3d51a8c09563731dc71e5b3e2c2f3
tcharding
left a comment
There was a problem hiding this comment.
Man, I really like this PR.
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
This inference is cool, I had to play with it to check the doc statement was correct and that it was what we wanted - yes on both accounts!
If you want it, I added
#[test]
fn address_constructor_infers_network_unchecked() {
let addr = Address::new(Bitcoin, Payload::PubkeyHash(hex_into!("162c5ea71c0b23f5b9022ef047c4a86470a5b070")));
// This method is only valid for `Address<NetworkUnchecked>`.
addr.is_valid_for_network(Network::Bitcoin);
}
#[test]
fn address_constructor_infers_network_checked() {
let addr = Address::new(Bitcoin, Payload::PubkeyHash(hex_into!("162c5ea71c0b23f5b9022ef047c4a86470a5b070")));
// This method is only valid for `Address<NetworkChecked>`.
addr.is_standard();
}
Kixunil
left a comment
There was a problem hiding this comment.
Can't focus on a proper review rn, sorry (I'm sick).
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
Does this doc comment have any effect?
There was a problem hiding this comment.
This, and other two on the other impl Addr<…>, were originally plain comments but I turned them into doc as a hint to readers of documentation that implementation of methods and functions are spread across multiple impls.
Would you prefer another way? Not having them as doc comments?
There was a problem hiding this comment.
I was just curious, if it has no effect it doesn't matter which kind of comment it is.
There was a problem hiding this comment.
It gets rendered in the documentation at the beginning of the impl.
0dc373d to
d344f56
Compare
|
Thanks for additional comments. I fixed wording of doc comments of
No worries. Get well, Martin. |
Kixunil
left a comment
There was a problem hiding this comment.
Thankfully my health significantly improved to the point of being able to properly review this. 🎉
This is almost ACK, I'm just unsure about address_type.
bitcoin/examples/ecdsa-psbt.rs
Outdated
There was a problem hiding this comment.
Unrelated but we should have a const constructor for Address if possible.
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
FYI I would normally complain about this not being a separate type but I guess people will usually parse and validate the address at roughly same place so merging them seems OK.
There was a problem hiding this comment.
Originally I started to write it as separate type but when I saw that other parsing-related errors are in the big enum, I added it there, otherwise Address::from_str("…")?.require_network(…)? would not be possible.
There was a problem hiding this comment.
Yeah, we would need a third type which seems too much.
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
I was just curious, if it has no effect it doesn't matter which kind of comment it is.
sanket1729
left a comment
There was a problem hiding this comment.
ACK d344f56bae98103359083fa4a626d98b779204b1. Though I also prefer the name require_network in place of expect_network
7cdf3d8 to
7d1f13b
Compare
|
One more change: |
Kixunil
left a comment
There was a problem hiding this comment.
Unfortunately I just noticed stale derives and have some additional questions.
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
Is : Address<NetworkChecked> even needed here? I think inference should work fine because require_network is only implemented on unchecked address. It'd be nice to show people the simplest version of the code.
There was a problem hiding this comment.
It's not needed for type inference. I added it as an indication for readers to see what's happening with checked/unchecked and to see that Address (variant 2) and Address<NetworkChecked> (variant 3) are equivalent.
Variant 1 could also be simplified all the way to:
let address: Address<_> = "32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf".parse().unwrap();
let address = address.require_network(Network::Bitcoin).unwrap();but I think it adds a bit less value to the readers.
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
Why was this marked as resolved if the derives are still here?
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
These enums shouldn't have any derives since they don't do anything but bloat compile times.
There was a problem hiding this comment.
AFAIK they are needed as long as they are on Address struct. If you are OK with implementing the traits on Address manually (cf. #1489 (comment) ) as follow-up to this PR, I would remove them as part of the follow-up, too.
Otherwise let me know and I will do it as part of this one.
There was a problem hiding this comment.
Yeah, I meant changing them to manual. I know it's annoying but at least you only have 3-field struct which isn't much. I have a project with several structs each having ~6 fields and a I think at least one impl is for a stupid reason... And I can't hack it with PhantomData because I have T::Assoc. 🤷♂️
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
Is this here only because calling Debug is wrong and we want to avoid duplicating the code since debug is (and should be) implemented for both? Note that @apoelstra didn't address my argument that people who want to just validate can call assume_valid. Or display the original string which will be faster anyway.
I suggest to have fmt_unchecked private method and call it from both impls, Display being only available for checked.
Oh and maybe we should mess up debug formatting a bit to discourage invalid usage? (put Address() around it or something).
There was a problem hiding this comment.
Reasonable. I will address this in my next commit.
There was a problem hiding this comment.
This brings an issue with serde and I am not sure what would be the best way around it.
Currently, we have Serialize and Deserialize for Address<NetworkUnchecked> (via macro) and Serialize for Address<NetworkChecked> (manual impl).
Serialize for Address<NetworkUnchecked> uses Display (in the macro). After removing Display from Address<NetworkUnchecked>, it can't be used anymore.
I guess most correct way would be to have only Deserialize for Address<NetworkUnchecked> and Serialize for Address<NetworkChecked> but it seems to me it would hurt ergonomics of the usage.
Alternative is to keep Serialize for Address<NetworkUnchecked> and make it work (via private fmt).
What do people think would be better approach?
There was a problem hiding this comment.
I think the best approach is to use the private fmt method.
There was a problem hiding this comment.
but it seems to me it would hurt ergonomics of the usage.
This might be true. If people have it inside a larger struct then it can get annoying since they have to move it out and in. Ideally they would have the struct tagged as well but there are n Rust-native tools to compose this. I will give it more thought.
|
Perhaps we should only implement |
|
From the issue:
While playing around with this I realized that the current API has not helped to discourage users from hard coding the network e.g., let address = Address::from_str("32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf")
.expect("failed to parse address")
.require_network(Network::Bitcoin)
.expect("wrong network");Where would one get the network from? Everyone either has to feature gate test code or pass the network around? |
bitcoin/src/address.rs
Outdated
There was a problem hiding this comment.
While reviewing I wrote some code, you can use any of this in the examples if you like it
// Use `FromStr` to get an `Address` from a string.
let address = Address::from_str("32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf")
.expect("failed to parse address")
.require_network(Network::Bitcoin)
.expect("wrong network");
// Display or serialize an `Address`.
println!("address: {}", address);
let ser = serde_json::to_string(&address).expect("failed to serialize address");
// Use `parse` to get an unchecked address from a string.
let address: Address<NetworkUnchecked> = "32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf"
.parse()
.expect("failed to parse address");
// `Display` is explicitly *not* implemented for `Address<NetworkUnchecked>`.
println!("address: {:?}", address);
// `Serialize` is also explicitly *not* implemented for `Address<NetworkUnchecked>`.
// let _ = serde_json::to_string(&address).expect("failed to serialize address");
// The generic on `Address<T>` is there to catch incorrect addresses and to prevent hard coding of the network in bitcoin codebases - ouch current API does not encourage _not_ hard coding the network.
let address = address
.require_network(Network::Bitcoin)
.expect("wrong network");
// Equivalent to
let address: Address<NetworkChecked> = "32iVBEu4dxkUQk9dJbZUiBiQdmypcEyJRf"
.parse::<Address<NetworkUnchecked>>()
.expect("failed to parse address")
.require_network(Network::Bitcoin)
.expect("wrong network");
I feel this variant would be more correct. But it would disallow just writing One more option may be something between. |
I don't think it's even possible. I'm not too concerned about that, I'm only concerned about encouraging to hardcode.
Passing around the network from some configuration is the right way to do that. An exception is mobile apps which can not take arguments and so they need to have different builds for different networks. But that still needs to be only hardcoded in a single place of the application so that a build for different network is easy.
Exactly, this would be very annoying.
This sounds interesting, will think about it. |
I played with this idea for a while. While it may look good at first, the issue is that this will not work when the address is in some container ( So given that, I believe providing serializer for |
|
OK, that seems to make sense. What do others think? |
|
+1 to having a serializer for |
7cfe028 to
08cd117
Compare
Parsing addresses from strings required a subsequent validation of network of the parsed address. However, this validation was not enforced by compiler, one had to remember to perform it. This change adds a marker type to `Address` that will assist the compiler in enforcing this validation.
08cd117 to
bef7c6e
Compare
|
I believe this is ready for final review. The current version contains:
|
|
OK, 1 week should be enough, I'll review this. FTR, I recently realized we'll have a similar problem with |
| fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { | ||
| write!(f, "Address<NetworkUnchecked>(")?; | ||
| self.fmt_internal(f)?; | ||
| write!(f, ")") |
There was a problem hiding this comment.
This could be annoying in generic contexts since Address<T: NetworkValidation> doesn't imply Address<T>: Debug. I suggest adding const IS_CHECKED: bool; to the NetworkValidation trait and using it here to determine if the address should be wrapped.
This can go into a separate PR.
There was a problem hiding this comment.
Understand. Will have a look at it.
| } | ||
|
|
||
| impl FromStr for Address { | ||
| // Address can be parsed only with NetworkUnchecked. |
There was a problem hiding this comment.
Nit: could've been doc comment.
|
Where is the serde::Serialize impl for Other than this nit, bef7c6e LGTM. |
It's right after the other impls. rust-bitcoin/bitcoin/src/address.rs Lines 696 to 712 in bef7c6e |
|
Ah! It just doesn't use the macro. Ok, I'm happy. |
…ion>` ebfbe74 Implement `Debug` for generic `Address<V: NetworkValidation>` (Jiri Jakes) Pull request description: Previously `Debug` was implemented for both `Address<NetworkChecked>` and `Address<NetworkUnchecked>`, but not for cases when the `NetworkValidation` parameter was generic. This change adds this ability. Based on Kixunil's tip. With previous implementation, the `test_address_debug()` resulted in error:  The added `Debug` on `NetworkChecked` and `NetworkUnchecked` are required by compiler. --- While dealing with derives and impls, I also attempted to turn all the derives on `Address` into manual impls (see Kixunil's suggestion in #1489 (comment)). The motivation behind this was the possibility to remove derives on `NetworkChecked` and `NetworkUnchecked`, too. However, even with manual impls, all the traits on `NetworkChecked` and `NetworkUnchecked` were still required by compiler in this sort of situations (see also the rest of the same discussion linked above). I do not fully understand why, perhaps limitation of this way of sealing traits? It can be demonstrated by removing `Debug` derivation on `NetworkUnchecked` and `NetworkChecked` in this PR and running `test_address_debug()`. Therefore, if we want to allow users of the library to define types generic in `NetworkValidation` and at the same time derive impls, it seems to me that `NetworkChecked` and `NetworkUnchecked` will have to have the same set of impls as `Address` itself. ACKs for top commit: Kixunil: ACK ebfbe74 tcharding: ACK ebfbe74 apoelstra: ACK ebfbe74 Tree-SHA512: 87f3fa4539602f31bf4513a29543b04e943c3899d8ece36d0d905c3b5a2d76e29eb86242694b5c494faa5e54bb8f69f5048849916c6438ddd35030368f710353
|
Bff, I haven't been active for a while, so I can only complain after the fact. But this change seems totally unnecessary to me and it means using |
|
This change uncovered a bunch of bugs in soon-to-be-production code I maintain, so this is a huge win. I find the annoyance worth it.
In case you're trying to upgrade |
|
Yeah I am continually rebasing my sync+async refactor of the lib: rust-bitcoin/rust-bitcoincore-rpc#212 Not all addresses are Sync because it has a generic parameter now and the generic type isn't guaranteed to be Sync (because the seal thing is a hack and the compiler doesn't know that all possible generic types for the type are in this library and are hence Sync). The fix I made should enforce the Sync-ness. Also, in your bump to 0.30 MR you're not allowing users to enter network-unchecked arguments which is annoying because the return type of the API itself is network-unchecked. |
|
Just saw your PR, makes sense. I did intentionally only accept checked to enforce checking but it's true it's kinda questionable since Core checks it anyway. |
|
I accept full responsibility for the |
Adds marker type
NetworkValidationtoAddressto help compiler enforce network validation. Inspired by Martin's suggestion. Closes #1460.Open questions:
Address:from_strvia macroserde_string_impl!(Address, "a Bitcoin address");. I don't think there is much we can do, so unless somebody has a better idea how to combine serde and network validation, I would just demacroed the macro forAddressand addunsafe_mark_network_validinto it.mod validation? As they are now, they live inaddressnamespace so I don't think mod is necessary.Addressare now onAddress<NetworkValid>except one (address_type) that needs to be called on both and a few that are only onAddress<NetworkUnchecked>(mainlyis_valid_for_network). Some methods (e. g.to_qr_uri,is_standardand perhaps others) could be, theoretically, called on both valid and unchecked. I think we should encourage validating the network ASAP, so I would leave them on NetworkValid only, but I can move them if others have different opinion.NetworkValidandNetworkUncheckedenums have some trait impls derived? ThePartialEqwas necessary for tests (I thinkassert_eqrequired it) but I am not sure whether some other would be good to have. The enums are only used as types so I guess it's not necessary, but also I do not fully understand why thePartialEqwas needed.