Skip to content

Commit 3f614c3

Browse files
committed
Fix serialization of (Plutus) 'Data'
The Cardano ledger does not enforce a single serialization format for most of the binary data that can end up on-chain. In CBOR, data-structures like lists or maps can be encoded in two ways: finite (using an explicit length) or indefinite (using begin/end markers). For example, the list [14] can be encoded either as: ``` 8114 # finite; 81 = 80 + 01, indicates a list of length 1 # OR 9F14FF # indefinite; 9F marks the beginning of the list and FF the end). ``` This means that, a same Haskell (or simply, unmarshalled) data-type can have multiple binary representations. The ledger copes with that by memoizing the original binary representation of decoded data, therefore avoiding to reserialize any data into a _different_ representation. Ogmios, and basically any program manipulating binary data from the chain MUST also use the same strategy. There is an example case with the following transaction on the testnet: <https://testnet.cexplorer.io/tx/c8b91a27976836f5ba349275bba3f0b81eb1d51aaf31a4340035ce450fdf83a7> which carries the following datum (<https://testnet.cexplorer.io/datum/06269f665176dc65b334d685b2f64826c092875ca77e15007b5f690ecc9db1af>) ``` D8798441FFD87982D87982D87982D87981581CC279A3FB3B4E62BBC78E2887 83B58045D4AE82A18867D8352D02775AD87981D87981D87981581C121FD22E 0B57AC206FEFC763F8BFA0771919F5218B40691EEA4514D0D87A80D87A801A 002625A0D87983D879801A000F4240D879811A000FA92E ``` Now, the datum reported by CExplorer (using cardano-db-sync) and Ogmios (before this commit) is: ``` D8799F41FFD8799FD8799FD8799FD8799F581CC279A3FB3B4E62BBC78E2887 83B58045D4AE82A18867D8352D02775AFFD8799FD8799FD8799F581C121FD2 2E0B57AC206FEFC763F8BFA0771919F5218B40691EEA4514D0FFFFFFFFD87A 80FFD87A80FF1A002625A0D8799FD879801A000F4240D8799F1A000FA92EFF FFFF ``` Both binary representation deserializes to the same high-level data-type, but the former uses finite data-structure in binary, whereas the latter uses indefinite structures. If one tries to re-calculate the hash digest of the reported datum, one obtains (obviously) something different than expected by the ledger. That's because this raw datum has been re-serialized by the programs consuming it from the chain whereas the hash calculated and used by the ledger is therefore the one corresponding to the original (finite data-structures) datum! This is tricky to reason about and also tricky to discover because, even property tests couldn't catch that (since the ledger's arbitrary instances do generate data only in the indefinite form!). Given the output of CExplorer, it seems that cardano-db-sync is also affected by this bug, and possibly more client applications down the line. Moreover, hardware devices are serializing all data using finite structures, creating this discrepency for applications interacting / using hardware integration.
1 parent 9c8ef1b commit 3f614c3

32 files changed

Lines changed: 94 additions & 40 deletions

File tree

server/ogmios.cabal

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

server/package.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,7 @@ tests:
137137
dependencies:
138138
- aeson
139139
- aeson-pretty
140+
- base16
140141
- bytestring
141142
- cardano-client
142143
- cardano-ledger-alonzo

server/src/Ogmios/Data/Json/Alonzo.hs

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,6 @@ import Cardano.Binary
1212
( serialize' )
1313
import Cardano.Ledger.Crypto
1414
( Crypto )
15-
import Codec.Serialise
16-
( serialise )
1715
import Data.ByteString.Base16
1816
( encodeBase16 )
1917
import GHC.Records
@@ -200,10 +198,11 @@ encodeCostModels =
200198
encodeMap stringifyLanguage encodeCostModel . Al.unCostModels
201199

202200
encodeData
203-
:: Al.Data era
201+
:: Ledger.Era era
202+
=> Al.Data era
204203
-> Json
205-
encodeData (Al.Data datum) =
206-
encodeByteStringBase16 (toStrict (serialise datum))
204+
encodeData =
205+
encodeByteStringBase16 . serialize'
207206

208207
encodeDataHash
209208
:: Crypto crypto
@@ -381,7 +380,7 @@ encodeProposedPPUpdates (Sh.ProposedPPUpdates m) =
381380
encodeMap Shelley.stringifyKeyHash (encodePParams' encodeStrictMaybe) m
382381

383382
encodeRedeemers
384-
:: Ledger.Era era
383+
:: forall era. (Ledger.Era era)
385384
=> Al.Redeemers era
386385
-> Json
387386
encodeRedeemers (Al.Redeemers redeemers) =

server/test/unit/Ogmios/Data/JsonSpec.hs

Lines changed: 53 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,8 @@ import Ogmios.Data.Protocol.TxSubmission
140140
)
141141
import Ouroboros.Consensus.Cardano.Block
142142
( CardanoEras, GenTx, HardForkApplyTxErr (..) )
143+
import Ouroboros.Consensus.Shelley.Eras
144+
( StandardAlonzo )
143145
import Ouroboros.Network.Block
144146
( Point (..), Tip (..) )
145147
import Ouroboros.Network.Protocol.LocalStateQuery.Type
@@ -154,6 +156,7 @@ import Test.Generators
154156
, genBlockNo
155157
, genBoundResult
156158
, genCompactGenesisResult
159+
, genData
157160
, genDelegationAndRewardsResult
158161
, genEpochResult
159162
, genEvaluateTxResponse
@@ -210,6 +213,7 @@ import Test.QuickCheck
210213
, conjoin
211214
, counterexample
212215
, elements
216+
, forAll
213217
, forAllBlind
214218
, forAllShrinkBlind
215219
, genericShrink
@@ -226,17 +230,20 @@ import Test.QuickCheck.Arbitrary.Generic
226230
import qualified Ogmios.Data.Json.Alonzo as Alonzo
227231
import qualified Ogmios.Data.Json.Babbage as Babbage
228232

233+
import qualified Cardano.Ledger.Alonzo.Data as Ledger
229234
import qualified Codec.Json.Wsp.Handler as Wsp
230235
import qualified Data.Aeson as Json
231236
import qualified Data.Aeson.Encode.Pretty as Json
232237
import qualified Data.Aeson.Types as Json
233238
import qualified Data.ByteString as BS
239+
import qualified Data.ByteString.Base16 as B16
240+
import qualified Data.Text.Encoding as T
234241
import qualified Test.QuickCheck as QC
235242

236-
jsonifierToAeson
243+
encodingToValue
237244
:: Json
238245
-> Json.Value
239-
jsonifierToAeson =
246+
encodingToValue =
240247
fromJust . Json.decodeStrict . jsonToByteString
241248

242249
-- | Generate arbitrary value of a data-type and verify they match a given
@@ -251,7 +258,7 @@ validateToJSON gen encode (n, vectorFilePath) ref = parallel $ do
251258
runIO $ generateTestVectors (n, vectorFilePath) gen encode
252259
refs <- runIO $ unsafeReadSchemaRef ref
253260
specify (toString $ getSchemaRef ref) $ forAllBlind gen
254-
(prop_validateToJSON (jsonifierToAeson . encode) refs)
261+
(prop_validateToJSON (encodingToValue . encode) refs)
255262

256263
-- | Similar to 'validateToJSON', but also check that the produce value can be
257264
-- decoded back to the expected form.
@@ -268,7 +275,7 @@ validateFromJSON gen (encode, decode) (n, vectorFilePath) ref = parallel $ do
268275
specify (toString $ getSchemaRef ref) $ forAllBlind gen $ \a ->
269276
let leftSide = decodeWith decode (jsonToByteString (encode a)) in
270277
conjoin
271-
[ prop_validateToJSON (jsonifierToAeson . encode) refs a
278+
[ prop_validateToJSON (encodingToValue . encode) refs a
272279
, leftSide == Just a
273280
& counterexample (decodeUtf8 $ Json.encodePretty $ inefficientEncodingToValue $ encode a)
274281
& counterexample ("Got: " <> show leftSide)
@@ -312,7 +319,6 @@ spec = do
312319
Json.Success (UTxOInBabbageEra utxo') ->
313320
utxo' === utxo
314321
& counterexample (decodeUtf8 $ Json.encodePretty encoded)
315-
316322
specify "Golden: Utxo_1.json" $ do
317323
json <- decodeFileThrow "Utxo_1.json"
318324
case Json.parse (decodeUtxo @StandardCrypto) json of
@@ -355,6 +361,21 @@ spec = do
355361
Json.Success UTxOInBabbageEra{} ->
356362
fail "successfully decoded an invalid payload( as Babbage Utxo)?"
357363

364+
context "Data / BinaryData" $ do
365+
prop "arbitrary" $
366+
forAll genData propBinaryDataRoundtrip
367+
368+
prop "Golden (1)" $
369+
propBinaryDataRoundtrip $ unsafeDataFromBytes
370+
"D8668219019E8201D8668219010182D866821903158140D8668219020C\
371+
\83230505"
372+
373+
prop "Golden (2)" $
374+
propBinaryDataRoundtrip $ unsafeDataFromBytes
375+
"D8798441FFD87982D87982D87982D87981581CC279A3FB3B4E62BBC78E\
376+
\288783B58045D4AE82A18867D8352D02775AD87981D87981D87981581C\
377+
\121FD22E0B57AC206FEFC763F8BFA0771919F5218B40691EEA4514D0D8\
378+
\7A80D87A801A002625A0D87983D879801A000F4240D879811A000FA92E"
358379

359380
context "validate chain-sync request/response against JSON-schema" $ do
360381
validateFromJSON
@@ -862,6 +883,31 @@ instance Arbitrary SerializedTx where
862883
\ce8473e990d61c1506f6"
863884
]
864885

886+
887+
propBinaryDataRoundtrip :: Ledger.Data StandardAlonzo -> Property
888+
propBinaryDataRoundtrip dat =
889+
let json = jsonToByteString (Alonzo.encodeData @StandardAlonzo dat)
890+
in case B16.decodeBase16 . T.encodeUtf8 <$> Json.decode (toLazy json) of
891+
Just (Right bytes) ->
892+
let
893+
dataFromBytes = Ledger.makeBinaryData (toShort bytes)
894+
originalData = Ledger.dataToBinaryData dat
895+
in conjoin
896+
[ dataFromBytes
897+
=== Right originalData
898+
, (Ledger.hashBinaryData <$> dataFromBytes)
899+
=== Right (Ledger.hashBinaryData originalData)
900+
] & counterexample (decodeUtf8 json)
901+
_ ->
902+
property False
903+
904+
unsafeDataFromBytes :: ByteString -> Ledger.Data era
905+
unsafeDataFromBytes =
906+
either (error . show) Ledger.binaryDataToData
907+
. Ledger.makeBinaryData
908+
. either error toShort
909+
. B16.decodeBase16
910+
865911
--
866912
-- Local State Query
867913
--
@@ -911,10 +957,10 @@ validateQuery json parser (n, vectorFilepath) resultRef =
911957
-- max success. In the end, the property run 1 time per era!
912958
runQuickCheck $ withMaxSuccess 20 $ forAllBlind
913959
(genResult Proxy)
914-
(prop_validateToJSON (jsonifierToAeson . encodeQueryResponse) resultRefs)
960+
(prop_validateToJSON (encodingToValue . encodeQueryResponse) resultRefs)
915961

916962
let encodeQueryUnavailableInCurrentEra
917-
= jsonifierToAeson
963+
= encodingToValue
918964
. _encodeQueryResponse encodeAcquireFailure
919965
. Wsp.Response Nothing
920966

server/test/unit/Test/Generators.hs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ module Test.Generators where
99

1010
import Ogmios.Prelude
1111

12+
import Cardano.Ledger.Alonzo.Data
13+
( Data )
1214
import Cardano.Ledger.Alonzo.Tools
1315
( TransactionScriptFailure (..) )
1416
import Cardano.Ledger.Alonzo.TxInfo
@@ -724,6 +726,11 @@ genUtxoBabbage
724726
genUtxoBabbage =
725727
reasonablySized arbitrary
726728

729+
genData
730+
:: Gen (Data era)
731+
genData =
732+
reasonablySized arbitrary
733+
727734
shrinkUtxo
728735
:: forall era.
729736
( Era era

server/test/vectors/ChainSync/Response/RequestNext/002.json

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)